-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-32170][CORE] Improve the speculation for the inefficient tasks by the task metrics. #28994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…s by the task metrics.
|
@cloud-fan @dongjoon-hyun kindly review, thanks. |
|
Can one of the admins verify this patch? |
|
@venkata91 You might be interested in this. |
|
@maropu @cloud-fan @gatorsmile @mridulm @dongjoon-hyun Could you help check this PR? Thanks. |
| }.map(_.taskMetrics).filter(_.isDefined).map(_.get).foreach { task => | ||
| if (task.inputMetrics != null) { | ||
| sumInputRecords += task.inputMetrics.recordsRead | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about recordsWritten? Should that also be considered wrt progress same wrt shuffleRecordsWritten?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even cache can also take time when written to disk, does that need to be taken into consideration? Similarly GC time, shuffle read blocked time etc. could also impact task progress
| } else if (taskData != null && taskData.contains(tid) && taskData(tid) != null && | ||
| taskData(tid).taskMetrics.isDefined) { | ||
| val taskMetrics = taskData(tid).taskMetrics.get | ||
| val currentTaskProgressRate = (taskMetrics.inputMetrics.recordsRead + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to add taskProgress as part of taskMetrics that way it can also be shown in SparkUI? Although taskProgress for tasks which doesn't involve input/output/shuffle records would be hard to measure?
|
This is an interesting idea and a good start. Just considering the runTime of a task alone might not be useful in many cases. Thanks! |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
Improve the speculation for the inefficient tasks by the task metrics.
Why are the changes needed?
Does this PR introduce any user-facing change?
No
How was this patch tested?
Add UT.