-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-11033] Updates Dataflow Metrics processor for portable job submission #13024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-11033] Updates Dataflow Metrics processor for portable job submission #13024
Conversation
|
R: @pabloem |
|
Refactored code to not to depend on an exception path. PTAL. |
Codecov Report
@@ Coverage Diff @@
## master #13024 +/- ##
=======================================
Coverage 82.49% 82.49%
=======================================
Files 455 455
Lines 54867 54890 +23
=======================================
+ Hits 45264 45283 +19
- Misses 9603 9607 +4
Continue to review full report at Codecov.
|
pabloem
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I just have questions about how this works with the Dataflow API / UI / etc.
Does this mean that the Dataflow API reports metrics with portability-API step names? Is this for all portable jobs? Does this mean that the UI has made this change as well?
|
With portable job submission, we directly convert Beam proto pipeline to internal Dataflow steps (instead of going through v1beta3 steps). Here we will be using transform IDs (from proto pipeline transform map) as the Dataflow step names. Hence this change to map such step names to user names expected by metrics. Do you think we need additional changes to map such step names to UI for displaying metrics ? |
|
I guess currently, the API works this way: And the SDK translates v1b3 names to user names. Is something like this happening for portable job submission? and SDK translating fnapi names to user names? |
|
So for portable job submission "v1beta3 step names" do not exist. So we'll have: At least looking at a simple wordcount job, seems like expected metrics are available in the UI. |
Currently, Dataflow metrics processor expects Dataflow internal step names generated for v1beta3 job description in metrics returned by Dataflow service.
But with portable job submission, Dataflow uses PTransform ID (in proto pipeline) as the internal step name. Hence metrics processor should be updated to handle this.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username).[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.