Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign uprfc: binary: Support broadcast source0 #869
Conversation
9c49e60
to
7e8f7c3
| on src0 broadcasting. The reference implementation will be the only | ||
| implementation supporting this broadcasting until we have an additional | ||
| request. The dimensions of the destination should be defined as: | ||
| `dst_dims[d] = max(src0_dims[d], src1_dims[d])` and |
dzarukin
Nov 4, 2020
Contributor
Right now dst_md is allowed to have format_kind::any which further is converted into src0 memory format. Since dimensions will not match, what's the plan to handle format_kind::any for dst in such case?
Right now dst_md is allowed to have format_kind::any which further is converted into src0 memory format. Since dimensions will not match, what's the plan to handle format_kind::any for dst in such case?
bartekxk
Nov 5, 2020
Author
Contributor
I added section for this problem with two solutions.
I added section for this problem with two solutions.
| - Pros: | ||
| - Very simple solution. | ||
| - Cons: | ||
| - Complicates integration into an application. Before this change |
dzarukin
Nov 4, 2020
Contributor
I don't understand this point. User was able to create a binary post-op md with any dimension values and at the stage of creating a primitive, the error would pop up since dimensions don't match. I don't see how the proposed behavior changes this scenario.
I don't understand this point. User was able to create a binary post-op md with any dimension values and at the stage of creating a primitive, the error would pop up since dimensions don't match. I don't see how the proposed behavior changes this scenario.
igorsafo
Nov 5, 2020
Contributor
Let me answer to this question as i am the one who brought it to the discussion. The integration of DNNL into some frameworks (TF) is separated into 2 stages:
- Graph-rewriter. During this stage graph nodes are merged according to the fusions supported by oneDNN. For example, Conv and Binary nodes could be merged into a single node "ConvBinary". If you look into the integration, it doesn't call oneDNN code and doesn't try to create primitives, it just merges nodes based on operation kinds.
- Execution. During this stage node "ConvBinary" will create conv with binary fusion, execute it and destroy.
So my point is that adding restrictions on a fusion of binary based on sizes will bring complications on integration side, because integration should also manage these restrictions. Basically you can't assume anymore that if conv is supported, binary is supported, conv + binary fusion is supported from API perspective then conv + binary implementation will be supported.
Let me answer to this question as i am the one who brought it to the discussion. The integration of DNNL into some frameworks (TF) is separated into 2 stages:
- Graph-rewriter. During this stage graph nodes are merged according to the fusions supported by oneDNN. For example, Conv and Binary nodes could be merged into a single node "ConvBinary". If you look into the integration, it doesn't call oneDNN code and doesn't try to create primitives, it just merges nodes based on operation kinds.
- Execution. During this stage node "ConvBinary" will create conv with binary fusion, execute it and destroy.
So my point is that adding restrictions on a fusion of binary based on sizes will bring complications on integration side, because integration should also manage these restrictions. Basically you can't assume anymore that if conv is supported, binary is supported, conv + binary fusion is supported from API perspective then conv + binary implementation will be supported.
| * JIT implementation allows broadcast with following syntax: | ||
| `NxCxDxHxW:{NxCx1x1x1,1xCx1x1x1,1x1x1x1x1} -> NxCxDxHxW` | ||
| * JIT i8 implementation allows broadcast with following syntax: | ||
| `NxCxDxHxW:{N,1}xCx1x1x1 -> NxCxDxHxW` |
dzarukin
Nov 4, 2020
Contributor
I believe int8 implementation is aligned in terms of broadcast support.
I believe int8 implementation is aligned in terms of broadcast support.
bartekxk
Nov 5, 2020
Author
Contributor
You're right, thanks, fixed.
You're right, thanks, fixed.
| - Pros: | ||
| - Ease of use (see cons of the first option). | ||
| - Cons: | ||
| - It should be supported as jit version which is supported in most |
dzarukin
Nov 4, 2020
Contributor
Just a side note: for current coverage of optimized binary post-op version, this is quite irrelevant. There are only two cases supported: a single point and a channel-wise. Single point will work regardless of dst_md dims values, but doing per channel broadcast in case dst_md has C=1 doesn't make sense to me at all. So I won't add this one as a con.
Just a side note: for current coverage of optimized binary post-op version, this is quite irrelevant. There are only two cases supported: a single point and a channel-wise. Single point will work regardless of dst_md dims values, but doing per channel broadcast in case dst_md has C=1 doesn't make sense to me at all. So I won't add this one as a con.
bartekxk
Nov 5, 2020
Author
Contributor
So why not add it as con?
So why not add it as con?
| binary primitive and fuse them together without looking into shapes. | ||
| After this change an application should track binary sizes and create | ||
| a fusion only when src0_dims are equal to dst_dims. | ||
| 2. Extend binary post operation to support src0 broadcast. |
dzarukin
Nov 4, 2020
Contributor
Is there a real-world use case for such scenario at all?
Is there a real-world use case for such scenario at all?
bartekxk
Nov 5, 2020
Author
Contributor
I don't know of any need for this use, and where it could be used.
I don't know of any need for this use, and where it could be used.
0ba1988
to
2635a47
Rfc document for enable broadcast source0 in binary primitive.
Link to rendered document