HIP: C++ Heterogeneous-Compute Interface for Portability
-
Updated
Oct 22, 2021 - C++
Add a description, image, and links to the hip topic page so that developers can more easily learn about it.
To associate your repository with the hip topic, visit your repo's landing page and select "manage topics."
Bug summary
There is evidence that
sub_group::get_group_id()does not return the same value asthreadIdx.x / warpSize(assuming 1D kernel), as expected on CUDA. We should check the implementation of this function. Our implementation of this function performs bit manipulation magic, presumably the optimization went to far...To Reproduce
Compare
sub_group{}.get_group_id()or `sub