Skip to content
#

rocm

Here are 66 public repositories matching this topic...

numba
rhjmoore
rhjmoore commented Sep 1, 2021

I see comments suggesting adding this to understand how loops are being handled by numba, and in the their own FAQ (https://numba.pydata.org/numba-doc/latest/user/faq.html)

from llvmlite import binding as llvm
llvm.set_option('','--debug-only=loop-vectorize')

You would then create your njit function and run it, and I believe the idea is that it prints debug information about whether

hipSYCL
illuhad
illuhad commented Sep 6, 2021

Bug summary
There is evidence that sub_group::get_group_id() does not return the same value as threadIdx.x / warpSize (assuming 1D kernel), as expected on CUDA. We should check the implementation of this function. Our implementation of this function performs bit manipulation magic, presumably the optimization went to far...

To Reproduce
Compare sub_group{}.get_group_id() or `sub

MIVisionX

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.

  • Updated Nov 17, 2021
  • C++
trafficVision

Improve this page

Add a description, image, and links to the rocm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rocm topic, visit your repo's landing page and select "manage topics."

Learn more