Pull requests: NVIDIA/cub
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Reapply changes from NVIDIA/cub#404 that were lost in conflict resolution.
P0: must have
Absolutely necessary. Critical issue, major blocker, etc.
testing: gpuCI passed
Passed gpuCI testing.
type: bug: functional
Does not work as intended.
P2322R6 accumulator types for scan and reduce by key
P1: should have
Necessary, but not critical.
release: breaking change
Include in "Breaking Changes" section of release notes.
testing: gpuCI passed
Passed gpuCI testing.
type: enhancement
New feature or request.
P2322R6 accumulator types for reduce
P1: should have
Necessary, but not critical.
release: breaking change
Include in "Breaking Changes" section of release notes.
testing: gpuCI passed
Passed gpuCI testing.
type: bug: functional
Does not work as intended.
Fix DeviceHistogram::Even for mixed float/int levels and sample types.
P1: should have
Necessary, but not critical.
testing: gpuCI passed
Passed gpuCI testing.
type: bug: functional
Does not work as intended.
Update CDP support macros for if-target compatibility
P0: must have
Absolutely necessary. Critical issue, major blocker, etc.
release: breaking change
Include in "Breaking Changes" section of release notes.
release: notes
PR description contains pre-written release notes.
Fix begin_bit == end_bit == 0 for device-wide and segmented sort
P2: nice to have
Desired, but not necessary.
type: bug: functional
Does not work as intended.
Add documentation generation
area: docs
Related to documentation.
blocked
Currently cannot make progress.
P1: should have
Necessary, but not critical.
type: enhancement
New feature or request.
add support FutureValue for reduce
P2: nice to have
Desired, but not necessary.
type: enhancement
New feature or request.
Add labels for CTest.
blocked
Currently cannot make progress.
only: cmake
CMake changes only. Doesn't need internal NVIDIA CI (DVS).
only: gpuci
Changes to gpuCI only. Doesn't need internal NVIDIA CI.
P2: nice to have
Desired, but not necessary.
[WIP] Allow cub::DeviceRadixSort and cub::DeviceSegmentedRadixSort to use iterator as input
helps: pytorch
Helps or needed by PyTorch.
P3: backlog
Unprioritized
Draft for catch2 testing framework usage
P2: nice to have
Desired, but not necessary.
testing: gpuCI in progress
Started gpuCI testing.
Remove pragma unroll from device radix sort, thread reduce, histogram, radix rank, select if and block exchange
#315
opened Jun 1, 2021 by
senior-zero
Loading…
cub::ThreadLoadAsync and friends, abstractions for asynchronous data movement
#209
opened Oct 5, 2020 by
brycelelbach
Loading…
3 tasks
Retune radix sort, run length encoding, reduce by key, scan, select if, and histogram for SM70 and SM80
area: performance
Does not perform as intended.
helps: rapids
Helps or needed by RAPIDS.
P1: should have
Necessary, but not critical.
#208
opened Oct 5, 2020 by
brycelelbach
•
Draft
Add assignment operator to the TestBar test util class.
P2: nice to have
Desired, but not necessary.
triage
Needs investigation and classification.
fix 'invalid arguments' warp sync error on Volta
info needed
Cannot make progress without more information.
P1: should have
Necessary, but not critical.
repro: missing
Missing a complete example that reproduces the issue.
type: bug: functional
Does not work as intended.
Add support for bfe.u64 and bfi.b64
P1: should have
Necessary, but not critical.
type: enhancement
New feature or request.
#117
opened Sep 21, 2017 by
aterenin
Loading…
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.