#
nvidia-cuda
Here are 101 public repositories matching this topic...
Build userspace NVMe drivers and storage applications with CUDA support
disk
gpu
driver
cuda
nvm
ssd
gpudirect-rdma
dax
nvme
dma
pcie
cluster-computing
nvidia-cuda
userspace-driver
sisci
disk-io
nvm-express
dolphinics
smartio
gpudirect
-
Updated
Mar 5, 2020 - C
neworderofjamie
commented
Sep 15, 2021
These are not being tested by the feature tests and our coverage agrees:
https://codecov.io/gh/genn-team/genn/src/master/src/genn/genn/code_generator/generateNeuronUpdate.cc#L213
Ubuntu 18.04 How to install Nvidia driver + CUDA + CUDNN + build tensorflow for gpu step by step command line
python
linux
build
tutorial
neural-network
ubuntu
tensorflow
gcc
cuda
bazel
python3
nvidia
gcc-complier
compile
cudnn
cuda-toolkit
nvidia-cuda
nvidia-gpu
tensorflow-gpu
ubuntu1804
-
Updated
Jul 9, 2018
Live on-demand transcoding in go using ffmpeg. Also with NVIDIA GPU hardware acceleration.
-
Updated
Sep 7, 2021 - Go
Dockerized Folding@home client with NVIDIA GPU support to help battle COVID-19
docker
distributed-computing
nvidia
nvidia-docker
folding
foldingathome
nvidia-cuda
nvidia-container-toolkit
2019-ncov
coronavirus
covid-19
folding-at-home
-
Updated
May 9, 2021 - Dockerfile
-
Updated
Nov 3, 2019
Implementation of ConjugateGradients method using C and Nvidia CUDA
c
numpy
gpgpu
cuda-kernels
mkl-pardiso
linear-equations
numerical-methods
conjugate-gradient
nvidia-cuda
-
Updated
Apr 20, 2021 - Python
NVIDIA DeepStream SDK
c-plus-plus
real-time
caffe
tensorflow
inference
nvidia
video-processing
object-detection
video-streaming
nvidia-gpus
video-decoding
nvidia-cuda
tensorrt
object-classification
-
Updated
Jan 14, 2019 - C++
An implementation of parallel exclusive scan in CUDA
-
Updated
Feb 23, 2018 - Cuda
-
Updated
Nov 12, 2020 - Shell
这是一个基于NVIDIA cuda的开源程序,其中包括了二维和三维VTI介质正演模拟和逆时偏移成像,二维TTI介质逆时偏移成像,以及以上介质的ADCIGs提取[translation: This is an open source program based on NVIDIA cuda, which includes two-dimensional and three-dimensional VTI media forward simulation and reverse time migration imaging, two-dimensional TTI media reverse time migration imaging, and ADCIGs extraction of the above media]
-
Updated
Jun 18, 2021 - Cuda
基于NVIDIA的GPU加速的VTI介质有限差分正演地震模拟,[translation: NVIDIA-based GPU Accelerated Finite Difference Forward Seismic Simulation of VTI Media]
-
Updated
Jun 18, 2021 - C
StiffMa: Fast finite element STIFFness MAtrix generation in MATLAB by using GPU computing.
matlab
parallel-computing
gpu-acceleration
cuda-kernels
gpu-computing
finite-element-analysis
nvidia-cuda
finite-element-methods
stiffness
pde-solver
cuda-programming
parallel-computing-toolbox
-
Updated
Sep 16, 2020 - MATLAB
HTML/JS port of CUDA Occupancy Calculator
-
Updated
Mar 21, 2019 - CoffeeScript
FFmpeg 4.0 with NVIDIA P4 GPU Driver Support
-
Updated
Feb 22, 2019 - C
Sparse linear Boolean algebra for Nvidia Cuda
python
cplusplus
graph-algorithms
linear-algebra
sparse-matrix
boolean-algebra
graph-analysis
nvidia-cuda
graphblas
-
Updated
Jun 12, 2021 - C++
PyTorch Image and Video Super-Resolution, specialized for vehicle and traffic view processing and performed by using Deep Convolutional Neural Networks
-
Updated
May 21, 2018 - Python
This system tracks artifacts in museum and triggers alarm if artifact goes missing from the frame.
python
machine-learning
museum
computer-vision
tensorflow
gpu
cuda
feed
object-tracking
tensorflow-examples
nvidia-cuda
nvidia-gpu
yolov2
smart-surveillance
triggers-alarm
-
Updated
Aug 25, 2021 - Python
A simple image classifier built with Keras using NVIDIA cuda libraries.
-
Updated
Dec 7, 2019 - Python
My bachelor and master thesis at Voronezh State University
-
Updated
May 21, 2020 - Jupyter Notebook
CUDA miner project, compatible with most Maxwell/Pascal nVidia cards
-
Updated
Aug 5, 2018 - C
Convolution 2D cuDNN C++ implement demo 二维卷积的cuDNN实现样例 2次元畳み込みのcuDNN実装例
deep-neural-networks
ai
deep-learning
cpp
cuda
inference
nvidia
convolutional-layers
convolution
convolutional-neural-network
cudnn
nvidia-cuda
inference-engine
-
Updated
May 19, 2020 - C++
GPUDirect Async implementation of HPGMG-FV CUDA
-
Updated
May 11, 2018 - Cuda
Restream live content as HLS using ffmpeg in docker. Also with NVIDIA GPU hardware acceleration.
-
Updated
Sep 5, 2021 - Shell
Portably Performant Physical Algebra
cmake
cpp
hpc
vector
cuda
simd
gpgpu
cpp17
hip
avx512
nvidia-cuda
amd-gpu
sandia-national-laboratories
cpp17-library
hpc-tools
snl-science-libs
-
Updated
Sep 15, 2021 - C++
This script collects some informations about NVLink and PCI bus traffic of NVidia GPUs. Results are published as prometheus metrics via a websocket.
-
Updated
Jul 29, 2019 - Python
pyhf Docker images built on Nvidia Container Toolkit enabled base images
-
Updated
Sep 6, 2021 - Shell
Convolution 3D cuDNN C++ implement demo 三维卷积的cuDNN实现样例 3次元畳み込みのcuDNN実装例
ai
deep-learning
cpp
cuda
inference
nvidia
convolutional-layers
convolution
convolutional-neural-network
cudnn
nvidia-cuda
inference-engine
deep-neural-network
-
Updated
Feb 5, 2020 - C++
Improve this page
Add a description, image, and links to the nvidia-cuda topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the nvidia-cuda topic, visit your repo's landing page and select "manage topics."
Bug summary
There is evidence that
sub_group::get_group_id()does not return the same value asthreadIdx.x / warpSize(assuming 1D kernel), as expected on CUDA. We should check the implementation of this function. Our implementation of this function performs bit manipulation magic, presumably the optimization went to far...To Reproduce
Compare
sub_group{}.get_group_id()or `sub