#

rocm

Here are 67 public repositories matching this topic...

apache / tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

javascript machine-learning performance deep-learning metal compiler gpu vulkan opencl tensor spirv rocm tvm

Updated Jul 8, 2022
Python

cupy / cupy

Open

Support `dtype` argument in `cupy.corrcoef` and `cupy.cov`

6

emcastillo commented Jan 27, 2022

Description

https://numpy.org/doc/stable/reference/generated/numpy.corrcoef.html

https://docs.cupy.dev/en/stable/reference/generated/cupy.corrcoef.html

Seems args are different

Additional Information

dtype argument added in NumPy version 1.20.

Read more

contribution welcome cat:numpy-compat good first issue

Open

Add missing parameters to the `nan<x>` functions

7

Open

[Tracker] Implement all `scipy.*` APIs in CuPy

5

Find more good first issues

dmlc / nnvm

deep-learning deployment metal optimization opencl cuda computation-graph rocm nnvm tvm

Updated Sep 11, 2018
C++

deepmodeling / deepmd-kit

A deep learning package for many-body potential energy representation and molecular dynamics

python deep-learning cpp tensorflow cuda molecular-dynamics lammps ipi rocm potential-energy deepmd

Updated Jul 7, 2022
C++

stdgpu

stotko / stdgpu

stdgpu: Efficient STL-like Data Structures on the GPU

cpp gpu modern-cpp cpp14 openmp cuda stl data-structures gpgpu gpu-acceleration cpp17 stl-containers hip gpu-computing rocm cpp20 stl-like

Updated Jun 22, 2022
C++

hipSYCL

illuhad / hipSYCL

Open

Remove references to deprecated hcc

6

tomdeakin commented Jun 8, 2020

Just an FYI whilst I was trawling through the ROCm GitHub page:

https://rocmdocs.amd.com/en/latest/Programming_Guides/Programming-Guides.html#

Read more

good first issue potential student work

Open

Upgrade cuda installation script to 10.1

RadeonOpenCompute / ROCm-docker

Dockerfiles for the various software layers defined in the Radeon Open Compute Platform

Updated May 29, 2022
Shell

agenium-scale / nsimd

Agenium Scale vectorization library for CPUs and GPUs

hpc neon cuda avx simd avx2 sse2 simd-programming aarch64 avx512 simd-instructions simd-library sse42 rocm cpp20 sve neon128 cpp20-library vectorization-library

Updated Oct 21, 2021
C

alpaka-group / alpaka

Abstraction Library for Parallel Kernel Acceleration 🦙

cpp hpc gpu openmp cuda header-only cpp17 hip heterogeneous-parallel-programming tbb openacc rocm

Updated Jul 7, 2022
C++

GPUOpen-Tools / gpu_performance_api

GPU Performance API for AMD GPUs

sdk opengl vulkan opencl d3d12 d3d11 rocm gpu-performance-counters

Updated Apr 25, 2022
C++

ROCmSoftwarePlatform / rocBLAS

Next generation BLAS implementation for ROCm platform

Updated Jun 30, 2022
C++

JuliaGPU / AMDGPU.jl

Open

Add option to disable automatic mark/wait of specific arrays

jpsamaroo commented Apr 6, 2021

Since arrays may not actually be modified by a given operation, or might only be partially modified (or the user has some other way to ensure correctness).

Read more

good first issue performance

Open

User-accessible objects should print nicely

GPUOpen-ProfessionalCompute-Libraries / amdovx-core

AMD OpenVX Core -- a sub-module of amdovx-modules:

linux cmake cpu opencl range vcxproj amdgpu rocm radeon-open-compute openvx radeon-instinct-mi-series radeon-vega-series amd-openvx khronos-openvx vx-loomsl

Updated Feb 5, 2019
C++

MIVisionX

GPUOpen-ProfessionalCompute-Libraries / MIVisionX

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.

Updated Jul 8, 2022
C++

RadeonOpenCompute / k8s-device-plugin

Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster

kubernetes k8s rocm kubernetes-device-plugins

Updated Jul 7, 2022
Go

COSMA

eth-cscs / COSMA

Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm

linear-algebra mpi cuda scalapack matrix-multiplication gpu-acceleration rocm matmul communication-optimal pdgemm

Updated Jul 7, 2022
C++

ROCm-Developer-Tools / aomp

AOMP is an open source Clang/LLVM based compiler with added support for the OpenMP® API on Radeon™ GPUs. Use this repository for releases, issues, documentation, packaging, and examples,.

amd llvm openmp clang rocm

Updated Jul 2, 2022
Fortran

GPUOpen-ProfessionalCompute-Libraries / amdovx-modules

AMD OpenVX modules: such as, neural network inference, 360 video stitching, etc.

video-stitching rocm radeon-open-compute openvx onnx neural-network-inference radeon-instinct-mi-series radeon-vega-series

Updated Feb 5, 2019
C++

ROCmSoftwarePlatform / rocFFT

Next generation FFT implementation for ROCm

Updated Jul 8, 2022
C++

ROCmSoftwarePlatform / gpufort

GPUFORT: S2S translation tool for CUDA Fortran and Fortran+X in the spirit of hipify

fortran gpu openmp cuda interoperability gpgpu hip cuda-fortran openacc rocm

Updated May 24, 2022
Fortran

ROCmSoftwarePlatform / rocPRIM

ROCm Parallel Primitives

amd gpu parallel cuda primitive hip rocm

Updated Jul 5, 2022
C++

GPUOpen-Tools / radeon_compute_profiler

The Radeon Compute Profiler (RCP) is a performance analysis tool that gathers data from the API run-time and GPU for OpenCL™ and ROCm/HSA applications. This information can be used by developers to discover bottlenecks in the application and to find ways to optimize the application's performance.

profiler opencl rocm

Updated Jun 16, 2020
C++

SIRIUS

electronic-structure / SIRIUS

Domain specific library for electronic structure calculations

Updated Jul 6, 2022
C++

ROCmSoftwarePlatform / rocRAND

RAND library for HIP programming language

gpu random cuda rng hip rocm

Updated Jul 7, 2022
C

NUCAR-DEV / Hetero-Mark

A Benchmark Suite for Heterogeneous System Computation

benchmark gpu opencl cuda hip rocm hetero-mark

Updated Nov 8, 2021
Jupyter Notebook

sukhmeetbawa / OpenCL-AMD-Fedora

AMD OpenCL userspace drivers for Fedora.

amd opencl rocm fedora-workstation

Updated Jun 24, 2022
Shell

eth-cscs / SpFFT

Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support

hpc mpi cuda gpu-acceleration fft rocm fft-library

Updated Feb 17, 2022
C++

Grench6 / RX580-rocM-tensorflow-ubuntu20.4-guide

Install guide of ROCm and Tensorflow on Ubuntu for the RX580

rocm tensorflow-rocm

Updated Nov 22, 2021

ROCmSoftwarePlatform / hipfort

Fortran interfaces for ROCm libraries

fortran gpu random solver cuda interoperability gpgpu sparse blas fft hip rocm

Updated Jun 29, 2022
Fortran

rocmsys / RET

ROCm Machine Learning and HPC Stack installer

deep-learning hpc amd tensorflow pytorch rocm

Updated Jul 31, 2020
Shell

Improve this page

Add a description, image, and links to the rocm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rocm topic, visit your repo's landing page and select "manage topics."