#
sse42
Here are 19 public repositories matching this topic...
Performance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
-
Updated
Jul 15, 2019
python
c
openmp
avx
simd
cosmology
astrophysics
galaxies
large-scale-structure
pair-counting
intrinsics
avx2
avx512
sse42
correlation-functions
-
Updated
Jul 21, 2020 - C
Agenium Scale vectorization library for CPUs and GPUs
neon
cuda
avx
simd
avx2
sse2
simd-programming
aarch64
avx512
simd-instructions
sse42
rocm
sve
neon128
vectorization-library
-
Updated
Aug 7, 2020 - Python
-
Updated
Jun 9, 2020 - C++
A GPU-based Graph500 implementation providing compressed data movements. WEB: http://unihd-ceg.github.io/gpugraph500 - GITHUB:
c
performance
compression
cpp
hpc
gpu
mpi
cuda
slurm
simd
performance-tuning
performance-visualization
performance-analysis
gpu-computing
performance-monitoring
sse42
graph500
-
Updated
May 11, 2019 - C++
A collection of high speed non-cryptographic hashing algorithms
-
Updated
Nov 6, 2019 - C
Bilinear image filtering implemented with SSE4 and AVX2.
-
Updated
Aug 5, 2020 - C++
tinyosc with pattern matching implemented using SIMD string instrinsics
-
Updated
Jan 12, 2020 - C
Tiny optimized math framework game oriented
-
Updated
Jun 26, 2020 - C++
What features does your CPU and OS support?
-
Updated
May 11, 2020 - C++
Improve this page
Add a description, image, and links to the sse42 topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the sse42 topic, visit your repo's landing page and select "manage topics."