Single Instruction, Multiple Data describes CPU instructions that process many operands in parallel.

learn more… | top users | synonyms

22
votes
1answer
658 views

SIMD matrix multiplication

I recently started toying with SIMD and came up with the following code for matrix multiplication. First I attempted to implement it using SIMD the same way I did in SISD, just using SIMD for things ...
7
votes
2answers
166 views

SSE instruction to check if byte array is zeroes C#

My fundamental problem is how to check whether byte[] is full of zeroes. I posted a range of implementations (with timings) and one clearly beats others. In fact, ...
7
votes
1answer
378 views

Writing SIMD libraries for C++ on FASM in x86-64 Linux

I have recently started a project of SIMD libraries development for C++ on FASM for x86-64 Linux. I would be glad to hear any opinion or feedback about the project, cleanness of the code and ...
5
votes
0answers
220 views

Bilinear interpolation using Neon intrinsics

I'm trying to do a Bilinear interpolation on the ARM Neon. However, I find that my vectorized code is slower than the regular one, on a BeagleBone Black. Any idea why this could happen? I'm using ...
4
votes
0answers
131 views

SSE optimisation for audio resampling

I'm learning SSE for the first time and trying to optimise some code. Using oprofile shows that the CPU usage in this function went down from 2.5% to 0.9% using the ...
2
votes
0answers
93 views

FUTABA SBUS serial communication in C++

I would like to reimplement the current Futaba SBUS protocol in ArduPilot for Navio+. It seems to be a relatively expensive protocol, so I changed the code from an existing git project and to make it ...
1
vote
1answer
113 views

HPC kernel for DGEMM: compiler v.s. assembly

This is a correct version, for computing a small matrix multiplication: C += A * B, where C is ...
0
votes
1answer
298 views

Computing tangent space basis vectors for an arbitrary mesh

This is more like a share and a request than a question. I converted Eric Lengyel's code, which calculates tangents of a mesh for the purpose of texturing and normal mapping, to support SIMD. For this ...
-3
votes
0answers
15 views

How can I change my code using SIMD/AVX to calculate ranks in my page rank algorithm? [closed]

I am writing a page rank program. I am writing a method for updating the rankings. I have successful got it working with nested for loops and also a threaded version. However I would like to instead ...