Tagged Questions

info newest frequent votes active unanswered

Single Instruction, Multiple Data describes CPU instructions that process many operands in parallel.

votes

0answers

37 views

Cache conscious SIMD matrix multiply of unsigned integers

The goal of the code review by order of importance (i.e. What I hope to hear from you): I've verified correctness using a straightforward matrix multiply function though I am open to those who want ...

c matrix sse simd

asked Oct 2 at 23:29

Franco Solleza

1213

votes

0answers

35 views

Multiplication of n-dimensional arrays with broadcasting

For explaintion of multiplication with broadcasting, see here. Problem: The nested loop of the simplified code is not vectorized. How to fix the simplified code so that its nested loop would ...

c++ array vectorization simd

asked Sep 28 at 20:06

r xu

113

vote

2answers

130 views

Hash calculation for array of long values in C#

Can the following function be improved in terms of performance? I am calculating millions of such hashes. The long array represents a record from data table where all values are encoded as long ...

c# performance hashcode simd

asked Aug 23 at 6:08

Sebastian Widz

1664

vote

1answer

146 views

HPC kernel for DGEMM: compiler v.s. assembly

This is a correct version, for computing a small matrix multiplication: C += A * B, where C is ...

c matrix assembly sse simd

asked Apr 4 at 15:26

Zheyuan Li

1147

votes

0answers

155 views

FUTABA SBUS serial communication in C++

I would like to reimplement the current Futaba SBUS protocol in ArduPilot for Navio+. It seems to be a relatively expensive protocol, so I changed the code from an existing git project and to make it ...

c++ bitwise serial-port device-driver simd

asked Dec 10 '15 at 14:25

dgrat

254212

votes

2answers

189 views

SSE instruction to check if byte array is zeroes C#

My fundamental problem is how to check whether byte[] is full of zeroes. I posted a range of implementations (with timings) and one clearly beats others. In fact, ...

c# performance array simd

asked Oct 23 '15 at 8:43

ArekBulski

26911

votes

1answer

962 views

SIMD matrix multiplication

I recently started toying with SIMD and came up with the following code for matrix multiplication. First I attempted to implement it using SIMD the same way I did in SISD, just using SIMD for things ...

c++ c matrix simd

asked Aug 17 '15 at 1:19

Peter

1114

votes

0answers

272 views

Bilinear interpolation using Neon intrinsics

I'm trying to do a Bilinear interpolation on the ARM Neon. However, I find that my vectorized code is slower than the regular one, on a BeagleBone Black. Any idea why this could happen? I'm using ...

performance c simd

asked Mar 17 '15 at 22:42

Josejulio

1311

votes

0answers

142 views

SSE optimisation for audio resampling

I'm learning SSE for the first time and trying to optimise some code. Using oprofile shows that the CPU usage in this function went down from 2.5% to 0.9% using the ...

performance c audio sse simd

asked May 30 '14 at 14:50

Tim

211

votes

1answer

321 views

Computing tangent space basis vectors for an arbitrary mesh

This is more like a share and a request than a question. I converted Eric Lengyel's code, which calculates tangents of a mesh for the purpose of texturing and normal mapping, to support SIMD. For this ...

c++ optimization computational-geometry matrix simd

asked Dec 17 '13 at 14:34

rashmatash

votes

1answer

385 views

Writing SIMD libraries for C++ on FASM in x86-64 Linux

I have recently started a project of SIMD libraries development for C++ on FASM for x86-64 Linux. I would be glad to hear any opinion or feedback about the project, cleanness of the code and ...

linux library assembly sse simd

asked Aug 23 '12 at 17:56

Jack Black

362

newest simd questions feed

current community

your communities

more stack exchange communities

Tagged Questions

Cache conscious SIMD matrix multiply of unsigned integers

Multiplication of n-dimensional arrays with broadcasting

Hash calculation for array of long values in C#

HPC kernel for DGEMM: compiler v.s. assembly

FUTABA SBUS serial communication in C++

SSE instruction to check if byte array is zeroes C#

SIMD matrix multiplication

Bilinear interpolation using Neon intrinsics

SSE optimisation for audio resampling

Computing tangent space basis vectors for an arbitrary mesh

Writing SIMD libraries for C++ on FASM in x86-64 Linux

Hot Network Questions

your communities

Tagged Questions

Related Tags