Tagged Questions
17
votes
5answers
1k views
Naive C++ Matrix Multiplication 100 times slower than BLAS?
I am taking a look at large matrix multiplication and ran the following experiment to form a baseline test:
Randomly generate two 4096x4096 matrixes X, Y from std normal (0 mean, 1 stddev).
Z = X*Y
...