OpenMP is an API that supports shared memory multiprocessing in C, C++, and Fortran.

learn more… | top users | synonyms

5
votes
1answer
179 views

Sparse matrix multiplication for billion by billion matrices

I have to do multiplication of two billion by billion sparse matrices (on a CPU), hence any help or hints in optimizing the below given code would be extremely useful. Note: I am only showing the ...
4
votes
1answer
112 views

Neural Network Simulator with OpenMP

I wrote a simple neural network simulator (the biophysical kind) from scratch, and was hoping to get some feedback on how I can speed things up, or any C++ / compilation best practices that I can ...
2
votes
1answer
98 views

Simple speed up of C++ OpenMP kernel

This function calculates the standard deviation of a patch, given a kernel size and greyscale OpenCV image. The middle pixel of the patch is kept if stdev of the patch is below the given threshold, ...
1
vote
0answers
34 views

Big Integers and parallel execution with OpenMP

This is part of an implementation of Rabin-Williams signatures as described by Bernstein in Section 6 of "RSA signatures and Rabin–Williams signatures: the state of the art" using Tweaked Roots. The ...
12
votes
2answers
122 views

Parallelization of number factors using OpenMP

For a simple try at parallelization on my own outside of school, I've created a number factors calculator. I hope to eventually come up with something more creative. Since I don't have access to ...
3
votes
1answer
84 views

Trajectory optimizations

Here's a function doing some trajectory optimization in the following manner: ...
4
votes
2answers
391 views

OpenMP loop parallel for loop with function calls and STL vector

I have a function initialize_path_statistics(). I have used openMP to make it parallel. I am not sure where certain lines such as ...
4
votes
2answers
162 views

Increasing performance of OpenMP based advection equation solver for Xeon Phi

I am solving linear advection equation in order to learn parallel programming. I am running this code on Xeon Phi co processor. Below is a plot of how it scales up with increasing number of threads. I ...
1
vote
1answer
196 views

Generating Ulam numbers with OpenMP and single-thread versions

I'm making a program that takes an integer n and generates the first n Ulam numbers. I followed this guide about OpenMP. This ...
4
votes
2answers
3k views

OpenMP parallelization of a for loop with function calls

Using OpenMP, is it correct to parallelize a for loop inside a function "func" as follows? ...
6
votes
1answer
298 views

OpenMP parallel for critical section and use of flush

I am not sure about the place where flush should be used (if it is used at all here). ...
5
votes
1answer
227 views

Optimizing multiplication of square matrices for full CPU utilization

Problem I am learning about HPC and code optimization. I attempt to replicate the results in Goto's seminal matrix multiplication paper. Despite my best efforts, I cannot get over ~50% maximum ...
11
votes
3answers
385 views

Pi Benchmarking in C

I wrote the following program to calculate n digits of Pi (where n could be anything, like 10M) in order to benchmark the CPU and it works perfectly (without OpenMP): ...
4
votes
1answer
454 views

Vectors assignations and operations in a loop and parallelization with OpenMP

I use this piece of code to compute a short-time Fourier transform: ...