OpenMP is an API that supports shared memory multiprocessing in C, C++, and Fortran.
0
votes
0answers
20 views
Renumbering vertices in a 3D mesh
I'm just guessing if there is any room for improving performance on this code:
...
4
votes
0answers
120 views
Optimizing the calculation of complex exponential numbers using OpenMP
I am trying to create a function that can either beat numexpr or perform comparably for the vectorized mathematical operation ...
5
votes
0answers
181 views
Parallelizing an algorithm with OpenMP using a dynamic work queue
I'm looking for comments on the design, correctness and performance (not so much style) of a dynamic work queue for OpenMP worker threads.
I have an algorithm that can be thought of in terms of some ...
5
votes
0answers
163 views
Gauss-Seidel+SOR
I am learning OpenMP+MPI hybrid programming. As an example I have chosen Gauss-Seidel+SOR. My implementation uses MPI_THREAD_FUNNELED style hybrid programming, ...
1
vote
0answers
65 views
Big Integers and parallel execution with OpenMP
This is part of an implementation of Rabin-Williams signatures as described by Bernstein in Section 6 of "RSA signatures and Rabin–Williams signatures: the state of the art" using Tweaked Roots.
The ...
2
votes
1answer
571 views
Simple speed up of C++ OpenMP kernel
This function calculates the standard deviation of a patch, given a kernel size and greyscale OpenCV image. The middle pixel of the patch is kept if stdev of the patch is below the given threshold, ...
4
votes
1answer
232 views
Neural Network Simulator with OpenMP
I wrote a simple neural network simulator (the biophysical kind) from scratch, and was hoping to get some feedback on how I can speed things up, or any C++ / compilation best practices that I can ...
5
votes
1answer
240 views
Sparse matrix multiplication for billion by billion matrices
I have to do multiplication of two billion by billion sparse matrices (on a CPU), hence any help or hints in optimizing the below given code would be extremely useful.
Note: I am only showing the ...
12
votes
2answers
205 views
Parallelization of number factors using OpenMP
For a simple try at parallelization on my own outside of school, I've created a number factors calculator. I hope to eventually come up with something more creative.
Since I don't have access to ...
3
votes
1answer
85 views
Trajectory optimizations
Here's a function doing some trajectory optimization in the following manner:
...
1
vote
1answer
248 views
Generating Ulam numbers with OpenMP and single-thread versions
I'm making a program that takes an integer n and generates the first n Ulam numbers. I followed this guide about OpenMP.
This ...
4
votes
2answers
193 views
Increasing performance of OpenMP based advection equation solver for Xeon Phi
I am solving linear advection equation in order to learn parallel programming. I am running this code on Xeon Phi co processor. Below is a plot of how it scales up with increasing number of threads. I ...
6
votes
1answer
292 views
Optimizing multiplication of square matrices for full CPU utilization
Problem
I am learning about HPC and code optimization. I attempt to replicate the results in Goto's seminal matrix multiplication paper. Despite my best efforts, I cannot get over ~50% maximum ...
6
votes
1answer
497 views
OpenMP parallel for critical section and use of flush
I am not sure about the place where flush should be used (if it is used at all here).
...
4
votes
2answers
550 views
OpenMP loop parallel for loop with function calls and STL vector
I have a function initialize_path_statistics(). I have used openMP to make it parallel. I am not sure where certain lines such as
...
11
votes
3answers
463 views
Pi Benchmarking in C
I wrote the following program to calculate n digits of Pi (where n could be anything, like 10M) in order to benchmark the CPU and it works perfectly (without OpenMP):
...
4
votes
2answers
4k views
OpenMP parallelization of a for loop with function calls
Using OpenMP, is it correct to parallelize a for loop inside a function "func" as follows?
...
4
votes
1answer
621 views
Vectors assignations and operations in a loop and parallelization with OpenMP
I use this piece of code to compute a short-time Fourier transform:
...