I'm currently working on a project1 which will result in millions of FFTs being solved time and time again. I have a finished prototype in Python (using numpy, scipy, and the pyfftw3 package), but there are significant bottlenecks in calculation speed. I have been writing a GPU version using arrayfire, but there's still a large amount of number crunching which can't be done in parallel. I'm hoping that translating the project to C++, either the entire thing or just the compute-intensive parts, will provide an extra boost over Python where the GPU isn't appropriate.
As a starting point, I have written a very basic C++ script which defines a Gaussian curve, takes the FFT using the FFTW3 library, and plots the input curve with its FFT using gnuplot. As this is my first script in C++ (neglecting 'hello world') I'd like to make sure I'm not wandering severely off track when it comes to optimizing efficiency and speed. I haven't benchmarked this yet, but was hoping that someone with more experience in using these libraries and the language could scan over it and point out any howlers.
The code:
#include "gnuplot-iostream.h"
#include <iostream>
#include <vector>
#include <cmath>
#include <fftw3.h>
#include <complex>
typedef std::complex<double> doubleComp;
int main()
{
Gnuplot gp;
doubleComp i;
i = -1;
i = sqrt(i);
int gridOrder = 9;
int nPoints = pow(2, gridOrder);
int halfPoints = pow(2, gridOrder-1);
float tau = 18;
float inTemp, outTemp;
fftw_complex in[nPoints], out[nPoints], pArr1[nPoints], pArr2[nPoints];
fftw_plan p;
for (int idx=0; idx<nPoints; idx++) {
in[idx][0] = exp(-pow((idx-halfPoints)/tau, 2));
in[idx][1] = 0;
}
p = fftw_plan_dft_1d(nPoints, pArr1, pArr2, FFTW_FORWARD, FFTW_MEASURE);
fftw_execute_dft(p, in, out);
for (int idx=0; idx<nPoints; idx++) {
out[idx][0] *= (1/double(nPoints));
out[idx][1] *= (1/double(nPoints));
}
std::vector<std::pair<double, double> >inPlot;
std::vector<std::pair<double, double> >outPlot;
for (int idx=0; idx<nPoints; idx++) {
inTemp = std::abs(in[idx][0] + i*in[idx][1]);
// Deal with fftshifts needed for visuals:
if (idx > halfPoints)
outTemp = std::abs(out[idx-halfPoints][0] +
i*out[idx-halfPoints][1]);
if (idx < halfPoints)
outTemp = std::abs(out[idx+halfPoints][0] +
i*out[idx+halfPoints][1]);
inPlot.push_back(std::make_pair(idx-halfPoints, inTemp));
outPlot.push_back(std::make_pair(idx-halfPoints, outTemp));
}
gp << "set xlabel 'Grid points' font 'Helvetica, 12'\n" <<
"set ylabel 'Intensity' font 'Helvetica, 12'" << std::endl;
gp << "set tics font 'Helvetica, 10'" << std::endl;
gp << "set title 'Gaussian pulse example' font 'Helvetica, 13'" << std::endl;
gp << "set grid xtics lt 0 lw 1 lc 'gray'\n" <<
"set grid ytics lt 0 lw 2 lc 'gray'" << std::endl;
gp << "set term wxt '0'" << std::endl;
gp << "plot '-' lt 1 lw 2 linecolor 'blue' with line title 'Time domain'\n";
gp.send1d(inPlot);
gp << "set term wxt '1'" << std::endl;
gp << "plot '-' lt 11 lw 2 with line title"
"'Spectral domain'\n";
gp.send1d(outPlot);
fftw_destroy_plan(p);
return 0;
}
To compile (Ubuntu 16.04):
g++ -O3 GaussianPulseDefinition.cpp -o GaussianPulseDefinition -lboost_system -lboost_iostreams -lboost_filesystem -lfftw3 -lm
1 For those who like to know the application: The project is basically a coupled PDE solver (with Qt GUI) aimed at simulating the nonlinear dynamics of optical pulse propagation in optical fibre, fibre laser systems, optical amplifiers, etc.