OK, I've been playing with D for a while (and been in love with its expressive power and simplicity, to be honest). However, since I'm still new to D, I'm facing a few issues.
Let's take the following example. All the following program does is to take numbers in the 0..10000000 (fairly big one, for benchmarking purposes) and for each one of them return a vector/array with the positions of bits set (in binary).
E.g.
4 = 100(2) => [ 2 ]
5 = 101(2) => [ 0, 2 ]
6 = 110(2) => [ 1, 2 ]
7 = 111(2) => [ 0, 1, 2]
And so on...
Now here's my C++ code (no vector reserve
etc being used) :
// bits.cpp
#include <vector>
#include <iostream>
#include "math.h"
using namespace std;
vector<unsigned int> bitsSet(unsigned long long bitboard)
{
unsigned int n;
vector<unsigned int> res;
for (n = 0; bitboard != 0; n++, bitboard &= (bitboard - 1))
{
res.push_back(log2(bitboard & ~(bitboard-1)));
}
return res;
}
int main()
{
for (int k=0; k<10000000; k++)
{
vector<unsigned int> res = bitsSet(k);
}
return 0;
}
And here's my D code :
// bits.d
import std.stdio;
import std.math;
int[] bitsSet(ulong bitboard)
{
int[] res;
for (int n=0; bitboard!=0; n++, bitboard&=(bitboard-1))
res ~= cast(int)log2(bitboard & ~(bitboard-1));
return res;
}
void main(string[] args)
{
for (ulong k=0; k<10000000; k++)
{
int[] bits = bitsSet(k);
}
}
Now, given that the 2 pieces of code are compiled with g++ bits.cpp -o cbits
(or clang++ bits.cpp -o cbits
) and dmd bits.d -ofdbits
, respectively, these are my benchmark results, using time
(on Mac OS X 10.8.2) :
For C++ :
time ./cbits
real 0m19.742s
user 0m19.722s
sys 0m0.012s
For D :
time ./dbits
real 0m14.914s
user 0m14.891s
sys 0m0.017s
This looks OK. (with D being - for me - noticeably faster).
NOTE :
Now, if I try to use something like res.reserve(64);
in my C++ bitsSet
function, though, time drops to around 7s.... which IS significantly faster. Tried something like res.length = 64;
(in my D code), time dropped to around 11s (though slower than C++), but I'm not sure if the result is the same...
What further optimizations would you suggest for my D code, so that it's at least as fast as my C++ code?
Results with Compiler Optimization flags :
With clang++ bits.cpp -O3 -o cbits
time ./cbits
real 0m8.994s
user 0m8.986s
sys 0m0.006s
With dmd bits.d -O -release -inline -m64 -ofdbits
time ./bitsd
real 0m14.083s
user 0m14.034s
sys 0m0.014s
Which looks pretty amazing (or bizarre). Clang managed to go from 19 to 8 seconds, while D optimization did nothing???
EDIT : So, is there no hope that my D code will run as fast as it's C++ counterpart?
-O3
? – R. Martinho Fernandes Feb 6 at 8:34-O2
or-Os
) for the C++ code. – Matthieu M. Feb 6 at 8:34