This is some fairly heavily optimized C#. Aside from reading like C++ code due to all of the pointers, does this pass sanity checks, and is there any way to improve it, while keeping its speed (or perhaps a performant way to fix the weighting issue)? For reference, it takes ~170ns to produce each 16 character string when averaged out over 1 million iterations.
Requirements were that it produce Base36 (ie. all capital letters and numbers) strings of specific sizes (typically 16 in length), with a fairly even distribution (ie. avoiding collisions). I'm aware of the slight weighting towards the lowerbound characters (specifically the first 15 in the constant due to ushort not lining up with a multiple of 36)
private const int PoolSize = 4096;
private const string Base36 = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
private static readonly Lazy<RNGCryptoServiceProvider> Crypto = new Lazy<RNGCryptoServiceProvider>(() => new RNGCryptoServiceProvider());
private static readonly ThreadLocal<byte[]> BytePoolProvider = new ThreadLocal<byte[]>(() => new byte[PoolSize]);
private static ThreadLocal<int> currentIndex = new ThreadLocal<int>(() => PoolSize);
public static unsafe string Create(int size)
{
if (size > 1024) return CreateLarge(size); //not included in this post, essentially the same but allocates a new byte array every time
//Get a reference to the byte pool for our thread.
var bytePool = BytePoolProvider.Value;
//Get a reference to the current index for our thread.
var index = currentIndex.Value;
//Allocate a new string of length size.
var newString = new string('\0', size);
//Get a pointer to the first character of newString and Base36
fixed (char* buffer = newString, b36 = Base36)
{
//reset the byte pool if we'd go out of bounds on it.
if (index + size * 2 > PoolSize)
{
Crypto.Value.GetBytes(bytePool);
index = 0;
}
//Get a pointer to the byte pool at our current index
fixed (byte* pbyte = &bytePool[index])
{
for (var cnt = 0; cnt < size; cnt++)
{
//Now here's where the real "magic" begins.
//&pbyte[cnt * 2] obtains a pointer to the byte by current step, skipping every other byte because we're casting to ushort
//*((ushort*) lets us use the data at the pointer (and its next byte) as a ushort, which is then truncated to 36 via modulous,
//and then used to index on the pointer to the base36 constant string.
buffer[cnt] = b36[(*((ushort*) &pbyte[cnt*2]))%36];
}
//Increment the index for this thread.
currentIndex.Value = index + size*2;
//After we've done all that, we want the string, right?
return newString;
}
}
}
Crypto.Value.GetBytes(bytePool);
fills the byte array with random values -- have you considered doing the work yourself with a fixed value? I would assume that it might be a tiny bit more performant since you don't have to generate the random values and the values by themselves don't matter anyway. – Jeroen Vannevel Aug 28 '15 at 18:25