Code Review Stack Exchange is a question and answer site for peer programmer code reviews. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

Below is my code for generating a random string of random length. I am using it in integration tests. Currently I have almost 2000 tests running each time a commit is made to master and a few of those tests are using this class.

I would like to make sure the code is optimized so that the tests complete as quickly as possible.

public class Strings
{
    private static readonly Random Random = new Random();

    private static int RandomChar => Random.Next(char.MinValue, char.MaxValue);

    public static string GenerateRandomStringOfSetLength(int length)
    {
        return GenerateRandomString(length);
    }

    private static string GenerateRandomString(int length)
    {
        var stringBuilder = new StringBuilder();

        while (stringBuilder.Length - 1 <= length)
        {
            var character = Convert.ToChar(RandomChar);
            if (!char.IsControl(character))
            {
                stringBuilder.Append(character);
            }
        }

        return stringBuilder.ToString();
    }
}
share|improve this question
    
Why do you need random strings ? – denis 15 hours ago
2  
Retorical question: How valid are these tests; any given test? How can random input on every test run be covering the intended conditions? Unless test output results are being compared. But that sounds like a test dependency that should be avoided. If this is unit testing, we don't want to use a database precisely because of its variable/uncontrolled state. – radarbob 15 hours ago
    
The intention of these random strings are purely to add some form of realism to the integration tests I've built to ensure that characters I would normally not think of from my English speak perspective won't unintentionally break my code. Maybe this can be seen as showing a lack of knowledge in my chosen platform, however I approach it from the point of view of completeness in my tests. Also, These random strings are not be not used in a security scenario as generating "random" passwords, but purely for testing purposes. – MrThursday 10 hours ago
    
@radarbob I forgot to add that I am not using or hitting a real DB in my tests either. I am rather generating the tables I need in memory from my EF models. This random string class is part of the routine of seeding those tables. – MrThursday 10 hours ago
1  
Randomizing tests is a bad idea. Suppose a test fails one time in a million; how are you going to be able to reproduce it to find the problem? Generate your random data once and hard code it into the tests. – Eric Lippert 8 hours ago
up vote 14 down vote accepted

Just two remarks:

  • RandomChar should be a method because it returns a different result each time - this is just a convention that we usually follow in C#
  • You can cast the result to char and don't need the Convert.ToChar

_

private static char RandomChar() => (char)Random.Next(char.MinValue, char.MaxValue);

You can also improve the performance by precalculating the array with chars:

private static readonly char[] Chars = 
    Enumerable
    .Range(char.MinValue, char.MaxValue)
    .Select(x => (char)x)
    .Where(c => !char.IsControl(c))
    .ToArray();

the RandomChar method would take the values from this array:

private static char RandomChar() => Chars[Random.Next(0, Chars.Length)];

so building the string can be a simple loop:

for (int i = 0; i < length; i++)
{
    stringBuilder.Append(RandomChar());
}

Without the StringBuilder this seems to be faster in tests by just ~8ms for 100.000 loops and a string lenght of 1.000

var chars = new char[length];
for (int i = 0; i < length; i++)
{
    chars[i] = RandomChar();
}
return new string(chars);

Changing the Random.Next(0, max) to Random.Next(max) improves the performance by another 10ms for the same tests.

share|improve this answer
    
I am interested about the convention where you put non-deterministic things in their own method. Does that have a name or is somewhere I can read about it? – Evorlor 5 hours ago

One small improvement is since you know the size of the string when you create the StringBuilder tell it how much it will hold.

 var stringBuilder = new StringBuilder(length);

This will prevent it from having to allocate more memory in the process if the string is long.

share|improve this answer

LINQ solution:

public string GetRandomString(int length)
{
    return String.Concat(RandomSequence().Where(x => !char.IsControl(x)).Take(length));
}

private IEnumerable<char> RandomSequence()
{
    while(true)
    {
        yield return (char)Random.Next(char.MinValue, char.MaxValue);
    }
}

Pretty short, but it is most likely the slowest option.

share|improve this answer
1  
Nice and clean indeed, but you're right about the performance (just tested it). – t3chb0t 14 hours ago

Random is pseudo random, as stated in the docs:

Represents a pseudo-random number generator, which is a device that produces a sequence of numbers that meet certain statistical requirements for randomness.

This is good for modelling, and games. However not for security. And so I'd recommend that you instead use RNGCryptoServiceProvider. This has a method, GetBytes(byte[] data), that looks exactly like what you want. Using this you can build an ASCII string, rather than a Unicode string, and will have the same level of randomness per bit/byte. But not per char, so your strings will need to be twice as long, as you're generating numbers in the range 0 to 65535, where the above is 0 to 256. Finally you can change the byte array to a string using Encoding.ASCII.GetString.

And so you can get:

private static string GenerateRandomString(int size)
{
    var b = new byte[size];
    new RNGCryptoServiceProvider().GetBytes(b);
    return Encoding.ASCII.GetString(b);
}

If you do need to have the program produce chars that are 16bits each, then you can chunk the above bytes array into chunks of two, shift the first by 8 bits, first << 8, and then add them.

share|improve this answer
    
Good point regarding security. I was actually thinking about this when I wrote the class as illustrated, however I do not intend to used it in such a scenario. I purely want a very good performing random string generator which goes beyond the normal alpha numeric characters we use in our every day English world. – MrThursday 10 hours ago

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.