Tuesday 2 February 2010

When random is too consistent...

A while back I posted an extension method for shuffling Lists in C# using the Fisher Yates Shuffle. The algorithm is valid but my implementation relied on the default random number generator in .Net (System.Random) which is flawed since it generates a predictable sort of random.

Here's a correction that uses a more random sort of random from System.Security.Cryptography:
public static void Shuffle<T>(this IList<T> list)
{
var provider = new RNGCryptoServiceProvider();
int n = list.Count;
while (n > 1)
{
var box = new byte[1];
do provider.GetBytes(box);
while (!(box[0] < n * (Byte.MaxValue / n)));
var k = (box[0] % n);
n--;
var value = list[k];
list[k] = list[n];
list[n] = value;
}
}

Here's a simple test you can run to demonstrate the difference in the algorithms:

using System;
using System.Collections.Generic;
using System.Security.Cryptography;

namespace AlphabetSoup
{
class Program
{
static void Main()
{
var alphabet = GetAlphabet();
foreach (var c in alphabet)
Console.Write(c);
Console.Write('\n');
Console.Write('\n');

Console.WriteLine("System.Random:");
for (int i = 0; i < 10; i++)
{
alphabet = GetAlphabet();
alphabet.QuickShuffle();
foreach (var c in alphabet)
Console.Write(c);
Console.Write('\n');
}
Console.Write('\n');
Console.Write('\n');

Console.WriteLine("System.Security.Cryptography:");
for (int i = 0; i < 10; i++)
{
alphabet = GetAlphabet();
alphabet.Shuffle();
foreach (var c in alphabet)
Console.Write(c);
Console.Write('\n');
}
Console.Write('\n');
Console.Write('\n');

Console.ReadKey();
}

static List<char> GetAlphabet()
{
var alphabet = new List<char>();
for (int c = 65; c < 91; c++)
alphabet.Add((char)c);
return alphabet;
}
}
static class Ext
{
public static void Shuffle<T>(this IList<T> list)
{
var provider = new RNGCryptoServiceProvider();
int n = list.Count;
while (n > 1)
{
var box = new byte[1];
do provider.GetBytes(box);
while (!(box[0] < n * (Byte.MaxValue / n)));
var k = (box[0] % n);
n--;
var value = list[k];
list[k] = list[n];
list[n] = value;
}
}
public static void QuickShuffle<T>(this IList<T> list)
{
var rng = new Random();
var n = list.Count;
while (n > 1)
{
n--;
var k = rng.Next(n + 1);
var value = list[k];
list[k] = list[n];
list[n] = value;
}
}
}
}

Here's the output on my machine:

2 comments:

Tom said...

The parameterless constructor for Random seeds with the system clock, so if your loop time is under a millisecond it'll be using the same seed for subsequent calls to QuickShuffle(), hence the repetition.

From an implementation perspective it's pretty nasty that the randomness of the RNG is based on something like that. I've never been sure why they didn't make the whole thing static and have done with it (thread safety, perhaps?).

Andy H said...

Not sure if anyone ran into this with this extension class, but if you are using build integration, you will run into this error if trying to implement this class.

http://social.msdn.microsoft.com/Forums/en-SG/vstscode/thread/a3d28c88-8a4a-49f4-90ad-6a56a6885a8e

Anyone know how to get around this without editing FXCop?