I ran across a practical example of algorithm analysis when I wrote a method that generates unique random values inside a provided range. Figured I would share it...
The code in this article was included in my https://github.com/ZacharyPatten/Towel project. Please check it out if you want to see more code like it. :)
Note: all code examples are in C#.
Introduction
When you want to generate unique random numbers, there are two approaches you can take: 1) you can track the remaining possible values of generation or 2) you can track the values that have already been randomly selected. For variables/parameters, let count
be the number of values to generate, minValue
be the inclusive lower bound of generation, and maxValue
be the exclusive upper bound of generation.
Algorithm #1/A
After a bit of googling, option #1 seems to be more common. You often see people throw all the possible values into an array/list and remove a randomly generated index. Here is an implementation of that algorithm:
public static IEnumerable<int> NextUniqueA(this Random random, int count, int minValue, int maxValue) { // Algorithm A: Θ(range + count) int pool = maxValue - minValue; int[] array = new int[pool]; for (int i = 0, j = minValue; j < maxValue; i++, j++) // Θ(range) array[i] = j; for (int i = 0; i < count; i++) // Θ(count) { int rollIndex = random.Next(0, pool); int roll = array[rollIndex]; array[rollIndex] = array[--pool]; yield return roll; } }
Algorithm #2/B
Now, here is what option #2 can look like. For this implementation, I use a linked list to store all the values that have already been generated in sorted order:
public static IEnumerable<int> NextUniqueB(this Random random, int count, int minValue, int maxValue) { // Algorithm B: O(.5*count^2), Ω(count), ε(.5*count^2) Node head = null; for (int i = 0; i < count; i++) // Θ(count) { int roll = random.Next(minValue, maxValue - i); Node node = head; Node previous = null; while (!(node is null)) // O(count / 2), Ω(0), ε(count / 2) { if (node.Value > roll) break; roll++; previous = node; node = node.Next; } yield return roll; if (previous is null) head = new Node() { Value = roll, Next = head, }; else previous.Next = new Node() { Value = roll, Next = previous.Next }; } } internal class Node { internal int Value; internal Node Next; }
Analyzing Algorithm Complexities
These two implementations have different runtime algorithm complexities. The #1/A option has a runtime complexity of Θ(range + count)
while the #2/B option has a runtime complexity of O(.5*count^2), Ω(count), ε(.5*count^2)
where O
is the worst case, Θ
is every case, Ω
is the best case, and ε
is the expected average between the worst and the best cases. Long story short, when the count
is relatively low you want to use algorithm #2/B. When the count
is relatively high you want to use algorithm #1/A. So if you want to make a faster all-around general purpose method, you need to combine the two approaches. But how do you do that? You need to find the intersection of the two runtime complexities of each method.
If you treat both algorithm complexities as linear functions and do some simple math to find the intesection, you will find that they intersect at about √(maxValue - minValue)
the square root of the range of random generation relative to count
. Now that you know the intersection point, you can combine the two algorithms into a single function that is optimized for more use cases:
Combination of #1/A and #2/B
public static IEnumerable<int> NextUnique(this Random random, int count, int minValue, int maxValue) { if (count > Math.Sqrt(maxValue - minValue)) { // Algorithm A } else { // Algorithm B } }
Benchmarks
Here are some benchmarks so you can compare theory with practical results. Note that for these benchmarks I used a random generation range of 0..10000
. So minValue = 0
and maxValue = 10000
. From the theory, I knew that the intersection of the runtime algorithms is roughly √(maxValue - minValue) = √(100000 - 0) = √(100000) = 100
. So when count
is less than ~100
the #2/B algorithm should be faster, but when count
is greater than ~100
algorithm #1/A should be faster.
Benchmarks Source Code
Note: This code uses the BenchmarkDotNet nuget package. It also uses the Plots exporter to generate the graphs, which requires R be installed.
using BenchmarkDotNet.Attributes; using BenchmarkDotNet.Running; using System; using System.Linq; using System.Collections.Generic; using BenchmarkDotNet.Exporters; using BenchmarkDotNet.Configs; namespace UniqueRandomGeneration { class Program { static void Main(string[] args) => BenchmarkRunner.Run<Benchmarks>(); } [RPlotExporter] [Config(typeof(Config))] public class Benchmarks { public class Config : ManualConfig { public Config() => Add(RPlotExporter.Default); } const int MinValue = 0; const int MaxValue = 10000; [Params(25, 50, 75, 100, 125, 150, 175, 200, 225, 250)] public int Count; readonly Random random = new Random(); [Benchmark] public void RandomUniqueA() => random.NextUniqueA(Count, MinValue, MaxValue).ToArray(); [Benchmark] public void RandomUniqueB() => random.NextUniqueB(Count, MinValue, MaxValue).ToArray(); [Benchmark] public void RandomUnique() => random.NextUnique(Count, MinValue, MaxValue).ToArray(); } public static class Extensions { internal class Node { internal int Value; internal Node Next; } public static IEnumerable<int> NextUnique(this Random random, int count, int minValue, int maxValue) { if (maxValue < minValue) throw new ArgumentOutOfRangeException(nameof(minValue) + " > " + nameof(minValue)); if (count < 0) throw new ArgumentOutOfRangeException(nameof(count) + " < 0"); if (maxValue - minValue < count) throw new ArgumentOutOfRangeException(nameof(count) + " is larger than " + nameof(maxValue) + " - " + nameof(minValue) + "."); if (count > Math.Sqrt(maxValue - minValue)) { // Algorithm A: Θ(range + count) int pool = maxValue - minValue; int[] array = new int[pool]; for (int i = 0, j = minValue; j < maxValue; i++, j++) // Θ(range) array[i] = j; for (int i = 0; i < count; i++) // Θ(count) { int rollIndex = random.Next(0, pool); int roll = array[rollIndex]; array[rollIndex] = array[--pool]; yield return roll; } } else { // Algorithm B: O(.5*count^2), Ω(count), ε(.5*count^2) Node head = null; for (int i = 0; i < count; i++) // Θ(count) { int roll = random.Next(minValue, maxValue - i); Node node = head; Node previous = null; while (!(node is null)) // O(count / 2), Ω(0), ε(count / 2) { if (node.Value > roll) break; roll++; previous = node; node = node.Next; } yield return roll; if (previous is null) head = new Node() { Value = roll, Next = head, }; else previous.Next = new Node() { Value = roll, Next = previous.Next }; } } } public static IEnumerable<int> NextUniqueA(this Random random, int count, int minValue, int maxValue) { if (maxValue < minValue) throw new ArgumentOutOfRangeException(nameof(minValue) + " > " + nameof(minValue)); if (count < 0) throw new ArgumentOutOfRangeException(nameof(count) + " < 0"); if (maxValue - minValue < count) throw new ArgumentOutOfRangeException(nameof(count) + " is larger than " + nameof(maxValue) + " - " + nameof(minValue) + "."); // Algorithm A: Θ(range + count) int pool = maxValue - minValue; int[] array = new int[pool]; for (int i = 0, j = minValue; j < maxValue; i++, j++) // Θ(range) array[i] = j; for (int i = 0; i < count; i++) // Θ(count) { int rollIndex = random.Next(0, pool); int roll = array[rollIndex]; array[rollIndex] = array[--pool]; yield return roll; } } public static IEnumerable<int> NextUniqueB(this Random random, int count, int minValue, int maxValue) { if (maxValue < minValue) throw new ArgumentOutOfRangeException(nameof(minValue) + " > " + nameof(minValue)); if (count < 0) throw new ArgumentOutOfRangeException(nameof(count) + " < 0"); if (maxValue - minValue < count) throw new ArgumentOutOfRangeException(nameof(count) + " is larger than " + nameof(maxValue) + " - " + nameof(minValue) + "."); // Algorithm B: O(.5*count^2), Ω(count), ε(.5*count^2) Node head = null; for (int i = 0; i < count; i++) // Θ(count) { int roll = random.Next(minValue, maxValue - i); Node node = head; Node previous = null; while (!(node is null)) // O(count / 2), Ω(0), ε(count / 2) { if (node.Value > roll) break; roll++; previous = node; node = node.Next; } yield return roll; if (previous is null) head = new Node() { Value = roll, Next = head, }; else previous.Next = new Node() { Value = roll, Next = previous.Next }; } } } }
Benchmarks Results Table
BenchmarkDotNet=v0.12.0, OS=Windows 10.0.18363 Intel Core i7-4790K CPU 4.00GHz (Haswell), 1 CPU, 8 logical and 4 physical cores .NET Core SDK=3.1.100 [Host] : .NET Core 3.1.0 (CoreCLR 4.700.19.56402, CoreFX 4.700.19.56404), X64 RyuJIT DefaultJob : .NET Core 3.1.0 (CoreCLR 4.700.19.56402, CoreFX 4.700.19.56404), X64 RyuJIT
Method | Count | Mean | Error | StdDev |
---|---|---|---|---|
RandomUniqueA | 25 | 7.406 us | 0.1576 us | 0.1876 us |
RandomUniqueB | 25 | 1.307 us | 0.0256 us | 0.0414 us |
RandomUnique | 25 | 1.300 us | 0.0046 us | 0.0043 us |
RandomUniqueA | 50 | 7.761 us | 0.0263 us | 0.0246 us |
RandomUniqueB | 50 | 2.893 us | 0.0084 us | 0.0079 us |
RandomUnique | 50 | 3.022 us | 0.0755 us | 0.0707 us |
RandomUniqueA | 75 | 8.387 us | 0.1317 us | 0.1232 us |
RandomUniqueB | 75 | 5.180 us | 0.1070 us | 0.1001 us |
RandomUnique | 75 | 5.298 us | 0.0634 us | 0.0593 us |
RandomUniqueA | 100 | 9.151 us | 0.1800 us | 0.2340 us |
RandomUniqueB | 100 | 8.013 us | 0.0994 us | 0.0930 us |
RandomUnique | 100 | 8.246 us | 0.1205 us | 0.1127 us |
RandomUniqueA | 125 | 9.335 us | 0.1533 us | 0.1434 us |
RandomUniqueB | 125 | 11.291 us | 0.1093 us | 0.1023 us |
RandomUnique | 125 | 9.394 us | 0.1773 us | 0.1658 us |
RandomUniqueA | 150 | 9.509 us | 0.0323 us | 0.0302 us |
RandomUniqueB | 150 | 14.486 us | 0.0653 us | 0.0610 us |
RandomUnique | 150 | 9.355 us | 0.0428 us | 0.0400 us |
RandomUniqueA | 175 | 9.972 us | 0.0319 us | 0.0298 us |
RandomUniqueB | 175 | 18.778 us | 0.0792 us | 0.0741 us |
RandomUnique | 175 | 9.789 us | 0.0593 us | 0.0554 us |
RandomUniqueA | 200 | 10.270 us | 0.0297 us | 0.0278 us |
RandomUniqueB | 200 | 23.384 us | 0.0623 us | 0.0520 us |
RandomUnique | 200 | 10.112 us | 0.0499 us | 0.0466 us |
RandomUniqueA | 225 | 10.724 us | 0.1754 us | 0.1465 us |
RandomUniqueB | 225 | 28.748 us | 0.0872 us | 0.0816 us |
RandomUnique | 225 | 10.612 us | 0.0430 us | 0.0381 us |
RandomUniqueA | 250 | 11.144 us | 0.0518 us | 0.0484 us |
RandomUniqueB | 250 | 34.320 us | 0.1249 us | 0.1107 us |
RandomUnique | 250 | 10.891 us | 0.0826 us | 0.0772 us |
Benchmarks Results Bar Graph
Probability/Validity Testing
Here is some validity testing to hopefully make sure I am not full of shit. In this code, I generate 5
values in the 0..26
range. I do this 1000000
times, and I calculate the mean that each values was randomly selected. All the values should be selected an average of 5 / 26 = 0.1923...
times.
Validity Testing Source Code
using System; using System.Linq; using System.Collections.Generic; namespace ValidityTesting { class Program { static void Main() { Console.WriteLine(nameof(Extensions.NextUnique)); PrintPriorities(CalculateProbabilities(Extensions.NextUnique)); Console.WriteLine(); Console.WriteLine(nameof(Extensions.NextUniqueA)); PrintPriorities(CalculateProbabilities(Extensions.NextUniqueA)); Console.WriteLine(); Console.WriteLine(nameof(Extensions.NextUniqueB)); PrintPriorities(CalculateProbabilities(Extensions.NextUniqueB)); Console.WriteLine(); } internal static void PrintPriorities(Dictionary<int, double> results) { foreach (var keyVal in results.OrderBy(x => x.Key)) Console.WriteLine(keyVal.Key + ": " + keyVal.Value); } internal static Dictionary<int, double> CalculateProbabilities(Func<Random, int, int, int, IEnumerable<int>> algorithm) { Random random = new Random(); const int size = 5; const int min = 0; const int max = 26; const int iterations = 1000000; var counts = new Dictionary<int, int>(); for (int i = min; i < max; i++) counts.Add(i, 0); for (int i = 0; i < iterations; i++) foreach (int j in algorithm(random, size, min, max)) counts[j]++; var results = new Dictionary<int, double>(); foreach (var keyVal in counts) results.Add(keyVal.Key, keyVal.Value / (double)iterations); return results; } } public static class Extensions { internal class Node { internal int Value; internal Node Next; } public static IEnumerable<int> NextUnique(this Random random, int count, int minValue, int maxValue) { if (maxValue < minValue) throw new ArgumentOutOfRangeException(nameof(minValue) + " > " + nameof(minValue)); if (count < 0) throw new ArgumentOutOfRangeException(nameof(count) + " < 0"); if (maxValue - minValue < count) throw new ArgumentOutOfRangeException(nameof(count) + " is larger than " + nameof(maxValue) + " - " + nameof(minValue) + "."); if (count > Math.Sqrt(maxValue - minValue)) { // Algorithm A: Θ(range + count) int pool = maxValue - minValue; int[] array = new int[pool]; for (int i = 0, j = minValue; j < maxValue; i++, j++) // Θ(range) array[i] = j; for (int i = 0; i < count; i++) // Θ(count) { int rollIndex = random.Next(0, pool); int roll = array[rollIndex]; array[rollIndex] = array[--pool]; yield return roll; } } else { // Algorithm B: O(.5*count^2), Ω(count), ε(.5*count^2) Node head = null; for (int i = 0; i < count; i++) // Θ(count) { int roll = random.Next(minValue, maxValue - i); Node node = head; Node previous = null; while (!(node is null)) // O(count / 2), Ω(0), ε(count / 2) { if (node.Value > roll) break; roll++; previous = node; node = node.Next; } yield return roll; if (previous is null) head = new Node() { Value = roll, Next = head, }; else previous.Next = new Node() { Value = roll, Next = previous.Next }; } } } public static IEnumerable<int> NextUniqueA(this Random random, int count, int minValue, int maxValue) { if (maxValue < minValue) throw new ArgumentOutOfRangeException(nameof(minValue) + " > " + nameof(minValue)); if (count < 0) throw new ArgumentOutOfRangeException(nameof(count) + " < 0"); if (maxValue - minValue < count) throw new ArgumentOutOfRangeException(nameof(count) + " is larger than " + nameof(maxValue) + " - " + nameof(minValue) + "."); // Algorithm A: Θ(range + count) int pool = maxValue - minValue; int[] array = new int[pool]; for (int i = 0, j = minValue; j < maxValue; i++, j++) // Θ(range) array[i] = j; for (int i = 0; i < count; i++) // Θ(count) { int rollIndex = random.Next(0, pool); int roll = array[rollIndex]; array[rollIndex] = array[--pool]; yield return roll; } } public static IEnumerable<int> NextUniqueB(this Random random, int count, int minValue, int maxValue) { if (maxValue < minValue) throw new ArgumentOutOfRangeException(nameof(minValue) + " > " + nameof(minValue)); if (count < 0) throw new ArgumentOutOfRangeException(nameof(count) + " < 0"); if (maxValue - minValue < count) throw new ArgumentOutOfRangeException(nameof(count) + " is larger than " + nameof(maxValue) + " - " + nameof(minValue) + "."); // Algorithm B: O(.5*count^2), Ω(count), ε(.5*count^2) Node head = null; for (int i = 0; i < count; i++) // Θ(count) { int roll = random.Next(minValue, maxValue - i); Node node = head; Node previous = null; while (!(node is null)) // O(count / 2), Ω(0), ε(count / 2) { if (node.Value > roll) break; roll++; previous = node; node = node.Next; } yield return roll; if (previous is null) head = new Node() { Value = roll, Next = head, }; else previous.Next = new Node() { Value = roll, Next = previous.Next }; } } } }
Validity Testing Console Output
NextUnique 0: 0.192031 1: 0.19208 2: 0.192785 3: 0.192287 4: 0.192593 5: 0.192686 6: 0.191937 7: 0.191998 8: 0.19244 9: 0.192112 10: 0.19212 11: 0.192707 12: 0.192877 13: 0.192086 14: 0.192277 15: 0.191907 16: 0.192366 17: 0.192386 18: 0.192423 19: 0.19229 20: 0.192017 21: 0.191901 22: 0.19285 23: 0.19282 24: 0.191902 25: 0.192122 NextUniqueA 0: 0.192705 1: 0.191956 2: 0.192313 3: 0.192613 4: 0.193418 5: 0.191957 6: 0.192116 7: 0.192217 8: 0.192773 9: 0.191862 10: 0.19233 11: 0.192145 12: 0.192044 13: 0.192749 14: 0.192097 15: 0.19224 16: 0.192526 17: 0.19225 18: 0.191653 19: 0.192546 20: 0.192156 21: 0.191686 22: 0.192908 23: 0.19174 24: 0.192605 25: 0.192395 NextUniqueB 0: 0.192408 1: 0.191776 2: 0.191809 3: 0.192623 4: 0.192399 5: 0.192451 6: 0.192135 7: 0.1931 8: 0.192594 9: 0.19232 10: 0.192281 11: 0.191723 12: 0.192358 13: 0.19216 14: 0.192697 15: 0.192975 16: 0.192357 17: 0.192507 18: 0.191768 19: 0.191874 20: 0.192803 21: 0.191934 22: 0.192075 23: 0.192704 24: 0.192155 25: 0.192014
Top comments (0)