The Mystery of Random Numbers -- a Pascal Exercise
Have you ever wondered how random numbers are generated by a computer? Computers are generally very precise, and mathematical operations are generally very predictable, so it seems strange that computers can do something randomly, on purpose. So I'll try to explain the 'magic' of random numbers without getting too mathematical.
A common method of generating a random sequnce of numbers on a computer is to start with any number except zero, then generate the next random number by multiplying it by some 'special' number. Of course, if we keep doing this, the number will soon get too big for the computer to hancle. So the higher digits of the number are chopped off to prevent an 'arithmetic overflow' error. The 'Random' function of Turbo Pascal and Borland Pascal use this kind of method with 16-bit integers. But we will try doing it with 8-bit integers (bytes), because it will be easier to see how it works.
Try compiling and running this program:
program Rand1; var R: byte; I: integer; begin writeln; R:= 71; for I:= 1 to 80 do begin R:= Lo(R * 157); write(R:4); if (I mod 16) = 0 then writeln; end; readln; end.
The purpose of the writeln statement is to put a blank line between the output of each experiment that we do. The readln statement makes the program pause, waiting for us to press the Enter key after we look at the output. The 'for' loop is to generate 80 numbers for each experiment. The "if (I mod 16) = 0 then writeln" statement starts a new line after every 16 numbers. The "write(R:4)" statement writes each 'random' value of R. Since byte values range from 0 to 255, no more than 3 digits are needed, and the ":4" allows for at least one space to separate the random numbers.
We initially set R equal to 71. Why 71? It doesn't really matter, except that odd values work better (Try other values). We just picked a number 'at random'. The 'special' number 157 is called the 'multiplier'. (Why 157? We'll explain later.) The "R:= Lo(R * 157)" statement inside the loop generates the next 'random' value of R, which depends of the previous value of R. So the number sequence isn't really random, because the next number is determined by the previous number and an exact rule. (The sequence is called 'deterministic'.) In a really random process, like tossing coins or dice, the process is too complex and imprecise for us to determine the outcome.
The result of the multiplication "R * 157" is generally larger than a byte -- a 2-byte integer, and the Lo function takes the low byte of the result.
The output of the program is:
139 63 163 247 123 111 19 167 107 159 131 87 91 207 243 7 75 255 99 183 59 47 211 103 43 95 67 23 27 143 179 199 11 191 35 119 251 239 147 39 235 31 3 215 219 79 115 135 203 127 227 55 187 175 83 231 171 223 195 151 155 15 51 71 139 63 163 247 123 111 19 167 107 159 131 87 91 207 243 7
The output looks random (except that they are all odd values), but since it is really deterministic, it's called 'pseudorandom' (false random). If you look at the output carefully, you will notice that the output repeats after 64 values; and that's another indication that the sequence isn't really random.
We can't avoid repetition, because since we limited the output to byte values, and since there are 256 byte values, we will need to repeat some values after at most the first 256 values. Also, since each value depends on the previous value, every time that 139 occurs, the next number will be 63; and every time that 63 occurs, the next number will be 163, etc. So the best we can do is a byte sequence that repeats every 256 values. But this program generates a byte sequence that repeats every 64 values.
You have probably been wondering why we chose a multiplier value of 157. Now is a good time for you to try other values, to see what happens. You will find that even numbers work very badly, and that some odd numbers provide longer sequences than others. (The length of the sequence is how many numbers are generated before it starts to repeat.) The longest length you will find is 64, so the multiplier value of 157 is one of the best.
I did some experimenting, trying to get the maximum length (for 8-bit bytes) of 256. I could get a length of 256 with the multiplier value of 157 if I chopped the product "R * 157" down to 10 bits rather than 8 bits. The 10-bit values are too big. But I found that the right-most 2 bits of the 10 bits are always 01, so they are not random. This gave me an idea: The other 8 bits are random, so use them. So this is how we can start with one 'random' value R, and compute the next 'random' R:
(1) Multiply R by 4 and add one. (This changes the 8-bit R into a 10-bit number that, in binary, ends with 01.)
(2) Multiply by 157. (This makes an 18-bit value.)
(3) Chop off the left-most 8 bits and the right-most 2 bits. (This leaves 8 bits.)
The first two steps compute (4*R + 1) * 157, which can be simplified as 628*R + 157, because 4*157 = 628 and 1*157 = 157. But how do we do step 3?
To chop off the left-most 8 bits of an 18-bit value (leaving 10 bits), we can use Pascal's 'and' operator on integers. When the 'and' operator is used on integers, each bit of each integer is treated as a boolean value -- 1 is true and 0 is false. The 'and' operator is applied to corresponding bits like this:
A = 100011100110111101 (145853 in decimal) B = 000000001111111111 (1023 in decimal) A and B = 000000000110111101 (445 in decimal) which is = 0110111101 (10 bits)
Notice that on the left side where B is zeros, "A and B" is made zeros, no matter the value of A. But on the right side where B is ones, "A and B" is just a copy of A. Since zeros on the left of an integer do not contribute to its value, they can be removed -- 'chopped off'.
To chop off the right-most 2 bits of an integer, we can use Pascal's 'shr' (shift right) operator. For example, "445 shr 2" shifts the binary representation of 445 to the right two bit positions, like this:
445 = 0110111101 445 shr 2 = 01101111 (111 in decimal)
So, putting all these steps together, a Pascal statement for changing one value of R to the next value is:
R:= ((628 * R + 157) and 1023) shr 2;
But we will get an "arithmetic overflow" error when this statement runs. This happens because we have declared R to be a byte (8 bits), but the multiplication generates an 18-bit value. That is, we need longint precision. To tell this to the compiler, we can write:
R:= ((longint(628) * R + 157) and 1023) shr 2;
This statement is now correct; but we can make the method clearer if we write the constant 1023 in hexadecimal notation. In Pascal, hexadecimal notation is indicated by a '$' prefix, so we change the "1023" to "$3FF". We can easily translate the hexadecimal to binary, because each hexadecimal digit is equivalent to four bits like this:
$0 = 0000 $1 = 0001 $2 = 0010 $3 = 0011 $4 = 0100 $5 = 0101 $6 = 0110 $7 = 0111 $8 = 1000 $9 = 1001 $A = 1010 $B = 1011 $C = 1100 $D = 1101 $E = 1110 $F = 1111
So 3FF in hexadecimal translates to 0011 1111 1111 in binary (1023 in decimal). The $3FF makes it clearer that the number is 10 ones in a row. Here is the modified program:
program Rand2; var R: byte; I: integer; begin writeln; R:= 71; for I:= 1 to 272 do begin R:= ((longint(628) * R + 157) and $3FF) shr 2; write(R:4); if (I mod 16) = 0 then writeln; end; readln; end.
The program output is:
178 81 212 43 134 85 72 79 154 153 252 179 238 29 240 87 130 225 36 59 86 229 152 95 106 41 76 195 190 173 64 103 82 113 116 75 38 117 232 111 58 185 156 211 142 61 144 119 34 1 196 91 246 5 56 127 10 73 236 227 94 205 224 135 242 145 20 107 198 149 136 143 218 217 60 243 46 93 48 151 194 33 100 123 150 37 216 159 170 105 140 3 254 237 128 167 146 177 180 139 102 181 40 175 122 249 220 19 206 125 208 183 98 65 4 155 54 69 120 191 74 137 44 35 158 13 32 199 50 209 84 171 6 213 200 207 26 25 124 51 110 157 112 215 2 97 164 187 214 101 24 223 234 169 204 67 62 45 192 231 210 241 244 203 166 245 104 239 186 57 28 83 14 189 16 247 162 129 68 219 118 133 184 255 138 201 108 99 222 77 96 7 114 17 148 235 70 21 8 15 90 89 188 115 174 221 176 23 66 161 228 251 22 165 88 31 42 233 12 131 126 109 0 39 18 49 52 11 230 53 168 47 250 121 92 147 78 253 80 55 226 193 132 27 182 197 248 63 202 9 172 163 30 141 160 71 178 81 212 43 134 85 72 79 154 153 252 179 238 29 240 87
The last row is a repeat of the first row. Before the repeating starts, each of the values from 0 to 255 appears exactly once.
As we said before, the number sequence appears to be random, but unlike really random numbers like those generated by a cage full of numbered balls for the lottery, this sequence is predictable and repeatable. Every time we start with 71, we get the same sequence, which is actually a cycle that returns to 71 and starts over again. If we start with a different number, we get the same sequence except that we start at a different part of the same cycle. If we have a random number generator with a VERY long cycle, however, it is likely that the program that uses these random numbers will finish before the random number cycle is finished. In this case, the random sequence is not repeated. It is even likely, or almost certain, that by changing the starting number, we will get a completely different sequence.
What if we could get just one truely random, unpredictable number? If we start the random number generator with this 'seed' number when the program starts, the whole sequence would be (in some sense) unpredictable. What is usually done is that the program looks at the computer's clock, including the fractions of the second, to get a 'seed' number. For example, when a Solitaire game program does this, the deck of cards are 'shuffled' differently every time. But if you were using a program to do a scientific experiment with random numbers, and you wanted to repeat the experiment to check for errors, then you might want to start the random number generator with a fixed number, so that you could repeat some trouble that you were investigating.
Most random number generators are similar to this small one that we experimented with, but are larger, making sequences that don't repeat until after several billion numbers. For example, Borland Pascal's "Random" function uses the rule R:= 134775813 * R + 1, which generates a sequence of 4,294,967,296 values before it repeats. But the sequence is not quite as random as it should be. The values are alternately odd and even, and don't change from positive to negative (and back to positive) as often as they should. If we had a better random number generator, with a long period, and initialized by the computer's clock, it might be very hard to tell the difference between the pseudorandom sequence and a truly random sequence.
Many years ago, I noticed an article in a technical magazine about how to generate random numbers. The authors (Park and Miller) had tested many different random number generators, using some very sophisticated mathematics, and had concluded that the random number generators provided with most programming languages were not very good. They provided an algorithm for generating random numbers that they claimed was much better. (An algorithm is an exact procedure, whether written in English, Pascal, or some other programming language.) I wrote the Park/Miller algorithm in Pascal, as follows:
var S: longint; {seed, random state} const M = 2147483647; {MaxLongint} A = 16807; Q = 127773; R = 2836; function PMrandom: longint; {returns 1 <= X <= MaxLongint} var Lo, Hi, T: longint; begin Hi:= S div Q; Lo:= S mod Q; T:= A*Lo - R*Hi; if T > 0 then S:= T else S:= T+M; PMrandom:= S; end; {PMrandom}
This one is obviously more complex; and I won't try to explain how it works or why it is better. We will just accept that the experts think that it is better. The PMrandom function generates random positive longint (31-bit) values, and has a period of 2,147,483,646. The positive longint values range from 1 to 2147483647, and all values in this range are equally probable.
(Years later, I found another random number generator that was MUCH bigger, called the "Mersenne Twister". The length of its period is a 6000-digit number!)
Many programs have use for a random number generator, so I put the PMrandom function in a unit. But such programs generally need different kinds of random values. For example, if we are simulating the toss of a coin, we need a result that has two values that are equally likely. If we are simulating the toss of a die, we need a result that has six values that are equally likely. Or we might need to pick a card randomly. These all have one thing in common: the ability to choose one of N values randomly, with all N values equally likely. Another kind of randomness is to have something true with a certain probability. For example, when a target is hit, it randomly falls 80% of the time. Another kind of randomness is a value in some range that isn't necessarily an integer, such as a length between 3.5 and 6.75 feet.
So I started writing functions to generate various kinds of random values. Each of these functions used the PMrandom function, and all of these functions were put into the new unit.
To get random lengths between 3.5 and 6.75 feet, or random amounts of fuel between 2.5 and 10 gallons, and other such, we could use a random function that returns a real value in the range 0.0 to 1.0. We name this function Rrand (R for Real). To get random lengths between 3.5 and 6.75 feet, we compute "3.5 + (6.75 - 3.5) * Rrand". To get random amounts of fuel between 2.5 and 10 gallons, we compute "2.5 + (10 - 2.5) * Rrand".
How do we implement the Rrand function? To get a random real value in the range 0.0 to 1.0, we compute "Rrand:= PMrandom / 2147483647.0". Notice that we need to do a real division '/' rather than integer division 'div', to get a real result. For a real division, Pascal requires that at least one of the operands must be real. So we include ".0" after the "2147483647" to make it a real value. So, the Rrand function is simply this:
function Rrand: extended; {returns 0.0 < X < 1.0} begin Rrand:= PMrandom / 2147483647.0; end; {Rrand}
To choose one of N values randomly, with all N values equally likely, we can compute "PMrandom mod N". This will produce random values in the range 0 .. N-1, and the values in this range will be equally likely provided that N is much smaller than the largest longint (MaxLongint = 2147483647). We can satisfy this condition if N is a 'word' (a 16-bit unsigned integer). So, the Irand function (I for Integer) is:
function Irand(N: word): word; {returns 0 <= X < N} var L: longint; begin L:= PMrandom mod N; Irand:= L; end; {Irand}
We couldn't write "Irand:= PMrandom mod N" because PMrandom returns a longint, but Irand should return a word. So we pass the value through the longint variable L.
As an example, "Irand(6)" will return random values in the range 0 .. 5. If we want random values in the range 1 .. 6 (for a die), we write "Irand(6) + 1".
Next, how can we produce a random boolean value that is true with a certain probability P? If we generate a random real value R in the range 0.0 .. 1.0 by using the Rrand function, then the probability that R is less than P is P. That is, the condition "R < P" is true with probability P. A few examples will make this more understandable:
So, the Brand function for random booleans is:
function Brand(P: extended): boolean; {returns true with probability P} begin Brand:= (PMrandom / 2147483647.0 < P); end; {Brand}
We could have written "Brand:= (Rrand < P)", but since the Rrand function is so simple, the program can run faster with "Brand:= (PMrandom / 2147483647.0 < P)".
The Rrand, Irand, and Brand functions for random Reals, Integers, and Booleans are all 'uniform distributions', which means that all values are equally possible within the range of values for each of these types. Think of the probability being distributed uniformly (evenly) over the range of values. There is a non-uniform distribution that typically occurs when we are dealing with a large sum of random events -- mathematicians call it the 'normal distribution'. I included a Nrand function for this kind of random variable. For example, if you dumped a container of 1000 coins on the floor, the number of heads would be a random number that could range from 0 to 1000. But it would be far more probable for the number of heads to be between 490 and 510 than between 0 and 20 or than between 980 and 1000. In this example, the 'mean' (average) number of heads is 500. There is also a measure of how widely (or narrowly) the random values are distributed around the 'mean' value, called the 'standard deviation'. Nrand produces a normally-distributed random value with a mean of zero and a standard deviation of one. To get a normally-distributed random value with a mean of M and a standard deviation of S, compute M + S * Nrand.
I also wrote a procedure to start the random number generator with a 'seed' value. If the argument I is non-zero, then I is used as the seed value. If the argument I is zero, then the Randomize procedure of the System unit is used to get a seed value from the computer's clock. If the seed value is not in the range 1 .. 2147483647, then it is modified to put it in this range.
procedure PMrandomize(I: word); {initializes the randomizer} begin if I <> 0 then S:= I else begin System.Randomize; S:= System.RandSeed; end; S:= S and M; if S = 0 then inc(S); end; {PMrandomize}
All of these random functions, and the seeding procedure, are provided in the following unit. (You can also download the PMrand.pas file.)
unit PMrand; {Park / Miller random number generator} {see Comm. ACM Oct. 88 Vol. 31 No. 10.} {*******************************************************} { } { 2-16-95 Added Nrand function, changed real } { type to extended. } { } { 3-9-99 Made PMRandom far. } { } { 9-11-06 Fixed minor bugs. } { } {*******************************************************} interface {deterministic for I>0, non-deterministic for I=0:} procedure PMrandomize(I: word); {initializes the randomizer} {for functions PMrandom, Rrand, and Irand,} {all returned values in the range given are equally probable} {the basic routine:} function PMrandom: longint; {returns 1 <= X <= MaxLongint} far; {for real values:} function Rrand: extended; {returns 0.0 < X < 1.0} {for subrange integers:} function Irand(N: word): word; {returns 0 <= X < N} {for random booleans:} function Brand(P: extended): boolean; {returns true with probability P} { approximately normal distribution } { with mean = 0 and standard deviation = 1: } function Nrand: extended; {actual range is -6 .. 6} {*******************************************************} implementation var S: longint; {seed, random state} const M = 2147483647; {MaxLongint} A = 16807; Q = 127773; R = 2836; {*******************************************************} procedure PMrandomize(I: word); {initializes the randomizer} begin if I <> 0 then S:= I else begin System.Randomize; S:= System.RandSeed; end; S:= S and M; if S = 0 then inc(S); end; {PMrandomize} {*******************************************************} function PMrandom: longint; {returns 1 <= X <= MaxLongint} var Lo, Hi, T: longint; begin Hi:= S div Q; lo:= S mod Q; T:= A*Lo - R*Hi; if T > 0 then S:= T else S:= T+M; PMrandom:= S; end; {PMrandom} {*******************************************************} function Rrand: extended; {returns 0.0 < X < 1.0} begin Rrand:= PMrandom / 2147483647.0; end; {Rrand} {*******************************************************} function Irand(N: word): word; {returns 0 <= X < N} var L: longint; begin L:= PMrandom mod N; Irand:= L; end; {Irand} {*******************************************************} function Brand(P: extended): boolean; {returns true with probability P} begin Brand:= (PMrandom / 2147483647.0 < P); end; {Brand} {*******************************************************} {-------------------------------------------------------} { Here, we use the fact that the sum of N Rrand } { values has a mean value of N/2 and a variance } { of N/12, and is normally distributed for 'large' } { N. It is convenient to choose N = 12. } {-------------------------------------------------------} { approximately normal distribution } { with mean = 0 and standard deviation = 1: } function Nrand: extended; var I: integer; S: extended; begin S:= -6.0; for I:= 1 to 12 do begin S:= S + (PMrandom / 2147483647.0); { + Rrand} end; Nrand:= S; end; {Nrand} {*******************************************************} begin PMrandomize(0); end.
In the initialization section of the unit, "PMrandomize(0)" is called to seed the random number generator from the computer's clock. If you don't want it started this way, your program can call PMrandomize with some non-zero value.
Here is a program that demonstrates how to use the PMrand unit:
program Rand3; uses PMrand; type TCoin = array[boolean] of string[4]; const Coin: TCoin = ('head', 'tail'); NCoins = 10; NDice = 8; var Seed: longint; C: boolean; I, Tails, D1, D2: integer; begin write('Seed value = '); Readln(Seed); PMrandomize(Seed); writeln; writeln('Tossing ', NCoins, ' coins:'); Tails:= 0; for I:= 1 to NCoins do begin C:= Brand(0.5); write(Coin[C]:5); if C then inc(Tails); end; writeln; writeln(Tails, ' tails and ', NCoins-Tails, ' heads'); writeln; writeln('Tossing ', NDice, ' pairs of dice:'); for I:= 1 to NDice do begin D1:= Irand(6)+1; D2:= Irand(6)+1; write(' ', D1, '+', D2, '=', D1+D2); end; writeln; writeln; writeln('Six random angles between 180 and 270 degrees:'); for I:= 1 to 6 do begin write(180 + 90*Rrand :8:3); end; writeln; readln; end.
Notice how the structured constant named 'Coin' is used to translate the boolean values false and true to the string values 'head' and 'tail'. Notice also how we can use an expression to modify the range of a random variable by 'rescaling' the random function. For example, the expression "180 + 90*Rrand" obtains a range of 180 to 270; but Rrand by itself has a range of 0 to 1. To determine the new range from the old range, substitute the old range limits 0 and 1 into the expression, like this:
180 + 90*0 = 180 180 + 90*1 = 270
(Since Rrand returns a real value, 90*Rrand becomes a real value, even though 90 is an integer; and likewise the addition is real, even though 180 is an integer.)
A typical output of this demo program is:
Seed value = 45 Tossing 10 coins: tail head tail head head head tail head head tail 4 tails and 6 heads Tossing 8 pairs of dice: 2+5=7 6+6=12 2+5=7 4+4=8 2+4=6 4+4=8 2+5=7 5+5=10 Six random angles between 180 and 270 degrees: 192.458 218.372 244.797 229.822 266.799 206.902
If you enter a seed value of 45, you will always get this output; and if you enter a seed value of 91, you will always get some other output. But if you enter a seed value of 0, the output will always be different, because the program will actually use a seed taken from the computer's clock time.
Most programs that use the PMrand unit will not call the PMrandomize procedure, because the initialization section of PMrand calls PMrandomize to start the random sequence from the computer's clock -- and this is what most programs need.