Not-So-Random Numbers

By Robert Edwards

Recently Dr.William A.Huber described the characteristics of the ArcView random number generator in his excellent web page ArcView Random Numbers.In his conclusions, Huber urges ArcView users to consider alternatives to a naive use of the ArcView MakeRandom function.

The MapInfo RND random number function is also flawed, and users should likewise be aware of its characteristics before using it in their applications.To see the pecular characteristics of the MapInfo random number generator, consider the following example program.

Define true 1
Define false 0

Dim i,j as integer
Dim D(30000), E(30000) as float
Dim identical as logical
print chr$(12)
set window message position (1,1 ) Height 4

for j = 1 to 3
print "Cycle #:"+Str$(j)
' Generate 30,000 random numbers first in the D array
print "Generating the 30,000 random"
print " numbers in the D array"
for i = 1 to 30000
D(i) = rnd(1)
next

' Now generate 30,000 random numbers in the E array
print "Generating the 30,000 random"
print " numbers in the E array"
for i = 1 to 30000
E(i) = rnd(1)
next

' Now compare the random numbers in the two arrays
print "Start comparing the D and E arrays"
identical = true
for i = 1 to 30000
if D(i) <> E(i) then
identical = false
print "Different at i = "+Str$(i)
end if
next

if identical then
print "D and E arrays are identical"
end if

next

This program first generates 30,000 random numbers in the "D" array, then 30,000 in the "E" array, and finally compares the values in the two arrays.And this process is repeated three times (called cycles). The results are very surprising:

Cycle #1
Generating the 30,000 random
numbers in the D array
Generating the 30,000 random
numbers in the E array
Start comparing the D and E arrays
Different at i = 1
Different at i = 2

Cycle #2
Generating the 30,000 random
numbers in the D array
Generating the 30,000 random
numbers in the E array
Start comparing the D and E arrays
D and E arrays are identical

Cycle #3
Generating the 30,000 random
numbers in the D array
Generating the 30,000 random
numbers in the E array
Start comparing the D and E arrays
D and E arrays are identical

The number of elements that are different at the beginning of the first cycle of the example is variable and can range from none different, to as many as eight different.If you just rerun the example without closing the MapInfo application, the D and E arrays are always identical for all three cycles.

Now lets see what happens if we insert a "Randomize" statement at the beginning of a cycle.The result is the same as if one started MapInfo anew at the beginning of each cycle: as many as eight of the first numbers are different at the beginning of each cycle, the rest are identical.

So from these experiments we can conclude that the MapInfo random number function has a cycle of 30,000 numbers, and there appears to be a "bug" in the randomization that causes differences in only the first n numbers (n<=8) generated in an instance of the MapInfo application. This short cycle causes a "granularity" in the random numbers, as can be seen in the following display of 5,000 points with random (x,y) coordinates:

5,000 Points with random coordinates defined by MapInfo's RND(1)
Inset from 5,000 points with random coordinates defined by MapInfo's RND(1)
Click to enlarge and see all points

So what is a better alternative to the MapInfo RND function? The current state-of-the-art in random number generation appears to lie in the algorithms described as "multiple recursive generator" routines, or MRGs.These algorithms are similar to the conventional recursive generators that most random number generators use, except that several random number generations are produced in parallel, and the returned number is a composite of all the several chains.For example, one algorithm that has received a lot of attention has the form:

xn = (a1xn-1 + a2xn-2 + a3xn-3 ) mod m

where xn is the random number of cycle n, and a1, a2, a3 and m are generating parameters.

The parameters for an algorithm suggested by Grube in his PhD thesis gives parameters for generating random numbers between zero and the maximum four byte integer (231-1).The algorithm requires 64-bit integer arithmetic, which is easily implemented in "C" or Delphi.One such example is the random number function in the MapTools SDK (which is written in Delphi) which produced the following display of 5,000 points with random coordinates:

5,000 Points with random coordinates defined by MapTools SDK GrubeMRG function
5,000 Points with random coordinates defined by the MapTools SDK GrubeMRG function
Click to enlarge and see all points

So the bottom line is that you should be cautious in using the MapInfo random number function.The safest recommendation is to rely on one of the more modern algorithms that has received the scrunity of theoretical and empirical testing, such as the modern MRG algorithms.

Reference: PLAB - A Server on the Theory and Practice of Random Number Generation (University of Salzburg)

Published Tuesday, January 11th, 2000

Written by Robert Edwards



If you liked this article subscribe to our newsletter...stay informed on the latest geospatial technology

© 2016 Directions Media. All Rights Reserved.