Random Numbers For Simulations

13d ago
1.72 MB
132 Pages

Random Numbers for SimulationsPseudo Random Number GeneratorsRoberto InnocenteSISSA, Triestefor the joint ICTP/SISSA MHPC.it master courseApr 4, 2017 - Revised versionRoberto InnocenteRandom Numbers for Simulations

Where it all began ?At the end of 1700 George Leclerc, Comte de Buffontried to compute π with random experiments.Roberto InnocenteRandom Numbers for Simulations

George-Louis Leclerc, Comte de Buffon, 1777 :Montecarlo method ante-litteramA needle of length 1 is thrown over a lined paper with lines every 1unit . The probability that it hits a line can be computed as : Z πsin(θ)cos(θ) πShaded Portion Area dθ 122θ 00Prob 1Shaded Portion Area Area of rectangleπ/2Roberto Innocente π Random Numbers for Simulations2Prob

Example : Buffon’s needle in R# R codebuffonp - function ( n ) {k 0;for ( i in 1: n ) {theta runif (1 , min 0 , max pi )y runif (1 , min 0 , max 1/2) ;if ( y 1/2* sin ( theta ) 1/2) k k 1 }return ( k / n )}for ( i in 1:6) {w 10 i ; bp buffonp ( w )cat ( ’ rn ’ ,w , ’ computed pi ’ ,2.0/ bp , ’ error ’ , abs ( pi -2.0/ bp) , ’\n ’)}# rn 10 computed pi 2 error 1.141593# rn 100 computed pi 3.389831 error 0.2482379# rn 1000 computed pi 3.04878 error 0.09281217# rn 10000 computed pi 3.134305 error 0.007287686# rn 1 e 05 computed pi 3.138584 error 0.003008469# rn 1 e 06 computed pi 3.144595 error 0.003002102Roberto InnocenteRandom Numbers for Simulations

RNG, TRNG, PRNG ITRNG (True Random Number Generators)noise based e.g from atmospheric noisehttps://www.random.org/free running oscillator (FRO) : simplest and cheapest waychaos basedquantum based e.g. Geiger counters, fluctuations of vacuumhttps://qrng.anu.edu.au/PRNG (Pseudo Random Number Generators) or simply RNGalgorithmicWe want to be able to produce RN fast, in a cheap way, we needrepeatability : we need algorithmic RNG !!.There is a subclass of PRNG that will not be covered here and it isthe class of cryptographically robust PRNG. These require thespecial quality of unpredictability that is computationally expensiveand is never shown by the efficient PRNG we need for simulations.Roberto InnocenteRandom Numbers for Simulations

Divide and conquerGeneration of Random Numbers is usually splitted into :1Generation of uniformly distributed integers xi in[0 . . . (m 1)]2Mapping of integers in [0 . . . (m 1)] to uniformly distributedxireals in U(0, 1) using ui m. In many cases the 1st step isallowed to produce 0 while usually we want the 2nd step notxi 1to produce it. Therefore often is used ui m 1.3Mapping of uniformly distributed reals in U (0, 1) to thewanted CDF (Cumulative Distribution Function) F (x) usingmost of the time its inverse F 1 (x)This is the reason why we will center our discussion about uniformrandom numbers on (0, 1): U (0, 1).Roberto InnocenteRandom Numbers for Simulations

How to map a uniform variate or deviate U (0, 1) in adifferently distributed variate? IWe say that a sequence of numbers is a sample from a cumulativedistribution function F , if they are a realization of a rv with CDFF . If F is the uniform distribution over (0, 1) then we call thesamples from F uniform deviates/variates and we write U (0, 1).Theorem : inversionSuppose U U (0, 1) and F to be a continuos strictly increasingcumulative distribution function (CDF). Then F 1 (U) is a samplefrom F .Transform to a Normal deviate :Box-Muller transform is more efficient:123generate U1 U (0, 1) and U2 U (0, 1) θ 2πU2 , ρ 2 log U1Z1 ρ cos θ is a normal variateRoberto InnocenteRandom Numbers for Simulations

Example in R# exponential distribution : R codepdf ( file ’ exp . pdf ’)lambda 1; x seq (0 ,5 ,0.05)y exp ( - lambda * x ) ; z 1 - exp ( - lambda * x )plot (x ,y , type ’n ’)title ( ’ Exponential distr : l * exp ( - l * x) ’)lines (x ,y , col ’ red ’) ; lines (x ,z , col ’ green ’)text (3 ,0.5 , ’ RN obtained from uniformdeviate using inversion ’)text (2 ,0.8 , ’ Exponential CDF ’ , col ’green ’)text (2 ,0.1 , ’ Exponential pdf ’ , col ’red ’)0.81.0Exponential distr: l*exp( l*x)0.6Exponential CDF0.20.4yRN obtained from uniform deviate using inversion0.0Exponential pdf0123x45invcdf - function ( yy ) { lambda 1;xx - log (1 - yy ) / lambda ;return ( xx ) ; }w runif (1000) ; ic invcdf ( w )lines ( ecdf ( ic ) , xlim c (0 ,5) , ylim c(0 ,1) )Roberto InnocenteRandom Numbers for Simulations

Qualities of Good RNGGood Theoretical BasisLong Period”pass” Empirical TestsEfficientRepeatablePortableRoberto InnocenteRandom Numbers for Simulations

Theoretical frameworkTheoretical Framework for Random Number Generators :SSet of statesinitial states0 initial statecurrent stateg:S-- (0,1)outputfunctionnext statef:S-- Stransitionfunction(0,1)Sset of all states/seeds (e.g. 1 integer - 2ˆ32 states)f:S-- Stransition function that moves the rng to the next stateg:S-- (0,1) output function that from a state outputs a number in the (0,1) intervalAn upper bound on the period of the generator is the cardinality of S : S Roberto InnocenteRandom Numbers for Simulations

Pre-period, PeriodThere can be confusion about these terms, they refer to the generalcase in which a generator can cycle skipping some initial outcomes.Roberto InnocenteRandom Numbers for Simulations

History of field based on scholars leadersVon Neumann was maybe the first to devise an algorithm : themiddlesquare method. A few leaders in the field during more than75 years of electronic computing were in succession :Donald KnuthGeorge MarsagliaPierre L'EcuyerRoberto InnocenteRandom Numbers for Simulations

Donald Knuthborn 1938, PhD at Caltech, worked at Stanford,now Professor Emeritus at StanfordWriter of the first bible of algorithms. 3 and now4 books nick-named TAOCP: The art ofcomputer programming.Creator of TEX and METAFONT and theComputer Modern family of fonts. Writer of the5 books about them : Computer andtypesetting: A,B,C,D,EVolume 2 of The Art Of ComputerProgramming : Seminumerical Algorithms(1998)dedicates the 189 pages of chapter 3 to RandomNumbers. In it there is also the description of abattery of rng tests.Roberto InnocenteRandom Numbers for Simulations

Knuth reports his attempt at a Super-random ng.Sometimes complexity hides simple behaviourThe first time Knuth ran this program, it converged quickly to6065038420 (a fixed point of the algorithm). After this time it wasmainly converging to a cycle of length 3178 !Roberto InnocenteRandom Numbers for Simulations

Linear Congruential Generators or LCG ILCG notationxn 1 (a xn c)(mod m)will be indicated by LCG (m, a, c, x0 ). m is called the modulus, athe multiplier , c the increment, x0 the starting value or seed. Weuse c like Knuth does and we set b a 1 for convenience.Introduced by Lehmer in 1949. Sometimes when c 0 they arecalled Multiplicative LCG or MLCG and denoted byMLCG (m, a). When c 6 0 Mixed Linear Congruential.Lehmer generator is u0 6 0, un 1 (23 un ) (mod 108 1)ANSICLCG (231 , 1103515245, 12345, 12345)Super-duper LCG (232 , 69069, 0, 1)MINSTD LCG (231 1 , 75 , 0, 1)NAG LCG (259 , 1313 , 0, 232 1)RANDU LCG (231 , 216 , 0, 1)DRAND48LCG (248 , 25214903917, 11, 0)APPLE LCG (235 , 513 , 0, 1)Roberto InnocenteRandom Numbers for Simulations

Linear Congruential Generators or LCG IIc 0 takes less time to compute, but cuts down the period ofthe sequence that anyway can still be longit can be proved thatxn k (ak xn (ak 1)c/b)(mod m)that expresses the n k term in terms of the n term. Inparticular respect to x0 .xk (ak x0 (ak 1)c/b)(mod m)That is: the subsequence consisting of every k th term is alsoan LC sequence.Choice of m : should be large because it’s a limit for theperiod ρ of the rng, should make it simple to compute(a xn c) (mod m),Roberto InnocenteRandom Numbers for Simulations

Linear Congruential Generators or LCG IIIMLCG (m, a) :If m is prime and a is a primitive root of m and x0 6 0 then thesequences {xn } are periodic with period length ρ m 1 and thegenerator is called a full period MLCG.If m 2w then the maximal period is ρ 2w 2 m/4 and isattained in particular when a 5 (mod 8).[14] We want a large m to make the grid of RN finer. But we needto keep m not larger than a computer word so that we can dooperations efficiently. Therefore we choose m 232 for 32-bitprocessors or m 264 for 64-bit processor. The theory is nicer if mis prime or is a power of 2 like 2w . Common choices:32-bit procPrime2km 231 1 m 232Roberto Innocente64-bit procPrime2km 248 59 m 264m 263 25m 264 59Random Numbers for Simulations

Linear Congruential Generators or LCG IVLCG (m, a, c, x0 ) :An LCG has full period m if and only if :1The GCD(Greatest Common Divisor) of m and c is 1.2if q is a prime that divides m then q divides (a 1).3if 4 divides m, then 4 divides (a 1)(Hull-Dobell Theorem)A lot of work has been done on these generators especially to givemultipliers that provide as little as possible of Marsaglia’s effect.You can’t use them without reading [14] that for common wordsizes computes the highest prime smaller than the largest integerand gives good multipliers , e.g.:Roberto InnocenteRandom Numbers for Simulations

Figure: from l'Ecuyer1988, MLCG :m 2e , c 0Roberto InnocenteRandom Numbers for Simulations

Figure: from l'Ecuyer1988, LCG : m 2e ,coddRoberto InnocenteRandom Numbers for Simulations

Figure: l'Ecuyer1988, LCG : m primeRoberto InnocenteRandom Numbers for Simulations

LCG : low order bits are less randomFigure: Lowest order bit 256x256 b0 , 256x256 third bit b2 , 256x256successive bitsRoberto InnocenteRandom Numbers for Simulations

George Marsagliaborn 1924, † 2011, PhD Ohio State, thenUniversity of Florida, University of Washingtondiscovered what is called Marsaglia’s effect. Thesuccessive n-tuples generated by LinearCongruential Generators (LCG) lie on a smallnumber of equally spaced hyperplanes inn-dimensional space.developed the diehard statistical tests for rng,1996developed many of the well known methods forgenerating rn : multiply-with-carry, subtractwith borrow, xorshift , KISS93, KISS99, . . .Roberto InnocenteRandom Numbers for Simulations

Fibonacci, Lagged Fibonacci(Marsaglia 1983) IProbably you know the Fibonacci's sequence : an attempt made byLeonardo Fibonacci (aka il Pisano, born in Pisa 1175, † 1245)to model the growth of a population of rabbits xn xn 1 xn 2 .Why not to generate rn based on a longer previous history ?FibonacciXn Xn 1 Xn 2(mod m) , denoted F (1, 2, )has poor distribution qualities.Lagged FibonacciXn Xn k Xn l(mod m) , denoted F (k, l, ) is a generic operator from , , , . is the XOR binaryoperator. k , l are called lags . Lags larger then 16 produce goodrng. k 24, l 55 was studied extensively, like 30, 127. TheyRoberto InnocenteRandom Numbers for Simulations

Fibonacci, Lagged Fibonacci(Marsaglia 1983) IIwere used extensively, but in the ' 90 it was discovered that theyfail a famous test of randomness (but a workaround exists). Theone proposed by Marsaglia is xn xn 5 xn 17 (mod 2k ). Periodof this lagged Fibonacci is 2k (217 1), quite longer than LCGs.State is an array of 17 integers.Roberto InnocenteRandom Numbers for Simulations

Marsaglia’s theoremRoberto InnocenteRandom Numbers for Simulations

Lattice structure of LCG generatorsMarsaglia's effect. Successive t-uples obtained from an LCGgenerator fall on, at most, (t!m)1/t parallel hyperplanes, where mis the modulus used in the LCG(marsaglia1968) : 70000 60000 50000 40000 30000 20000 10000 000100002000030000400005000060000y[seq(2, n (n%%3), 3)]y[seq(3, n (n%%3), 3)]10000 20000 30000