Random Number Generation. Was Re: Beowulf Questions

Florent Calvayrac Florent.Calvayrac at univ-lemans.fr
Mon Jan 6 00:28:25 PST 2003

Randall Jouett wrote:
> Hi Donald (folks),
> On Sat, 2003-01-04 at 11:29, Donald Becker wrote:
>>On Sat, 4 Jan 2003, Florent.Calvayrac wrote:
>>>>This is trivial.  Once you get a good a serial pseudo-random number
>>>>generator, you can use a cluster to generate more.
>>>Sorry to disagree here, but you forget that you can not be sure
>>>that the initial seed on one of the machines will not
>>>be generated by another one...so you can end up with
>>>completely correlated series. Check "scalable parallel number
>>>generator" (sprng), working well for us, using different
>>>recurrence formulas on different machines.
>>I was assuming a sophisticated RNG.  With such, the likelyhood of
>>identical seeds is very low, exactly the same as correlation within the
>>number stream.  Anyone that needs a cluster to generate random numbers
>>will be far beyond using a LFSR with a small seed.

well, theoretically at least, I am right. I agree that the probability
of identical seeds is the same than the one of correlation (this is 
trivial), but this probability increases with the number of nodes and
the length of the run. Robert G Brown pointed that out as well.

>>I'll even put forth a hand-waving argument that multiple machines will
>>be working from a much richer entropy pool, and thus generate better
>>quality numbers.

is this a joke ? For sure also, the more expensive the computer, the
better the results are ! (especially with blinkenlights)

> Yep, and let's not forget that a semi-decent sized, hardware-encryption
> based cluster could be set aside to generate the initial seeds, and then
> the seeds could propagate over the entire network,
> To take things even further, another cluster could be used that would
> try guess the random sequences being generated (pattern matching), and
> if it found something, it could report back to the entropy cluster and
> tell it to change things up and get with the program :
> Taking this method a step further still, MPI latency might even able to
> be used (delta time between the compute nodes and the head) 

this is certainly much simpler to implement than downloading sprng,
(on http://sprng.cs.fsu.edu/ )
or just implementing a node-number dependent recurrence formula on each 
machine, as RGB is also doing. Measuring an MPI latency time
(in order of several thousands cycles), communicating back and forth,
and running the same calculation on another cluster to predict the 
results as you suggest is also certainly faster and more elegant to 
achieve independence of the streams.



Florent Calvayrac                          | Tel : 02 43 83 26 26
Laboratoire de Physique de l'Etat Condense | Fax : 02 43 83 35 18
UMR-CNRS 6087         | http://www.univ-lemans.fr/~fcalvay
Universite du Maine-Faculte des Sciences   |
72085 Le Mans Cedex 9

More information about the Beowulf mailing list