[Beowulf] OT: public random numbers?
Vincent Diepeveen
diep at xs4all.nl
Fri Aug 26 09:53:14 PDT 2011
On Aug 26, 2011, at 2:57 PM, Robert G. Brown wrote:
> On Fri, 26 Aug 2011, Vincent Diepeveen wrote:
>
>> EVERY PROGRAMMER IS DOING THIS TO USE RANDOM NUMBERS IN THEIR
>> PROGRAM.
>
> Bullshit. "Every programmer" isn't dumb as a post. Or wasn't my
> argument clear enough? Do you need me to actually post the code
> for how
> the GSL -- written by at least some of these programmers -- do this?
> Here, I'll try again. This time I'll use smaller numbers and make an
> actual table of the outcomes: Imagine only two lousy random bits,
> enough to make 00, 01, 10, 11 (or 0,1,2,3). Here is the probability
> table:
>
> r = 0 1 2 3
> ------------------------
> p = 0.25 0.25 0.25 0.25
>
> Let us generate N samples from this distribution. Our expected
> frequency of the occurrence of all of these numbers is:
>
> r = 0 1 2 3
> ---------------------------------
> Np = 0.25*N 0.25*N 0.25*N 0.25*N
>
> Is this all clear? If I generate 100 random numbers, the expected
> number of 3's is 0.25*100 = 25. Now apply mod 3 the outcomes are
> now:
>
> r = 0 1 2 3
> r%3 = 0 1 2 0
> ---------------------------------
> Np = 0.25*N 0.25*N 0.25*N 0.25*N
>
> You now sum the number of outcomes for each element in the mod 3
> table,
> since we have two values of r that make one value of r%3 and frequency
> clearly aggregates as the outcomes are independent.
>
> r%3 = 0 1 2
> ---------------------------------
> Np = 0.50*N 0.25*N 0.25*N 0.25*N
>
> It is therefore twice as likely that two random bits, modulus 3, will
> produce a zero.
>
If you have a domain of 0..3 where a generator generates and your
modulo n is
just n-1, obviously that means it'll map a tad more to 0.
Basically the deviation one would be able to measure in such case is
that if
we have a generator that runs over a field of say size m and we want
to map
that onto n entries then we have the next formula :
m = x * n + y;
Now your theory is basically if i summarize it that in such case the
entries
0..y-1 will have a tad higher hit than y.. m-1.
However if x is large enough that shouldn't be a big problem.
If we map now in the test i'm doing onto say a few million to a
billion entries,
the size of that x is a number of 40+ bits for most RNG's.
So that means that the deviation of the effect you show above the
order of
magnitued of 1 / 2^40 in such case, which is rather small.
Especially because the 'test' if you want to call it like that, is
operating in the
granularity O ( log n ), we can fully ignore then the expected
deviation granularity O ( 2 ^ 40 ).
>>
>> Apologies for the caps. I hope how important this is. You're
>> claiming all programmers
>> use random numbers in a faulty manner?
>
> They don't. Only you do. Everybody else takes a uniform deviate and
> scales it by the number of desired integer outcomes, checking to make
> sure that they don't go out of bounds and thereby e.g. get an
> incorrect
> endpoint frequency. The gsl code is open source and it takes two
> minutes to download it and check (I just timed it). Go on, look. the
> file is rng/rng.c in the gsl distro directory, the function name is
> gsl_rng_uniform_int. No modulus.
>
> The exception is (obviously) when the range is a power of 2. In that
> case ONLY, r%n where r is a binary uint and n is a power of 2 will
> (obviously) equally balance the table above. Personally I'd use >>
> and
> shift the bits because it is faster than mod, but suit yourself, after
> you've learned what you are doing.
>
>>
>> This is important enough to further discuss about it.
>>
>> As nearly always you need random numbers from within a given
>> domain say 0.. n-1
>> So projecting a RNG onto that domain is pretty crucial. How would
>> you want to do that in a correct manner?
>>
>> In the slot test in fact a simple AND is enough.
>
> No, as I've just proven algebraically. The correct manner for
> general n
> is the gsl code, but in rough terms it is n*r/r_max (with care used to
> avoid roundoff errors at the ends as noted). If you've been using
> modulus, all your results are crap.
>
> Look, the reason God invented the GSL and made it open source is so
> numb-nuts and smart people alike wouldn't have to constantly reinvent
> the wheel, badly. Use it. Don't question it -- you obviously aren't
> competent to. Just use it. If you want a random integer from 0 to n,
> use gsl_rn_uniform_int. If you want this for e.g. mt19937 don't write
> the latter, set up the gsl to use it to generate your ints. Learn to
> use it carefully, use it correctly, but use it.
>
>> It's not interesting to discuss - but yes this strategy makes
>> money in casino's,
>> you just get thrown out of the casino and end up at the blacklist
>> if you do.
>
> You are clearly too stupid to be allowed out of the house without a
> caretaker. I'm not going to walk you through the proof that this
> isn't
> so as it is openly published and I've already referenced a step my
> step
> analysis that you can't be bothered, apparently, to actually read.
> I'll
> just reiterate the previous offer -- I, too, am happy to buy a
> roulette
> wheel and you can come over and bet Martingale against me all day.
> Just
> one 0, no limits and no quitting, infinite credit on both sides, we
> play
> until it is obvious to you that you are losing, have lost, will always
> lose, and the longer you play the more that you will lose. Loser buys
> the winner a case of truly excellent beer.
>
> Look, why don't you fix your random number code and try again, since
> your simulations are obviously trash. It isn't difficult to show this
> with simulations, once you actually code them correctly, but I have to
> go and don't have time to do it for you.
>
> rgb
>
> Robert G. Brown http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
>
>
More information about the Beowulf
mailing list