[Beowulf] Open source prime number application

Peter St. John peter.st.john at gmail.com
Thu Aug 16 14:53:40 PDT 2007


actually I'm ambitious to build a small one myself, why I lurk here, but
haven't committed to a specific plan yet. I really ought to get started with
just a couple of heterogenous clunkers sitting around my place. Will be
edifying.
Peter


On 8/16/07, Jack C <jack at crepinc.com> wrote:
>
> Haha ok, too bad you didn't know me a year ago when i had one with nothing
> to do on it...
>
> -Jack Carrozzo
>
> On 8/16/07, Peter St. John < peter.st.john at gmail.com> wrote:
> >
> > Ssssh. I'm trying to take over Tim's cluster.
> > peter
> >
> >
> >  On 8/16/07, Jack C <jack at crepinc.com> wrote:
> > >
> > > Peter,
> > >
> > > You can always install MPI on a single host to test your code, even if
> > > you don't have a cluster. You can even run multple threads on that single
> > > host, it just won;t have the speedup that multiple hosts would.
> > >
> > > -Jack Carrozzo
> > >
> > > On 8/16/07, Peter St. John <peter.st.john at gmail.com > wrote:
> > > >
> > > > Tim,
> > > > I thought about this some. I don't have a cluster, but I can program
> > > > in C; you have a cluster, but don't want to program too much. It's possible
> > > > we can help each other.
> > > >
> > > > A toy application I thought of is finding numbers that can written
> > > > as a sum of two cubes in two different ways (there's a famous story about
> > > > Ramanujan recognizing the four-digit number of a taxicab as "the smallest
> > > > number than can be expressed as a sum of two cubes in two distinct ways").
> > > >
> > > > So I had this plan. I went to wiki and got the source of an MPI
> > > > "hello world" application, http://en.wikipedia.org/wiki/Message_Passing_Interface
> > > >
> > > >
> > > > Then I wrote a C program to take a range of numbers, say 1000 to
> > > > 1999, and find the numbers in that interval that have the taxicab quality
> > > > (it turns out that's what such numbers are called now!).
> > > > Since I don't have the MPI library or a cluster I can't test the
> > > > integrated thing, so at the moment all I have is the wiki example (which
> > > > just fills a buffer with "I am node #<whatever>" for later reporting), which
> > > > presumably is an OK example of MPI, and my own code (which just does
> > > > arithmetic) which I tested myself. So if you wanted, you could try and
> > > > integrate these, and see if you can find bigger reults by compiling for 64
> > > > bit arithmetic.
> > > >
> > > > My program wastes a huge amount of redundant calculation; it's only
> > > > virtue is to split the memory required among nodes. So it computes x^3 + y^3
> > > > for silly huge ranges of x and y, at each node, but only stores results for
> > > > comparisons in limited ranges which can be split up among nodes.
> > > >
> > > > I was disappointed at first because I only got 1729 (which was
> > > > Ramanujan's number, the smallest), and 4,104:
> > > >
> > > > C:\PeteStJohn\Experiments\Ramanujan\Debug>ramanujan
> > > > Deubg: i = 0
> > > > DEBUG: HIT:
> > > >  01729 = 12^3 + 1^3,
> > > >  01729 = 10^3 + 9^3
> > > > DEBUG: HIT:
> > > >  04104 = 16^3 + 2^3,
> > > >  04104 = 15^3 + 9^3
> > > > Deubg: i = 1000
> > > > DEBUG: i = 1588 is too big.
> > > > Findrama made 2 hits
> > > > Winner 4104:
> > > > = 16 + 2
> > > > = 15 + 9
> > > >
> > > > but I checked, putting those two numbers, 1729 and 4104, into the
> > > > Online Encyclopedia of Integer Sequences http://www.research.att.com/~njas/sequences/?q=1729%2C+4104&language=english&go=Search
> > > > , and that got me the "Taxi Cab Sequence" and the next value is
> > > > 13,832. My program is naive, using an exhaustive and redundant search, with
> > > > only 32 bit, so I get stuck when 1588 is too big becaues it's cube is bigger
> > > > than about 4 billion, the extent of a 32 bit int. Also I didn't take the
> > > > time even to accommodate the signum bit.
> > > >
> > > > I suggest you look at the wiki link. If you want to hack that code
> > > > (which could be confusing because the address of something in this node's
> > > > memory may not make sense when passed to some other node, but the MPI
> > > > library makes some accommodation) let me know and I'll send you my C
> > > > program, to call from inside that Wiki program.
> > > >
> > > > You might start by just running the wiki example, and see if it
> > > > works and what you have to do to make it work. If that seems like fun then
> > > > we can try doing math with it.
> > > >
> > > > I think I'll integrate them myself, but I wouldn't be able to test
> > > > it directly. If you think this toy project is the kind of thing you could
> > > > use let me know. Maybe some day you will run my genetic algorithm app, which
> > > > is not a toy, for me, and I won't need a cluster :-)
> > > >
> > > > Well, here. It's not that big a program so I'll just paste it in.
> > > > Serious software engineers may avert their eyes.
> > > >
> > > >
> > > > // FindRama
> > > > // Toy parallelizable math application
> > > > // Pete St.John 8/2007
> > > > //
> > > >
> > > > #include <stdio.h>
> > > > #include <stdlib.h>
> > > > #include <malloc.h>
> > > >
> > > >
> > > > #define BIGGEST 4000000000
> > > > // sloppy way to deal with 32-bit limitation
> > > >
> > > > struct pair
> > > > {
> > > >  long x;
> > > >  long y;
> > > > };
> > > > // a pair of integers; the sum of their cubes matters to us.
> > > >
> > > > struct winner
> > > > {
> > > >  long win;
> > > >  struct pair first;
> > > >  struct pair second;
> > > > };
> > > > // any pair you want to keep and report, e.g. the largest summand
> > > > found in the range.
> > > >
> > > > int findrama(int myid, long z1, long z2, struct winner *winnerp);
> > > > // findrama finds numbers which are sums of cubes in two distince
> > > > ways,
> > > > // e.g. 1729 = 12^3 + 1^3 but also = 10^3 + 9^3
> > > > // "myid" is the hypothetical node number of the MPI instance that
> > > > would invoke this
> > > > // z1 and z2 are the range of numbers to search, e.g. 1729 might be
> > > > found
> > > > // between 1000 and 19999;
> > > > // winner is the "best" result found, which in this example is the
> > > > largest.
> > > >
> > > >
> > > > int main(int argc, char **argv)
> > > > {
> > > >  // I'm not doing much in this main. We want to use the wiki MPI
> > > > example
> > > >  // main to call "findrama" with a range of numbers, perhaps based
> > > > on myid; e.g.
> > > >  // node 1 does z = 1 to 1999, node 2 does 2000 to 2999, etc, based
> > > > on the number
> > > >  // of nodes you have and the largest arithmetic you can do.
> > > >
> > > >  int myid;
> > > >
> > > >  int hits;
> > > >
> > > >  struct winner mywinner;
> > > >
> > > >  myid = 0;
> > > >
> > > >  hits = findrama(myid, 1, 10000, &mywinner);
> > > >
> > > >  printf("Findrama made %d hits\n", hits);
> > > >
> > > >  if(hits < 0)
> > > >  {
> > > >   printf("Error, probably malloc error.\n");
> > > >   return 0;
> > > >  }
> > > >
> > > >  printf("Winner %d:\n", mywinner.win);
> > > >  printf("= %d, %d\n", mywinner.first.x, mywinner.first.y);
> > > >  printf("= %d, %d\n", mywinner.second.x, mywinner.second.y);
> > > >
> > > >  return 0;
> > > > }
> > > >
> > > > int findrama(int myid, long z1, long z2, struct winner *winnerp)
> > > > {
> > > >  struct pair *p;
> > > >  long range;
> > > >  long i, j, k;
> > > >  int hits;
> > > >  long bigz, z;
> > > >  long xi, xj;
> > > >  long xsum;
> > > >
> > > >  //struct pair newpair;
> > > >  struct winner mywinner;
> > > >
> > > >  bigz = 0;
> > > >
> > > >  hits = 0;
> > > >
> > > >  range =  (z2 - z1);
> > > >  p =  malloc( range * sizeof(struct pair));
> > > >  //
> > > >  // malloc should return a pointer to a range of allocated memory
> > > >  // large enough for "range" pairs, and pair takes up space for two
> > > > long integers.
> > > >  //
> > > >
> > > >  if(!p)
> > > >  {
> > > >   fprintf(stderr, "Error: unable to malloc at myid = %d\n", myid);
> > > >   return -1;
> > > >  }
> > > >  // in the zth offset we will store the pair x, y st x^3 + y^3 = z^3
> > > >  // for z in the range from z1 <= z < z2
> > > >
> > > >  // zero out
> > > >  for(i=0; i < range; i++)
> > > >  {
> > > >   p[i].x = 0;
> > > >   p[i].y = 0;
> > > >
> > > >  }
> > > >
> > > >  //z2cube = z2 * z2 * z2;
> > > >
> > > >  for(i = 0; i < z2; i++)
> > > >  {
> > > >   xi = i*i*i;
> > > >
> > > >   if(xi > BIGGEST)
> > > >   {
> > > >    printf("DEBUG: i = %d is too big.\n", i);
> > > >    break;
> > > >   }
> > > >
> > > >   // debug
> > > >   if(i%1000 == 0)
> > > >   {
> > > >    printf("Deubg: i = %d\n", i);
> > > >   }
> > > >
> > > >   for(j = 0; j <= i; j++)
> > > >   {
> > > >    xj  = j*j*j;
> > > >    xsum = xi + xj;
> > > >
> > > >    if(xsum > z2)
> > > >    {
> > > >     // we've exceeded our range
> > > >     break;
> > > >    }
> > > >
> > > >    for(k = 0; k< range; k++)
> > > >    {
> > > >     z = z1 + k;
> > > >
> > > >     if(z == xsum)
> > > >     {
> > > >      if(p[z].x == 0 && p[z].y == 0)
> > > >      {
> > > >       // found this sum of cubes for the first time
> > > >       p[z].x = i;
> > > >       p[z].y = j;
> > > >      }
> > > >      else
> > > >      {
> > > >       // we have discovered a number that
> > > >       // can be written as a sum of two cubes
> > > >       // in two distince ways.
> > > >       hits++;
> > > >
> > > >       printf("DEBUG: HIT:\n %05d = %d^3 + %d^3,\n %05d = %d^3 +
> > > > %d^3\n",
> > > >        z, i, j, z, p[z].x, p[z].y
> > > >       );
> > > >       mywinner.win = z;
> > > >       mywinner.first.x = i;
> > > >       mywinner.first.y = j;
> > > >       mywinner.second.x = p[z].x;
> > > >       mywinner.second.y = p[z].y;
> > > >      }
> > > >     }
> > > >    }
> > > >   }
> > > >  }
> > > >  winnerp->win = mywinner.win;
> > > >  winnerp->first.x = mywinner.first.x;
> > > >  winnerp->first.y = mywinner.first.y;
> > > >  winnerp->second.x = mywinner.second.x;
> > > >  winnerp->second.y = mywinner.second.y;
> > > >
> > > >
> > > >
> > > >  return hits;
> > > > }
> > > >
> > > > Peter
> > > >
> > > >
> > > >
> > > >
> > > > On 8/16/07, Robert G. Brown <rgb at phy.duke.edu > wrote:
> > > > >
> > > > > On Tue, 14 Aug 2007, Tim Simon wrote:
> > > > >
> > > > > > I would like to know if there is any open source software that
> > > > > will
> > > > > > run on a beowulf, which will do something like find prime
> > > > > numbers, or
> > > > > > something simillar. I cant afford a commercial program.
> > > > > >
> > > > > > I am guessing that there is not "one size fits all" type thing -
> > > > > what
> > > > > > fits your beowulf wont fit mine. Is there anything that can be
> > > > > easily
> > > > > > made to fit? I have searched all over the net, and all i can
> > > > > find are
> > > > > > educational or high research type applications. I just would
> > > > > like
> > > > > > something reasonably simple, (I thought prime numbers but hey,
> > > > > > anything is good).
> > > > >
> > > > > Writing your own is good.  Hunting for primes is actually
> > > > > something that
> > > > > will parallelize decently simply because the Sieve of Erasthones
> > > > > for a
> > > > > proposed number N only requires that you try all the primes found
> > > > > up to
> > > > > N/2.  Therefore a collection of nodes can contain and share all
> > > > > the
> > > > > primes found up to N-1, and distribute the testing of N, N+2, N+4,
> > > > > N+6... N+M (obviously skipping all even numbers) as long as N+M <
> > > > > 2N.
> > > > >
> > > > > For a small number M of nodes, if you precompute and load all the
> > > > > primes
> > > > > less than M -- trivially done -- then you're in business.
> > > > >
> > > > > You will have a few longer term problems.  A single processor can
> > > > > sieve
> > > > > all the unsigned int primes from 0-(2^32-1) out fairly quickly
> > > > > anyway.
> > > > > To go beyond this, you'll need to use symbolic division instead of
> > > > > binary computer arithmetic, I think.  You'll also run into
> > > > > storage/memory problems as you start to get a significant number
> > > > > of
> > > > > primes in your running table.
> > > > >
> > > > > But it is a fun problem, for sure.  Go for it.
> > > > >
> > > > > Beyond that, there are a number of programs people use to play
> > > > > with or
> > > > > demonstrate a beowulf's speedup.  The "funnest" is any of the
> > > > > rubberbandable Mandelbrot set exploration tools parallelized on
> > > > > top of
> > > > > MPI or PVM -- makes nifty graphics.  It isn't quite as cool as it
> > > > > used
> > > > > to be because Moore's Law has overwhelmed the computational
> > > > > difficulty
> > > > > of the task.  When I started in this business, it might have taken
> > > > > minutes to compute a rubberbanding down into the MS, depending on
> > > > > where
> > > > > one was (and how deep one had to go on all those pixels to
> > > > > terminate).
> > > > > Parallelize that on sixty or so nodes and it drops DRAMATICALLY to
> > > > > seconds.
> > > > >
> > > > > Now CPUs are so fast that a SINGLE CPU can rubberband down in a
> > > > > second
> > > > > or two, and networks haven't kept pace so that computing the
> > > > > pixels in a
> > > > > single thread might actually be faster than splitting them.  So
> > > > > speedup
> > > > > is a bit less dramatic, but still very cool.
> > > > >
> > > > >    rgb
> > > > >
> > > > > > I am running Red Hat FC4.
> > > > >
> > > > > Hmmm, might at least TRY 6 or 7.
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > Tim
> > > > > > _______________________________________________
> > > > > > Beowulf mailing list, Beowulf at beowulf.org
> > > > > > To change your subscription (digest mode or unsubscribe) visit
> > > > > http://www.beowulf.org/mailman/listinfo/beowulf
> > > > > >
> > > > >
> > > > > --
> > > > > Robert G. Brown                         http://www.phy.duke.edu/~rgb/
> > > > >
> > > > > Duke University Dept. of Physics, Box 90305
> > > > > Durham, N.C. 27708-0305
> > > > > Phone: 1-919-660-2567  Fax: 919-660-2525
> > > > > email:rgb at phy.duke.edu
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Beowulf mailing list, Beowulf at beowulf.org
> > > > > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> > > > >
> > > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > Beowulf mailing list, Beowulf at beowulf.org
> > > > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> > > >
> > > >
> > > >
> > >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20070816/4cf6206d/attachment.html>


More information about the Beowulf mailing list