Beginning with beowulf
smuelas at mecanica.upm.es
Fri Feb 28 00:50:35 PST 2003
Just to give you a big amounts of thanks. Now I'm beginning to understand what is all this world. I haven't answered before to R.G.Brown as I've downloaded your book, Robert, and coudn't take my eyes from it :-)
It is a nice piece of literature in the same, easy, nice and educational style that I try with my owns. Congratulations!!
Thanks also to M.Hahn and Rafael (I'm afraid, Rafael, that you have taken a very dangerous decission answering my mail. You are too near to me .... :-)
Once more, thanks to all
On Thu, 27 Feb 2003 06:14:28 -0500 (EST)
"Robert G. Brown" <rgb at phy.duke.edu> wrote:
> On Thu, 27 Feb 2003, Santiago Muelas wrote:
> > Hello,
> > I am a newby to this list and to the beowulf world.
> > As it seems to me that the level of this list is quite high, perhaps I
> > should find another for "beginners". If so, any suggestion on the
> > possible list will be very welcome.
> Naaa, this is the right one, and your questions are pretty focused.
> Don't worry about it.
> > My question is a simple (and I guess a sempiternal) one. I have just
> > run my first parallel program in a "beowulf". I have used up to five
> > processor in an intranet in this School of Enginnering. Everything run
> > o.k. with just one "small" unconvenient: the best result was obtained
> > using three processors, but almost the same as with two. Using the five
> > was a total disaster.
> > The reason seems clear. One computer is a dual one. The network is a
> > totally standard ethernet 100M.
> Don't be TOO hasty to make conclusions. You could be right, but you
> first need to fully understand why this is normal and expected behavior
> for all parallel code, with the major difference being one of scale.
> The program spends some time computing and some time communicating, and
> at a FIXED scale, especially a small one, one often finds that
> communication times scale up to overwhelm computational advantages as a
> problem is subdivided. This is basically Amdahl's Law and its more
> quantitative generalizations.
> There is a whole chapter on this very question that derives at least
> simple semi-quantitative scaling forms in:
> (which is the latest updated version -- I've actually been working on it
> some once again).
> You will very likely find that just making your code BIGGER will make it
> scale well to ten nodes, so computation dominates communication. This
> is also explained in Sterling, Becker, et. al.'s book. There are some
> "talks" on the brahma site that go through parallelizing an actual
> application and show how your experience is a perfectly normal one on
> the first pass through.
> Finally, although a full analysis of your problem may well indicate that
> you need gig ethernet or myrinet to get good scaling (if it is fine
> grained and intrinsically has a lot of communication and is synchronous
> and all that) you might ALSO look at a book or two on parallel
> programming, as efficient parallel programming is not always intuitive
> to serial programmers, even experienced ones. Parallel algorithms are
> often "different", and naive parallelizations of common tasks may be
> very inefficient. There is at least one online book on parallel
> programming linked (IIRC) to brahma and referenced in my online book.
> With a very limited number of nodes to scale to, there is no point in
> getting faster communications unless you really need it, and you won't
> know that until you try scaling up your application, studying your
> parallel algorithm, and studying parallel programming in general.
> > Now the question:
> > What would be your advice about network cards (giga-ethernet seems
> > clear but would they be fast enough?), and switches. My plan is not tu
> > use more than ten computers in this first period of approach to beowulf.
> We can't answer that or even help you answer it without much more
> information. You probably can't answer it yourself without learning a
> bit more. Study your code, work out its computation to communications
> times (possibly with a profiler), see how the ratio changes when the
> code is scaled up, see especially if it is latency dominated. If
> latency is killing you (lots of small packets) you may not get as much
> of a boost moving to gigE as you might think. Latency dominated
> communications problems require high end networks to resolve, e.g.
> myrinet as there can be an order of magnitude difference between
> ethernet latencies and myrinet latencies (with other networks scattered
> out across the field as well -- this isn't intended to disrespect any or
> endorse any, and latency times can vary with implementation even within
> a single paradigm).
> If you look back at the list archives, there is an ongoing and lively
> discussion on comparative virtues of the various networks, so getting a
> definitive, comprehensive answer on which is "best" (has best
> cost-benefit performance, meets your needs) for you will be VERY
> difficult and will definitely require your active participation in
> analyzing the details of your communications pattern.
> Robert G. Brown http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Profesor de Resistencia de Materiales y Cálculo de Estructuras
ETSI de Caminos, Canales y Puertos (U.P.M)
smuelas at mecanica.upm.es http://w3.mecanica.upm.es/~smuelas
More information about the Beowulf