Beowulf Questions
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Donald Becker becker at scyld.comFri Jan 3 23:05:13 PST 2003
- Previous message: Beowulf Questions
- Next message: Beowulf Questions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, 3 Jan 2003, Randall Jouett wrote: > Donald Becker wrote: > > It transparently uses all available cluster nodes, and works even if > > that number is '0'. > > I like this. Our cluster philosophy is that the end user should not be required to do anything new or special to run a cluster application. That means Applications should work even if there is only a single machine in the cluster. Many beginner MPI applications don't handle this case correctly. Cluster applications should not require a helper program such as 'mpirun' or 'mpiexec'. The application code should interact with the scheduler to set any special scheduling requirements or suggestions. A sophisticated user should still be able to optimize and do clever things, but the basic operation shouldn't require any new knowledge. > > It does all of the serial setup and run-time I/O on the front end > > machine (technically, the MPI rank 0 node). This minimizes > > overall work and keeps the POV-Ray call-out semantics unchanged > > It does the rendering only on compute nodes (except for the N=0 case). > > It completes the rendering even with crashed or slow nodes. > > Ah. So it redistributes the work, huh? Kewl. Here we use knowledge about the application semantics to implement failure tolerence. When we have idle workers and the rendering isn't finished, we send some of the remaining work to the idle machine. If a machine fails we still finish the rendering and do the final call-outs, but don't cleanly terminate. -- Donald Becker becker at scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Scyld Beowulf cluster system Annapolis MD 21403 410-990-9993
- Previous message: Beowulf Questions
- Next message: Beowulf Questions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
