FW: MPI cases and PVM
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduSat Mar 16 11:02:19 PST 2002
- Previous message: FW: MPI cases and PVM
- Next message: Miniature Beowulf
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, 14 Mar 2002, Anand Singh Bisen wrote: > > Hello > > I dont want to start off with a flame i just want to ask that if i have > a homogenous cluster i.e. all the Nodes are exactly same with a high > speed interconnect then which API should i use PVM or MPI. I know there > might be some areas where PVM must be better and some where MPI so i > just wanted to know in which cases PVM should be used and where all MPI. > I cant discuss my computational problem because i am not supposed to. > > Anand Singh Bisen (abisen at iupui.edu) > Graduate Student @ Purdue School of Science (CS) Either one or both. There used to be a lovely white paper on PVM vs MPI here: http://www.epm.ornl.gov/pvm/PVMvsMPI.ps that compares their features and talks about when one might be preferred over the other. Frankly I don't think it much matters -- most people I know use one or the other depending as much on their personal history in parallel computing as a rational decision process. A VERY brief recapitulation of the history and features that are apropos to a rational decision are something like: PVM: Developed at ORNL to facilitate parallel supercomputing on commodity workstations. I'd even say that the development of PVM was "the" critical enabling technology for beowulfery, and there were lots of people, myself included, who did massive parallel computations on large clusters of e.g. Sun or Digital workstations using PVM back before Linux. Note that even using a bunch of relatively expensive Suns or SGIs or Decstations as nodes still beat the hell out of buying a Cray or CM5, especially if a lot of ths Suns were already in place on desktops. PVM was designed from the beginning to run with the (TCP and UDP IP) network as "the" IPC channel in a heterogeneous environment. It uses a nifty "AI" expert system to determine architecture of hosts it is installed on and creates a tree of binary directories so that a task can be compiled on several architectures and run across them. It doesn't to automated load balancing but it isn't horribly difficult to balance load across systems with different speeds either, depending on the type of task. The PVM library basically passes packed messages and provides various routines and tools to manage e.g. task spawning and parallel process control. MPI: Back when people were spending large amounts of government money for massive parallel supercomputers, vendors generally provided their own proprietary API to use the parallel features of their big systems. That meant that you first bought a supercomputer, then you spent months to years porting your application to use its proprietary language interface, then it became obsolete (often before you finished porting:-) and then you bought another and started over. Sometimes you even made it into production for a while in between;-). After a few multimillion dollar passes through this cycle, the government finally decided that it had had enough and told the supercomputer manufacturers that either they came up with a portable API or no more government funded supercomputers would be purchased. Faced with that (and the fact that damn near nobody BUT the government could afford them) a consortium was formed that wrote the MPI spec, and vendors (all hoping for a Microsoftish monopoly fueled by the high cost of porting out of their proprietary API's) reluctantly participated and complied, although a lot of them still offered and touted their proprietary interfaces as well. (Forgive me if this isn't perfectly accurate -- I'm doing this from memory). MPI was almost immediately turned into a PVM-like language that would support the creation of virtual parallel supercomputers out of Unix workstations, but I >>think<< that they came in well after PVM in this arena more or less as a network device IPC channel in open source MPI implementations for MP Unix systems. MPI is obviously a message passing interface and also provides job creation and management tools. I personally started with PVM and am therefore far from an MPI expert but my impression is that it hides more of the details of the parallel supercomputer from the user and is thereby arguable more scalable for straightforward applications although perhaps not so well suited to custom applications or load balancing. As I said before, which of them one uses is largely determined by one's history and to a lesser extent where you plan to run your code. If you came from big iron to beowulfs, you will almost certainly have MPI based code and will want to use MPI. If you need to run code on BOTH a beowulf AND big iron, you will likely want to use MPI as MPI is likely to be the parallel API on the big iron. If you ran on a NOW, or did odd computations that used a Cray for vector code and a possibly heterogeneous NOW to do parallel blocks of computations or if you just wanted something to facilitate coarse grained to embarrassingly parallel job distribution in a master-slave paradigm, you probably started with PVM and use PVM to this day on beowulfs (which tend to be at least speed heterogeneous after the first year as newer nodes get mixed in with the old). MPI is arguably a tiny bit better in its basic design, although given the quality of the authors of PVM (whom I think very highly of) it is a tiny bit indeed. One area where MPI might hold a small advantage is in low-level network device support, specifically Myrinet. I think MPI has native Myrinet drivers and can avoid TCP altogether. I don't really know if PVM does also at this point (although somebody on the list that uses Myrinet probably does:-). OTOH, MPICH at least has been plagued with TCP problems over the years and may be yet for all that I know (again, I expect that somebody currently expert will say a word one way or another:-). And then there is also LAM-MPI -- with MPI you have a bit of a choice of implementations while with PVM there is really just one. Hope this helps. Although I'm a fairly satified PVM user because of MY history, I've tried to be balanced in my treatment of the two. As I said, I don't think it matters terribly from the point of view of performance (except where one or the other might have weaknesses in a specific communication stack) or ease of programming, but it does matter in terms of portability and maybe support of heterogeneous operation. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: FW: MPI cases and PVM
- Next message: Miniature Beowulf
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
