LAM and openmosix
Peter Beerli
beerli at gs.washington.edu
Tue Nov 26 16:49:42 PST 2002
Hi,
earlier I asked about why lam-mpi did not migrate on an openmosix [not mosix]
cluster. meanwhile I found out myself [a kind of].
I configured LAM 6.5.8 the first time with
./configure --prefix=$HOME/lam --with-rti=usysv
[the cluster contains 8 dual AMD machines]
programs compiled/linked with this lam libraries/binaries does run but does NOT migrate away from the
master node.
Today I retried, after reading that shared memory programs such as java engines etc do not migrate, with
./configure --prefix=$HOME/lam --with-romio --without-impi
[leaving out the rti, I assume it is then default (tcp)]
Now it works fine and the jobs migrate, I assume that openmosix is adding a layer that is
invisible to lam. I do not know if any of the romio and impi material blocks a migration among
cluster nodes.
It was a good LAM/MPI day today,
Peter
PS. I do not believe that using lam with openmosix is the best use of computertime,
but the machine is used by others and it seems that I am the only MPI person on it.
On Mon, 25 Nov 2002, Manish Chablani wrote:
> Hi,
>
> To run lam successfully on Mosix, there needs to be support for socket
> migration. LAM has never previously supported migration and I am not sure if
> Mosix supports socket migration. I would be surprised if MPI programs
> worked in Mosix.
> If you could send in the details of how you ran lam jobs in Mosix, it
> would be great.
>
> Hope this helps,
> Manish Chablani
> ------------------------------------------------------
> Graduate Student, CS Department, Indiana University.
> http://www.cs.indiana.edu/~mchablan
>
> LAM/MPI Developer
> Make today a LAM/MPI day !!!
> http://www.lam-mpi.org
> ------------------------------------------------------
>
>
> On Mon, 25 Nov 2002, Peter Beerli wrote:
>
> > Hi,
> > I have a question concerning LAM jobs and openmosix job migration.
> > Last year I was demoing parallel runs of my population genetics program
> > (evolution.gs.washington.edu/lamarc/migrate.html) on a openmosix cluster
> > using LAM version 5.6.7? and it worked fine. Now I want to run
> > the same program on a different system that also runs openmosix,
> > I compiled the newest production LAM, started lam
> > and started on the main node but
> > it does not migrate my nodes. I asked the former sysadmin, but
> > except that he said he compiled LAM from the source.
> > The sysadmin on this new cluster says that the other jobs migrate,
> > although currently there is no traffic on the machine.
> >
> > I am sure I miss something, any hints where to look?
> > Peter
> >
> > I did this:
> > - installed lam in my home directory from source
> > - lamboot -v lamsetup [lamsetup contains name-of-local-network cpu=8]
> > - mpirun -s n0 -np 8 migrate-n
> > the program runs but it runs all workers on node0.
> >
> >
> >
> > _______________________________________________
> > This list is archived at http://www.lam-mpi.org/MailArchives/lam/
> >
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
--
Peter Beerli, Genome Sciences, Box #357730, University of Washington,
Seattle WA 98195-7730 USA, Ph:2065438751, Fax:2065430754
http://evolution.genetics.washington.edu/PBhtmls/beerli.html
More information about the Beowulf
mailing list