lam - recon works lamboot doesn't!

Eric Linenberg elinenbe at umich.edu
Mon Aug 6 08:23:07 PDT 2001


I am trying to run lam and recon works A-OK, but lamboot gives me errors.  
Could someone possibly give me some insight into this problem! I have read 
everythig I can to no avail.  Help a newbie!

Thanks,
eric

[guest at kitkat bin]$ lamboot -d -v -b beowulf
 
LAM 6.5.4/MPI 2 C++/ROMIO - University of Notre Dame
 
lamboot: boot schema file: /usr/local/lam/etc/beowulf
lamboot: opening hostfile /usr/local/lam/etc/beowulf
lamboot: found the following hosts:
lamboot:   n0 kitkat
lamboot:   n1 snickers
lamboot:   n2 twix
lamboot:   n3 rolo
lamboot:   n4 butterfinger
lamboot: found 5 host node(s)
lamboot: origin node is 0 (kitkat)
Executing hboot on n0 (kitkat - 2 CPUs)...
lamboot: attempting to execute "hboot -t -c lam-conf.lam -d -v -I " -H 
127.0.0.1 -P 35993 -n 0 -o 0
   ""
hboot: process schema = "/usr/local/lam/etc/lam-conf.lam"
hboot: found /usr/local/bin/lamd
hboot: performing tkill
hboot: tkill
hboot: booting...
hboot: fork /usr/local/bin/lamd
hboot: attempting to execute
[1]  24080 lamd -H 127.0.0.1 -P 35993 -n 0 -o 0 -d
Executing hboot on n1 (snickers - 2 CPUs)...
lamboot: -b used, assuming same shell on remote nodes
lamboot: got local shell /bin/bash
lamboot: attempting to execute "/usr/bin/rsh snickers -n hboot -t -c 
lam-conf.lam -d -v -s -I "-H 127.0.0.1 -P 35993 -n 1 -o 0    ""
hboot: process schema = "/usr/local/lam/etc/lam-conf.lam"
hboot: found /usr/local/lam/bin/lamd
hboot: performing tkill
hboot: tkill
hboot: booting...
hboot: fork /usr/local/lam/bin/lamd
[1]    918 lamd -H 127.0.0.1 -P 35993 -n 1 -o 0 -d
-----------------------------------------------------------------------------
lamboot encountered some error (see above) during the boot process,
and will now attempt to kill all nodes that it was previously able to
boot (if any).
 
Please wait for LAM to finish; if you interrupt this process, you may
have LAM daemons still running on remote nodes.
-----------------------------------------------------------------------------
wipe ...
 
LAM 6.5.4/MPI 2 C++/ROMIO - University of Notre Dame
 
Executing tkill on n0 (kitkat)...
Executing tkill on n1 (snickers)...
lamboot did NOT complete successfully





thanks,
-eric




More information about the Beowulf mailing list