[Beowulf] probs with mpdboot for 4 nodes
Vinodh
gvinodh1980 at yahoo.co.in
Mon Nov 8 00:08:56 PST 2004
hi,
i established a four node cluster. i mounted a
directory /home/vinodh/cluster to all the 3 slave
nodes. i created a pubic key (rsa) in
/home/vinodh/.ssh and i copied the id_rsa.pub to all
the 3 slave nodes as authorized_keys2.
when i tried to invoke the command mpdboot -n 4 ,
it says host key verification failed , but it is
starting the mpd in 3 nodes depending on the names in
mpd.hosts. it fails for the last one.
the name of the nodes are
beomaster
beoslave
beoslave1
beoslave2
here is the output of mpdboot -n 4 --debug
mpdboot_rank_0 (mpdboot 222): starting
mpdboot_rank_0 (mpdboot 225): p=-1 l=1 r=2
mpdboot_rank_0 (mpdboot 236): cmd to run local mpd =
:/usr/local/bin/mpd.py -d -e --ncpus=1:
mpdboot_rank_0 (mpdboot 288): cmd to run lchild boot
= :ssh -x beoslave1 -n '/usr/local/bin/mpdboot.py
--ncpus=1 -r
ssh -m /usr/local/bin/mpd.py -n 4 -d -zentry
beomaster:50442 -zrank 1 -zhosts
beomaster:1,beoslave1:1,beoslave:1,beoslave2:1
</dev/null ' :
mpdboot_rank_0 (mpdboot 302): cmd to run rchild boot
= :ssh -x beoslave -n '/usr/local/bin/mpdboot.py
--ncpus=1 -r
ssh -m /usr/local/bin/mpd.py -n 4 -d -zentry
beomaster:50442 -zrank 2 -zhosts
beomaster:1,beoslave1:1,beoslave:1,beoslave2:1
</dev/null ' :
mpdboot_rank_1 (mpdboot 222): starting
mpdboot_rank_1 (mpdboot 225): p=0 l=3 r=-1
mpdboot_rank_1 (mpdboot 236): cmd to run local mpd =
:/usr/local/bin/mpd.py -h beomaster -p 50442 -d -e
--ncpus=1:
mpdboot_rank_2 (mpdboot 222): starting
mpdboot_rank_2 (mpdboot 225): p=0 l=-1 r=-1
mpdboot_rank_2 (mpdboot 236): cmd to run local mpd =
:/usr/local/bin/mpd.py -h beomaster -p 50442 -d -e
--ncpus=1:
mpdboot_rank_1 (mpdboot 288): cmd to run lchild boot
= :ssh -x beoslave2 -n '/usr/local/bin/mpdboot.py
--ncpus=1 -r
ssh -m /usr/local/bin/mpd.py -n 4 -d -zentry
beomaster:50442 -zrank 3 -zhosts
beomaster:1,beoslave1:1,beoslave:1,beoslave2:1
</dev/null ' :
Host key verification failed.
but it is working, if i start the mpd in each node
with the command
mpd -h beomaster -p <no> &
then, i tried by umounting /home/vinodh/cluster and i
mounted /home/vinodh itself. and in .ssh i copied the
id_rsa.pub as authorized_keys2. now the command
mpdboot -n -4
it is working on all the four nodes.
the output of mpdboot -n 4 --debug is
mpdboot_rank_0 (mpdboot 222): starting
mpdboot_rank_0 (mpdboot 225): p=-1 l=1 r=2
mpdboot_rank_0 (mpdboot 236): cmd to run local mpd =
:/usr/local/bin/mpd.py -d -e --ncpus=1:
mpdboot_rank_0 (mpdboot 288): cmd to run lchild boot
= :ssh -x beoslave1 -n '/usr/local/bin/mpdboot.py
--ncpus=1 -r
ssh -m /usr/local/bin/mpd.py -n 4 -d -zentry
beomaster:50864 -zrank 1 -zhosts
beomaster:1,beoslave1:1,beoslave:1,beoslave2:1
</dev/null ' :
mpdboot_rank_0 (mpdboot 302): cmd to run rchild boot
= :ssh -x beoslave -n '/usr/local/bin/mpdboot.py
--ncpus=1 -r
ssh -m /usr/local/bin/mpd.py -n 4 -d -zentry
beomaster:50864 -zrank 2 -zhosts
beomaster:1,beoslave1:1,beoslave:1,beoslave2:1
</dev/null ' :
mpdboot_rank_1 (mpdboot 222): starting
mpdboot_rank_1 (mpdboot 225): p=0 l=3 r=-1
mpdboot_rank_1 (mpdboot 236): cmd to run local mpd =
:/usr/local/bin/mpd.py -h beomaster -p 50864 -d -e
--ncpus=1:
mpdboot_rank_2 (mpdboot 222): starting
mpdboot_rank_2 (mpdboot 225): p=0 l=-1 r=-1
mpdboot_rank_2 (mpdboot 236): cmd to run local mpd =
:/usr/local/bin/mpd.py -h beomaster -p 50864 -d -e
--ncpus=1:
mpdboot_rank_1 (mpdboot 288): cmd to run lchild boot
= :ssh -x beoslave2 -n '/usr/local/bin/mpdboot.py
--ncpus=1 -r
ssh -m /usr/local/bin/mpd.py -n 4 -d -zentry
beomaster:50864 -zrank 3 -zhosts
beomaster:1,beoslave1:1,beoslave:1,beoslave2:1
</dev/null ' :
mpdboot_rank_3 (mpdboot 222): starting
mpdboot_rank_3 (mpdboot 225): p=1 l=-1 r=-1
mpdboot_rank_3 (mpdboot 236): cmd to run local mpd =
:/usr/local/bin/mpd.py -h beomaster -p 50864 -d -e
--ncpus=1:
Regards,
G. Vinodh Kumar
__________________________________
Do you Yahoo!?
Check out the new Yahoo! Front Page.
www.yahoo.com
More information about the Beowulf
mailing list