From saville at comcast.net Mon Feb 18 14:42:11 2008 From: saville at comcast.net (Gregg Germain) Date: Tue Nov 9 01:14:29 2010 Subject: [scyld-users] Cluster up - no action on slaves Message-ID: <47BA09C3.6090503@comcast.net> >> I have the freeware version of SCYLD Beowulf up and running on a 5 >> node system. I've added the 4 slaves to the Master using Beosetup. >>The >> slaves boot and the status monitor shows them as being up. I can ping >> them using their IP address. I ran the beofdisk, beoboot-install, and >> bpctl commands as instructed by SCYLD. >> 3) I ran a simple Hello World program (on the Master and two >>slaves), >> using MPI calls (not BeoMPI) and I get the following output: >> >> $ mpirun -np 3 HelloWorld >> I am the Master! Rank 0, size 3, name localhost.localdomain >> Rank 1, size 3, name .0 >> Rank 2, size 3, name .1 >> So things SEEM to be working. However the Beowulf Status Monitor >> statistics portion of the Slave nodes never budge. Ok maybe the >>program >> runs too quickly to get a reaction. >It's likely that you are not seeing anything on the display because >your program is so trivial. That's what I thought. So I wrote a simple program with a big delay loop. Code fragment: startwtime = MPI_Wtime(); for (ii=0; ii<1000; ii++) for (jj=0; jj<1000; jj++) for (kk=0; kk<1000; kk++) n++; endwtime = MPI_Wtime(); sprintf(greeting, "Slave: rank %d of %d running on node: %s N-val is: %f for an elapsed loop time of: %f\n", rank,size,name,n, endwtime-startwtime); MPI_Send(greeting, strlen(greeting)+1, MPI_BYTE, 0,1, MPI_COMM_WORLD); The results are ("einar" is the SCYLD Master, or -1, node): $ mpirun -np 4 Loop Hello World: rank 0 of 4 running on node: einar N-val is: 0.000000 Slave: rank 1 of 4 running on node: .0 N-val is: 1000000000.000000 for an elapsed loop time of: 74.098218 Slave: rank 2 of 4 running on node: .1 N-val is: 1000000000.000000 for an elapsed loop time of: 74.061121 Slave: rank 3 of 4 running on node: .2 N-val is: 1000000000.000000 for an elapsed loop time of: 74.063888 so each run takes over 74 seconds and still there's no reaction on the Beowulf Status Monitor for the slave nodes. So what does it take to get the slave entries on the Beowulf Status Monitor to come to life? > ..............so it's best to change the Beostatus update >period to once per second. How does one change the update period? thanks Gregg