[scyld-users] Cluster up - no action on slaves

Gregg Germain saville at comcast.net
Mon Feb 18 14:42:11 PST 2008


 >>  I have the freeware version of SCYLD Beowulf up and running on a 5
 >> node system. I've added the 4 slaves to the Master using Beosetup. >>The
 >> slaves boot and the status monitor shows them as being up. I can ping
 >> them using their IP address. I ran the beofdisk, beoboot-install, and
 >> bpctl commands as instructed by SCYLD.

 >> 3)   I ran a simple Hello World program (on the Master and two 
 >>slaves),
 >> using MPI calls (not BeoMPI) and I get the following output:
 >>
 >> $ mpirun -np 3 HelloWorld
 >> I am the Master! Rank 0, size 3, name localhost.localdomain
 >> Rank 1, size 3, name .0
 >> Rank 2, size 3, name .1

 >> So things SEEM to be working. However the Beowulf Status Monitor
 >> statistics portion of the Slave nodes never budge. Ok maybe the 
 >>program
 >> runs too quickly to get a reaction.

 >It's likely that you are not seeing anything on the display because 
 >your program is so trivial.

That's what I thought. So I wrote a simple program with a big delay 
loop. Code fragment:

         startwtime = MPI_Wtime();

	for (ii=0; ii<1000; ii++)
	 for (jj=0; jj<1000; jj++)
	  for (kk=0; kk<1000; kk++)
	    n++;

         endwtime = MPI_Wtime();

     sprintf(greeting, "Slave: rank %d of %d running on node: %s N-val 
is: %f for an elapsed loop time of: %f\n",
	    rank,size,name,n, endwtime-startwtime);

	MPI_Send(greeting, strlen(greeting)+1, MPI_BYTE,  0,1,
                  MPI_COMM_WORLD);

The results are ("einar" is the SCYLD Master, or -1,  node):

$ mpirun -np 4 Loop
Hello World: rank 0 of 4 running on node:  einar N-val is: 0.000000
Slave: rank 1 of 4 running on node: .0 N-val is: 1000000000.000000 for 
an elapsed loop time of: 74.098218
Slave: rank 2 of 4 running on node: .1 N-val is: 1000000000.000000 for 
an elapsed loop time of: 74.061121
Slave: rank 3 of 4 running on node: .2 N-val is: 1000000000.000000 for 
an elapsed loop time of: 74.063888

so each run takes over 74 seconds and still there's no reaction on the 
Beowulf Status Monitor for the slave nodes.

So what does it take to get the slave entries on the Beowulf Status 
Monitor to come to life?


 > ..............so it's best to change the Beostatus update >period to 
once per second.

How does one change the update period?

thanks

Gregg



More information about the Scyld-users mailing list