[Beowulf] RE: [mvapich-discuss] Two problems related to slowness and TASK_UNINTERRUPTABLE process

Tahir Malas tmalas at ee.bilkent.edu.tr
Mon Jun 18 09:23:04 PDT 2007


Hi Sayantan,
We have installed OFED 1.2, and our two problems have gone! Now there is
neither suspending processes and nor inconsistent communication times:
PACKAGE SIZE         512 BYTES
           1.76 
PACKAGE SIZE        4096 BYTES
           13.83

These were 
Our test:
512: 29.434
4096: 16.209
with OFED 1.1.

Thanks and regards,
Tahir Malas
Bilkent University 
Electrical and Electronics Engineering Department
Phone: +90 312 290 1385 

> -----Original Message-----
> From: Sayantan Sur [mailto:surs at cse.ohio-state.edu]
> Sent: Tuesday, June 12, 2007 6:09 PM
> To: Tahir Malas
> Cc: mvapich-discuss at cse.ohio-state.edu; beowulf at beowulf.org;
> teoman.terzi at gmail.com; 'Ozgur Ergul'
> Subject: Re: [mvapich-discuss] Two problems related to slowness and
> TASK_UNINTERRUPTABLE process
> 
> Hi Tahir,
> 
> Thanks for sharing this data and your observations. It is interesting.
> We have a more recent release, MVAPICH-0.9.9 which is available from our
> website (mvapich.cse.ohio-state.edu) as well as with OFED-1.2
> distribution. Could you please try out our newer release and see if the
> results change/remain the same?
> 
> Thanks,
> Sayantan.
> 
> Tahir Malas wrote:
> > Hi all,
> > We have an 8 dual quad-core node HP cluster connected via Infiniband. We
> use
> > Voltaire DDR cards and 24-port switch. We also use OFED 1.1 and MVAPICH
> > 0.9.7. We have two interesting problems that we could not overcome yet:
> >
> > 1. In our test program which mimics the communications in our code, the
> > nodes are paired as follows: (0 and 1), (2 and 3), (4 and 5), (6 and 7).
> We
> > perform one to one communications between these pairs of nodes
> > simultaneously. We use blocking MPI send and receive commands to
> communicate
> > an integer array of various sizes. In addition, we consider different
> > numbers of processes:
> > (a) 1 process per node, 8 processes overall: One link is established
> between
> > the pairs of nodes.
> > (b) 2 process per node, 16 processes overall: Two links are established
> > between the pairs of nodes.
> > (c) 4 process per node, 32 processes overall: Four links are established
> > between the pairs of nodes.
> > (d) 8 process per node, 64 processes overall: Eight links are
> established
> > between the pairs of nodes.
> >
> > We obtain logical timings, except for the following interesting
> comparison:
> >
> > For 32 processes (4 process per node), the arrays with 512-Byte size are
> > communicated slower than the 4096-Byte size arrays. For both of them, we
> > send/receive 1,000,000 arrays and take the average to find the time per
> > package. Only package size changes. We have made many trials and
> confirmed
> > this abnormal case is persistent. More specifically, communication of
> > 4k-Byte packages are 2 times faster than the communication of 512-Byte
> > packages.
> >
> > The OSU bandwidth and latency test around these points shows:
> > Byte			MB/s
> > 256             417.53
> > 512             592.34
> > 1024            691.02
> > 2048            857.35
> > 4096            906.04
> > 8192            1022.52
> > 		Time (usec)
> > 256		    4.79
> > 512             5.48
> > 1024            6.60
> > 2048            8.30
> > 4096            11.02
> > So this behavior does not seem reasonable to us.
> >
> > 2. SOMETIMES, after the test with overall 32 processes, one of the four
> > processes at node3 hangs in TASK_UNINTERRUPTABLE "D" state. Hence, the
> test
> > program shows a "done." and waits for sometime. We can neither kill the
> > process nor soft reboot the node. We have to wait for that process to
> > terminate, which can last long.
> >
> > Does anybody have some comments in these issues?
> > Thanks in advance,
> > Tahir Malas
> > Bilkent University
> > Electrical and Electronics Engineering Department
> >
> >
> >
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >
> 
> 
> --
> http://www.cse.ohio-state.edu/~surs
> 






More information about the Beowulf mailing list