[Beowulf] Re: [mvapich-discuss] Two problems related to slowness and TASK_UNINTERRUPTABLE process

Sayantan Sur surs at cse.ohio-state.edu
Tue Jun 12 08:09:01 PDT 2007


Hi Tahir,

Thanks for sharing this data and your observations. It is interesting. 
We have a more recent release, MVAPICH-0.9.9 which is available from our 
website (mvapich.cse.ohio-state.edu) as well as with OFED-1.2 
distribution. Could you please try out our newer release and see if the 
results change/remain the same?

Thanks,
Sayantan.

Tahir Malas wrote:
> Hi all,
> We have an 8 dual quad-core node HP cluster connected via Infiniband. We use
> Voltaire DDR cards and 24-port switch. We also use OFED 1.1 and MVAPICH
> 0.9.7. We have two interesting problems that we could not overcome yet:
>
> 1. In our test program which mimics the communications in our code, the
> nodes are paired as follows: (0 and 1), (2 and 3), (4 and 5), (6 and 7). We
> perform one to one communications between these pairs of nodes
> simultaneously. We use blocking MPI send and receive commands to communicate
> an integer array of various sizes. In addition, we consider different
> numbers of processes:
> (a) 1 process per node, 8 processes overall: One link is established between
> the pairs of nodes.
> (b) 2 process per node, 16 processes overall: Two links are established
> between the pairs of nodes.
> (c) 4 process per node, 32 processes overall: Four links are established
> between the pairs of nodes.
> (d) 8 process per node, 64 processes overall: Eight links are established
> between the pairs of nodes.
>
> We obtain logical timings, except for the following interesting comparison:
>
> For 32 processes (4 process per node), the arrays with 512-Byte size are
> communicated slower than the 4096-Byte size arrays. For both of them, we
> send/receive 1,000,000 arrays and take the average to find the time per
> package. Only package size changes. We have made many trials and confirmed
> this abnormal case is persistent. More specifically, communication of
> 4k-Byte packages are 2 times faster than the communication of 512-Byte
> packages.   
>
> The OSU bandwidth and latency test around these points shows:
> Byte			MB/s
> 256             417.53
> 512             592.34
> 1024            691.02
> 2048            857.35
> 4096            906.04
> 8192            1022.52
> 		Time (usec)
> 256		    4.79
> 512             5.48
> 1024            6.60
> 2048            8.30
> 4096            11.02
> So this behavior does not seem reasonable to us.
>
> 2. SOMETIMES, after the test with overall 32 processes, one of the four
> processes at node3 hangs in TASK_UNINTERRUPTABLE "D" state. Hence, the test
> program shows a "done." and waits for sometime. We can neither kill the
> process nor soft reboot the node. We have to wait for that process to
> terminate, which can last long.  
>
> Does anybody have some comments in these issues? 
> Thanks in advance,
> Tahir Malas
> Bilkent University 
> Electrical and Electronics Engineering Department
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>   


-- 
http://www.cse.ohio-state.edu/~surs




More information about the Beowulf mailing list