[Beowulf] Two problems related to slowness and
hahn at mcmaster.ca
Tue Jun 12 08:14:55 PDT 2007
> For 32 processes (4 process per node), the arrays with 512-Byte size are
> communicated slower than the 4096-Byte size arrays. For both of them, we
do you mean that this is not the case in other configurations?
an interconnect _should_ have some steep rise in effective bandwidth
as packet size is increased. it's a useful metric to know the packet
size at which half-peak bandwidth is achieved, since this offers some
"sense of scale" to programmers judging whether their own packet sizes
> this abnormal case is persistent. More specifically, communication of
> 4k-Byte packages are 2 times faster than the communication of 512-Byte
perhaps I'm dense this morning, but what's unexpected about that?
> The OSU bandwidth and latency test around these points shows:
> Byte MB/s
> 256 417.53
> 512 592.34
> 1024 691.02
> 2048 857.35
> 4096 906.04
> 8192 1022.52
the osu_bw test is a streaming, fire-and-forget one which strongly
rewards message aggregation. (this is not necessarily deceptive -
it's measuring a real communication pattern, though it's not the
only way to quantify bandwidth.) you can see that it's aggregating
because the reported bandwidth for small packets is much higher than
you'd expect if each packet took the latency reported below.
(unless my math is wrong, 256/(2*4.79e-6) = 26.7 MB/s)
> Time (usec)
> 256 4.79
> 512 5.48
> 1024 6.60
> 2048 8.30
> 4096 11.02
> So this behavior does not seem reasonable to us.
> 2. SOMETIMES, after the test with overall 32 processes, one of the four
> processes at node3 hangs in TASK_UNINTERRUPTABLE "D" state. Hence, the test
> program shows a "done." and waits for sometime. We can neither kill the
> process nor soft reboot the node. We have to wait for that process to
> terminate, which can last long.
does /proc/$pid/wchan (on the 'D' state process) tell you anything?
do all the ranks return from MPI_Finalize?
regards, mark hahn.
More information about the Beowulf