[Beowulf] cluster softwares supporting parallel CFD computing
m.janssens at opencfd.co.uk
Fri Sep 15 07:39:13 PDT 2006
Patrick Geoffray wrote:
> Alas, people use blocking calls in general because
> they are lazy (50%), they don't know (40%) or they don't care (10%).
We did some tests with non-blocking v.s. blocking. Unfortunately in our code
there is only a small window of overlap, i.e. almost immediately after one
computes a result and swaps it to the neighbouring processor the value
received from the neighbour is needed.
On small cases non-blocking was faster than blocking, on larger cases blocking
definitely was advantageous. Maybe the MPI code used (LAM) does not handle
multiple outstanding sends/receives well, maybe the network card does not
like it, maybe it causes collisions at the destination processor. Anybody can
comment on this?
(we got best performance by scheduling the communication so it happens in
pairs. Every processor swaps its data with one of its neighbours (we're using
domain decomposition), then goes and swaps to a different neighbour. This
schedule lasts until every processor has swapped with all its neighbours. The
schedule is determined at the start of the run since the decomposition does
The Mews, Picketts Lodge,
Picketts Lane, Salfords,
Surrey RH1 5RG.
Tel: +44 (0)1293 821272
Email: M.Janssens at OpenCFD.co.uk
More information about the Beowulf