Kidger's comments on Quadric's design and performance

James Cownie jcownie at etnus.com
Wed Apr 24 01:42:03 PDT 2002


> > This message also reminded me to ask if a long-held opinion is valid - and
> > that opinion is "that a cache coherent interconnect would offer performance
> > enhancement when applications are at the 'more tightly coupled' end of the
> > spectrum."  I know that present PCI based interfaces can't do that without
> > invoking software overhead and latencies.  Anyone have data - or an argument
> > for invalidating this opinion?
> 
> You would need another programming model than MPI for that (see below),
> maybe OpenMP as you basically have the characteristics of a SMP system
> with cc-NUMA architecture.

No, you don't have an SMP model.

You need to distinguish between a system which has a single address
space and one with multiple address spaces accessed explicitly.
You can have a cache coherent interface in the second, but that
doesn't make it into the first. 

What you have in the Quadrics (assuming it's still like the Meiko in
this respect) is an explicit cache coherent remote store access model.

You can access remote store without the active collaboration of the
owner of the remote store (so it's not message passing), but you have
to _know_ that you're accessing remotely and generate different code
(maybe execute a channel program) to do it. You can't just indirect a
random int * and fetch from remote store.

In the OpenMP model you generally don't know which accesses are
remote, all of the UPC threads live in the same address space and can
pass pointers around at will. The compiler does not know which
references will be to non-local store.

Languages for the explicit remote store access model include

UPC               http://hpc.gwu.edu/~upc/
Co-array Fortran  http://www.co-array.org/
Titanium          http://www.cs.berkeley.edu/~liblit/titanium/

Of course these languages can also run on SMP machines (and indeed one
might hope that they can achieve better performance than something
like OpenMP, because the compiler can better lay out shared areas to
avoid false sharing effects and has better knowledge about which
accesses are to shared variables).

Enjoy

-- Jim 

James Cownie	<jcownie at etnus.com>
Etnus, LLC.     +44 117 9071438
http://www.etnus.com



More information about the Beowulf mailing list