[Beowulf] fast interconnects, HT 3.0 ...
Richard Walsh
rbw at ahpcrc.org
Wed May 24 12:54:24 PDT 2006
Eugen Leitl wrote:
> On Wed, May 24, 2006 at 09:09:23AM -0500, Richard Walsh wrote:
>
>
>> Jim, I meant cache coherence. As we know, HT provides cache
>> coherent and non-cache coherent
>> memory management. Typically within the board complex on an SMP
>> device we want cache coherency.
>>
>
> You cannot have cache coherency over a large amount of systems *and*
> have temporally unconstrained execution. There is no free lunch.
> There are already coherency issues in distributing such a simple
> thing as clock over such a small area as a single die. (Which
> is why global clocks will go away one day).
>
Yes, yes ... ;-) ... ccNUMA gets to be heavy-weight as the processor
count rises, but
again what I was asking is: If HT 3.0, by providing a
chassis-to-chassis connection,
defines a single, globally addressable address space across enough
processors (thus
my question about switches and scalability), then it could become a
standard protocol
on top of which pGAS languages like UPC and CAF can run. The HT
3.0 layer would
play a role similar to global address space (non-coherent) that the
Cray X1 provides
UPC and CAF across its node boards, while allow on-board
applications a cache
coherent alternative ... or function, as a standard, as an
important alternative to GASnet
in a COTS/cluster regime
>> The HT 3.0 standard, as I understand it, offers off-chassis memory
>> access at lower bit rates using AC power,
>> but without cache coherence. This is quite similar to the approach
>> taken on the Cray X1 with cache coherent
>> on-board images and non-coherent access off-board. The Cray X1
>>
>
> I think cache coherency on 4-16 CPUs on-board does make some sense.
>
Yes, again ... as I said above, as the Cray X1 design shows, etc.
>
>> support the partitioned Global Address
>> Space (pGAS) programming models of UPC and CAF. The question here
>>
>
> pGAS assumes shared memory. There is no such thing as a shared memory,
> beyond of multiport memory where "crossbars do not scale" thing applies.
>
pGAS only assumes a >>somehow globally addressable memory<< that's
why you can still run
UPC and CAF on a cluster. The >>global addressability<< is
currently provided through the GASnet
API written for your particular interconnect although is can even
run "upside down" on top
of MPI!
HT 3.0 would seem to offer a more uniform and efficient way of
providing the >>somehow global
addressibility<< pGAS langauges need in a COTS/cluster regime with
better latency and bandwith.
I was hoping someone on here that know HT 3.0 well would be able to
comment, but it seems
the folks are not too familiar with it yet or with UPC and CAF in a
cluster context. People should
down load Berkeley UPC and give it a try ... ;-) ...
Perhaps, you are asserting that with latencies for 1 byte of ~1.5
micros and N 1/2 bandwidth (400 Mbytes/sec)
at 400 byte-messages we can't hope to do better at the scale we
would like to take cluster systems. This
problem is exacerbated by the absense of vector memory operations
that are available on the Cray X1 ...
although there is a message-vector like sent of UPC memory copy
libraries that can be used. The Cray
delivers better latency and bandwidth in UPC and CAF than it does in
MPI which is an argument in
favor of some room for improvement still.
It may be that little more can be extracted from interconnect
physics with HT 3.0 ... but pGAS language
performance should at least equal MPI and would have a programming
elegance advantage.
Regards,
rbw
>> was: What do those that under
>> stand HT 3.0 better than I do think about its ability to similarly
>> support the pGAS programming style
>> efficiently? The follow up question was: What might be the
>> implications for commodity parallel programming
>> in MPI. I want to get a feel for HT 3.0s scalability in this
>> context, the need/density of potential HT switches,
>> etc.
>>
>> The discussion on signal coherence was of course interesting ... ;-) ...
>>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
--
Richard B. Walsh
Project Manager
Network Computing Services, Inc.
Army High Performance Computing Research Center (AHPCRC)
rbw at ahpcrc.org | 612.337.3467
-----------------------------------------------------------------------
This message (including any attachments) may contain proprietary or
privileged information, the use and disclosure of which is legally
restricted. If you have received this message in error please notify
the sender by reply message, do not otherwise distribute it, and delete
this message, with all of its contents, from your files.
-----------------------------------------------------------------------
More information about the Beowulf
mailing list