[Beowulf] traverse @ princeton

Bill Wichser bill at princeton.edu
Thu Oct 10 10:49:49 PDT 2019


Actually 12 per rack.  The reasoning was that there were 2 connections 
per host to top of rack switch leaving 12 uplinkks to two tier0 switches 
at 6 each.

For the IB cards they are some special flavored Mellanox which attach to 
the PCIv4 sockets, 8 lanes each.  And since 8 lanes of v4 == 16 lanes of 
v3, we get full EDR to both CPU sockets.

Bill

On 10/10/19 12:57 PM, Scott Atchley wrote:
> That is better than 80% peak, nice.
> 
> Is it three racks of 15 nodes? Or two racks of 18 and 9 in the third rack?
> 
> You went with a single-port HCA per socket and not the shared, dual-port 
> HCA in the shared PCIe slot?
> 
> On Thu, Oct 10, 2019 at 8:48 AM Bill Wichser <bill at princeton.edu 
> <mailto:bill at princeton.edu>> wrote:
> 
>     Thanks for the kind words.  Yes, we installed more like a mini-Sierra
>     machine which is air cooled.  There are 46 nodes of the IBM AC922, two
>     socket, 4 V100 where each socket uses the SMT threading x4.  So two 16
>     core chips, 32/node, 128 threads per node.  The GPUs all use NVLink.
> 
>     There are two EDR connections per host, each tied to a CPU, 1:1 per
>     rack
>     of 12 and 2:1 between racks.  We have a 2P scratch filesystem running
>     GPFS.  Each node also has a 3T NVMe card as well for local scratch.
> 
>     And we're running Slurm as our scheduler.
> 
>     We'll see if it makes the top500 in November.  It fits there today but
>     who knows what else got on there since June.  With the help of
>     nVidia we
>     managed to get 1.09PF across 45 nodes.
> 
>     Bill
> 
>     On 10/10/19 7:45 AM, Michael Di Domenico wrote:
>      > for those that may not have seen
>      >
>      >
>     https://insidehpc.com/2019/10/traverse-supercomputer-to-accelerate-fusion-research-at-princeton/
>      >
>      > Bill Wischer and Prentice Bisbal are frequent contributors to the
>      > list, Congrats on the acquisition.  Its nice to see more HPC
>     expansion
>      > in our otherwise barren hometown... :)
>      >
>      > Maybe one of them will pass along some detail on the machine...
>      > _______________________________________________
>      > Beowulf mailing list, Beowulf at beowulf.org
>     <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
>      > To change your subscription (digest mode or unsubscribe) visit
>     https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>      >
>     _______________________________________________
>     Beowulf mailing list, Beowulf at beowulf.org
>     <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
>     To change your subscription (digest mode or unsubscribe) visit
>     https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> 


More information about the Beowulf mailing list