[Beowulf] numad?
Ryan Novosielski
novosirj at rutgers.edu
Tue Jan 18 20:10:44 UTC 2022
It’s not installed on our nodes, so nothing to turn off.
--
#BlackLivesMatter
____
|| \\UTGERS, |---------------------------*O*---------------------------
||_// the State | Ryan Novosielski - novosirj at rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
`'
> On Jan 18, 2022, at 1:18 PM, Michael Di Domenico <mdidomenico4 at gmail.com> wrote:
>
> does anyone turn-on/off numad on their clusters? I'm running RHEL7.9
> on Intel CPU's and seeing a heavy performance impact on MPI jobs when
> running numad.
>
> diagnosis is pretty prelim right now, so i'm light on details. when
> running numad i'm seeing MPI jobs stall while numad pokes at the job.
> the stall is notable, like 10-12 seconds
>
> it's particularly interesting because if one rank stalls while numad
> runs, the others wait. once it frees they all continue, but then
> another rank gets hit, so i end up seeing this cyclic stall
>
> like i said i'm still looking into things, but i curious what
> everyone's take on numa is. my consensus is we probably don't even
> really need it since slurm/openmpi should be handling process
> placement anyhow
>
> thoughts?
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
More information about the Beowulf
mailing list