[Beowulf] NUMA zone weirdness
John Hearns
hearnsj at googlemail.com
Fri Dec 16 06:36:03 PST 2016
This is in the context of Ominpath cards and the hfi1 driver.
In the file pio.c there is a check on the NUMA zones being online
* num_numa = num_online_nodes
<http://lxr.free-electrons.com/ident?v=4.4;i=num_online_nodes>();*
*1711*
<http://lxr.free-electrons.com/source/drivers/staging/rdma/hfi1/pio.c?v=4.4#L1711>*
/* enforce the expectation that the numas are compact */*
*1712*
<http://lxr.free-electrons.com/source/drivers/staging/rdma/hfi1/pio.c?v=4.4#L1712>*
for (i <http://lxr.free-electrons.com/ident?v=4.4;i=i> = 0; i
<http://lxr.free-electrons.com/ident?v=4.4;i=i> < num_numa; i
<http://lxr.free-electrons.com/ident?v=4.4;i=i>++) {*
*1713*
<http://lxr.free-electrons.com/source/drivers/staging/rdma/hfi1/pio.c?v=4.4#L1713>*
if (!node_online
<http://lxr.free-electrons.com/ident?v=4.4;i=node_online>(i
<http://lxr.free-electrons.com/ident?v=4.4;i=i>)) {*
*1714*
<http://lxr.free-electrons.com/source/drivers/staging/rdma/hfi1/pio.c?v=4.4#L1714>*
dd_dev_err <http://lxr.free-electrons.com/ident?v=4.4;i=dd_dev_err>(dd
<http://lxr.free-electrons.com/ident?v=4.4;i=dd>, "NUMA nodes are not
compact\n");*
*1715*
<http://lxr.free-electrons.com/source/drivers/staging/rdma/hfi1/pio.c?v=4.4#L1715>*
ret <http://lxr.free-electrons.com/ident?v=4.4;i=ret> = -EINVAL
<http://lxr.free-electrons.com/ident?v=4.4;i=EINVAL>;*
*1716*
<http://lxr.free-electrons.com/source/drivers/staging/rdma/hfi1/pio.c?v=4.4#L1716>*
goto done <http://lxr.free-electrons.com/ident?v=4.4;i=done>;*
*1717*
<http://lxr.free-electrons.com/source/drivers/staging/rdma/hfi1/pio.c?v=4.4#L1717>*
}*
*1718*
<http://lxr.free-electrons.com/source/drivers/staging/rdma/hfi1/pio.c?v=4.4#L1718>*
}*
On some servers I have I see this weirdness with the NUMA zones:
(2650-v4 processors, HT is off)
[root at comp006 ~]# numactl --hardware
available: 2 nodes (0,2)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 18 19 20 21 22 23
node 0 size: 32673 MB
node 0 free: 29840 MB
node 2 cpus: 12 13 14 15 16 17
node 2 size: 32768 MB
node 2 free: 31753 MB
node distances:
node 0 2
0: 10 20
2: 20 10
Someone will be along in a minute to explain why.
I am sure this is a BISO Setting, but which oen is not makign itself clear
to me.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20161216/b0f442fc/attachment.html>
More information about the Beowulf
mailing list