[Beowulf] log_mtts_per_seg in mlx4 driver

Ryan Novosielski novosirj at rutgers.edu
Tue Jul 9 20:39:43 PDT 2019


Yeah,

I saw that too, which is part of why I was confused.

It appears to me as if, at the very least, the kernel driver and OFED have two different recommendations. The CentOS kernel driver (from CentOS 7.6 at least, but likely longer) seems to set log_mtts_per_seg to 3 on all of my nodes. You can see from my earlier reply that this driver already takes total memory into account.

OFED on the other hand (I was looking at a VM with OFED 4.5 on it — have to confirm the host version, etc.) seems to set that value to 0 by default.

So I’m not a ton less confused, but reasonably confident the right move on the CentOS kernel driver is not to set that value as the defaults are sane.

--
____
|| \\UTGERS,       |---------------------------*O*---------------------------
||_// the State     |         Ryan Novosielski - novosirj at rutgers.edu<mailto:novosirj at rutgers.edu>
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ     | Office of Advanced Research Computing - MSB C630, Newark
    `'

On Jul 9, 2019, at 21:25, Jonathan Engwall <engwalljonathanthereal at gmail.com<mailto:engwalljonathanthereal at gmail.com>> wrote:

Hello,
They make a recommendation: https://community.mellanox.com/s/article/howto-increase-memory-size-used-by-mellanox-adapters
interesting stuff.
Jonathan Engwall

On Tue, Jul 9, 2019 at 1:13 PM Ryan Novosielski <novosirj at rutgers.edu<mailto:novosirj at rutgers.edu>> wrote:
Hi all,

There seems to be a whole lot of misinformation out there about the appropriate setting for the log_mtts_per_seg parameter in the mlx4 driver. Some folks suggest that it’s different between the RHEL/kernel.org<http://kernel.org> provided driver and Mellanox OFED, some suggest it’s been fixed such that the default can assign 2x the total memory on a node since at least RHEL6.6 (and so one would assume even longer ago in OFED), and some places seem to still be carrying it forward because “maybe it matters.”

Is anyone here sure of the current state? I’ll probably read the source code if not, but I’d like to spare myself the hassle.

--
____
|| \\UTGERS,     |---------------------------*O*---------------------------
||_// the State  |         Ryan Novosielski - novosirj at rutgers.edu<mailto:novosirj at rutgers.edu>
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ  | Office of Advanced Research Computing - MSB C630, Newark
     `'

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org<mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20190710/cf9b5916/attachment-0001.html>


More information about the Beowulf mailing list