[Beowulf] Anybody here still use SystemImager?

David Mathog mathog at caltech.edu
Fri Mar 1 13:23:54 PST 2019


> Joe Landman wrote
> 
> 
> On 2/27/19 9:08 PM, David Mathog wrote:
>> Joe Landman wrote:
> [...]
>> 
>> I'm about 98% of the way there now, with a mashup of parts from boel
>> and Centos 7.
>> The initrd is pretty large though.
>> 
>> Wasted most of a day on a mysterious issue with "sh" (busybox) not
>> responding to the keyboard with a 3.10.108 kernel built starting from
>> the boel config, but it would respond using the same initrd and a
>> stock Centos 7 kernel.  So 3.10.108 was recompiled with the Centos 7
>> config (which makes WAY too many modules for an initrd) with the
>> network drivers built into the kernel.  This fixes that problem but I
>> could not tell you why.
> 
> This is a driver issue.  Likely you aren't including the hid components
> in your initramfs, or built into the kernel.
> 
> lsmod | grep hid
> mac_hid                16384  0
> hid_generic            16384  0
> usbhid                 49152  0
> hid                   118784  2 usbhid,hid_generic

That was part of it.

Ran into another issue with 3.10.108 in the interim, which was that the 
stock kernel didn't support "v5" xfs, which is what mkfs.xfs from centos 
7 uses.  RH/Centos must have backported patches into their kernel.  
Ended up finally trying a generic 3.14.78 kernel to resolve the xfs and 
it turned out that

yes "" | make oldconfig

starting from the 3.10.108 .config automatically enabled 
CONFIG_MODVERSIONS, and that in turn broke xfs (and many other modules), 
which was indicated by 291 messages like this in the build log:

WARNING: "generic_getxattr" [fs/xfs/xfs.ko] has no CRC!

Changed .config like this

# CONFIG_MODVERSIONS is not set

rebuilt, and the terminal and all the devices are working now.  When 
sd_mod loads there is still a warning about a missing crc-t10dif symbol, 
even though crc-t10dif.ko.xz is present.  But the disk still mounts and 
works via ahci.

I had to retain the modules.pcimap file (verbatim) from the last version 
of BOEL
so that it could find all the devices via:

  pci-automod --hwlist --class storage --class net \
     --class serial --class bridge > /tmp/hardware.lst
  MODULES=`cat /tmp/hardware.lst | sed 's/  */ /g' | cut -d' ' -f4`

There seems to no longer be a tool to automatically generate a 
modules.pcimap file, and I don't know how this sort of "detect all 
devices and load their modules" is supposed to be done without it.  
Anyway, the old method works for now.

> In both cases, it is a driver issue.  For large initramfs, it varies
> from about 710MB for everything and the kitchen sink in debian9, to
> about 1.5GB for CentOS7.
> 
> root at zoidberg:/data/tiburon/diskless/images/nyble# ls -alF centos7/
> total 2736520
> drwxr-xr-x 2 root root        138 Jun 15  2018 ./
> drwxr-xr-x 4 root root         36 Apr 25  2018 ../
> -rw-r--r-- 1 root root 1436202727 Jun  5  2018 
> initramfs-4.16.13.nlytiq.img
> -rw-r--r-- 1 root root 1356007691 Jun 15  2018 
> initramfs-4.16.15.nlytiq.img
> -rw-r--r-- 1 root root    5023504 Jun  5  2018 vmlinuz-4.16.13.nlytiq
> -rw-r--r-- 1 root root    4953872 Jun 15  2018 vmlinuz-4.16.15.nlytiq
> 

These are the sizes if the entire /lib/modules/3.14.78 is stored in the 
binaries
file.  This is a three part load, kernel, a small initrd, and then 
anything else in the binaries file.  busybox, dhclient, and just a few 
others are in the intrd, while everything else is in the tar.gz.

   4264895 Mar  1 11:44 C7knl_initrd.img
254743094 Mar  1 11:36 C7knl_binaries.tar.gz
   5370032 Mar  1 11:31 C7knl_3.14.78

I could remove most of the drivers from /lib/modules for this 
application and trim the binaries file down to ~50Mb.  Not sure if it is 
worth the trouble though since even the big one loads in about 2 seconds 
on 1000baseT.

Is there a small dhclient for this sort of application around somewhere? 
  I kept the one from BOEL, which is quite old, because the dhclient in 
Centos 7 has so many library dependencies:

ldd `which dhclient` | wc -l
45

vs.

ldd sbin/dhclient
         linux-vdso.so.1 =>  (0x00007ffebf9f7000)
         libc.so.6 => /lib64/libc.so.6 (0x00007f57f38b0000)
         /lib64/ld-linux-x86-64.so.2 (0x00007f57f3c7d000)

The old binary is also 3x smaller.  (I suspect a statically linked 
modern dhclient would be pretty big.)

Thanks,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech


More information about the Beowulf mailing list