[Beowulf] Packaging

Bogdan Costescu Bogdan.Costescu at iwr.uni-heidelberg.de
Thu Sep 29 05:34:28 PDT 2005


On Wed, 28 Sep 2005, Greg M. Kurtzer wrote:

>> But your MPI implementation likely requires the X libraries, and 
>> perhaps a few interpreters
>
> Can you elaborate on why an MPI implementation would require Xlibs?

I can't read Donald's mind - I suppose that he refers not to the 
communication libraries, but the associated tools, like profiling or 
debugging ones... which points to a packaging problem: why associate 
the 2 when the purpose is clearly different (runtime vs. development) ?

But this raises another packaging vs. usage question: how much 
fine-grained can the packages be made ? Let's consider the 2 extremes:

1. Put everything in one package.
   Advantages:
     easy to install (rpm -i M.rpm or yum install M)
     everything is installed (so no users complaining of missing parts)
   Disadvantages:
     large(r) package size (bad for ramdisks, speed of transfer)
     dependencies

2. Put any files that have any chance of being used for a different 
purpose or in a different way in a different package, like libraries 
in one package, startup/runtime binaries in another, CLI tools in yet 
another, X tools in another, Java (and implicitly X) tools in another, 
etc. and spend time finding proper dependencies for each package.
   Advantages:
     small individual package size
     dependencies
   Disadvantages:
     difficult to install in the right order (yum helps here)
     packages might be missed

The last 2 are actually those that IMHO make a negative impression. 
The fact that automatic dependency resolution can be done 
automatically is not the whole story, even in the (ideal) situation 
where all software (including end-user applications) is packaged. 
There is still the issue of how users want to use the various parts of 
the software, which are now in different packages and might not be 
installed by the admin, because they were not deemed important enough 
or because the dependency chain did not bring them in automatically: 
for example, running a MPI application requires only the runtime parts 
of the MPI distribution, which means that development parts, 
documentation, etc. might not be installed. And at this point the 
admin wants to be on the safe side or is confused by the multitude of 
packages and installs most (all) packages anyway - and then where's 
the advantage over the single package that provides everything ?

An intermediate solution is to add some meta-packages that have no 
content, but only dependencies from the fine-grained packages. This 
helps a bit those admins that know in principle what is needed, but a 
meta-package might still bring in too many dependencies and confuses 
even more a clueless admin. [same applies to yum groups]

> No need to replicate technologies that already exist for this purpose.

IMHO, it's even worse to use existing technologies in an improper or 
inefficient way: because you can, you don't necessarily have to. Just 
as an example, I'll take one of your commands:

>   # yum --installroot /vnfs/default install/remove moooo

If you want to install a number crunching application that doesn't 
need any fancy libs, it's much faster to use 'rpm -i' than 'yum 
install'. IMHO running 'yum update' nightly on compute nodes is a 
waste, especially when you make the effort of getting all kinds of 
optimized libraries and optimizing compilers to speed up the 
application that runs on the very same nodes.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De



More information about the Beowulf mailing list