[Beowulf] Threaded code

Mark Hahn hahn at physics.mcmaster.ca
Tue Aug 17 09:16:59 PDT 2004

> Ok, I haven't used atlas in a while.  Are you saying that it hardcodes 
> the number of processors into the code itself?  Wouldn't this 

yes.  I didn't read its optimizer or its ncpus-detection code,
but this is completely in line with the whole point of atlas,
which is to tune the library to suit the particular configuration.

> effectively render binary RPMs of Atlas completely useless?  Would also 

no, just very specific.  for instance, you certainly shouldn't run Atlas
on a machine which has caches of different size than the library was built
for (on).

> make building static binaries (don't know if it is possible with Atlas 
> libs) a waste of time if you need a portable binary.

Atlas is inherently non-portable, at least non-portable to machines 
which have different configuration.

I think this is different from fftw, which at runtime dynamically
generates a semi-optimal strategy of combining precompiled units.
if Atlas did this, it would need to compile code blocked for a whole
range of cache sizes, for instance.  possible, but I don't believe 
it does that.  naively, it could simply parameterize for NTHREADS,
but that would probably give trouble on machines which have shared cache,
in which NTHREADS would interact with blocking...

regards, mark.

More information about the Beowulf mailing list