[Beowulf] Re: OT: informatics software for linux clusters

David Mathog mathog at caltech.edu
Mon May 15 12:53:32 PDT 2006

>    Scalable Informatics has released Scalable HMMer, an optimized 
> version of HMMer 2.3.2 that is 1.6-2.5x faster per node on benchmark 
> tests run on Opteron systems.

Did you remove the memory organization changes SE put in to make
it run better on the Altivec Macs?  Those really made life hard when I
was trying to optimize this code to run
on our Beowulf with Athlon MP processors.  The problem was the
P7Viterbi data structures didn't fit entirely into cache (no matter
how it was organized) and this resulted in toxic query lengths that ran
several times slower.  That is, take a query sequence
of length 1000, run hmmpfam, nip off the last character, run it again,
etc.  It was anything but a smooth function of execution time vs. query
length.  Working around the Altivec stuffed helped some but didn't
entirely eliminate the effect.  Probably the bigger cache on the
Opteron would eliminate this effect for smaller sequences but I'm
guessing you could still run into it with a long query.

This has nothing to do with the Parallel implementation though, it
was a data size vs. cache size effect.


David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech

More information about the Beowulf mailing list