[Beowulf] O'Reilly Clusters Book Review
gotero at linuxprophet.com
Fri Feb 25 00:23:31 PST 2005
My review of O'Reilly's latest clusters book published at HPCwire
> 'Crazy Talk' Clutters New Cluster Book
> Glen Otero, Linux Prophet
> When my colleagues and I heard that O'Reilly was releasing another
> cluster book ("High Performance Linux Clusters with OSCAR, Rocks,
> openMosix & MPI"), we knew it would not turn out well. One of my
> colleagues even said, "It's going to be written by some guy that
> doesn't know anything and [gets all excited] over clusters."
> Why such a pessimistic prediction?
> For one, it was uttered by the same cluster expert that O'Reilly
> ignored while producing their first cluster book debacle several
> ago. When told that their first book ("Building Linux Clusters" by
> David Spector) should be scrapped and rewritten, O'Reilly ignored
> their reviewers. The advice only came from the knowledgeable folks at
> VA Linux, *the* cluster company at that time. But what does VA Linux
> know? It's O'Reilly, they obviously know better.
> The first O'Reilly cluster book was a complete disaster. I wrote a
> scathing review of it for Linux Journal in 2000. Completely void of
> anything useful, the book and included software were simply not
> finished. It was like reading a rough draft. Totally embarrassed, and
> suddenly void of hubris, O'Reilly apologized to its audience and
> pulled the book from print.
> Not satisfied to sit around pointing fingers and complaining, I told
> O'Reilly I would help them with their next cluster book attempt, if
> there even was one. Before long, I signed a contract to write a
> clusters book for O'Reilly. But in their infinite wisdom, they didn't
> like the first few chapters that I submitted. Although I had gotten
> other cluster experts to review what I had written, O'Reilly didn't
> bother to get any experts to review what I was writing. They just
> didn't like it, so they dismissed it out of hand. Needless to say the
> "we know better" attitude was back, and that ended the contract.
> Which brings us to present day. This latest cluster book suffers from
> the same brain damaged, hubris-driven process at O'Reilly. Just like
> the first book, it's written by a virtual unknown in the cluster
> community (Joseph D. Sloan) and comes across as having been written
> a vacuum.
> Let's start with the book's title, "High Performance Linux Clusters
> with OSCAR, Rocks, openMosix & MPI." There's nothing high-performance
> about this book because there's no discussion of using any high
> performance networks like Myrinet, Infiniband, or Quadrics outside of
> four paragraphs on page 40. There are so many ill-informed sweeping
> generalizations made about cluster networks on that page that I threw
> the book against the wall when I read them. For example, Quadrics and
> Infiniband are clearly established networking technologies, not
> "emerging," as the author believes. Sloan obviously hasn't attended a
> Supercomputing conference in the last several years. Unfortunately,
> the rest of the book is rife with several inaccurate cluster
> oversimplifications and incorrect definitions of terms like single
> system image (SSI) and virtual machine interface (VMI). The
> "beginner's guide" design of the book is no excuse for inaccuracies
> and oversimplifications.
> In my eyes, this book was doomed for the trash after page 8. Sloan
> states that the term "Beowulf" is a politically charged term that
> would be avoided in the book. That is the most ridiculous thing I
> have ever heard. It's impossible to take that comment seriously,
> especially since the author doesn't even take the time to properly
> define a Beowulf. For these reasons alone, I can't take this book
> seriously. I've thrown back my share of adult beverages with Don
> Becker, and trust me when I say that the political nature of Beowulf
> has never come up. Adding to the confusion, the phrase "more
> traditional Beowulf-style cluster" is then used on page 63. I hope
> you'll understand why I think this book is schizophrenic at best.
> Defining a Beowulf shouldn't have been too difficult for Sloan. He
> could have used a term that he introduced on page 10, "asymmetric
> cluster." But I guess it's too much to ask that the Beowulf project,
> Tom Sterling and Don Becker's brainchild that started the high
> performance cluster phenomenon, be properly described and defined in
> clusters book. By the way, I've never heard the term "asymmetric
> architecture" used when describing clusters. And, outside this book,
> you won't either.
> After page 8, it's apparent that the author has nothing original to
> offer and is going to regurgitate what has already been written about
> clusters. There is absolutely no value in this because the online
> documentation for all of the cluster projects covered by the author
> far more informative than what is included in the book. For example,
> while screenshots of a cluster install are included in the online
> Rocks documentation, they are omitted in the book. Furthermore, after
> regurgitating much of the online Rocks documentation, the author
> doesn't offer any additional helpful hints or troubleshooting advice.
> As someone who runs a company that provides and supports cluster
> software based on Rocks, I can tell you that there are plenty of
> pitfalls that should have been mentioned.
> This underscores my major complaint with this book. There's nothing
> new, nothing novel and no real help offered. Everything is just laid
> out superficially in front of the reader for them to make the right
> cluster decision. The book should guide the cluster decision-making
> process, but it only offers a bunch of questions -- with no
> substantial answers.
> Sloan even admits on page 91 that there is a very detailed set of
> installation instructions for OSCAR, including screen shots,
> online. So why is this book necessary again? Oh yeah, the author is
> supposed to help the reader decide if OSCAR, or any cluster toolkit
> for that matter, is right for the reader. Unfortunately, no help of
> any kind is offered.
> The typos and omissions weren't rampant this time, but the errors I
> found on pages 76, 123, 127, 130, and 136 provided nasty flashbacks
> the first O'Reilly book. Good thing I resigned myself to do a shot of
> tequila after every typo I found. It dulled the pain this book
> OK. "Part I -- An Introduction to Clusters" is just inaccurate and
> infuriating. "Part II -- Getting Started Quickly" contains recycled
> and reformatted content easily found for free online. "Part III --
> Building Custom Clusters" isn't really about building custom
> but looks more closely at some software that was gleaned over in
> I & II. While I don't agree with the inclusion of the parallel
> file system (PVFS) and the omission of Sun Grid Engine in Part III,
> I'm sure this can be chalked up to one of the tough decisions the
> author had to make, like the omission of PVM and Condor from the
> "Part IV -- Cluster Programming" is actually a very good introduction
> to programming, debugging, and profiling MPI programs.
> It's obvious that this book has no clear identity. It's like a 5th
> grader's book report: a lifeless facsimile of what's been read,
> totally void of originality, wisdom or topic advancement. But it's a
> quick read because it uses small words.
> Should I be this harsh? After all, cluster computing is a complex
> subject where the answer to most questions is "it depends." However,
> I believe that O'Reilly owed us an excellent book after their first
> cluster gaffe, so I'm disappointed that O'Reilly took the easy way
> by reorganizing and watering down documentation that is available
> elsewhere. Even the content in the exemplary Part IV can be found in
> several other places. It's just a lot less technical and intimidating
> There are better ways to write a clusters book. I know because I've
> read several cluster book outlines by members of the cluster
> intelligentsia that would have been better than this offering. So I'm
> not going easy on O'Reilly, no matter how good their intentions. The
> cluster community has a difficult enough time assisting people with
> clusters without books like this dynamiting the proverbial cluster
> well. The statement on page 28, "...benchmarking is probably a
> meaningless activity and waste of time," is just plain wrong and
> demonstrates a glaring lack of cluster understanding.
> If you really want to learn about clusters, pick up a copy of
> Sterling's "Beowulf Cluster Computing with Linux," 2nd edition, or
> check out Warewulf, Rocks, OSCAR, OpenMosix, and ClusterWorld online.
> You could join a mailing list, like the Beowulf mailing list, and
> subscribe to ClusterWorld Magazine. This is where the creators and
> maintainers of all that is clustering hang out, announce, debate,
> rant, create, lurk, help, and publish. If you want to be part of
> clustering's future, then you'll check out the community's Cluster
> Agenda and attend this year's ClusterWorld conference.
> Glen Otero received his Ph.D. in Microbiology and Immunology from
> in 1995 and immediately escaped to the more temperate climes and
> better surf in San Diego. After some research on the molecular and
> cellular biology of HIV and Herpes viruses at the Salk Institute for
> Biological Sciences, Glen left the wet lab research bench in 1999.
> Although leaving the research bench, he didn't leave science
> altogether; traveling all the way across the street to the San Diego
> Supercomputer Center (SDSC) for a stint at the Protein Data Bank. It
> was while at SDSC that Glen had his Linux clusters and bioinformatics
> epiphany. Soon after that illuminating event, Glen founded Linux
> Prophet, a bioinformatics consultancy specializing in the
> implementation, design, and deployment of Linux Beowulf clusters in
> the life sciences. Late in 2002 Linux Prophet evolved into Callident,
> a Linux cluster software and high performance computing company.
Glen Otero Ph.D.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 10616 bytes
Desc: not available
More information about the Beowulf