support - Re: [Beowulf] Redmond is at it, again

Alvin Oga alvin at
Wed Jun 2 07:13:48 PDT 2004

hi ya "clusters"

On Wed, 26 May 2004, Roger L. Smith wrote:

> On Tue, 25 May 2004, Jeffrey B. Layton wrote:
> > Redhat, Suse, etc. have gone to a per-CPU price for license
> > and support driving up the costs of installing the OS on a cluster.
> > In fact, the last quote I got for a 72 node cluster averaged about
> > $400 a node!
> Amen!
> I recently talked to RedHat about purchasing RHEL for my systems.  Based
> on my needs for two large clusters and a smattering of desktops, they were
> asking nearly $10,000 PER YEAR for it, and that was educational pricing!

i donno about you folks, but i think $400/node is super super cheap
for support ..

	- but, to be fair, i think one would have to say what the
 	"$400/node" support consists of

basically ... "support" should be free if:

	- its just applying (free) patches 
	- its building the boxes
	- its to keep the boxes running w/o any major change/differences
	- its under warranty ( hardware should NOT fail )

and conversely, "support" is NOT free if:
	- things get changed/upgraded to keep up with the times
	- if you want things tested before you get the "tested patches"
	- if things was separated into its various line items for
	the original purchase order

to me .. support costs of building, maintaining the cluster is:

  - deciding which cluster to build/buy

	- pick the right set of hw ( cpu, mem, disk, motherboard )
	- pick the right ser of sw ( cluster apps )
	( both of the above can be done by the end user and usually
	( is part of the specs

	- applying initial patches .... good idea to do, as it might, and
	probably will fix existing known problems
	( part of the cluster install process )

	- it's also cheaper to support your own hardware one builds
	than to inherit someone else "hardware" picked out and built
	randomly by the $5/hr tech at some big outfit
	( it's a non-trivial problem .. getting "good" hardware )

  - sw support

	- applying patches over the course of the year ..
	( that is a "i want it my way issue" ...
		- if you apply new patches to a working system,
		you can break it ( no long works at all )

		- if you apply new patches, you can change the results
		of prior tests

		- its probably NOT a good idea to change things if
		repeatability and predictability is important

	- adding new sw apps over the course of the year ..
		- requires prior testing that the new app will 
		work with the rest of the apps already installed
		and all the dependencies are working properly
		between the libraries and other apps

		( major pain in the butt )

   - hw support ?? 
	- if the node dies, is it under parts-replacement warranty
	to be swapped out and replaced within 24hrs ??
	( warranty from cpu/disk/mb/mem manufacturers are 1yr - 5yr
	( depending on what item it is, but does NOT include 24hr hot swap

	- whether your cluster can survive a node failure is a
	separate "cluster spec" issue and should obviously survive
	any multiple failures of one or more nodes
		( ie.... 24hr hot swapp hardware replacement is NOT worth
		( it, especially since very very few people can properly
		( remotely diagnose the problem, and send the "right"
		( replacement part in a 24hr period

	- consumer COTS parts have a lifespan of about 2-3 months after
	which, you'd be lucky to get the same identical brand new
	replacement part, and worst still, replacement parts are typically
	someone else's returned part :-)
		( check the replacement parts carefully )

	- in my book, hardware should NOT fail :-)
	( with the possible exception of fans and power supply .. )

   - mw support ??
	- how do i do this ???
	- how do i do that ???
	( these unexpected support issues is what drives both parties
	( bonkers and whose support ccsts can be controlled or get out
	( of controlling if the "support" was free

   - on site support vs email support vs web-based support vs phone
     support, shall be another day's ballgame

- my comment is if all the support folks is giving you patches that
  is freely available by scouring the net, than "that support" should be
  FREE  and hardware should NOT fail .. 


have fun

More information about the Beowulf mailing list