[Beowulf] Nobody ever got fired for using Hadoop on a cluster

Adam DeConinck ajdecon at ajdecon.org
Wed Apr 24 13:00:10 PDT 2013


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


http://research.microsoft.com/pubs/163083/hotcbp12%20final.pdf

An interesting paper from Microsoft research on the feasibility of
using single large-memory servers as a more cost-effective replacement 
for Hadoop cluters. Especially since "Big Data" often isn't...

"However, evidence suggests that the majority of analytics jobs do not
process huge data sets. For example, as we will discuss in more detail 
later, at least two analytics production clusters (at Microsoft and
Yahoo) have median job input sizes under 14 GB, and 90%
of jobs on a Facebook cluster have input sizes under 100 GB."

It was published last year, but I've seen it pop up a couple places
recently and figured it might be interesting to this list. I know that
as much as I usually prefer clusters for long-term scalability reasons, 
I've seen a lot of cases where a single large-memory workstation made
more sense for both performance and simplicity.

Cheers,
Adam


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (Darwin)

iQEcBAEBAgAGBQJReDnIAAoJEGqxms5cZfz0g0gIAKx07C81I9SnJZq/r3CkQqpP
B4NHJ+WadNNpr0LlAY9WD7Q7D6URWqx4JjlkWENCOFtPjEl3yVZ2eFXEtr5dz0Sd
5EyBzdVtMuKc4Z14AXgcLDqFwEBWg6hgu+YVzQL3JUVSnBD9s14MxYz8cJC8VzKF
n5HxTJbZ7siErR+Dzsh/eGkGzb1hehzWV5Bw27oajhSRMAYwBNmbWQGRvtUxYdkp
uEaAahEH2tgZmZ6tEX+HkvNScIES8V7SF3jUWfzGaHj5wSNgtRhm9/5OZTlFSF9q
bv17m3IEgunrx7NBk2gSKgR0gUA99BlU9c2fPBCJ0fEp7ROzNF3Pea7OTpsYdBU=
=JwPq
-----END PGP SIGNATURE-----


More information about the Beowulf mailing list