[Beowulf] Nobody ever got ﬁred for using Hadoop on a cluster
ajdecon at ajdecon.org
Wed Apr 24 13:00:10 PDT 2013
-----BEGIN PGP SIGNED MESSAGE-----
An interesting paper from Microsoft research on the feasibility of
using single large-memory servers as a more cost-effective replacement
for Hadoop cluters. Especially since "Big Data" often isn't...
"However, evidence suggests that the majority of analytics jobs do not
process huge data sets. For example, as we will discuss in more detail
later, at least two analytics production clusters (at Microsoft and
Yahoo) have median job input sizes under 14 GB, and 90%
of jobs on a Facebook cluster have input sizes under 100 GB."
It was published last year, but I've seen it pop up a couple places
recently and figured it might be interesting to this list. I know that
as much as I usually prefer clusters for long-term scalability reasons,
I've seen a lot of cases where a single large-memory workstation made
more sense for both performance and simplicity.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (Darwin)
-----END PGP SIGNATURE-----
More information about the Beowulf