[Beowulf] $1, 279-per-hour, 30, 000-core cluster built on Amazon EC2 cloud

Tue Oct 4 07:55:39 PDT 2011

On Mon, Oct 3, 2011 at 3:21 PM, Robert G. Brown <rgb at phy.duke.edu> wrote:
> I would be very interested in seeing the
> detailed scaling of "fine grained parallel" applications on cloud
> resources -- one point that the talk made that I agree with is that
> embarrassingly parallel applications that require minimal I/O or IPCs
> will do well in a cloud where all that matters is how many instances you
> can run of jobs that don't talk to each other or need much access to
> data.  But what of jobs that require synchronous high speed
> communications?

Amazon (and I believe other cloud providers have something similar?)
introduced Cluster Compute Instances with 10 Gb Ethernet. For
traditional MPI workloads, the real advantage is actually from HVM
(Hardware VM), as it cuts the communication latency by quite a lot.

> What of jobs that require access to huge datasets?

Getting data in & out of the cloud is still a big problem, and the
highest bandwidth way of sending data to AWS is by FedEx. In fact, it
is quite often that the fastest way to send data from one data center
to another when the data size is big.

And processing data on the cloud is easier (in terms of setup) with
Amazon Elastic MapReduce (and recently works with spot instances).

http://aws.amazon.com/elasticmapreduce/

> Ultimately the problem comes down to this.  Your choice is to rent time
> on somebody else's hardware or buy your own hardware.  For many people,
> one can scale to infinity and beyond, so using "all" of the
> time/resource you have available either way is a given.  In which case
> no matter how you slice it, Amazon or Google have to make a profit above
> and beyond the cost of delivering the service.  You don't (or rather,
> your "profit" is just the ability to run your jobs and get paid as usual
> to do your research either way).  This means that it will always be
> cheaper to directly provision a lot of computing rather than run it in
> the cloud, or for that matter at an HPC center.

Provided that the machines are used 24x7. A lot of enterprise users do
not have enough work to load up the machines. Eg, I worked with a
client that has lots of data & numbers to crunch at night, and during
day time most of the machines are idle.

For traditional HPC centers, the batch queue length is almost never 0,
then agreed, cloud wouldn't help or even makes the problem worse.

Rayson

=================================
Grid Engine / Open Grid Scheduler
http://gridscheduler.sourceforge.net

>  Not all -- lots of
> nonlinearities and thresholds associated with infrastructure and admin
> and so on -- but a lot.  Enough that I don't see Amazon's Pinky OR the
> Brain ever taking over the (HPC) world...
>
>   rgb
>
>>
>> http://web.mit.edu/stardev/cluster/
>>
>> StarCluster sets up NFS, SGE, BLAS library, Open MPI, etc
>> automatically for the user in around 10-15 mins. StarCluster is
>> licensed under LGPL, written in Python+Boto, and supports a lot of the
>> new EC2 features (Cluster Compute Instances, Spot Instances, Cluster
>> GPU Instances, etc). Support for launching higher node count (100+
>> instances) clusters is even better with the new scalability
>> enhancements in the latest version (0.92).
>>
>> And there are some tutorials on YouTube:
>>
>> - "StarCluster 0.91 Demo":
>> http://www.youtube.com/watch?v=vC3lJcPq1FY
>>
>> - "Launching a Cluster on Amazon Ec2 Spot Instances Using StarCluster":
>> http://www.youtube.com/watch?v=2Ym7epCYnSk
>>
>> Rayson
>>
>> =================================
>> Grid Engine / Open Grid Scheduler
>> http://gridscheduler.sourceforge.net
>>
>>
>>
>> On Wed, Sep 21, 2011 at 7:02 AM, Eugen Leitl <eugen at leitl.org> wrote:
>>>
>>>
>>> http://arstechnica.com/business/news/2011/09/30000-core-cluster-built-on-amazon-ec2-cloud.ars
>>>
>>> $1,279-per-hour, 30,000-core cluster built on Amazon EC2 cloud
>>>
>>> By Jon Brodkin | Published September 20, 2011 10:49 AM
>>>
>>> Amazon EC2 and other cloud services are expanding the market for
>>> high-performance computing. Without access to a national lab or a
>>> supercomputer in your own data center, cloud computing lets businesses
>>> spin
>>> up temporary clusters at will and stop paying for them as soon as the
>>> computing needs are met.
>>>
>>> A vendor called Cycle Computing is on a mission to demonstrate the
>>> potential
>>> of Amazon’s cloud by building increasingly large clusters on the Elastic
>>> Compute Cloud. Even with Amazon, building a cluster takes some work, but
>>> Cycle combines several technologies to ease the process and recently used
>>> them to create a 30,000-core cluster running CentOS Linux.
>>>
>>> The cluster, announced publicly this week, was created for an unnamed
>>> “Top 5
>>> Pharma” customer, and ran for about seven hours at the end of July at a
>>> peak
>>> cost of $1,279 per hour, including the fees to Amazon and Cycle
>>> Computing.
>>> The details are impressive: 3,809 compute instances, each with eight
>>> cores
>>> and 7GB of RAM, for a total of 30,472 cores, 26.7TB of RAM and 2PB
>>> (petabytes) of disk space. Security was ensured with HTTPS, SSH and
>>> 256-bit
>>> AES encryption, and the cluster ran across data centers in three Amazon
>>> regions in the United States and Europe. The cluster was dubbed
>>> “Nekomata.”
>>>
>>> Spreading the cluster across multiple continents was done partly for
>>> disaster
>>> recovery purposes, and also to guarantee that 30,000 cores could be
>>> provisioned. “We thought it would improve our probability of success if
>>> we
>>> spread it out,” Cycle Computing’s Dave Powers, manager of product
>>> engineering, told Ars. “Nobody really knows how many instances you can
>>> get at
>>> any one time from any one [Amazon] region.”
>>>
>>> Amazon offers its own special cluster compute instances, at a higher cost
>>> than regular-sized virtual machines. These cluster instances provide 10
>>> Gigabit Ethernet networking along with greater CPU and memory, but they
>>> weren’t necessary to build the Cycle Computing cluster.
>>>
>>> The pharmaceutical company’s job, related to molecular modeling, was
>>> “embarrassingly parallel” so a fast interconnect wasn’t crucial. To
>>> further
>>> reduce costs, Cycle took advantage of Amazon’s low-price “spot
>>> instances.” To
>>> manage the cluster, Cycle Computing used its own management software as
>>> well
>>> as the Condor High-Throughput Computing software and Chef, an open source
>>> systems integration framework.
>>>
>>> Cycle demonstrated the power of the Amazon cloud earlier this year with a
>>> 10,000-core cluster built for a smaller pharma firm called Genentech.
>>> Now,
>>> 10,000 cores is a relatively easy task, says Powers. “We think we’ve
>>> mastered
>>> the small-scale environments,” he said. 30,000 cores isn’t the end game,
>>> either. Going forward, Cycle plans bigger, more complicated clusters,
>>> perhaps
>>> ones that will require Amazon’s special cluster compute instances.
>>>
>>> The 30,000-core cluster may or may not be the biggest one run on EC2.
>>> Amazon
>>> isn’t saying.
>>>
>>> “I can’t share specific customer details, but can tell you that we do
>>> have
>>> businesses of all sizes running large-scale, high-performance computing
>>> workloads on AWS [Amazon Web Services], including distributed clusters
>>> like
>>> the Cycle Computing 30,000 core cluster to tightly-coupled clusters often
>>> used for science and engineering applications such as computational fluid
>>> dynamics and molecular dynamics simulation,” an Amazon spokesperson told
>>> Ars.
>>>
>>> Amazon itself actually built a supercomputer on its own cloud that made
>>> it
>>> onto the list of the world’s Top 500 supercomputers. With 7,000 cores,
>>> the
>>> Amazon cluster ranked number 232 in the world last November with speeds
>>> of
>>> 41.82 teraflops, falling to number 451 in June of this year. So far,
>>> Cycle
>>> Computing hasn’t run the Linpack benchmark to determine the speed of its
>>> clusters relative to Top 500 sites.
>>>
>>> But Cycle’s work is impressive no matter how you measure it. The job
>>> performed for the unnamed pharma company “would take well over a week for
>>> them to run internally,” Powers says. In the end, the cluster performed
>>> the
>>> equivalent of 10.9 “compute years of work.”
>>>
>>> The task of managing such large cloud-based clusters forced Cycle to step
>>> up
>>> its own game, with a new plug-in for Chef the company calls Grill.
>>>
>>> “There is no way that any mere human could keep track of all of the
>>> moving
>>> parts on a cluster of this scale,” Cycle wrote in a blog post. “At Cycle,
>>> we’ve always been fans of extreme IT automation, but we needed to take
>>> this
>>> to the next level in order to monitor and manage every instance, volume,
>>> daemon, job, and so on in order for Nekomata to be an efficient 30,000
>>> core
>>> tool instead of a big shiny on-demand paperweight.”
>>>
>>> But problems did arise during the 30,000-core run.
>>>
>>> “You can be sure that when you run at massive scale, you are bound to run
>>> into some unexpected gotchas,” Cycle notes. “In our case, one of the
>>> gotchas
>>> included such things as running out of file descriptors on the license
>>> server. In hindsight, we should have anticipated this would be an issue,
>>> but
>>> we didn’t find that in our prelaunch testing, because we didn’t test at
>>> full
>>> scale. We were able to quickly recover from this bump and keep moving
>>> along
>>> with the workload with minimal impact. The license server was able to
>>> keep up
>>> very nicely with this workload once we increased the number of file
>>> descriptors.”
>>>
>>> Cycle also hit a speed bump related to volume and byte limits on Amazon’s
>>> Elastic Block Store volumes. But the company is already planning bigger
>>> and
>>> better things.
>>>
>>> “We already have our next use-case identified and will be turning up the
>>> scale a bit more with the next run,” the company says. But ultimately,
>>> “it’s
>>> not about core counts or terabytes of RAM or petabytes of data. Rather,
>>> it’s
>>> about how we are helping to transform how science is done.”
>>>
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>
>>
>>
>> --
>> Rayson
>>
>> ==================================================
>> Open Grid Scheduler - The Official Open Source Grid Engine
>> http://gridscheduler.sourceforge.net/
>>
>> Wikimedia Commons
>> http://commons.wikimedia.org/wiki/User:Raysonho
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
> Robert G. Brown                        http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
>
>