<div>I used Doug's BPS package to benchmark a virtual cluster on Amazon EC2, and was hoping the beowulf list could give their feedback on the results and feasibility of using this for on-demand clusters. This approach is currently being used to run some MPI code which is tolerant of poor latency, i.e. MPIBlast, monte carlo runs, etc. <br>
<br>You get gigabit ethernet on EC2, but the latency from netpipe seems to be an order of magnitude higher than Doug's Kronos example on the Cluster Monkey page:<br><br>Amazon EC2 Latency: 0.000492 (microseconds)<br>
Kronos Latency: 0.000029 (microseconds)<br><br>Full Results/Charts for a "small" cluster of two extra-large nodes here (I just used the default BPS config with MPICH2):<br><br>
<a href="http://www.datawrangling.com/media/BPS-AmazonEC2-xlarge-run-1/index.html">http://www.datawrangling.com/media/BPS-AmazonEC2-xlarge-run-1/index.html</a><br>
<a href="http://www.datawrangling.com/media/BPS-AmazonEC2-xlarge-run-2/index.html">http://www.datawrangling.com/media/BPS-AmazonEC2-xlarge-run-2/index.html</a><br><br>The unixbench results are misleading on VM, so I left those out. Others have verified the performance mentioned in the EC2 documentation: "One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor."<br>
<br>Some bonnie results are here:<br><a href="http://blog.dbadojo.com/2007/10/bonnie-io-benchmark-vs-ec2.html">http://blog.dbadojo.com/2007/10/bonnie-io-benchmark-vs-ec2.html</a><br><br>The cluster is launched and configured using some python scripts and a custom beowulf Amazon Machine Image (AMI), which is basically a Xen image configured to run on EC2. You end up paying 80 cents/hour for 8 cores with15GB RAM, and can scale that up to 100 or more if you need to. I'm cleaning up the code, and will post it on my blog if anyone wants to try it out. I think this could be a cost effective path for people, who for whatever reason, can't build/use a dedicated cluster. <br>
<br>Here are the specifications for each instance:<br><br>Extra Large Instance:<br><br> 15 GB memory<br> 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)<br> 1,690 GB instance storage (4 x 420 GB plus 10 GB root partition)<br>
64-bit platform<br> I/O Performance: High<br> Price: $0.80 per instance hour<br><br>-Pete<br><br> </div><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">
<pre style="margin: 0em;">There are plenty of parallel chores that are tolerant of poor latency --<br>the whole world of embarrassingly parallel computations plus some<br>extension up into merely coarse grained, not terribly synchronous real<br>
parallel computations.</pre></blockquote><div> </div><div><br></div><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote"><pre style="margin: 0em;">
VMs can also be wonderful for TEACHING clustering and for managing<br>"political" problems. ... Having any sort of access to a high-latency Linux VM<br>node running on a Windows box beats the hell out of having no node at<br>
all or having to port one's code to work under Windows.</pre></blockquote><div> <br></div><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">
<pre style="margin: 0em;"><br>We can therefore see that there are clearly environments where the bulk<br>of the work being done is latency tolerant and where VMs may well have<br>benefits in administration and security and fault tolerance and local<br>
politics that make them a great boon in clustering, just as there are<br>without question computations for which latency is the devil and any<br>suggestion of adding a layer of VM latency on top of what is already<br>inherent to the device and minimal OS will bring out the peasants with<br>
pitchforks and torches. Multiboot systems, via grub and local<br>provisioning or PXE and remote e.g. NFS provisioning is also useful but<br>is not always politically possible or easy to set up.<br><br>It is my hope that folks working on both sorts of multienvironment<br>
provisioning and sysadmin environments work hard and produce spectacular<br>tools. I've done way more work than I care to setting up both of these<br>sorts of things. It is not easy, and requires a lot of expertise.<br>
Hiding this detail and expertise from the user would be a wonderful<br>contribution to practical clustering (and of course useful in the HA<br>world as well).</pre></blockquote><br clear="all"><br>-- <br>Peter N. Skomoroch<br>
<a href="mailto:peter.skomoroch@gmail.com">peter.skomoroch@gmail.com</a><br><a href="http://www.datawrangling.com">http://www.datawrangling.com</a>