<p dir="ltr">FWIW - SR-IOV on Mellanox is good and turning to great this year so near bare metal performance in a vm is becoming possible with the flexibiliy of migrating VMs over IB. We don't use it yet in production but expect to by SC15.<br></p>
<p dir="ltr">[Shameless_Promo]<br>
R-HPC (<a href="http://www.r-hpc.com">www.r-hpc.com</a>) sells bare metal "HPC as a Service" in 2 modes today:<br>
1) Utility Shared Queues with IB and Lustre on Scientific Linux (pay per job)<br>
2) Dedicated Clusters with flexible OS/FS and superuser if you need it. (Pay per days/months dedicated)</p>
<p dir="ltr">Our access model is ssh or ssh over VPN, so there is infinite flexibility in mode 2. We can help make sure the user experience is minimally impacted either way. We have HPC Admins who can deep dive in to support compiling codes or other challenges, so we are an extension of your admin/support team any time we are invited to help. We are open source and happy to help you recreate anything we do on your site. Our only "vendor lock-in strategy" is you will love our support.</p>
<p dir="ltr">We use Dell as a sales channel too (Dell HPC CLoud Services) which has simplified purchasing for some academic institutions too. We are Internet2 connected and working on Net+ Service provider status... so you may already be conntected at 10Gb speeds!<br>
[/Shameless_Promo]</p>
<p dir="ltr">Hope This Helps...<br>
Cheers!<br>
Greg W. Keller<br><br><br><br><br><br></p>
<p dir="ltr">> Date: Thu, 7 May 2015 22:28:11 +0000<br>
> From: "Hutcheson, Mike" <<a href="mailto:Mike_Hutcheson@baylor.edu">Mike_Hutcheson@baylor.edu</a>><br>
> To: "<a href="mailto:beowulf@beowulf.org">beowulf@beowulf.org</a>" <<a href="mailto:beowulf@beowulf.org">beowulf@beowulf.org</a>><br>
> Subject: [Beowulf] HPC in the cloud question<br>
> Message-ID: <<a href="mailto:D1714A97.56B37%25Mike_Hutcheson@baylor.edu">D1714A97.56B37%Mike_Hutcheson@baylor.edu</a>><br>
> Content-Type: text/plain; charset="iso-8859-1"<br>
><br>
> Hi. We are working on refreshing the centralized HPC cluster resources<br>
> that our university researchers use. I have been asked by our<br>
> administration to look into HPC in the cloud offerings as a possibility to<br>
> purchasing or running a cluster on-site.<br>
><br>
> We currently run a 173-node, CentOS-based cluster with ~120TB (soon to<br>
> increase to 300+TB) in our datacenter. It?s a standard cluster<br>
> configuration: IB network, distributed file system (BeeGFS. I really<br>
> like it), Torque/Maui batch. Our users run a varied workload, from<br>
> fine-grained, MPI-based parallel aps scaling to 100s of cores to<br>
> coarse-grained, high-throughput jobs (We?re a CMS Tier-3 site) with high<br>
> I/O requirements.<br>
><br>
> Whatever we transition to, whether it be a new in-house cluster or<br>
> something ?out there?, I want to minimize the amount of change or learning<br>
> curve our users would have to experience. They should be able to focus on<br>
> their research and not have to spend a lot of their time learning a new<br>
> system or trying to spin one up each time they have a job to run.<br>
><br>
> If you have worked with HPC in the cloud, either as an admin and/or<br>
> someone who has used cloud resources for research computing purposes, I<br>
> would appreciate learning your experience.<br>
><br>
> Even if you haven?t used the cloud for HPC computing, please feel free to<br>
> share your thoughts or concerns on the matter.<br>
><br>
> Sort of along those same lines, what are your thoughts about leasing a<br>
> cluster and running it on-site?<br>
><br>
> Thanks for your time,<br>
><br>
> Mike Hutcheson<br>
> Assistant Director of Academic and Research Computing Services<br>
> Baylor University<br>
><br>
</p>