[Beowulf] Docker in HPC
Peter Clapham
pc7 at sanger.ac.uk
Wed Nov 27 04:01:20 PST 2013
On 27/11/13 11:45, John Hearns wrote:
> Here is a Register article, which covers the same ground as Joe's post:
> http://www.theregister.co.uk/2013/11/26/docker_spreads_to_more_linux_distros/
> " For instance, Docker could be used to run a database in one
> container and an app server in another, and the configurable isolation
> properties"
> So can we think of batch schedulers which woudl reserve parts of big
> NUMA machines, and run docker containers on them?
> Also fromthe blog, Offline Transfer:
> "The exported bundles are regular directories, and can be transported
> by any file transfer mechanism, included ftp, physical media,
> proprietary installers, etc
> . This feature is particulary interesting for software vendors who
> need to ship their software as sealed appliances to their “enterprise”
> customers.
> Using offline transfer, they can use docker containers as the delivery
> mechanism for software updates"
> That is really interesting.
> Can we forsee users running on in-house clusters with Docker
> containers, which may be commercial applications delivered
> pre-packaged by an ISV,
> or locally developed?
> Then when they need more capacity in short timescales just exporting
> those containers to run on a cloud (let's say AWS ) and be confident
> they will run in the same way?
>
This is something that is being strongly considered in house. As we are
increasingly being exposed to restricted data sets the security model is
very compelling. There is also a secondary and possibly equally
important aspect for our users.
In the bio-informatics arena the local software half life is
approximately 6-12 months. This, along with the wide range of
applications in use rapidly creates an environment where users can cross
link or pick up binaries or libraries that they weren't expecting.
Rolling containers with predefined environments would not only
potentially alleviate these potential pitfalls BUT they could provide an
environment in which data can be re-analysed at a future date in against
the same pre-defined environment.
So in short I would be very surprised if we are not running something
along exactly these lines in the (hopefully) very near future. If there
is the interest we'd be happy to pass on our war stories / experiences
along the way.
Commercial options also exist...
As an aside for those who pay to use IBMs Platform LSF they have had an
integrated CGROUP environment for a while now. They also provide various
supported options for managing such instances in their portfolio. So far
we have only investigated integrating the lsf CGROUPS within lsf and
whilst we have found some interesting features / bugs, the patches
provided and early results are very promising.
If anyone has similarly prodded the world of HPC and cgroups we'd be
very interested in hearing how you get on.
Pete
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
More information about the Beowulf
mailing list