<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
<br class="">
<div><br class="">
<blockquote type="cite" class="">
<div class="">On 24 Nov 2020, at 18:31, Alex Chekholko via Beowulf <<a href="mailto:beowulf@beowulf.org" class="">beowulf@beowulf.org</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class="">
If you can run your task on just one computer, you should always do that rather than having to build a cluster of some kind and all the associated headaches.</div>
<br class="Apple-interchange-newline">
</div>
</blockquote>
</div>
<div class=""><br class="">
</div>
<div class="">If you take on the cloud message, that of course isn’t necessarily the case. If you use very high level cloud services like lambda, you don’t have to build that infrastructure. It’s very unlikely to be anywhere near as efficient, of course,
but throughput efficiency is not what your average scientist cares about. What they care about is getting their answer quickly (and to a lesser extent, cheaply)</div>
<div class=""><br class="">
</div>
<div class="">I saw a recent example where someone took a fairly simple sequencing read alignment process, which normally runs on a single 16-core node in about 6 hours, and split the input files small enough that the alignment code execution time and memory
use would fit with AWS Lambda’s envelope. The result executed in a couple of minutes, elapsed, but used about four times as many core-hours as the optimised single node version. Of course, this is an embarrassingly parallel problem, so this is a relatively
easy analysis to move to this sort of design.</div>
<div class=""><br class="">
</div>
<div class="">From the scientist’s point of view, which is better? Getting their answer in 5 minutes or 6 hours? Especially if they’ve also reduced their development time as well because they don’t have to worry so much about infrastructure and optimisation.</div>
<div class=""><br class="">
</div>
<div class="">The total value is hard to work out, many of these considerations are hard to put a dollar value on. When I saw that article, I did ask the author how much the analysis actually cost, and she didn’t have a number. But I don’t think we can dogmatically
say that we should always run a task on a single machine if we can.</div>
<div class=""><br class="">
</div>
<div class="">Tim</div>
--
The Wellcome Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
</body>
</html>