[Beowulf] Project Heron at the Sanger Institute [EXT]

Jonathan Aquilina jaquilina at eagleeyet.net
Thu Feb 4 10:27:35 UTC 2021

Would love to help you guys out in anyway i can in terms of hardware processing.

Have you guys thought of doing something like SETI at home and those projects to get idle compute power to help churn through the massive amounts of data?

From: Tim Cutts <tjrc at sanger.ac.uk>
Sent: 04 February 2021 11:26
To: Jonathan Aquilina <jaquilina at eagleeyet.net>
Cc: Beowulf <beowulf at beowulf.org>
Subject: Re: [Beowulf] Project Heron at the Sanger Institute [EXT]

On 4 Feb 2021, at 10:14, Jonathan Aquilina via Beowulf <beowulf at beowulf.org<mailto:beowulf at beowulf.org>> wrote:

I am curious though to chunk out such large data is something like hadoop/HBase and the like of those platforms, are those whats being used?

It’s a combination of our home-grown sequencing pipeline which we use across the board, and then a specific COG-UK analysis of the genomes themselves.  This pipeline is common to all consortium members who are contributing sequence data.  It’s a Nextflow pipeline, and the code is here:


Being nextflow, you can run it on anything for which nextflow has a backend scheduler.   It supports data from both Illumina and Oxford Nanopore sequencers.

-- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20210204/92a9a7c2/attachment.htm>

More information about the Beowulf mailing list