[Beowulf] Project Heron at the Sanger Institute [EXT]

Thu Feb 4 10:26:07 UTC 2021

On 4 Feb 2021, at 10:14, Jonathan Aquilina via Beowulf <beowulf at beowulf.org<mailto:beowulf at beowulf.org>> wrote:

I am curious though to chunk out such large data is something like hadoop/HBase and the like of those platforms, are those whats being used?

It’s a combination of our home-grown sequencing pipeline which we use across the board, and then a specific COG-UK analysis of the genomes themselves.  This pipeline is common to all consortium members who are contributing sequence data.  It’s a Nextflow pipeline, and the code is here:

https://github.com/connor-lab/ncov2019-artic-nf

Being nextflow, you can run it on anything for which nextflow has a backend scheduler.   It supports data from both Illumina and Oxford Nanopore sequencers.

Tim

-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20210204/6f1cd599/attachment-0001.htm>