-
Recommended analysis pipeline
The analysis of the FASTQ format sequence data is performed using a Nextflow workflow called wf-flu. This workflow is accessible through the Nextflow command-line software and may also be run using the graphical interface provided by EPI2ME Labs.
The workflow processes the basecalled and demultiplexed DNA sequence data output by MinKNOW. The sequences are filtered for a minimum length and quality thresholds (200 nucleotides and Q9 respectively) prior to sequence alignment to the CDC multi-fasta Influenza reference. The alignment is performed using the Minimap2 software. Depth of coverage across the mapped sequences is measured using Samtools before genetic variants are called using Medaka. A coverage-masked consensus sequence is prepared for each sample using bcftools. The influenza strain typing is then performed using the abricate software with an insaflu database. The influenza strains included in the database are listed in the project documentation pages at https://github.com/epi2me-labs/wf-flu.
The workflow returns a per-run HTML-format summary report along with a CSV file of typing results. Additional files that include mapping BAM files and VCF files of Medaka variants are also included in the workflow output.
For more information, please refer to the Influenza workflow blog.
-
Software set-up and installation
The wf-flu workflow requires the Nextflow and either the Docker or Conda software to have been installed. The EPI2ME Labs Workflow quick start guide provides instructions for the installation of these requirements for GridION, PromethION and general Ubuntu Linux users and provides a little more introduction to the Nextflow software.
To run the EPI2ME Labs via the GUI instead of the command-line, download the executable for your operating system here: https://labs.epi2me.io/downloads and consult the Quick Start Guide to set up and run the software.