sars-cov-2-consensus-from-variation/COVID-19-CONSENSUS-CONSTRUCTION

Workflow Type: Galaxy

COVID-19: consensus construction

This workflow aims at generating reliable consensus sequences from variant calls according to transparent criteria that capture at least some of the complexity of variant calling.

It takes a collection of VCFs and a collection of the corresponding aligned reads (for the purpose of calculating genome-wide coverage) such as produced by any of the four variant calling workflows in https://github.com/galaxyproject/iwc/tree/main/workflows/sars-cov-2-variant-calling and generates a collection of viral consensus sequences and a multisample FASTA of all these sequences.

Each consensus sequence is guaranteed to capture all called, filter-passing variants as defined in the VCF of its sample that reach a user-defined consensus allele frequency threshold.

Filter-failing variants and variants below a second user-defined minimal allele frequency threshold will be ignored.

Genomic positions of filter-passing variants with an allele frequency in between the two thresholds will be hard-masked (with N) in the consensus sequence of their sample.

Genomic positions with a coverage (calculated from the read alignments input) below another user-defined threshold will be hard-masked, too, unless they are consensus variant sites.

Inputs

ID Name Description Type
Variant calls Variant calls Collection of VCFs produced by upstream workflows for variation analysis n/a
min-AF for consensus variant min-AF for consensus variant Only variant calls with an allele-frequency greater this value will be considered consensus variants. n/a
min-AF for failed variants min-AF for failed variants Variant calls with an allele frequency higher than this value, but lower than the AF threshold for consensus variants will be considered questionable and the respective sites be masked (with Ns) in the consensus sequence. n/a
aligned reads data for depth calculation aligned reads data for depth calculation Fully processed BAMs as generated by upstream workflows for variation analysis. Note: for ARTIC data, these BAMs should NOT have undergone processing with ivar removereads. n/a
Depth-threshold for masking Depth-threshold for masking Sites in the viral genome covered by less than this number of reads are considered questionable and will be masked (with Ns) in the consensus sequence independent of whether a variant has been called at them or not. n/a
Reference genome Reference genome The SARS-CoV-2 reference genome n/a

Steps

ID Name Description
0 Variant calls Collection of VCFs produced by upstream workflows for variation analysis
1 min-AF for consensus variant Only variant calls with an allele-frequency greater this value will be considered consensus variants.
2 min-AF for failed variants Variant calls with an allele frequency higher than this value, but lower than the AF threshold for consensus variants will be considered questionable and the respective sites be masked (with Ns) in the consensus sequence.
3 aligned reads data for depth calculation Fully processed BAMs as generated by upstream workflows for variation analysis. Note: for ARTIC data, these BAMs should NOT have undergone processing with ivar removereads.
4 Depth-threshold for masking Sites in the viral genome covered by less than this number of reads are considered questionable and will be masked (with Ns) in the consensus sequence independent of whether a variant has been called at them or not.
5 Reference genome The SARS-CoV-2 reference genome
6 Compose text parameter value toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1
7 Compose text parameter value toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1
8 bedtools Genome Coverage toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_genomecoveragebed/2.29.2
9 Compose text parameter value toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1
10 SnpSift Filter toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_filter/4.3+t.galaxy1
11 SnpSift Filter toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_filter/4.3+t.galaxy1
12 Filter Filter1
13 SnpSift Extract Fields toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_extractFields/4.3+t.galaxy0
14 SnpSift Extract Fields toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_extractFields/4.3+t.galaxy0
15 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6
16 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6
17 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6
18 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6
19 Cut Cut1
20 Cut Cut1
21 Concatenate toolshed.g2.bx.psu.edu/repos/devteam/concat/gops_concat_1/1.0.1
22 Merge toolshed.g2.bx.psu.edu/repos/devteam/merge/gops_merge_1/1.0.0
23 Subtract toolshed.g2.bx.psu.edu/repos/devteam/subtract/gops_subtract_1/1.0.0
24 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6
25 Cut Cut1
26 bcftools consensus toolshed.g2.bx.psu.edu/repos/iuc/bcftools_consensus/bcftools_consensus/1.10
27 Collapse Collection toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/4.2

Outputs

ID Name Description Type
out1 out1 n/a
out1 out1 n/a
output output n/a
out1 out1 n/a
output output n/a
output output n/a
out_file1 out_file1 n/a
output output n/a
output output n/a
out_file1 out_file1 n/a
out_file1 out_file1 n/a
out_file1 out_file1 n/a
out_file1 out_file1 n/a
out_file1 out_file1 n/a
out_file1 out_file1 n/a
output output n/a
output output n/a
output output n/a
out_file1 out_file1 n/a
out_file1 out_file1 n/a
output_file output_file n/a
output output n/a
help Creators and Submitter
Creator
  • Wolfgang Maier
Submitter
License
Activity

Views: 65   Downloads: 0

Created: 30th Aug 2021 at 17:00

Last updated: 30th Aug 2021 at 17:00

Last used: 22nd Sep 2021 at 22:01

help Attributions

None

Version History

Version 2 (latest) Created 30th Aug 2021 at 17:00 by Simone Leo

No revision comments

Version 2 (earliest) Created 30th Aug 2021 at 17:00 by Simone Leo

No revision comments

Related items

Powered by
(v.1.12.0-master)
Copyright © 2008 - 2021 The University of Manchester and HITS gGmbH