Workflow Type: Galaxy

Given a set of VCF files and the reference genome used to do the mapping and SNP calling, create a multifasta file containing the genomes of all samples and calculate the matrix of pairwise SNP distances

Associated Tutorial

This workflows is part of the tutorial From VCFs to SNP distance matrix, available in the GTN

Thanks to...

Tutorial Author(s): Galo A. Goig, Daniela Brites, Christoph Stritt

Tutorial Contributor(s): Wolfgang Maier

gtn star logo followed by the word workflows

Inputs

ID Name Description Type
Collection of VCFs to analyze #main/Collection of VCFs to analyze n/a
  • array containing
    • File
Reference genome of the MTBC ancestor #main/Reference genome of the MTBC ancestor n/a
  • File

Steps

ID Name Description
2 Filter TB variants We will ensure at this step that variants to build the MSA are fixed variants and that we low-confidence filter repetitive regions of the MTB genome toolshed.g2.bx.psu.edu/repos/iuc/tb_variant_filter/tb_variant_filter/0.1.3+galaxy0
3 Generate the complete genome of each of the samples The complete genome of each of the samples is generated by inserting the SNPs defined in the respective VCF in the reference genome that was used for mapping and SNP calling toolshed.g2.bx.psu.edu/repos/iuc/bcftools_consensus/bcftools_consensus/1.9+galaxy2
4 Concatenate genomes to build a MSA All genomes are concatenated in a single multifasta file. Because all o them have the same length, this may be seen as a multiple sequence alignment. toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cat/0.1.1
5 Keep only variable positions Discard invariant positions from the MSA to simplify the file so only contains positions with at least one SNP in at least one strain. toolshed.g2.bx.psu.edu/repos/iuc/snp_sites/snp_sites/2.5.1+galaxy0
6 Calculate SNP distances From the MSA. Calculate pairwise SNP distances between samples. toolshed.g2.bx.psu.edu/repos/iuc/snp_dists/snp_dists/0.6.3+galaxy0

Outputs

ID Name Description Type
{input_file} #main/{input_file} n/a
  • File
_anonymous_output_1 #main/_anonymous_output_1 n/a
  • File
_anonymous_output_2 #main/_anonymous_output_2 n/a
  • File
_anonymous_output_3 #main/_anonymous_output_3 n/a
  • File
_anonymous_output_4 #main/_anonymous_output_4 n/a
  • File

Version History

1.0 (latest) Created 16th Jul 2024 at 14:24 by Helena Rasche

Added/updated 4 files


Open master 058351f

2.0 (earliest) Created 25th Jun 2024 at 11:22 by Helena Rasche

Added/updated 4 files


Frozen 2.0 5a40a06
help Creators and Submitter
Creators
Not specified
Submitter
Discussion Channel
Activity

Views: 228   Downloads: 83

Created: 25th Jun 2024 at 11:22

Last updated: 25th Jun 2024 at 11:22

help Attributions

None

Total size: 71.7 KB
Powered by
(v.1.16.0-main)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH