preparing genomic data for phylogeny recostruction (GTN)

Workflow Type: Galaxy

This workflow begins from a set of genome assemblies of different samples, strains, species. The genome is first annotated with Funnanotate. Predicted proteins are furtner annotated with Busco. Next, 'ProteinOrtho' finds orthologs across the samples and makes orthogroups. Orthogroups where all samples are represented are extracted. Orthologs in each orthogroup are aligned with ClustalW. The alignments are cleaned with ClipKIT and the concatenation matrix is built using PhyKit. This can be used for phylogeny reconstruction.

Associated Tutorial

This workflows is part of the tutorial preparing genomic data for phylogeny recostruction (GTN), available in the GTN

Thanks to...

Tutorial Author(s): Miguel Roncoroni, Brigida Gallone

Workflow Author(s): Miguel Roncoroni

gtn star logo followed by the word workflows

Inputs

ID Name Description Type
Input genomes as collection #main/Input genomes as collection n/a
  • array containing
    • File

Steps

ID Name Description
1 Replace Text toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_line/1.1.2
2 RepeatMasker toolshed.g2.bx.psu.edu/repos/bgruening/repeat_masker/repeatmasker_wrapper/4.1.2-p1+galaxy0
3 Funannotate predict annotation toolshed.g2.bx.psu.edu/repos/iuc/funannotate_predict/funannotate_predict/1.8.9+galaxy2
4 Extract ORF toolshed.g2.bx.psu.edu/repos/bgruening/glimmer_gbk_to_orf/glimmer_gbk_to_orf/3.02
5 Regex Find And Replace toolshed.g2.bx.psu.edu/repos/galaxyp/regex_find_replace/regex1/1.0.1
6 Collapse Collection toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/4.2
7 Proteinortho toolshed.g2.bx.psu.edu/repos/iuc/proteinortho/proteinortho/6.0.14+galaxy2.9.1
8 Busco toolshed.g2.bx.psu.edu/repos/iuc/busco/busco/4.1.4
9 Filter Filter1
10 Proteinortho grab proteins toolshed.g2.bx.psu.edu/repos/iuc/proteinortho_grab_proteins/proteinortho_grab_proteins/6.0.14+galaxy2.9.1
11 Regex Find And Replace toolshed.g2.bx.psu.edu/repos/galaxyp/regex_find_replace/regex1/1.0.1
12 ClustalW toolshed.g2.bx.psu.edu/repos/devteam/clustalw/clustalw/2.1
13 ClipKIT. Alignment trimming software for phylogenetics. toolshed.g2.bx.psu.edu/repos/padge/clipkit/clipkit/0.1.0
14 PhyKit - Alignment-based functions toolshed.g2.bx.psu.edu/repos/padge/phykit/phykit_alignment_based/0.1.0

Outputs

ID Name Description Type
A partition file ready for input into RAxML or IQ-tree #main/A partition file ready for input into RAxML or IQ-tree n/a
  • File
An occupancy file that summarizes the taxon occupancy per sequence #main/An occupancy file that summarizes the taxon occupancy per sequence n/a
  • File
ClustalW on input dataset(s): clustal #main/ClustalW on input dataset(s): clustal n/a
  • File
Concatenated fasta alignment file #main/Concatenated fasta alignment file n/a
  • File
Proteinortho on input dataset(s): orthology-groups #main/Proteinortho on input dataset(s): orthology-groups n/a
  • File
Proteinortho_extract_by_orthogroup #main/Proteinortho_extract_by_orthogroup n/a
  • File
Trimmed alignment. #main/Trimmed alignment. n/a
  • File
_anonymous_output_1 #main/_anonymous_output_1 n/a
  • File
_anonymous_output_2 #main/_anonymous_output_2 n/a
  • File
_anonymous_output_3 #main/_anonymous_output_3 n/a
  • File
_anonymous_output_4 #main/_anonymous_output_4 n/a
  • File
_anonymous_output_5 #main/_anonymous_output_5 n/a
  • File
_anonymous_output_6 #main/_anonymous_output_6 n/a
  • File
_anonymous_output_7 #main/_anonymous_output_7 n/a
  • File
extracted_ORFs #main/extracted_ORFs n/a
  • File
fasta_header_cleaned #main/fasta_header_cleaned n/a
  • File
funannotate_predicted_proteins #main/funannotate_predicted_proteins n/a
  • File
headers_shortened #main/headers_shortened n/a
  • File
proteomes_to_one_file #main/proteomes_to_one_file n/a
  • File
repeat_masked #main/repeat_masked n/a
  • File
sample_names_to_headers #main/sample_names_to_headers n/a
  • File

Version History

1.0 (latest) Created 16th Jul 2024 at 14:10 by Helena Rasche

Added/updated 4 files


Open master 17df822

2.0 (earliest) Created 25th Jun 2024 at 11:17 by Helena Rasche

Added/updated 4 files


Frozen 2.0 0a6a734
help Creators and Submitter
Creators
Not specified
Submitter
Discussion Channel
Activity

Views: 179   Downloads: 75

Created: 25th Jun 2024 at 11:17

Last updated: 25th Jun 2024 at 11:17

help Attributions

None

Total size: 241 KB
Powered by
(v.1.16.0-main)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH