6 - RecSel
Version 1

Workflow Type: Galaxy

Evolutionary Analysis

Live Resources

usegalaxy.org usegalaxy.eu usegalaxy.org.au usegalaxy.be

Galaxy workflow Galaxy workflow Galaxy workflow Galaxy workflow

Galaxy history Galaxy history Galaxy history Galaxy history

What's the point?

Wu et al. showed recombination between COVID-19 and bat coronaviruses located within the S-gene. We want to confirm this observation and provide a publicly accessible workflow for recombination detection.

In previous coronavirus outbreaks (SARS), retrospective analyses determined that adaptive substitutions might have occurred in the S-protein Zhang et al., e.g., related to ACE2 receptor utilization. While data on COVID-19 are currently limited, we investigated whether or not the lineage leading to them showed any evidence of positive diversifying selection.


We employ a recombination detection algorithm (GARD) developed by Kosakovsky Pond et al. and implemented in the hyphy package. To select a representative set of S-genes we perform a blast search using the S-gene CDS from NC_045512 as a query against the nr database. We select coding regions corresponding to the S-gene from a number of COVID-19 genomes and original SARS isolates. This set of sequences can be found in this repository

We then generate a codon-based alignment using the workflow shown below and perform the recombination analysis using the gard tool from the hyphy package.

For selection analyses, we apply the Adaptive Branch Site Random Effects method to test whether or each branch of the tree shows evidence of diversifying positive selection along a fraction of sites using the absrel tool from the hyphy package.


A set of unaligned CDS sequences for the S-gene.


A recombination report:

and a map of possible recombination hotspots:

A selection analysis summary and tree (COVID-19 isolate is MN988668_1)

and a plot of the inferred ω distribution for the MN988668_1 branch.

History and workflow

A Galaxy workspace (history) containing the most current analysis can be imported from here.

The publicly accessible workflow can be downloaded and installed on any Galaxy instance. It contains version information for all tools used in this analysis.

The workflow takes unaligned CDS sequences, translates them with EMBOSS:tanseq, aligns translations using mafft, realigns original CDS input using the mafft alignment as a guide and sends this codon-based alignment to gard.


Tools used in this analysis are also available from BioConda:

Name Link

emboss Anaconda-Server Badge

mafft Anaconda-Server Badge

hyphy Anaconda-Server Badge

fasttree Anaconda-Server Badge


ID Name Description Type
S_nt.fna S_nt.fna n/a
  • File


ID Name Description
1 transeq toolshed.g2.bx.psu.edu/repos/devteam/emboss_5/EMBOSS: transeq101/5.0.0
2 MAFFT toolshed.g2.bx.psu.edu/repos/rnateam/mafft/rbc_mafft/7.221.3
3 tranalign toolshed.g2.bx.psu.edu/repos/devteam/emboss_5/EMBOSS: tranalign100/5.0.0
4 FASTTREE toolshed.g2.bx.psu.edu/repos/iuc/fasttree/fasttree/2.1.10+galaxy1
5 HyPhy-GARD toolshed.g2.bx.psu.edu/repos/iuc/hyphy_gard/hyphy_gard/2.5.4+galaxy0
6 HyPhy-aBSREL toolshed.g2.bx.psu.edu/repos/iuc/hyphy_absrel/hyphy_absrel/2.5.4+galaxy0


ID Name Description Type
_anonymous_output_1 _anonymous_output_1 n/a
  • File
_anonymous_output_2 _anonymous_output_2 n/a
  • File
_anonymous_output_3 _anonymous_output_3 n/a
  • File

Version History

Version 1 (earliest) Created 25th Mar 2020 at 10:05 by Finn Bacall

Added/updated 13 files

Open master ba9b2c7
help Creators and Submitter

Views: 1577   Downloads: 153

Created: 25th Mar 2020 at 10:05

Last updated: 25th Mar 2020 at 11:23

help Tags
help Attributions


Total size: 1.98 MB
Powered by
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH