Analysis of S-protein polymorphism
What's the point?
In the previous portion of this study we found a non-synonymous polymorphism within the S-gene. In this section we are trying to interpret its possible effect.
Obtain coding sequences of S proteins from a diverse group of coronaviruses. Generate amino acid alignment to assess conservation of the polymorphic location.
Bat SARS Coronavirus Rs806/2006
Bat SARS-like coronavirus BatCoV/BB9904/BGR/2008
Bat SARS-like coronavirus WIV1
MERS coronavirus isolate Riyadh_1175/KSA/2014
MERS coronavirus isolate Riyadh_1337/KSA/2014
MERS coronavirus isolate Riyadh_1340/KSA/2014
MERS coronavirus strain Tunisia-Qatar_2013
Murine hepatitis virus
Murine hepatitis virus strain TY
SARS coronavirus A013
SARS coronavirus A021
SARS coronavirus B029
SARS coronavirus C013
SARS coronavirus C018
SARS coronavirus HHS-2004
SARS coronavirus isolate CUHKtc10NP
SARS coronavirus isolate CUHKtc14NP
SARS coronavirus isolate CUHKtc32NP
Feline infectious peritonitis virus
Swine enteric coronavirus strain Italy/213306/2009
Transmissible gastroenteritis virus
These viruses were chosen based on a publication by Duquerroy et al. (2005). The sequences were extracted manually--a painful process. We will develop a tool for parsing particular CDS sequences automatically for future analyses.
We produce two alignments, one at the nucleotide and one at the amino acid level, of Betacoronavirus spike proteins. The alignments can be visualized with the
Multiple Sequence Alignment visualization in Galaxy :
Alignments of Spike proteins
A. CDS alignments
B. Protein alignment
Workflow and history
The Galaxy history containing the latest analysis can be found here. The publicly accessible workflow can be downloaded and installed on any Galaxy instance. It contains all information about tool versions and parameters used in this analysis.
transeq tool converts the CDS sequences into protein sequences, which we then align to each other using
mafft. The output is fed into
tranalign along with the nucleotide sequences.
tranalign produces a nucleotide alignment coherent with the protein alignment.
Tools used in this analysis are also available from BioConda:
Views: 371 Downloads: 15
Created: 25th Mar 2020 at 10:05
Last updated: 25th Mar 2020 at 11:23
Last used: 1st Mar 2021 at 21:46