This implementation was done for the ELIXIR EXCELERATE Demonstrator to be run on WES-TES environment.
There is also task related to input file transfer (curl) and the output file transfer to a private FTP server. The needed URLs and FTP credential are given as input parameters.
The input files are assumed to come from a web server. Idea there is to make some configurations more modular and some tools like lftp or curl have easier to write script parameter files instead of command line parameters.
For the output file transfer was installed pureftpd server. It allows to create virtua user account and to execute an upload script (man pure-uploadscript). Thus, it possible to move files automatically to else where that with ftp credential cannot be used to download the files later. Another user account can be used then to access the files.
The .yml files allow to run each cwl files separately with cwltool. There you need to have in the input files in defined in path. Using local files allow faster execution of the pipeline or execution of just a simple task.
Click and drag the diagram to pan, double click or use the controls to zoom.
Inputs
ID | Name | Description | Type |
---|---|---|---|
fastq_files | n/a | List of paired-end input FASTQ files |
|
reference_genome | n/a | Compress FASTA files with the reference genome chromosomes |
|
known_indels_file | n/a | VCF file correlated to reference genome assembly with known indels |
|
known_sites_file | n/a | VCF file correlated to reference genome assembly with know sites (for instance dbSNP) |
|
chromosome | n/a | Label of the chromosome to be used for the analysis. By default all the chromosomes are used |
|
readgroup_str | n/a | Parsing header which should correlate to FASTQ files |
|
sample_name | n/a | Sample name |
|
gqb | n/a | Exclusive upper bounds for reference confidence GQ bands (must be in [1, 100] and specified in increasing order) |
|
Steps
ID | Name | Description |
---|---|---|
unzipped_known_sites | n/a | n/a |
unzipped_known_indels | n/a | n/a |
gunzip | n/a | n/a |
picard_dictionary | n/a | n/a |
cutadapt2 | n/a | n/a |
bwa_index | n/a | n/a |
samtools_index | n/a | n/a |
bwa_mem | n/a | n/a |
samtools_sort | n/a | n/a |
picard_markduplicates | picard-MD | n/a |
gatk3-rtc | gatk3-rtc | n/a |
gatk-ir | gatk-ir | n/a |
gatk-base_recalibration | gatk-base_recalibration | n/a |
gatk-base_recalibration_print_reads | gatk-base_recalibration_print_reads | n/a |
gatk_haplotype_caller | gatk-haplotype_caller | n/a |
Outputs
ID | Name | Description | Type |
---|---|---|---|
metrics | n/a | Several metrics about the result |
|
gvcf | n/a | unannotated gVCF output file from the mapping and variant calling pipeline |
|
Version History
Version 4 (latest) Created 11th Jan 2021 at 21:40 by Laura Rodriguez-Navas
new step to unzip known indels file
Open
master
8854542
Version 3 Created 3rd Nov 2020 at 11:56 by Laura Rodriguez-Navas
Open
master
7e970c0
Version 2 Created 30th Oct 2020 at 10:11 by Laura Rodriguez-Navas
Open
master
a9d0efc
Version 1 (earliest) Created 2nd Mar 2020 at 10:49 by Laura Rodriguez-Navas
Added/updated 1 files
Open
master
816231b
Creators
Submitter
Views: 4603 Downloads: 1001
Created: 2nd Mar 2020 at 10:49
Last updated: 16th Apr 2021 at 10:38
None