Australian-Structural-Biology-Computing/bindflow
Introduction
Australian-Structural-Biology-Computing/bindflow is a bioinformatics pipeline that generates protein binders for user-defined hotspot residues on a target protein structure. The pipeline will be executed until a user-defined number of designs pass the in-silico quality control criteria.
Bindflow is a wrapper around the BindCraft tool to allow convenient execution on HPC infrastructure. BindCraft includes 4 core binder design modules within a single tool.
- Structure proposal (AlphaFold2 multimer hallucination)
- Sequence design (SolubleMPNN)
- Structure prediction (AlphaFold2)
- Post-design quality control (PyRosetta)
[!WARNING] Post-design QC filtering is conducted with PyRosetta. Users must agree to the PyRosetta license terms.
Usage
[!NOTE] If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with
-profile test
before running the workflow on actual data.
First, prepare a samplesheet with your input data that looks as follows:
samplesheet.csv
:
id,binder_name,starting_pdb,chains,target_hotspot_residues,min_length,max_length,number_of_final_designs,settings_advanced,settings_filters
demo,PDL1,PDL1.pdb,A,"56",65,150,10,default_4stage_multimer.json,default_filters.json
Each row represents a single design instance. Detailed documentation describing job parameters can be found in the BindCraft documentation. Briefly:
- id is a unique job identifier
- binder_name is an identifier for the protein target.
- starting_pdb contains the target structure in PDB format.
- chains defines the target chains in the starting_pdb.
- target_hotspot_residues defines the residue indices of the target_pdb which will be targeted by the design process.
- min_length defines the minimum length of the designed binder.
- max_length defines the maximum length of the designed binder.
- number_of_final_designs defines the number of binders required to pass QC criteria before the job is complete.
- settings_advanced defines advanced BindCraft settings (JSON format).
- settings_filters defines advanced BindCraft filter settings (JSON format).
Workloads can be distributed over multiple GPUs by setting the --batches command line argument to split the number of final designs in to separate batches.
Now, you can run the pipeline using:
nextflow run Australian-Structural-Biology-Computing/bindflow \
-profile \
--input samplesheet.csv \
--outdir
--batches 1
[!WARNING] Please provide pipeline parameters via the CLI or Nextflow
-params-file
option. Custom config files including those provided by the-c
Nextflow option can be used to provide any configuration except for parameters; see docs.
Credits
Australian-Structural-Biology-Computing/bindflow was originally written by Ziad Al-Bkhetan.
We thank the following people for their extensive assistance in the development of this pipeline:
Contributions and Support
If you would like to contribute to this pipeline, please see the contributing guidelines.
Citations
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md
file.
This pipeline uses code and infrastructure developed and maintained by the nf-core community, reused here under the MIT license.
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.
Version History
add-docs @ 61c983a (earliest) Created 5th Jul 2025 at 05:59 by Thomas Litfin
Update changelog
Frozen
add-docs
61c983a

Creators
Not specifiedSubmitter
Views: 14 Downloads: 2
Created: 5th Jul 2025 at 05:59

This item has not yet been tagged.

None