Where bindsight fits — positioning & landscape¶
A short, honest map of where
bindsightsits in the protein-design / target-discovery ecosystem: who it's for, the gap it fills, and what's on the roadmap. If you're evaluating bindsight for your group, start here.
The gap bindsight fills¶
Two mature ecosystems sit side by side and barely talk to each other:
- Genomics / transcriptomics (DESeq2, edgeR, Seurat, scanpy, TCGA, recount3) stops at "here are the interesting genes."
- De novo protein / binder design (RFdiffusion, ProteinMPNN, BindCraft, BoltzGen, AlphaFold, Boltz-2) starts from "given a target structure…"
Every open binder-design tool we surveyed — BindCraft, BinderFlow, PXDesign, Latent-X — begins with a known target. The path from expression data to a designed, ranked, provenance-tracked binder candidate is built ad-hoc, per project, rarely reproducibly.
bindsight ships that bridge as one open, reproducible tool — and records the receipts all the way back to the patient cohort.
Who it's for¶
| Audience | What bindsight gives you |
|---|---|
| Translational researchers | A free, reproducible "data → designed binder" path on a laptop, with free-GPU offload. |
| Clinical / cancer biologists | An audit trail from any binder candidate back to the cohort it came from. |
| Method developers | A held-out evaluation harness (rediscovery of known antigens) to benchmark new designers/validators behind a stable plugin interface. |
| Early-discovery teams | An open, extensible comparator you can plug proprietary designers into — no fork required. |
How it relates to neighboring tools¶
bindsight is an orchestration + provenance layer, not a new model. It stands on, and credits, the best open tools in each step (see Acknowledgments). Its contribution is the connective tissue: the surfaceome/targetable-site filter (SURFACE-Bind), the multi-objective ranking, the cost-aware GPU offload, the failure taxonomy, and the PROV-O + RO-Crate provenance that makes a run citable and reproducible.
| Typical binder-design tool | bindsight | |
|---|---|---|
| Input | Target structure | RNA-seq counts |
| Provenance | PDB + maybe a log | PROV-O JSON-LD + RO-Crate, audit trail to cohort |
| Hardware | HPC assumed | CPU laptop + free Colab/Modal/Kaggle offload |
| Cost-awareness | None | --dry-run estimates GPU $ before running |
| Negative results | Discarded | Catalogued (failure_taxonomy.parquet) |
| Citability | Code dump | DOI per release, JSON-Schema-validated outputs |
Roadmap¶
- v0.2.0 (now) — discovery half end-to-end on CPU; design + validation proven on a free GPU (bindsight's first real ERBB2 binders — see the designer benchmark); multi-page web UI live.
- v0.3.0 — live (async, non-blocking) Modal/Colab job submission; BindCraft + BoltzGen plugins fully wired; scRNA-seq input.
- v1.0.0 — JOSS submission + validation paper (blinded rediscovery of HER2/EGFR/MSLN/CLDN6).
Get involved¶
bindsight is AGPL-3.0-licensed and built in the open. If you run target discovery or binder design and want to compare notes — or you'd like to try it on your own cohort — open an issue or reach the author (@mikhaeelatefrizk, ORCID 0009-0006-1069-9558). Feedback from real workflows directly shapes the roadmap.