Use cases¶

Three concrete scenarios where bindsight saves serious time and produces defensible artifacts. Each is sized for a single user with a CPU laptop and ≤$50 of cloud GPU budget.

Use case 1 — Triaging a cancer cohort for surface-antigen targets¶

Scenario. You're a translational researcher with a TCGA cohort (or your own RNA-seq study) of tumor vs. matched normal samples. You want a ranked shortlist of cell-surface antigens that are (a) over-expressed in tumor, (b) low in vital tissues, and (c) druggable with antibody-class binders. From that shortlist, you want designed binder candidates.

Without bindsight (typical workflow today): - Week 1: Run DESeq2, manually filter, dump gene list. - Week 2: Cross-check against SURFY by hand. Annotate UniProt accessions. Pull AlphaFoldDB structures one by one. - Week 3: Set up RFdiffusion + ProteinMPNN environment on the cluster. Spend 3 days on PyRosetta install issues. - Week 4: Run designs. Set up Boltz-2 separately. Validate. Rank in Excel. - Week 5: Realize you forgot to filter for safety liabilities. Restart.

With bindsight:

bindsight discover my_cohort.yaml --out runs/cohort_v01
bindsight design   runs/cohort_v01 --backend modal --designer rfdiff_mpnn --trajectories 50
bindsight validate runs/cohort_v01 --backend modal --validator boltz2
bindsight rank     runs/cohort_v01
bindsight report   runs/cohort_v01 --format html
bindsight export   runs/cohort_v01 --format ro-crate --out cohort_v01.crate.zip

Wall time: ~30 min CPU + ~5–10 GPU-hours on Modal A100 (~$25–40), or several hours of free Colab.

Output: Ranked binders with iPTM > 0.65 against the top-5 surface antigens in your cohort, every one traceable back to the patients it came from.

Use case 2 — Methods benchmark of a new designer¶

Scenario. You're a methods developer who just published a new backbone diffusion model and want to compare it against RFdiffusion, BindCraft, and BoltzGen on a held-out target set with a fair upstream pipeline.

Without bindsight: you write 4 different driver scripts, hope you've configured each fairly, and reviewers complain that the comparisons aren't apples-to-apples.

With bindsight:

Implement your designer as a bindsight.design.Designer plugin (Protocol in bindsight/design/protocol.py, ~50 lines).

Register it via pyproject.toml:

[project.entry-points."bindsight.designers"]
my_designer = "my_package:MyDesigner"

Run the same config four times with different --designer flags:

for d in rfdiff_mpnn bindcraft boltzgen my_designer; do
    bindsight run examples/benchmark_held_out.yaml --designer "$d" \
        --out runs/bench_${d}
done

Compare via the bundled benchmark script:

bindsight benchmark runs/bench_*/  --known-antigens benchmarks/known.tsv \
    --out benchmark_report.html

Output: A side-by-side comparison of all four designers on identical inputs (same DEGs, same SURFY filter, same epitope prediction, same Boltz-2 validator), with the raw artifacts shipped as RO-Crates so reviewers can re-run any cell.

Use case 3 — Teaching computational biology end-to-end¶

Scenario. You're a PI running a graduate course on computational biology and want one project that touches DEG analysis, structural biology, deep learning, software engineering, and reproducibility — all in a 12-week semester.

The lesson plan with bindsight:

Week	Topic	Hands-on
1–2	RNA-seq + DEG	Run `bindsight discover` on a public TCGA cohort; interpret the volcano plot
3	Surfaceome + Open Targets	Inspect the `candidates.parquet`; understand each filter
4	Protein structure	Pull a candidate's AlphaFoldDB mmCIF; visualize in PyMOL
5–6	De novo design	Run `bindsight design --backend colab` (free); read the Colab notebook step by step
7	Validation	Boltz-2 vs Chai-1r — which agrees on which design?
8	Multi-objective ranking	Modify the rank weights; see which targets move
9	Reproducibility	Re-run with a different seed; diff the manifests
10	Report writing	Customize the HTML report template
11	Plugin development	Each student adds a custom filter or scorer
12	Final projects	Apply to a different cohort or disease

The same single repo serves the full arc from "what is differential expression" to "I added a custom plugin." Students get a portfolio piece.

Use case 4 (bonus) — Pharma early-discovery comparator¶

Scenario. You're at a small biotech with a proprietary scoring model and a few internal designers. You want a free, reproducible open-source pipeline to run alongside, both for sanity checks and for collaborator handoffs.

Why bindsight fits:

All defaults are MIT/Apache/BSD/CC-BY (LICENSING.md).
Plugin interface lets you wrap your proprietary designer / validator without forking. Internal models stay private; only the wrappers are added.
Container-pinned reproducibility means a partner running the same Docker image gets byte-identical outputs.
RO-Crate exports satisfy increasingly common funder reproducibility requirements.

Concretely:

# in your_company/internal_designer.py — never published
class InternalDesigner:
    name = "internal_designer_v3"
    version = "3.2.1"

    def make_spec(self, ...) -> DesignSpec: ...
    def submit(self, spec, runner) -> DesignResult:
        # call your internal model, package results to bindsight's schema
        ...

Then in pyproject.toml of your private package:

[project.entry-points."bindsight.designers"]
internal_designer = "your_company.internal_designer:InternalDesigner"

bindsight --designer internal_designer Just Works alongside RFdiff+MPNN / BindCraft / BoltzGen in the same comparison.

What unifies these use cases¶

In every scenario above, the value isn't a single algorithm — it's the opinionated, reproducible join between genomics evidence and protein design, with provenance preserved. You bring the data and the question; bindsight brings the pipeline.

If your use case isn't here, open a discussion — we'd like to add it.