● Capabilities

What it can do today

Everything below is built and working — the governed Foundation. It’s organised by what a user actually does: explore the data, build & analyse, govern & trust, and keep a complete record. The roadmap at the bottom is clearly separated so expectations stay honest.

💬 Conversational 🧱 Governed analysis 🛡️ Reviewed & reproducible

Available today

💬

Ask & explore

Plain-English chat agent — ask questions, get tables, charts & explanations
Schema Explorer — Flatiron tables/fields + Guardant gene panels (searchable)
Live catalog browse — catalogs → schemas → tables from Unity Catalog
Data dictionary — exact field meanings, coding, gene→panel coverage
Connection check — confirm the secure M2M identity any time

🧱

Build & analyse

Governed cohort builder — from validated definitions (biomarker, drug-class)
Saved, versioned cohorts — reusable, with patient counts
Feasibility reports — eligible N, biomarker & follow-up availability, covariate completeness, attrition funnel
Comparative analysis — cohort vs cohort, side-by-side with deltas
Time-to-event — Kaplan-Meier-style (rwPFS / time-to-next-treatment)
Interactive charts — switch type, zoom, recolour, export PNG/CSV

🛡️

Govern & trust

Review & validation — Draft → Validated, reviewer-gated, with comments
De-duplication — canonical signature + AI similarity check
Validated-cohort reuse — reuse trusted definitions instead of rebuilding
Chain of trust — evidence inherits its cohort’s status (trusted vs provisional)
Data-version pinning — every result tied to the exact data it read
Read-only, least-privilege — safe by construction

📋

Record & restore

Provenance log — append-only; SQL, code hash, model, prompt, data version
Audit trail — auth, every tool & model call, admin/review actions
Conversation history — every chat restorable, charts intact
Source-chat links — jump from any cohort/report to where it was made
Copy — one-click copy on every question & answer

On the roadmap — not yet built

These extend the same governed base toward regulatory-grade evidence. Listed here for transparency — they are not available today.

📈

Inferential statistics

Cox models & hazard ratios with confidence intervals
Confounding adjustment — IPTW, propensity-score matching
Sensitivity analyses & bias diagnostics — via governed Python on Databricks

🧪

External Control Arm

Map trial eligibility → real-world variables (target trial emulation)
Baseline-comparability “Table 1”
Contextualisation for FDA / EMA submissions

📝

Scientific output

STROBE-aligned tables & figures
Abstract / manuscript scaffolds
Regulatory-ready summaries

🗂️

Scale & harmonisation

Multi-asset harmonised data model (EAPs, NIS, trial-linked, synthetic)
Fit-for-purpose data-suitability gate at ingestion
Evidence-output sign-off lifecycle (beyond cohorts)

The thread that connects them

Today’s capabilities are the governed Foundation; each roadmap item builds on the same base — every result still reproducible, reviewed, and logged.

Architecture & tech stack →

See

The flow diagram →