● Capabilities

What it can do today

Everything below is built and working — the governed Foundation. It’s organised by what a user actually does: explore the data, build & analyse, govern & trust, and keep a complete record. The roadmap at the bottom is clearly separated so expectations stay honest.

💬 Conversational 🧱 Governed analysis 🛡️ Reviewed & reproducible

Available today

💬

Ask & explore

  • Plain-English chat agent — ask questions, get tables, charts & explanations
  • Schema Explorer — Flatiron tables/fields + Guardant gene panels (searchable)
  • Live catalog browse — catalogs → schemas → tables from Unity Catalog
  • Data dictionary — exact field meanings, coding, gene→panel coverage
  • Connection check — confirm the secure M2M identity any time
🧱

Build & analyse

  • Governed cohort builder — from validated definitions (biomarker, drug-class)
  • Saved, versioned cohorts — reusable, with patient counts
  • Feasibility reports — eligible N, biomarker & follow-up availability, covariate completeness, attrition funnel
  • Comparative analysis — cohort vs cohort, side-by-side with deltas
  • Time-to-event — Kaplan-Meier-style (rwPFS / time-to-next-treatment)
  • Interactive charts — switch type, zoom, recolour, export PNG/CSV
🛡️

Govern & trust

  • Review & validation — Draft → Validated, reviewer-gated, with comments
  • De-duplication — canonical signature + AI similarity check
  • Validated-cohort reuse — reuse trusted definitions instead of rebuilding
  • Chain of trust — evidence inherits its cohort’s status (trusted vs provisional)
  • Data-version pinning — every result tied to the exact data it read
  • Read-only, least-privilege — safe by construction
📋

Record & restore

  • Provenance log — append-only; SQL, code hash, model, prompt, data version
  • Audit trail — auth, every tool & model call, admin/review actions
  • Conversation history — every chat restorable, charts intact
  • Source-chat links — jump from any cohort/report to where it was made
  • Copy — one-click copy on every question & answer

On the roadmap — not yet built

These extend the same governed base toward regulatory-grade evidence. Listed here for transparency — they are not available today.

📈

Inferential statistics

  • Cox models & hazard ratios with confidence intervals
  • Confounding adjustment — IPTW, propensity-score matching
  • Sensitivity analyses & bias diagnostics — via governed Python on Databricks
🧪

External Control Arm

  • Map trial eligibility → real-world variables (target trial emulation)
  • Baseline-comparability “Table 1”
  • Contextualisation for FDA / EMA submissions
📝

Scientific output

  • STROBE-aligned tables & figures
  • Abstract / manuscript scaffolds
  • Regulatory-ready summaries
🗂️

Scale & harmonisation

  • Multi-asset harmonised data model (EAPs, NIS, trial-linked, synthetic)
  • Fit-for-purpose data-suitability gate at ingestion
  • Evidence-output sign-off lifecycle (beyond cohorts)
The thread that connects them

Today’s capabilities are the governed Foundation; each roadmap item builds on the same base — every result still reproducible, reviewed, and logged.