Governance is built into every layer — from the data grant, to the kind of query that can run, to a human signing off before a result is trusted, to a complete trail of who did what.
The service principal is granted read access only to the specific catalogs it needs — nothing more — managed centrally in Unity Catalog and revocable at any time.
Every query passes a guard that permits a single SELECT only — all data-changing commands are blocked and results are row-capped. Safety is structural, not a matter of trust.
Access is role-based. Only reviewers (admins) can validate or reject definitions — creators can’t rubber-stamp their own work.
Only de-identified real-world data, queried in place. We return aggregates and small result sets — never bulk patient extracts.
Nothing is “valid” until a qualified reviewer signs off. Editing a validated definition supersedes it and forces re-review — a validated artifact is never silently changed.
Clinical concepts (e.g. “CDK4/6 inhibitor”) are defined once, validated, and reused — so the same term means the same thing every time.
Saving a cohort that already exists reuses it instead of creating a near-twin — using a canonical signature plus a similarity check.
Evidence inherits its inputs’ status: a report built on an unvalidated cohort is shown as provisional, not trusted.