Apex Atlas™
A research-grade synthetic patient population for model training, simulation, and benchmarking. Apache 2.0 for open use; commercial agreement for production deployment.
Healthcare AI is bottlenecked by access to data. PHI lives behind decade-long DUAs, real-world cohorts under-represent edge cases, and academic datasets are too narrow to train on. Atlas removes the bottleneck — a continuously generated synthetic patient population, statistically faithful to the Data Lake, free of any real PHI, and openly licensed for the research community.
What Atlas does.
A continuously regenerated synthetic patient population spanning U.S. demographics, common comorbidity structures, and longitudinal history. Distributions are statistically anchored to the source Data Lake.
Synthetic EHR encounters, lab panels, imaging studies (DICOM-compliant), genomic variant calls, wearable streams, and free-text clinical notes — all generated as a single coherent record per patient.
Generation pipelines are differentially-private (ε ≤ 1.0) against the source Data Lake. Re-identification risk is mathematically bounded — independently audited by MITRE and NIST PSCR.
Need a synthetic cohort of patients with HFpEF + Stage III CKD on SGLT2 inhibitors? Specify the criteria, generate via the Atlas API. Surface edge cases that don't exist at scale in the real world.
Core dataset, generation models, and benchmark suite are Apache 2.0. Use it in academic work, model evals, FDA submissions, or open-source health AI without a license conversation.
A separate commercial agreement covers production deployment, indemnification, custom cohort generation, and SLA-backed API access. Use Apache for R&D, convert to commercial at the deploy line.
Plays well with the stack you already run.
Every integration is first-party and maintained in-house. No fragile middleware, no orphaned connectors.
- FHIR R5
- OMOP CDM v5.4
- i2b2
- DICOM
- VCF/BCF
- CSV/Parquet bulk export
- Hugging Face Datasets
- AWS Open Data
- Google Cloud Public Datasets
- Snowflake Marketplace
- Direct Atlas API
- MedQA-Atlas
- Clinical-NER-Atlas
- Risk-Stratification-Atlas
- Triage-LLM-Atlas
- Coding-Accuracy-Atlas
- Zero real PHI — Safe Harbor + Expert Determination de-identified by construction
- Differential privacy guarantee (ε ≤ 1.0) independently audited
- MITRE re-identification risk assessment published quarterly
- Apache 2.0 license for non-production use
- Commercial Use Agreement with indemnification for production deployment
- FDA Model-Informed Drug Development (MIDD) reference dataset