ReClaim: Unlocking Healthcare Insights from Claims Data

ReClaim, a 1.7B parameter foundation model trained on 43.8B medical events, leverages administrative claims data to achieve state-of-the-art performance in disease prediction and RWE analysis.

A diagram illustrating the ReClaim foundation model processing longitudinal healthcare data.
Conceptual overview of the ReClaim foundation model's data processing pipeline.

The vast, underutilized potential of administrative claims data for healthcare AI is now being unlocked. While rich in longitudinal detail, this data has been largely unexplored as a foundation for advanced modeling. A new generative transformer, ReClaim, trained from scratch on 43.8 billion medical events, demonstrates the power of this data source.

Administrative Claims as a Scalable Healthcare Foundation Model Substrate

ReClaim, a generative transformer trained on over 200 million enrollees' data from 2008-2022, models longitudinal trajectories across diagnoses, procedures, medications, and expenditures. Scaled up to 1.7 billion parameters, this approach proves that administrative claims are not just records but a potent substrate for building powerful healthcare foundation models. The model's ability to capture financial outcomes and improve real-world evidence (RWE) analyses underscores this potential.

Related startups

Unprecedented Performance in Disease Prediction and RWE

Across over 1,000 disease-onset prediction tasks, the ReClaim foundation model achieved a mean AUC of 75.6%, substantially outperforming disease-specific LightGBM (66.3%) and the transformer-based Delphi model (69.4%). Notably, ReClaim showed the largest gains for rare diseases, a critical area for clinical advancement. These advantages were consistent across retrospective and prospective evaluations, and in external validation on independent datasets. Furthermore, for healthcare expenditure forecasting, ReClaim increased explained variance from 0.28 to 0.37 compared to LightGBM, and in target trial emulation, it reduced systematic bias by 72% on average relative to Delphi, showcasing its real-world utility.

Monotonic Scaling and Post-Training Gains Drive Efficacy

Performance improvements in the ReClaim foundation model scaled monotonically with model size, highlighting the benefits of continued scaling. Crucially, post-training added 13.8 percentage points to performance over pre-training alone, indicating the significant value of fine-tuning on specific downstream tasks. This synergy between scale and targeted training enables learned representations that generalize effectively across time periods and diverse data sources, supporting critical applications like disease surveillance, expenditure forecasting, and robust RWE generation.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.