A Multimodal Foundation Model for EHR Time Series and Clinical Notes with Outcome Calibration

Authors

  • Tao Sun Department of Computer Science, ETH Zurich, Zurich 8092, Switzerland Author
  • Tao Sun Department of Computer Science, ETH Zurich, Zurich 8092, Switzerland Author

DOI:

https://doi.org/10.71465/mrcis172

Keywords:

Electronic Health Records, Multimodal Learning, Foundation Models, Uncertainty Calibration, Deep Learning in Healthcare

Abstract

The digitization of healthcare has resulted in the proliferation of Electronic Health Records (EHRs), which contain a rich yet heterogeneous mix of data modalities, primarily structured time-series data and unstructured clinical notes. While deep learning has demonstrated remarkable potential in predictive healthcare, existing approaches often process these modalities in isolation or rely on naive fusion mechanisms that fail to capture the complex, asynchronous interplay between physiological measurements and clinical narratives. Furthermore, modern neural networks, particularly large foundation models, suffer from significant miscalibration, frequently yielding overconfident predictions that are detrimental to clinical decision-making. In this paper, we introduce MedCali-FM, a novel multimodal foundation model designed to integrate sparse, irregular EHR time series with clinical text through a calibrated cross-attention mechanism. We propose a joint pre-training objective that combines masked forecasting, masked language modeling, and a novel contrastive alignment loss to learn unified patient representations. Crucially, we integrate a differentiable calibration objective directly into the fine-tuning phase, ensuring that the model's confidence scores align with true empirical probabilities. Our extensive experiments on the MIMIC-IV dataset demonstrate that MedCali-FM not only achieves state-of-the-art performance in mortality prediction and phenotype classification but also significantly reduces Expected Calibration Error (ECE) compared to existing ensemble and deep fusion methods.

Downloads

Published

2025-12-30