Sequences of Events from the Electronic Medical Record and the Onset of Infection

Caitlin E. Coombes, Kevin R. Coombes, Naleef Fareed

Research output: Contribution to journalArticlepeer-review


We present a novel model of time-series analysis to learn from electronic health record (EHR) data when infection occurred in the intensive care unit (ICU) by translating methods from proteomics and Bayesian statistics. Using 48,536 patients hospitalized in an ICU, we describe each hospital course as an ‘alphabet’ of 23 physician actions (‘events’) in temporal order. We analyze these as k-mers of length 3–12 events and apply a Bayesian model of (cumulative) relative risk (RR). The log2-transformed RR (median=0.248, mean=0.226) supported the conclusion that the events selected were individually associated with increased risk of infection. Selecting from all possible cutoffs of maximum gain (MG), MG>0.0244 predicts administration of antibiotics with PPV 82.0 %, NPV 44.4 %, and AUC 0.706. Our approach holds value for retrospective analysis of other clinical syndromes for which time-of-onset is critical to analysis but poorly marked in EHRs, including delirium and decompensation.

Original languageEnglish (US)
Article numbere202200657
JournalChemistry and Biodiversity
Issue number11
StatePublished - Nov 2022
Externally publishedYes


  • Bayesian statistics
  • electronic health record
  • protein sequence
  • proteomics
  • time-series analysis

ASJC Scopus subject areas

  • Bioengineering
  • Biochemistry
  • Chemistry(all)
  • Molecular Medicine
  • Molecular Biology


Dive into the research topics of 'Sequences of Events from the Electronic Medical Record and the Onset of Infection'. Together they form a unique fingerprint.

Cite this