I am working to develop new methods for representing data in electronic health records (EHR) to improve predictive modeling and interpretation of patient outcomes. EHR data offer a promising opportunity for advancing the understanding of how clinical decisions and patient conditions interact over time to influence patient health. However, EHR data are difficult to use for predictive modeling due to the various data types they contain (continuous, categorical, text, etc.), their longitudinal nature, the high amount of non-random missingness for certain measurements, and other concerns. Furthermore, patient outcomes often have heterogenous causes and require information to be synthesized from several clinical lab measures and patient visits. The core challenge at hand is overcoming the mismatch between data representations in the EHR and the assumptions underlying commonly used statistical and machine learning (ML) methods.