Interpreting ML models from electronic health records

Feature importances clustered with diagnosis.

Last month, I had the pleasure of attending AMIA 2019 Symposium to present some recent biomedical informatics work. You can see my slides here. This is the first work supported by my K99, the goal of which is to develop interpretable ML methods using electronic health records (EHR).

Many hospitals and biomedical informaticists are hoping that ML can build useful predictive models of patient outcomes. Predictive models are already used to assist at point of care, for example via early warning systems and patient prioritization models. In this paper, we empirically analyzed the interpretability of models trained on EHR data. We found that, even when models perform similarly in prediction, they often disagree about which factors drive diagnoses. This effect was alleviated somewhat by replacing the use of ‘built-in’ measures of importance with permutation testing.

You can read the preprint on arxiv, or browse the code on github.