Supervised Learning predicts therapeutic success in acute myeloid leukemia

Achievement of complete remission signifies a crucial milestone in the therapy of acute myeloid leukemia (AML) while refractory disease is associated with dismal outcomes. Hence, accurately identifying patients at risk is essential to tailor treatment concepts individually to disease biology.

We developed a multi-model supervised learning pipeline to predict complete remissions and 2-year overall survival in a large multi-center cohort of 1383 leukemia patients who received intensive induction therapy using clinical, laboratory, cytogenetic and molecular genetic data. Our classification algorithms were completely agnostic of any pre-existing risk classifications and autonomously selected predictive features both including established markers of favorable or adverse risk as well as identifying markers of so-far controversial or even unknown relevance.

We then used a large external multicenter cohort of 664 AML patients to validate our classifiers with outstanding results. Our multi-model approach outperforms previous CR predictors by far! While previously reported models achieved an area-under-the-curve (AUC) of 0.60-ish, our top model predicts complete remissions with an AUC of 0.86.

The underlying pipeline is disease-agnostic, so if you have data of some hundred patients lying around and want to collaborate on a research project, give us a call!

Our paper has just been published in Haematologica’s March 2023 issue with a corresponding editorial.