A new machine-learning model can analyze proteins in blood samples to predict survival in patients critically ill with COVID-19, according to an article published on January 18 in PLOS Digital Health.
"Using a machine learning model which combines the measurements of multiple proteins, we were able to accurately predict survival in critically ill patients with COVID-19 from single blood samples, weeks before the outcome, substantially outperforming established risk predictors," wrote the authors, led by Florian Kurth, PhD, and Markus Ralser, PhD, of the Charité – Universitätsmedizin in Berlin.
Intensive care risk assessments, such as the Sequential Organ Failure Assessment (SOFA) Score or Acute Physiology and Chronic Health Evaluation (APACHE) II scores, are useful for assessing moderately ill COVID-19 patients, but not for predicting the clinical outcomes of severely ill patients who are on mechanical ventilation or other organ support.
In the current study, Kurth et al sought to devise a more targeted clinical assessment for severely ill patients. To do that, they looked to the plasma proteome (the proteins expressed by an organism and present in its blood) for diagnostic clues as to what biomarkers are present during this stage of the disease.
"We have leveraged the power of the proteome to address a problematic diagnostic gap in the prognosis of the most critical form of COVID-19, that is not covered by established clinical assessments, such as the SOFA or APACHE II scores," the authors wrote.
To obtain data for the model, the researchers acquired longitudinal blood samples from 50 patients with COVID-19 confirmed by polymerase chain reaction (PCR) tests at Charité – Universitätsmedizin. All 50 patients were critically ill and receiving invasive mechanical ventilation. Meanwhile, clinical phenotyping of the patients was also performed, including recording of intensive care and disease severity scores, treatment parameters, and outcomes.
The blood samples were then used to generate high-resolution time series for 321 protein quantities at 349 time points. The median time between the collection of samples and the determination of outcome (survival or nonsurvival) was 39 days.
Studying the plasma proteomes, the researchers found 78 proteins for which the concentration changed significantly during the patients' disease course. Of these, the researchers identified a set of 14 proteins that showed different trajectories between survivors and nonsurvivors.
For instance, the inflammatory proteins SAA1, SAA2, C-reactive protein, ITIH3, LRG1, Serpina1, Serpina10, and lipopolysaccharide binding protein were significantly increased in patients who died. In contrast, the anticoagulative proteins thrombin and plasma kallikrein (already decreased in nonsurvivors) continually decreased over time in patients who died while increasing in survivors, according to the authors.
COVID-19 patient survival outcomes
To predict patient survival from the protein data, the researchers trained a machine-learning model based on parenclitic networks, a relatively novel technique for coercing multidimensional data into a graph form, enabling the application of topological theory to evaluate features. The networks are generated by considering every pair of analytes (proteins) individually and calculating the edge weights connecting the pairs as the estimated probabilities of survival versus nonsurvival.
After training, the system was tested on a new cohort of critically ill patients admitted to the intensive care unit at the University Hospital of Innsbruck in Austria. Even though the model was trained at a different hospital and healthcare system, it correctly predicted the outcome for 18 out of 19 patients who survived and 5 out of 5 patients who died at the Innsbruck site.
"The ML [machine learning] predictor trained on these samples substantially outperformed established clinical risk scores and predicted the outcome among a group of severely ill patients with similar clinical presentation with high accuracy," the authors wrote.
In terms of future work, the authors wrote that it is necessary to validate their method in larger cohorts, to test whether a given treatment changes the projected trajectory of an individual patient, and to test the proteins they identified for non-COVID-19 conditions.
"The panel of proteins identified in our study should also be assessed for other conditions such as non-COVID-19 ARDS," they wrote.