An integrated network representation of multiple cancer-specific data for graph-based machine learning

Limeng Pu, Manali Singha, Hsiao Chun Wu, Costas Busch, J. Ramanujam, Michal Brylinski

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Genomic profiles of cancer cells provide valuable information on genetic alterations in cancer. Several recent studies employed these data to predict the response of cancer cell lines to drug treatment. Nonetheless, due to the multifactorial phenotypes and intricate mechanisms of cancer, the accurate prediction of the effect of pharmacotherapy on a specific cell line based on the genetic information alone is problematic. Emphasizing on the system-level complexity of cancer, we devised a procedure to integrate multiple heterogeneous data, including biological networks, genomics, inhibitor profiling, and gene-disease associations, into a unified graph structure. In order to construct compact, yet information-rich cancer-specific networks, we developed a novel graph reduction algorithm. Driven by not only the topological information, but also the biological knowledge, the graph reduction increases the feature-only entropy while preserving the valuable graph-feature information. Subsequent comparative benchmarking simulations employing a tissue level cross-validation protocol demonstrate that the accuracy of a graph-based predictor of the drug efficacy is 0.68, which is notably higher than those measured for more traditional, matrix-based techniques on the same data. Overall, the non-Euclidean representation of the cancer-specific data improves the performance of machine learning to predict the response of cancer to pharmacotherapy. The generated data are freely available to the academic community at

Original languageEnglish (US)
Article number14
Journalnpj Systems Biology and Applications
Issue number1
StatePublished - Dec 2022

ASJC Scopus subject areas

  • Modeling and Simulation
  • General Biochemistry, Genetics and Molecular Biology
  • Drug Discovery
  • Computer Science Applications
  • Applied Mathematics


Dive into the research topics of 'An integrated network representation of multiple cancer-specific data for graph-based machine learning'. Together they form a unique fingerprint.

Cite this