Integrative analysis of multiple types of genomic data using an accelerated failure time frailty model

Shirong Deng, Jie Chen, Huidong Shi

Research output: Contribution to journalArticlepeer-review

Abstract

As the high throughput technologies rapidly develop, multiple types of genomic data become available within and across different studies. It has become a challenging task in modern statistical research to use all types of genomic data to infer some disease-prone genetic information. In this work, we propose an integrative analysis of multiple and different types of genomic data, clinical covariates and survival data under a framework of an accelerated failure time with frailty model. The proposed integrative approach aims to answer some aspects of the complex problem in genomic data analysis by finding relevant genomic features and inferring patients’ survival time using identified features. The proposed integrative approach is developed using a weighted least-squares with a sparse group LASSO penalty as the objective function to simultaneously estimate and select the relevant features. Extensive simulation studies are conducted to assess the performance of the proposed method with two types of genomic data, DNA methylation data and copy number variation data, on 600 genes and three clinical covariates. The simulation results show promises of the proposed method. The proposed method is applied to the analysis of the Cancer Genome Atlas data on Glioblastoma, a lethal brain cancer, and biologically interpretable results are obtained.

Original languageEnglish (US)
Pages (from-to)1499-1532
Number of pages34
JournalComputational Statistics
Volume36
Issue number2
DOIs
StatePublished - Jun 2021

Keywords

  • Accelerated failure time frailty model
  • Genomic data
  • High-dimensional data
  • Integrative analysis
  • Sparse group lasso

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Integrative analysis of multiple types of genomic data using an accelerated failure time frailty model'. Together they form a unique fingerprint.

Cite this