IntLIM: Integration using linear models of metabolomics and gene expression data

Jalal K. Siddiqui; Elizabeth Baskin; Mingrui Liu; Carmen Z. Cantemir-Stone; Bofei Zhang; Russell Bonneville; Joseph P. McElroy; Kevin R. Coombes; Ewy A. Mathé

doi:10.1186/s12859-018-2085-6

IntLIM: Integration using linear models of metabolomics and gene expression data

Jalal K. Siddiqui, Elizabeth Baskin, Mingrui Liu, Carmen Z. Cantemir-Stone, Bofei Zhang, Russell Bonneville, Joseph P. McElroy, Kevin R. Coombes, Ewy A. Mathé

Research output: Contribution to journal › Article › peer-review

28 Scopus citations

Abstract

Background: Integration of transcriptomic and metabolomic data improves functional interpretation of disease-related metabolomic phenotypes, and facilitates discovery of putative metabolite biomarkers and gene targets. For this reason, these data are increasingly collected in large (> 100 participants) cohorts, thereby driving a need for the development of user-friendly and open-source methods/tools for their integration. Of note, clinical/translational studies typically provide snapshot (e.g. one time point) gene and metabolite profiles and, oftentimes, most metabolites measured are not identified. Thus, in these types of studies, pathway/network approaches that take into account the complexity of transcript-metabolite relationships may neither be applicable nor readily uncover novel relationships. With this in mind, we propose a simple linear modeling approach to capture disease-(or other phenotype) specific gene-metabolite associations, with the assumption that co-regulation patterns reflect functionally related genes and metabolites. Results: The proposed linear model, metabolite ~ gene + phenotype + gene:phenotype, specifically evaluates whether gene-metabolite relationships differ by phenotype, by testing whether the relationship in one phenotype is significantly different from the relationship in another phenotype (via a statistical interaction gene:phenotype p-value). Statistical interaction p-values for all possible gene-metabolite pairs are computed and significant pairs are then clustered by the directionality of associations (e.g. strong positive association in one phenotype, strong negative association in another phenotype). We implemented our approach as an R package, IntLIM, which includes a user-friendly R Shiny web interface, thereby making the integrative analyses accessible to non-computational experts. We applied IntLIM to two previously published datasets, collected in the NCI-60 cancer cell lines and in human breast tumor and non-tumor tissue, for which transcriptomic and metabolomic data are available. We demonstrate that IntLIM captures relevant tumor-specific gene-metabolite associations involved in known cancer-related pathways, including glutamine metabolism. Using IntLIM, we also uncover biologically relevant novel relationships that could be further tested experimentally. Conclusions: IntLIM provides a user-friendly, reproducible framework to integrate transcriptomic and metabolomic data and help interpret metabolomic data and uncover novel gene-metabolite relationships. The IntLIM R package is publicly available in GitHub ( https://github.com/mathelab/IntLIM ) and includes a user-friendly web application, vignettes, sample data and data/code to reproduce results.

Original language	English (US)
Article number	81
Journal	BMC Bioinformatics
Volume	19
Issue number	1
DOIs	https://doi.org/10.1186/s12859-018-2085-6
State	Published - Mar 5 2018
Externally published	Yes

Keywords

Integration
Linear Modeling
Metabolomics
Transcriptomics

ASJC Scopus subject areas

Structural Biology
Biochemistry
Molecular Biology
Computer Science Applications
Applied Mathematics

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1186/s12859-018-2085-6

Cite this

@article{6e435ea4b56849d0a2da9522c7d4d369,

title = "IntLIM: Integration using linear models of metabolomics and gene expression data",

abstract = "Background: Integration of transcriptomic and metabolomic data improves functional interpretation of disease-related metabolomic phenotypes, and facilitates discovery of putative metabolite biomarkers and gene targets. For this reason, these data are increasingly collected in large (> 100 participants) cohorts, thereby driving a need for the development of user-friendly and open-source methods/tools for their integration. Of note, clinical/translational studies typically provide snapshot (e.g. one time point) gene and metabolite profiles and, oftentimes, most metabolites measured are not identified. Thus, in these types of studies, pathway/network approaches that take into account the complexity of transcript-metabolite relationships may neither be applicable nor readily uncover novel relationships. With this in mind, we propose a simple linear modeling approach to capture disease-(or other phenotype) specific gene-metabolite associations, with the assumption that co-regulation patterns reflect functionally related genes and metabolites. Results: The proposed linear model, metabolite ~ gene + phenotype + gene:phenotype, specifically evaluates whether gene-metabolite relationships differ by phenotype, by testing whether the relationship in one phenotype is significantly different from the relationship in another phenotype (via a statistical interaction gene:phenotype p-value). Statistical interaction p-values for all possible gene-metabolite pairs are computed and significant pairs are then clustered by the directionality of associations (e.g. strong positive association in one phenotype, strong negative association in another phenotype). We implemented our approach as an R package, IntLIM, which includes a user-friendly R Shiny web interface, thereby making the integrative analyses accessible to non-computational experts. We applied IntLIM to two previously published datasets, collected in the NCI-60 cancer cell lines and in human breast tumor and non-tumor tissue, for which transcriptomic and metabolomic data are available. We demonstrate that IntLIM captures relevant tumor-specific gene-metabolite associations involved in known cancer-related pathways, including glutamine metabolism. Using IntLIM, we also uncover biologically relevant novel relationships that could be further tested experimentally. Conclusions: IntLIM provides a user-friendly, reproducible framework to integrate transcriptomic and metabolomic data and help interpret metabolomic data and uncover novel gene-metabolite relationships. The IntLIM R package is publicly available in GitHub ( https://github.com/mathelab/IntLIM ) and includes a user-friendly web application, vignettes, sample data and data/code to reproduce results.",

keywords = "Integration, Linear Modeling, Metabolomics, Transcriptomics",

author = "Siddiqui, {Jalal K.} and Elizabeth Baskin and Mingrui Liu and Cantemir-Stone, {Carmen Z.} and Bofei Zhang and Russell Bonneville and McElroy, {Joseph P.} and Coombes, {Kevin R.} and Math{\'e}, {Ewy A.}",

note = "Publisher Copyright: {\textcopyright} 2018 The Author(s).",

year = "2018",

month = mar,

day = "5",

doi = "10.1186/s12859-018-2085-6",

language = "English (US)",

volume = "19",

journal = "BMC Bioinformatics",

issn = "1471-2105",

publisher = "BioMed Central",

number = "1",

}

TY - JOUR

T1 - IntLIM

T2 - Integration using linear models of metabolomics and gene expression data

AU - Siddiqui, Jalal K.

AU - Baskin, Elizabeth

AU - Liu, Mingrui

AU - Cantemir-Stone, Carmen Z.

AU - Zhang, Bofei

AU - Bonneville, Russell

AU - McElroy, Joseph P.

AU - Coombes, Kevin R.

AU - Mathé, Ewy A.

PY - 2018/3/5

Y1 - 2018/3/5

N2 - Background: Integration of transcriptomic and metabolomic data improves functional interpretation of disease-related metabolomic phenotypes, and facilitates discovery of putative metabolite biomarkers and gene targets. For this reason, these data are increasingly collected in large (> 100 participants) cohorts, thereby driving a need for the development of user-friendly and open-source methods/tools for their integration. Of note, clinical/translational studies typically provide snapshot (e.g. one time point) gene and metabolite profiles and, oftentimes, most metabolites measured are not identified. Thus, in these types of studies, pathway/network approaches that take into account the complexity of transcript-metabolite relationships may neither be applicable nor readily uncover novel relationships. With this in mind, we propose a simple linear modeling approach to capture disease-(or other phenotype) specific gene-metabolite associations, with the assumption that co-regulation patterns reflect functionally related genes and metabolites. Results: The proposed linear model, metabolite ~ gene + phenotype + gene:phenotype, specifically evaluates whether gene-metabolite relationships differ by phenotype, by testing whether the relationship in one phenotype is significantly different from the relationship in another phenotype (via a statistical interaction gene:phenotype p-value). Statistical interaction p-values for all possible gene-metabolite pairs are computed and significant pairs are then clustered by the directionality of associations (e.g. strong positive association in one phenotype, strong negative association in another phenotype). We implemented our approach as an R package, IntLIM, which includes a user-friendly R Shiny web interface, thereby making the integrative analyses accessible to non-computational experts. We applied IntLIM to two previously published datasets, collected in the NCI-60 cancer cell lines and in human breast tumor and non-tumor tissue, for which transcriptomic and metabolomic data are available. We demonstrate that IntLIM captures relevant tumor-specific gene-metabolite associations involved in known cancer-related pathways, including glutamine metabolism. Using IntLIM, we also uncover biologically relevant novel relationships that could be further tested experimentally. Conclusions: IntLIM provides a user-friendly, reproducible framework to integrate transcriptomic and metabolomic data and help interpret metabolomic data and uncover novel gene-metabolite relationships. The IntLIM R package is publicly available in GitHub ( https://github.com/mathelab/IntLIM ) and includes a user-friendly web application, vignettes, sample data and data/code to reproduce results.

AB - Background: Integration of transcriptomic and metabolomic data improves functional interpretation of disease-related metabolomic phenotypes, and facilitates discovery of putative metabolite biomarkers and gene targets. For this reason, these data are increasingly collected in large (> 100 participants) cohorts, thereby driving a need for the development of user-friendly and open-source methods/tools for their integration. Of note, clinical/translational studies typically provide snapshot (e.g. one time point) gene and metabolite profiles and, oftentimes, most metabolites measured are not identified. Thus, in these types of studies, pathway/network approaches that take into account the complexity of transcript-metabolite relationships may neither be applicable nor readily uncover novel relationships. With this in mind, we propose a simple linear modeling approach to capture disease-(or other phenotype) specific gene-metabolite associations, with the assumption that co-regulation patterns reflect functionally related genes and metabolites. Results: The proposed linear model, metabolite ~ gene + phenotype + gene:phenotype, specifically evaluates whether gene-metabolite relationships differ by phenotype, by testing whether the relationship in one phenotype is significantly different from the relationship in another phenotype (via a statistical interaction gene:phenotype p-value). Statistical interaction p-values for all possible gene-metabolite pairs are computed and significant pairs are then clustered by the directionality of associations (e.g. strong positive association in one phenotype, strong negative association in another phenotype). We implemented our approach as an R package, IntLIM, which includes a user-friendly R Shiny web interface, thereby making the integrative analyses accessible to non-computational experts. We applied IntLIM to two previously published datasets, collected in the NCI-60 cancer cell lines and in human breast tumor and non-tumor tissue, for which transcriptomic and metabolomic data are available. We demonstrate that IntLIM captures relevant tumor-specific gene-metabolite associations involved in known cancer-related pathways, including glutamine metabolism. Using IntLIM, we also uncover biologically relevant novel relationships that could be further tested experimentally. Conclusions: IntLIM provides a user-friendly, reproducible framework to integrate transcriptomic and metabolomic data and help interpret metabolomic data and uncover novel gene-metabolite relationships. The IntLIM R package is publicly available in GitHub ( https://github.com/mathelab/IntLIM ) and includes a user-friendly web application, vignettes, sample data and data/code to reproduce results.

KW - Integration

KW - Linear Modeling

KW - Metabolomics

KW - Transcriptomics

UR - http://www.scopus.com/inward/record.url?scp=85043345678&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85043345678&partnerID=8YFLogxK

U2 - 10.1186/s12859-018-2085-6

DO - 10.1186/s12859-018-2085-6

M3 - Article

C2 - 29506475

AN - SCOPUS:85043345678

SN - 1471-2105

VL - 19

JO - BMC Bioinformatics

JF - BMC Bioinformatics

IS - 1

M1 - 81

ER -

IntLIM: Integration using linear models of metabolomics and gene expression data

Abstract

Keywords

ASJC Scopus subject areas

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this