Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue BMC Bioinformatics Année : 2018

Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data

Résumé

Prediction from high-dimensional genomic data is an active field in today's medical research. Most of the proposed prediction methods make use of genomic data alone without considering established clinical data that often are available and known to have predictive value. Recent studies suggest that combining clinical and genomic information may improve predictions. We consider in this paper methods for classification purposes that simultaneously use both types of variables, but applying dimension reduction only to the high-dimensional genomic ones. A usual way to deal with that is the use of a two-step approach. In step one, dimensionality reduction technique is just performed on the genomic dataset. In step two, the selected genomic variables are merged with the clinical variables to build a classification model on the combined dataset. Nevertheless, the reduction dimension is built without taking into account the link between the response variable and the clinical data. To address this issue, using Partial Least Squares (PLS) as reduction technique, we propose here a one step approach based on three extensions of LS-PLS (LS for Least Squares) method for logistic regression context. We perform a simulation study to evaluate these approaches compared to methods using only the clinical data or only genetic data. Then, we illustrate their performances to classify two real data sets containing both clinical information and gene expression.
Fichier principal
Vignette du fichier
Bazzoli_et_al-2018-BMC_Bioinformatics.pdf (1.02 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-01405101 , version 1 (29-11-2016)
hal-01405101 , version 2 (25-07-2017)
hal-01405101 , version 3 (15-10-2018)

Identifiants

Citer

Caroline Bazzoli, Sophie Lambert-Lacroix. Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data. BMC Bioinformatics, 2018, 19 (1), ⟨10.1186/s12859-018-2311-2⟩. ⟨hal-01405101v3⟩
2323 Consultations
1054 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More