Archive ouverte HAL - Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data

Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data

Caroline Bazzoli 1 Sophie Lambert-Lacroix 2
1 SVH - Statistique pour le Vivant et l’Homme
LJK - Laboratoire Jean Kuntzmann
2 TIMC-IMAG-BCM - Biologie Computationnelle et Mathématique
TIMC-IMAG - Techniques de l'Ingénierie Médicale et de la Complexité - Informatique, Mathématiques et Applications, Grenoble
Abstract : Prediction from high-dimensional genomic data is an active field in today's medical research. Most of the proposed prediction methods make use of genomic data alone without considering established clinical data that often are available and known to have predictive value. Recent studies suggest that combining clinical and genomic information may improve predictions. We consider in this paper methods for classification purposes that simultaneously use both types of variables, but applying dimension reduction only to the high-dimensional genomic ones. A usual way to deal with that is the use of a two-step approach. In step one, dimensionality reduction technique is just performed on the genomic dataset. In step two, the selected genomic variables are merged with the clinical variables to build a classification model on the combined dataset. Nevertheless, the reduction dimension is built without taking into account the link between the response variable and the clinical data. To address this issue, using Partial Least Squares (PLS) as reduction technique, we propose here a one step approach based on three extensions of LS-PLS (LS for Least Squares) method for logistic regression context. We perform a simulation study to evaluate these approaches compared to methods using only the clinical data or only genetic data. Then, we illustrate their performances to classify two real data sets containing both clinical information and gene expression.
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01405101
Contributeur : Caroline Bazzoli <>
Soumis le : lundi 15 octobre 2018 - 16:26:19
Dernière modification le : vendredi 26 octobre 2018 - 18:00:02

Fichier

Bazzoli_et_al-2018-BMC_Bioinfo...
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

Collections

Citation

Caroline Bazzoli, Sophie Lambert-Lacroix. Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data. BMC Bioinformatics, BioMed Central, 2018, 19 (1), 〈10.1186/s12859-018-2311-2〉. 〈hal-01405101v3〉

Partager

Métriques

Consultations de la notice

40

Téléchargements de fichiers

52