No full text
Unpublished conference/Abstract (Scientific congresses and symposiums)
Tackling the problem of data processing reproducibility in multidimensional chromatography by developing a metabolomics reference data set
Stefanuto, Pierre-Hugues; Dejong, Thibaut; Massenet, Thibault et al.
2023Metabolomics 2023
Peer reviewed
 

Files


Full Text
No document available.

Send to



Details



Abstract :
[en] In the quest of making multidimensional chromatography a robust method for untargeted screening of small molecules, one of the key remaining challenges to tackle is reproducibility. To reach this objective, some analytical aspects need to be investigated, such as column dimension and separation conditions. The biggest challenge is the data processing step going from row data to information extraction. To enable data analytical methods and processing workflow evaluation, a reference data set is required. In this study, we used a whole stool research grade test materials (RGTMs) from NIST to develop a control data set covering sampling, analysis, and processing workflows comparison. These RGTMs have been designed to conduct an inter-method and interlaboratory study on whole stool samples to develop a standard reference material. The RGTMs contain two diets, vegan and omnivore, and two sample preparation, liquid vs lyophilized. In this presentation, we will focus on the utilization of this data set to evaluate data processing approaches. The robustness of several workflow involving commercial, in house and open-source solutions were investigated. First, we investigated user impact on a well-established ANOVA-based workflow. The goal was to evaluate the weight of human decisions on the final classification metrics and the significant features identified. Next, we developed and evaluated a new processing approach combining tile-based image comparison and machine learning-based feature selection. For the user impact study, our well-established workflow has proven to be robust to human decision. Indeed, human decision during the data cleaning, the pre-processing and the model building did not affect the global output of the study. For our new processing approach, the combination of tile-based alignment and random forest classification increases the robustness compared to the ANOVA-based approach. Indeed, the false positive rate decreased during feature selection, and we were able to conduct unbalanced data set processing.
Disciplines :
Chemistry
Author, co-author :
Stefanuto, Pierre-Hugues  ;  Université de Liège - ULiège > Département de chimie (sciences) > Chimie analytique, organique et biologique
Dejong, Thibaut  ;  Université de Liège - ULiège > Molecular Systems (MolSys) ; Université de Liège - ULiège > Département de chimie (sciences) > Chimie analytique, organique et biologique
Massenet, Thibault  ;  Université de Liège - ULiège > Molecular Systems (MolSys)
Bhatt, Kinjal  ;  Université de Liège - ULiège > Molecular Systems (MolSys)
Gaida, Meriem  ;  Université de Liège - ULiège > Molecular Systems (MolSys)
Focant, Jean-François  ;  Université de Liège - ULiège > Département de chimie (sciences) > Chimie analytique, organique et biologique
Language :
English
Title :
Tackling the problem of data processing reproducibility in multidimensional chromatography by developing a metabolomics reference data set
Publication date :
2023
Event name :
Metabolomics 2023
Event date :
18/06/23-22/06/23
Audience :
International
Peer reviewed :
Peer reviewed
Available on ORBi :
since 04 July 2023

Statistics


Number of views
25 (3 by ULiège)
Number of downloads
0 (0 by ULiège)

Bibliography


Similar publications



Contact ORBi