This paper reports on three machine learning methods, i.e. Naïve Bayes (NB), Adaptive Bayesian Network (ABN) and Support Vector Machines (SVM) for multi-target classification on micro-array datasets involving a large feature space and very few samples. By adopting the Minimum Description Length criterion for ranking and selecting relevant features, experiments are carried out to investigate the accuracy and effectiveness of the above methods in classifying many targets as well as to study the effects of feature selection on the sensitivity of each classifier. The paper also shows how the knowledge of a domain expert makes it possible to decompose the multi-target classification in a set of binary classifications, one for each target, with a substantial improvement in accuracy. The effectiveness of the MDL criterion to decide on particular feature subsets is asserted by empirical results showing that MDL is comparable with entropy based feature selection methodologies reported by earlier works.

High-Dimensional Micro-array Data Classification Using Minimum Description Length and Domain Expert Knowledge

BOSIN, ANDREA;DESSI, NICOLETTA;PES, BARBARA
2006-01-01

Abstract

This paper reports on three machine learning methods, i.e. Naïve Bayes (NB), Adaptive Bayesian Network (ABN) and Support Vector Machines (SVM) for multi-target classification on micro-array datasets involving a large feature space and very few samples. By adopting the Minimum Description Length criterion for ranking and selecting relevant features, experiments are carried out to investigate the accuracy and effectiveness of the above methods in classifying many targets as well as to study the effects of feature selection on the sensitivity of each classifier. The paper also shows how the knowledge of a domain expert makes it possible to decompose the multi-target classification in a set of binary classifications, one for each target, with a substantial improvement in accuracy. The effectiveness of the MDL criterion to decide on particular feature subsets is asserted by empirical results showing that MDL is comparable with entropy based feature selection methodologies reported by earlier works.
2006
978-3-540-35453-6
Data Mining & Knowledge Discovery, Machine Learning, Bioinformatics.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/26595
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 3
social impact