A Comparative Study of Statistical and Artificial Intelligence based Classification Algorithms on Central Nervous System Cancer Microarray Gene Expression Data

2016-09-03
Arslan, Mustafa Turan
Kalınlı, Adem
A variety of methods are used in order to classify cancer gene expression profiles based on microarray data. Especially, statistical methods such as Support Vector Machines (SVM), Decision Trees (DT) and Bayes are widely preferred to classify on microarray cancer data. However, the statistical methods can often be inadequate to solve problems which are based on particularly large-scale data such as DNA microarray data. Therefore, artificial intelligence-based methods have been used to classify on microarray data lately. We are interested in classifying microarray cancer gene expression by using both artificial intelligence based methods and statistical methods. In this study, Multi-Layer Perceptron (MLP), Radial basis Function Network (RBFNetwork) and Ant Colony Optimization Algorithm (ACO) have been used including statistical methods. The performances of these classification methods have been tested with validation methods such as v-fold validation. To reduce dimension of DNA microarray gene expression has been used Correlation-based Feature Selection (CFS) technique. According to the results obtained from experimental study, artificial intelligence-based classification methods exhibit better results than the statistical methods.

Suggestions

Training of ANFIS Network by Genetic Algorithm for Diagnosis of Leukemia Cancer Subtypes Using Gene Expression Profile
Arslan, Mustafa Turan; Haznedar, Bülent; Kalınlı, Adem (2017-05-12)
In this study, subtypes of Leukemia cancer has classified by using microarray gene expression profiles. An approach is proposed to train Adaptive Neuro Fuzzy Inference System (ANFIS) network by using a population-based Genetic Algorithm (GA) to classify this cancer data. The classification success of the proposed model has compared with the successes of Backpropagation (BP)-ANFIS and Hybrid-ANFIS, which are derivative based ANFIS models. According to obtained results, GA-ANFIS model has performed ve...
Determination of the effect of polyadenylation SLR values on microarray data classification
Aslan, Ümit; Can, Tolga; Department of Computer Engineering (2014)
Microarray data classification is generally used to predict unknown sample outcomes by the help of models created using the preprocessed and categorized microarray data that includes gene expression values. Preparation of microarray experiments, design of Affymetrix chips and availability of previous microarray experiments give the opportunity to extract a new kind of data; differential expressions of proximal and distal probes (Short to Long Ratio -SLR- values), which is used to predict the alternative pol...
Comparing Clustering Techniques for Real Microarray Data
Purutçuoğlu Gazi, Vilda (2012-08-29)
The clustering of genes detected as significant or differentially expressed provides useful information to biologists about functions and functional relationship of genes. There are variant types of clustering methods that can be applied in genomic data. These are mainly divided into the two groups, namely, hierarchical and partitional methods. In this paper, as the novelty, we perform a detailed clustering analysis for the recently collected boron microarray dataset to investigate biologically more interes...
An integrative approach to structured snp prioritization and representative snp selection for genome-wide association studies
Üstünkar, Gürkan; Aydın Son, Yeşim; Weber, Gerhard Wilhelm; Department of Information Systems (2011)
Single Nucleotide Polymorphisms (SNPs) are the most frequent genomic variations and the main basis for genetic differences among individuals and many diseases. As genotyping millions of SNPs at once is now possible with the microarrays and advanced sequencing technologies, SNPs are becoming more popular as genomic biomarkers. Like other high-throughput research techniques, genome wide association studies (GWAS) of SNPs usually hit a bottleneck after statistical analysis of significantly associated SNPs, as ...
A multi-layered graphical model of the relation among SNPS, GENES, and pathways based on subgraph search
Ersoy, Gökhan; Aydın Son, Yeşim; Can, Tolga; Department of Bioinformatics (2015)
The analysis of Single Nucleotide Polymorphisms (SNPs) through Genome Wide Association Studies (GWAS) presents great potential for describing disease loci and gaining insight into the underlying etiology of diseases. Recently described combined p-value approach allows identification of associations at gene and pathway level. The integrated programs like METU-SNP produce simple lists of either SNP id/gene id/pathway title and their p-values and significance status or SNP id/disease id/pathway information. In...
Citation Formats
M. T. Arslan and A. Kalınlı, “A Comparative Study of Statistical and Artificial Intelligence based Classification Algorithms on Central Nervous System Cancer Microarray Gene Expression Data,” 2016, Accessed: 00, 2021. [Online]. Available: https://hdl.handle.net/11511/78718.