Subtree selection in kernels for graph classification

2013-01-01
TAN, MEHMET
Polat, Faruk
Alhajj, Reda
Classification of structured data is essential for a wide range of problems in bioinformatics and cheminformatics. One such problem is in silico prediction of small molecule properties such as toxicity, mutagenicity and activity. In this paper, we propose a new feature selection method for graph kernels that uses the subtrees of graphs as their feature sets. A masking procedure which boils down to feature selection is proposed for this purpose. Experiments conducted on several data sets as well as a comparison of our method with some frequent subgraph based approaches are presented.
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS

Suggestions

Integrating machine learning techniques into robust data enrichment approach and its application to gene expression data
Erdogdu, Utku; TAN, MEHMET; Alhajj, Reda; Polat, Faruk; Rokne, Jon; Demetrick, Douglas (Inderscience Publishers, 2013-01-01)
The availability of enough samples for effective analysis and knowledge discovery has been a challenge in the research community, especially in the area of gene expression data analysis. Thus, the approaches being developed for data analysis have mostly suffered from the lack of enough data to train and test the constructed models. We argue that the process of sample generation could be successfully automated by employing some sophisticated machine learning techniques. An automated sample generation framewo...
A quantitative investigation of students' attitudes towards electronic book technology
Bulur, Hatice Gonca (SAGE Publications, 2020-09-01)
The purpose of this study is to analyse the factors that have an impact on technology adoption for e-books utilizing the Analytic Hierarchy Process and Multiple Regression Analysis methods. Findings indicate that perceived usefulness and ease of use are the most significant determinants in using e-books. Of key significance is that Analytic Hierarchy Process results show that consumers make pairwise comparisons, adding environmental concerns to the selection process. Recognizing the importance of all these ...
A pattern classification approach for boosting with genetic algorithms
Yalabık, Ismet; Yarman Vural, Fatoş Tunay; Üçoluk, Göktürk; Şehitoğlu, Onur Tolga (2007-11-09)
Ensemble learning is a multiple-classifier machine learning approach which produces collections and ensembles statistical classifiers to build up more accurate classifier than the individual classifiers. Bagging, boosting and voting methods are the basic examples of ensemble learning. In this study, a novel boosting technique targeting to solve partial problems of AdaBoost, a well-known boosting algorithm, is proposed. The proposed system finds an elegant way of boosting a bunch of classifiers successively ...
Concept discovery on relational databases: New techniques for search space pruning and rule quality improvement
Kavurucu, Yusuf; Karagöz, Pınar; Toroslu, İsmail Hakkı (Elsevier BV, 2010-12-01)
Multi-relational data mining has become popular due to the limitations of propositional problem definition in structured domains and the tendency of storing data in relational databases. Several relational knowledge discovery systems have been developed employing various search strategies, heuristics, language pattern limitations and hypothesis evaluation criteria, in order to cope with intractably large search space and to be able to generate high-quality patterns. In this work, we introduce an ILP-based c...
Blockchain-Based Supply Chain Management: Understanding the Determinants of Adoption in the Context of Organizations
Gökalp, Ebru; Gökalp, Mert Onuralp; Gökalp, Selin (Informa UK Limited, 2020-09-01)
This study investigates the importance of the determinants affecting the adoption and usage of blockchain-based SCM systems in the context of organizations. Hence, an SLR method was followed to uncover critical determinants in the literature. Then, a research model, including 14 key determinants, was developed based on the TOE Framework. Subsequently, the AHP method was applied to rank the adoption determinants. The findings reveal that environment-related determinants are more critical than technology-rela...
Citation Formats
M. TAN, F. Polat, and R. Alhajj, “Subtree selection in kernels for graph classification,” INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, pp. 294–310, 2013, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/39134.