Knowledge discovery in microarray data of bioinformatics

Download
2012
Kocabaş, Fahri Salih
This thesis analyzes major microarray repositories and presents a metadata framework both to address the current issues and to promote the main operations such as knowledge discovery, sharing, integration, and exchange. The proposed framework is demonstrated in a case study on real data and can be used for other high throughput repositories in biomedical domain. Not only the number of microarray experimentation increases, but also the size and complexity of the results rise in response to biomedical inquiries. And, experiment results are significant when examined in a batch and placed in a biological context. There have been standardization initiatives on content, object model, exchange format, and ontology. However, they have proprietary information space. There are backlogs and the data cannot be exchanged among the repositories. There is a need for a format and data management standard at present.iv v We introduced a metadata framework to include metadata card and semantic nets to make the experiment results visible, understandable and usable. They are encoded in standard syntax encoding schemes and represented in XML/RDF. They can be integrated with other metadata cards, semantic nets and can be queried. They can be exchanged and shared. We demonstrated the performance and potential benefits with a case study on a microarray repository. This study does not replace any product on repositories. A metadata framework is required to manage such huge data. We state that the backlogs can be reduced, complex knowledge discovery queries and exchange of information can become possible with this metadata framework.

Suggestions

Data interoperability through federated semantic metadata registries
Sınacı, Ali Anıl; Çiçekli, Fehime Nihan; Doğaç, Asuman; Department of Computer Engineering (2014)
In this study, a unified methodology together with the supporting framework for the problem of data interoperability is introduced which brings together the power of metadata registries and semantic web technologies. A federated architecture of semantic metadata registries which are purely based on ISO/IEC 11179 standard leads to the Linked Open Data integration of data element repositories where each element can be uniquely identified, referenced and processed to enable the syntactic and semantic interoper...
Semantic concept recognition from structured and unstructured inputs within cyber security domain
Hoşsucu, Alp Gökhan; Baykal, Nazife; Department of Information Systems (2015)
Linked data initiative has been quite successful in terms of publishing and interlinking data over ontological structures. The success is due to answering semantically rich queries over highly structured data. The utilization of linked data structures are widely used in various domains to solve the problem of producing domain specific knowledge which can be interpreted by automated agents without any human interference. Cyber security field is one of the domains that suffer from the excessiveness of the raw...
Data integration over horizontally partitioned databases in service-oriented data grids
Sunercan, Hatice Kevser Sönmez; Çiçekli, Fehime Nihan; Alpdemir, Mahmut Nedim; Department of Computer Engineering (2010)
Information integration over distributed and heterogeneous resources has been challenging in many terms: coping with various kinds of heterogeneity including data model, platform, access interfaces; coping with various forms of data distribution and maintenance policies, scalability, performance, security and trust, reliability and resilience, legal issues etc. It is obvious that each of these dimensions deserves a separate thread of research efforts. One particular challenge among the ones listed above tha...
Automated integration of real-time and non-real-time defense systems
Dalkiran, Emre; Onel, Tolga; Oğuztüzün, Mehmet Halit S.; Demir, Kadir Alpaslan (2021-04-01)
Various application domains require the integration of distributed real-time or near-real-time systems with non-real-time systems. Smart cities, smart homes, ambient intelligent systems, or network-centric defense systems are among these application domains. Data Distribution Service (DDS) is a communi-cation mechanism based on Data-Centric Publish-Subscribe (DCPS) model. It is used for distributed systems with real-time operational constraints. Java Message Service (JMS) is a messaging standard for enterpr...
CLOUDGEN: Workload generation for the evaluation of cloud computing systems CLOUDGEN: Bulut Bilişim Sistemlerinin Başarim Deǧerlendirmesi icin Iş Yuku Uretimi
Koltuk, Furkan; Yazar, Alper; Schmidt, Şenan Ece (2019-04-01)
In this paper, we propose CLOUDGEN workflow that produces synthetic workloads for Infrastructure and Platform as a Service for the evaluation of resource management approaches in cloud computing systems. To this end, CLOUDGEN systematically processes and clusters records in a given workload trace and fits distributions for different workload parameters within the clusters. Different than the previous work, clustering is carried out to produce different virtual machine types for achieving models that are sui...
Citation Formats
F. S. Kocabaş, “Knowledge discovery in microarray data of bioinformatics,” Ph.D. - Doctoral Program, Middle East Technical University, 2012.