Optimization of Progressive Queries via Materialized Views for Large Databases

Zhu, Chao

Optimization of Progressive Queries via Materialized Views for Large Databases

Zhu, Chao

2014-12-13

View/Open

ChaoZhu_Thesis_final.pdf

(1.3MB

PDF)
Dissertation

Abstract

There is an increasing demand to efficiently process emerging types of queries, such as progressive queries (PQ), on large scale databases from numerous contemporary applications including telematics, e-commerce, and social media. Unlike a conventional quer...y, a PQ consists of a set of interrelated step-queries (SQ). A user formulates a new SQ on the fly based on the result(s) from the previously executed SQ(s). Processing PQs raises a number of new challenges. Existing database management systems were not designed to efficiently process such queries. In this dissertation, we propose a suite of novel materialized-view based techniques to efficiently process PQs. First, we propose a dynamic materialized-view based approach to efficiently processing a special type of PQs, called monotonic linear PQs. We introduce a so-called superior relationship graph to capture superior relationships among SQs of such a PQ and suggest a method to estimate the benefit of keeping the result of an SQ as a materialized view using the graph. To efficiently construct the superior relationship graph, we propose two algorithms: generating-based and pruning-based. To improve the view searching efficiency and quality, we design an algorithm with a special storage structure to store and manage the materialized views. Second, to handle generic PQs, we define a so-called multiple query dependency graph to capture the data source dependency relationships that exist among SQs and external tables of a generic PQ. Using the graph, a mathematical benefit estimation model, which takes both the impact and the effectiveness of materialization into consideration, is derived. A greedy method and a dynamic programming method to solve the view maintenance problem are proposed. Third, to efficiently find usable materialized views from the view space/set for answering a given SQ, we suggest a dynamic materialized view index method. A special index tree structure with nodes ordered by a two-level priority rule that facilitates efficient locating of different types of nodes is designed. Bitmaps encoded with special methods are also used to refine the pruning of unusable views during a search. Fourth, to support PQs in a big data environment like Hadoop, we propose an index based technique for performing a new column family join operation on Hbase tables. To efficiently process such a join operation, we suggest a multiple freedom family index. A parallel MapReduce algorithm to construct the index is developed. To perform a column family join on two Hbase tables using the indexes, we present two partitioning methods to balance the workload among map nodes in a MapReduce algorithm. The introduced column family join operation and its relevant processing technique can ensure the closure property that is essential to the processing of PQs. To examine the performance of the proposed techniques, we performed extensive empirical and theoretical analyses. Our studies show that the proposed techniques are quite promising in efficiently processing PQs. To our knowledge, our work is the first to apply the materialized-view based approach to efficiently processing progressive queries on large databases. [more]

Subjects

Databases

Progressive Queries

Materialized Views

Index

Query Optimization

Big Data

Types

Thesis

Handle

https://hdl.handle.net/2027.42/110311

Metadata

Show full item record

Collections

Dissertations and Theses (Ph.D. and Master's)

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.