Liu, Ying
[UCL]
In this thesis, we have investigated two aspects towards improving the performance of distributed storage systems. In one direction, we present techniques and algorithms to reduce request latency of distributed storage services that are deployed geographically. In another direction, we propose and design elasticity controllers to maintain predictable/stable performance of distributed storage systems under dynamic workloads and platform uncertainties. On the research path towards the first direction, we have proposed a lease-based data consistency algorithm that allows a distributed storage system to serve read-dominant workload efficiently in a global scale. Essentially, leases are used to assert the correctness/freshness of data within a time interval. The leasing algorithm allows replicas with valid leases to serve read requests locally. As a result, most of the read requests are served with little latency. Furthermore, leases' time-based assertion guarantees the liveness of the consistency algorithm even in the presence of failures. Then, we have investigated the efficiency of quorum-based data consistency algorithms when deployed globally. We have proposed MeteorShower framework, which is based on replicated logs and loosely synchronized clocks, to augment quorum-based data consistency algorithms. In core, MeteorShower allows the algorithms to maintain data consistency using slightly old replica values provided in the replicated logs. As a result, the quorum-based data consistency algorithms no longer need to query for updates from remote replicas, which significantly reduces request latency. Based on similar insights, we build a transaction framework, Catenae, for geo-distributed data stores. It employs replicated logs to distribute transactions and aggregate the execution results. Transactions are distributed in order to accomplish a speculative execution phase, which is coordinated using a transaction chain algorithm. The algorithm orders transactions based on their execution speed with respect to each data partition, which maximizes the concurrency and determinism of transaction executions. As a result, most of the execution results on replicas in different data centers are consistent when examined in a validation phase. This allows Catenae to commit a serializable read-write transaction experiencing only a single inter-DC RTT delay in most of the cases. Following the second research path, we examine and control the factors that cause performance degradation when scaling a distributed storage system. First, we have proposed BwMan, which is a model-based network bandwidth manager. It alleviates performance degradation caused by data migration activities when scaling a distributed storage system. It is achieved by dynamically throttling the network bandwidth allocated to these activities. As a result, the performance of the storage system is more predictable/stable, i.e., satisfying latency-based service level objective (SLO), even in the presence of data migration. As a step forward, we have systematically modeled the impact of data migrations. Using this model, we have built an elasticity controller, namely, ProRenaTa, which combines proactive and reactive controls to achieve better control accuracy. With the help of workload prediction and the data migration model, ProRenaTa is able to calculate the best possible scaling plan to resize a distributed storage system under the constraint of achieving scaling deadlines, reducing latency SLO violations and minimizing VM provisioning cost. As a result, ProRenaTa yields much higher resource utilization and less latency SLO violations comparing to state-of-the-art approaches while provisioning a distributed storage system. Based on ProRenaTa, we have built an elasticity controller named Hubbub-scale, which adopts a control model that generalizes the data migration overhead to the impact of performance interference caused by multi-tenancy in the Cloud.
Bibliographic reference |
Liu, Ying. Towards elastic high-performance geo-distributed storage in the cloud. Prom. : Van Roy, Peter ; Vlassov, Vladimir |
Permanent URL |
http://hdl.handle.net/2078.1/178914 |