Title: 異質 MapReduce 環境中的負載感知排程器
A Load-Aware Scheduler for MapReduce Framework in Heterogeneous Environments
Authors: 尤信翰
You, Hsin-Han
黃俊龍
Huang, Jiun-Long
資訊科學與工程研究所
Keywords: 雲端運算;Hadoop MapReduce;排程演算法;Cloud Computing;Hadoop MapReduce;Scheduling Algorithm
Issue Date: 2010
Abstract:   MapReduce 已成為目前處理海量資料的一個利器。不論資料探勘、處理紀錄檔、處理網頁索引及其他需要大量 資料處理的科學研究,都可透過 MapReduce 得到好的程式擴展性以及執行效率。MapReduce 為一分散式批次資料處理程式框架, MapReduce 將一個工作分解為許多小的 map 任務以及 reduce 任務。每個工作節點分別完成一些 map 任務及 reduce 任務來完成 整個工作。 Hadoop MapReduce 是目前最熱門的 MapReduce 開放原始碼實做。Hadoop MapReduce 有一個可抽換的排程介面 TaskScheduler,預設的排程器將工作用先進先出的方法排程。   排程器如何選擇工作,分配 map/reduce 任務會影響 MapReduce 工作的執行效率與整體工作站群的使用率。 在實際的服務中,我們發現有許多問題仍需要在排程時考慮以增進效能,如工作節點的動態負載、工作節點的異質性、 同時存在許多工作時的任務選擇。我們發現目前的 Hadoop MapReduce 對於這些問題並沒有妥善處理,並且在相關的情況下, 整體效能會下降。我們針對這些問題提出了我們的解法,以及在 Hadoop MapReduce 上實做了我們的 Load-Aware 排程器 來提昇整體工作效能。我們的實驗可以在一般的多工作情況下,透過避免不必要的備份任務,提昇平均 10% 至 20% 的效能。
MapReduce is becoming a trendy programming model for large-scale data processing such as data mining, log processing, web indexing and scientific research. MapReduce framework is a batch distributed data processing framework that disassembles a job into smaller map tasks and reduce tasks. In MapReduce framework, master node distributes tasks to worker nodes to complete the whole job. Hadoop MapReduce is the most popular open-source implementation of MapReduce framework. Hadoop MapReduce comes with a pluggable task scheduler interface and a default FIFO job scheduler. Performance of MapReduce jobs and overall cluster utilization rely on how the tasks being assigned and processed. In practice, there are some issues such as dynamic loading, heterogeneity of nodes, multiple job scheduling needs to be taken into account. We find that current Hadoop scheduler suffers from performance degradation due to the above problems. We propose a new scheduler named Load-Aware Scheduler to address these issues, and improve the overall performance and utilization. Experimental results show that we could improve 10 to 20 of utilization on average by avoid unnecessary speculative tasks.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079755525
http://hdl.handle.net/11536/45872
Appears in Collections:Thesis


Files in This Item:

  1. 552501.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.