Exploiting large-scale data analytics platforms with accelerator hardware

Li, Xiangyu

Permanent URL: http://hdl.handle.net/2047/D20294268

Title:

Exploiting large-scale data analytics platforms with accelerator hardware

Creator:

Li, Xiangyu (Author)

Contributor:

Kaeli, David (Advisor)
Lin, Xue (Committee member)
Mi, Ningfang (Committee member)

Language:

English

Publisher:

Boston, Massachusetts : Northeastern University, December 2018

Date Awarded:

December 2018

Date Accepted:

August 2018

Type of resource:

Text

Genre:

Dissertations

Format:

electronic

Digital origin:

born digital

Abstract/Description:

The volume of data being generated today across multiple application domains including scientific exploration, web search, e-commerce and medical research, has continued to grow unbounded. The value of leveraging machine learning to analyze big data has led to the growth in popularity of high-level distributed computing frameworks such as Apache Hadoop and Spark. These frameworks have significantly improved the programmability of distributed systems to accelerate big data analysis, whose workload is typically beyond the processing and storage capabilities of a single machine.

GPUs have been shown to provide an effective path to accelerate machine learning tasks. These devices offer high memory bandwidth and thousands of parallel cores which can deliver up to an order of magnitude better performance for machine learning applications as compared to multi-core CPUs.

While both distributed systems and GPUs have been shown to independently provide benefits when processing machine learning tasks on big data, developing an integrated framework that can exploit the parallelism provided GPUs, while maintaining an easy-to-use programming interface, has not been aggressively explored.

In this thesis, we explore the seamless integration of GPUs with Hadoop and Spark to achieve performance and scalability, while preserving their flexible programming interfaces. We propose techniques that expose GPU details for fine-tuned kernels in a Java/Scala-based distributed computing environment, reduce JVM overhead, and increase on/off heap memory efficiency. We use a set of representative machine learning data analytics applications to test our approach and achieve promising performance improvements compared to Hadoop/Spark's multi-core CPU implementations.

Subjects and keywords:

distributed computing
GPU
machine learning

DOI:

https://doi.org/10.17760/D20294268

Permanent Link:

http://hdl.handle.net/2047/D20294268

Use and reproduction:

In Copyright: This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the right-holder(s). (http://rightsstatements.org/vocab/InC/1.0/)
Copyright restrictions may apply.

Downloads

PDF