Learning at the virtualization layer: intrusion detection and workload characterization from within the virtual machine monitor.

Azmandian, Fatemeh

Permanent URL: http://hdl.handle.net/2047/d20002822

Title:

Learning at the virtualization layer : intrusion detection and workload characterization from within the virtual machine monitor

Creator:

Azmandian, Fatemeh (Author)

Contributor:

Kaeli, David R. (Advisor)
Dy, Jennifer G. (Committee member)
Aslam, Javed A. (Committee member)

Publisher:

Boston, Massachusetts : Northeastern University, 2012

Copyright date:

2012

Date Accepted:

August 2012

Date Awarded:

May 2013

Type of resource:

Text

Genre:

Dissertations

Format:

electronic

Digital origin:

born digital

Abstract/Description:

Virtualization technology has many attractive qualities including improved reliability, scalability, and resource sharing/management. As a result, virtualization has been deployed on an array of platforms, from mobile devices to high-end enterprise servers. In this work, we contribute to the benefits of virtualization by providing two key features for virtual machines (VMs): enhanced security using a novel intrusion detection system and workload characterization for virtual machine workloads. What makes our contributions unique is the fact that we only make use of low-level architectural data available at the virtual machine monitor (VMM) layer. This gives rise to several advantages, including a reduction in the overhead introduced into the system, as only the VMM need be modified, and ease of deployment since there are no ties to a specific OS and deployment can occur transparently below different operating systems. In addition, it limits the perturbation introduced into the system, thereby reducing the Observer Effect where the phenomena under observation is altered or lost due to the measurement itself.

The low-level VMM-layer data, in itself, lacks the semantic information available at higher computing abstraction layers, such as the application layer or operating system layer. Only with the right set of tools is it possible to realize the richness hidden within the raw data. Thus, we take the approach of learning at the VMM layer; we apply machine learning and data mining techniques to understand what it means for an execution stream to be identified as "normal". Then we can flag deviations from normal as suspicious activity, signaling the presence of malware, as well as break down normal behavior into its constituent parts corresponding to prevalent components of a computer system.

Our experiments on over 300 real-world malware and exploits illustrate that there is sufficient information embedded within the VMM-level data to allow accurate detection of malicious attacks, with an acceptable false alarm rate. In this thesis, we also demonstrate that the information available at the VMM level still retains rich workload characteristics that can be used to identify application behavior. We show that we are able to capture enough information about a workload to characterize and decompose it into a combination of CPU, memory, disk I/O, and network I/O-intensive components. Dissecting the behavior of a workload in terms of these components, we can develop significant insight into the behavior of any application.

Finally, in this thesis we propose a novel feature selection algorithm designed to facilitate the process of identifying outliers. It is the first of its kind to tackle the difficult task of selecting features suitable for outlier detection problems. With its opportunities for parallelism, the algorithm becomes an excellent candidate for implementation on a graphics processing unit (GPU). Through the acceleration provided by general purpose computing on a GPU (GPGPU), we demonstrate the benefits of utilizing the proposed approach over popular and state-of-the-art feature selection techniques, and its high applicability to large datasets.

Subjects and keywords:

Data Mining
Feature Selection
Intrusion Detection
Machine Learning
Virtualization
Workload Characterization
Electrical and Computer Engineering
Systems and Communications

DOI:

https://doi.org/10.17760/d20002822

Permanent Link:

http://hdl.handle.net/2047/d20002822

Use and reproduction:

In Copyright: This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the right-holder(s). (http://rightsstatements.org/vocab/InC/1.0/)
Copyright restrictions may apply.

Downloads

PDF