Identifying Polymorphic Malware Variants Using Biosequence Analysis Techniques

Date
2018
Authors
Naidu, Vijay Jeevanantham
Supervisor
Narayanan, Ajit
Whalley, Jacqueline
Pears, Russel
Item type
Thesis
Degree name
Doctor of Philosophy
Journal Title
Journal ISSN
Volume Title
Publisher
Auckland University of Technology
Abstract

Modern antivirus systems (AVSs) are not able to detect new polymorphic malware variants until they emerge, even when signatures of one or more variants belonging to a specific polymorphic malware family are known. Polymorphic malware can transform into functionally identical variants of themselves. Polymorphism changes the order of the viral code but not typically the code itself to avoid signature-based detection. Current AVSs detect malware by adopting signatures based on the most essential parts of a known virus, such as execution traces, instruction sequences, etc. Virus writers exploit the weaknesses of malware signature databases by creating new variants using the same engine employed by an already existing polymorphic malware family. In this thesis, virus detection and signature extraction techniques are presented. These techniques were developed by exploring string matching techniques traditionally employed in biosequence analysis. The main contribution of these matching techniques is to extract syntactic patterns (i.e. conserved regions/sequences) from semantically rich polymorphic hex code. These extracted syntactic patterns act as signatures and are used in the identification of polymorphic malware variants belonging to the same family. Moreover, these extracted syntactic patterns can help in identifying new variants that make simple alterations to their newly generated variants. The string matching approaches presented in this thesis may revolutionise our knowledge of polymorphic variant generation and give rise to a new era of string-based syntactic AVSs.

Description
Keywords
Smith-Waterman algorithm , Dynamic programming , Polymorphic malware , Syntactic approach , Sequence alignment techniques , String matching algorithm , Biological sequences , Bioinformatics , Data mining , Automatic signature generation , Phylogenetics
Source
DOI
Publisher's version
Rights statement
Collections