Ghosal_2022_Invoice.pdf (2.05 MB)
Invoice #31415 attached: Automated analysis of malicious Microsoft Office documents
journal contribution
posted on 2022-02-23, 14:27 authored by Vasilios Koutsokostas, Nikolaos Lykousas, Theodoros Apostolopoulos, Gabriele Orazi, Amrita GhosalAmrita Ghosal, Fran Casino, Mauro Conti, Constantinos PatsakisMicrosoft Office may be by far the most widely used suite for processing documents, spreadsheets, and
presentations. Due to its popularity, it is continuously utilised to carry out malicious campaigns. Threat
actors, exploiting the platform’s dynamic features, use it to launch their attacks and penetrate millions of
hosts in their campaigns.
This work explores the modern landscape of malicious Microsoft Office documents, exposing the means
that malware authors use. We leverage a taxonomy of the tools used to weaponise Microsoft Office documents and explore the modus operandi of malicious actors. Moreover, we generated and publicly shared
a specially crafted dataset, which relies on incorporating benign and malicious documents containing
many dynamic features such as VBA macros and DDE. The latter is crucial for a fair and realistic analysis,
an open issue in the current state of the art. This allows us to draw safe conclusions on the malicious
features and behaviour. More precisely, we extract the necessary features with an automated analysis
pipeline to efficiently and accurately classify a document as benign or malicious using machine learning
with an F1 score above 0.98, outperforming the current state of the art detection algorithms.
Funding
History
Publication
Computers and Security;114, 102582Publisher
ElsevierNote
peer-reviewedOther Funding information
European Union (EU), Horizon 2020, Government of CataloniaLanguage
EnglishExternal identifier
Usage metrics
Categories
No categories selectedLicence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC