Doctoral thesis (Dissertations and theses)
Machine Learning Techniques for Suspicious Transaction Detection and Analysis
Camino, Ramiro Daniel
2020
 

Files


Full Text
PhD_Thesis.pdf
Author postprint (5.27 MB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
machine learning; fraud detection; deep generative models; anti-money laundering; ripple; ethereum
Abstract :
[en] Financial services must monitor their transactions to prevent being used for money laundering and combat the financing of terrorism. Initially, organizations in charge of fraud regulation were only concerned about financial institutions such as banks. However, nowadays, the Fintech industry, online businesses, or platforms involving virtual assets can also be affected by similar criminal schemes. Regardless of the differences between the entities mentioned above, malicious activities affecting them share many common patterns. This dissertation's first goal is to compile and compare existing studies involving machine learning to detect and analyze suspicious transactions. The second goal is to synthesize methodologies from the last goal for tackling different use cases in an organized manner. Finally, the third goal is to assess the applicability of deep generative models for enhancing existing solutions. In the first part of the thesis, we propose an unsupervised methodology for detecting suspicious transactions applied to two case studies. One is related to transactions from a money remittance network, and the other is related to a novel payment network based on distributed ledger technologies. Anomaly detection algorithms are applied to rank user accounts based on recency, frequency, and monetary features. The results are manually validated by domain experts, confirming known scenarios and finding unexpected new cases. In the second part, we carry out an analogous analysis employing supervised methods, along with a case study where we classify Ethereum smart contracts into honeypots and non-honeypots. We take features from the source code, the transaction data, and the funds' flow characterization. The proposed classification models proved to generalize well to unseen honeypot instances and techniques and allowed us to characterize previously unknown techniques. In the third part, we analyze the challenges that tabular data brings into the domain of deep generative models, a particular type of data used to represent financial transactions in the previous two parts. We propose a new model architecture by adapting state-of-the-art methods to output multiple variables from mixed types distributions. Additionally, we extend the evaluation metrics used in the literature to the multi-output setting, and we show empirically that our approach outperforms the existing methods. Finally, in the last part, we extend the work from the third part by applying the presented models to enhance classification tasks from the second part, commonly containing a severe class imbalance. We introduce the multi-input architecture to expand models alongside our previously proposed multi-output architecture. We compare three techniques to sample from deep generative models defining a transparent and fair large-scale experimental protocol and interesting visual analysis tools. We showed that general machine learning detection and visualization techniques could help address the fraud detection domain's many challenges. In particular, deep generative models can add value to the classification task given the imbalanced nature of the fraudulent class, in exchange for implementation and time complexity. Future and promising applications for deep generative models include missing data imputation and sharing synthetic data or data generators preserving privacy constraints.
Disciplines :
Computer science
Author, co-author :
Camino, Ramiro Daniel ;  University of Luxembourg > Faculty of Science, Technology and Communication (FSTC)
Language :
English
Title :
Machine Learning Techniques for Suspicious Transaction Detection and Analysis
Defense date :
08 October 2020
Number of pages :
136
Institution :
Unilu - University of Luxembourg, Luxembourg, Luxembourg
Degree :
Docteur de l'Université du Luxembourg en Informatique
Promotor :
President :
Jury member :
Aouada, Djamila  
Fernández Slezak, Diego
Hammerschmidt, Christian
Focus Area :
Computational Sciences
FnR Project :
FNR11614300 - Advanced Market Abuse Detection With Big Data, 2017 (01/03/2017-14/10/2020) - Ramiro Daniel Camino
Available on ORBilu :
since 07 December 2020

Statistics


Number of views
211 (18 by Unilu)
Number of downloads
1036 (32 by Unilu)

Bibliography


Similar publications



Contact ORBilu