Data science for connected car insurance : use of trips raw telematics data for knowledge discovery and customers profiling

Spada, Enrico

Utilize este identificador para referenciar este registo: http://hdl.handle.net/10362/42452

Título:	Data science for connected car insurance : use of trips raw telematics data for knowledge discovery and customers profiling
Autor:	Spada, Enrico
Orientador:	Cabral, Pedro da Costa Brito Claeys, Olivier
Palavras-chave:	Data Science Car Insurance Raw Telematics Data Clustering Risk Knowledge
Data de Defesa:	13-Jul-2018
Resumo:	This report presents all data science processes designed and implemented during the internship at the Actuarial Department of Sterling Insurance1 (Italy). The project developed a complete data science solution, organized according to Cross-Industry Standard Process for Data Mining. The objective is to study in-depth – for the very first time – trips raw telematics data, and to discover actionable knowledge that can be applied to generate value for the business. The research is based on trips raw telematics data generated over 5 months by telematics black-box devices installed in the cars of 937 customers. The data are solely related to trips, with granularity at the finest level of individual geospatial coordinate sets composing trajectories. The features describing each timestamped GPS coordinate set are average speed in the last second, heading, GPS quality, meters travelled since previous position. The data sources consist of semi-structured data stored in several flat files in their native format, batch extracted from the data lake. Starting from trips raw telematics data at the granular level of geospatial coordinate sets, they are extensively studied and enriched with additional open data sources exploiting spatial join operations. Next, a complex concatenation of data preparation tasks is performed to obtain the final dataset, aggregated at the granular level of trips and described by 117 features. The final dataset is fed to the k-means algorithm for discovering patterns over trips characteristics. Patterns are studied considering the overall portfolio, regardless of driver and intentionally neglecting historical or personal information. The study concludes by deploying the clustering results to profile customers, bringing to a new level the risk knowledge of the line of business about its customers. This discovery opens a world of new possibilities, some of the uncountable examples are improve pricing, using results in fraud detection and offering new services and overall risk prevention for customers.
Descrição:	Internship Report presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business Intelligence
URI:	http://hdl.handle.net/10362/42452
Designação:	Mestrado em Gestão de Informação, especialização em Gestão do Conhecimento e Inteligência de Negócio
Aparece nas colecções:	NIMS - Dissertações de Mestrado em Gestão da Informação (Information Management)

Ficheiros deste registo:

Ficheiro	Descrição	Tamanho	Formato
TGI0159.pdf		3,8 MB	Adobe PDF	Ver/Abrir

Mostrar registo em formato completo Dê a sua opinião sobre este registo.