SummaryNet: two-stream convolutional networks for automatic video summarisation

Jappie, Ziyad

SummaryNet: two-stream convolutional networks for automatic video summarisation

Files

ZIYAD JAPPIE 557803.pdf (10.57 MB)

Date

2020

Authors

Jappie, Ziyad

Abstract

Video summarisation is the task of automatically summarising a video sequence, to extract “important” parts of the video so as to give an overview of what has occurred. The benefit of solving this problem is that it can be applied to a myriad of fields such as the entertainment industry, sports, e-learning and many more. There is a distinct inherent difficulty with video summarisation due to its subjectivity - there is no one defined correct answer. As such, it is particularly difficult to define and measure tangible performance. This is in addition to the other difficulties associated with general video processing. We present a novel two-stream network framework for automatic video summarisation, which we call SummaryNet. The SummaryNet employs a deep two-stream network to model pertinent spatio-temporal features by leveraging RGB as well as optical flow information. We use the Two-Stream Inflated 3D ConvNet (I3D) network to extract high-level, semantic feature representations as inputs to our SummaryNet model. Experimental results on common benchmark datasets show that the considered method achieves comparable or better results than the state-of-the-art video summarisation methods

Description

A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Master of Science, 2020

Citation

Jappie, Ziyad (2020) SummaryNet: two-stream convolutional networks for automatic video summarisation, University of the Witwatersrand, Johannesburg, https://hdl.handle.net/10539/30207

URI

https://hdl.handle.net/10539/30207

Collections

ETD Collection

Full item page