In this paper, we investigate the use of a wavelet transform-based analysis of audio tracks accompanying videos for the problem of automatic program genre detection. We compare the classification performance based on wavelet-based audio features to that using conventional features derived from Fourier and time analysis for the task of discriminating TV programs such as news, commercials, music shows, concerts, motor racing games, and animated cartoons. Three different classifiers namely the Decision Trees, SVMs, and k-Nearest Neighbours are studied to analyse the reliability of the performance of our wavelet features based approach. Further, we investigate the issue of an appropriate duration of an audio clip to be analyzed for this automatic genre determination. Our experimental results show that features derived from the wavelet transform of the audio signal can very well separate the six video genres studied. It is also found that there is no significant difference in performance with varying audio clip durations across the classifiers.
History
Event
Asian Conference on Computer Vision (5th : 2002 : Melbourne, Vic.)
Pagination
69 - 74
Publisher
Asian Federation of Computer Vision Societies
Location
Melbourne, Vic.
Place of publication
[Tokyo, Japan]
Start date
2002-01-22
End date
2002-01-25
ISBN-13
9780958025607
ISBN-10
0958025606
Language
eng
Notes
Papers will be published in Springer's Lecture Notes in Computer Science.
Publication classification
E1.1 Full written paper - refereed
Copyright notice
2002, Springer
Editor/Contributor(s)
D Suter, A Bab-Hadiashar
Title of proceedings
ACCV 2002 : Proceedings of the 5th Asian Conference on Computer Vision