Meessen, J.
Xu, L. -Q.
Macq, Benoît
[UCL]
The paper presents a novel method and software platform for remote and interactive browsing of a summary of long video sequences as well as revealing the semantic links between shots and scenes in their temporal context. The solution is based on interactive navigation in a scalable mega image resulting from a JPEG 2000 coded key-frame-based video summary. Each key-frame could represent an automatically detected shot, event or scene, which is then properly annotated using some semi-automatic tools or learning methods. The presented system is compliant with the new JPEG 2000 Part 9 'JPIP - JPEG 2000 interactivity, API and protocols,' which lends itself to working under varying transmission channel conditions such as GPRS or 3G wireless networks. While keeping the advantages of a single 2D video summary, like the limited storage cost, the flexibility offered by JPEG 2000 allows the application to highlight interactively key-frames corresponding to the desired content first within a low-quality and low-resolution version of the full video summary. It then offers fine grain scalability for a user to navigate and zoom into particular scenes or events represented by the key-frames. This possibility of visualising key-frames of interest and playing back the corresponding video shots within the context of the whole sequence (e.g. an episode of a media file) enables the user to understand the temporal relations between semantically related events/actions/physical settings, providing a new way to present and search for contents in video sequences.
Bibliographic reference |
Meessen, J. ; Xu, L. -Q. ; Macq, Benoît. Content browsing and semantic context viewing through JPEG 2000-based scalable video summary. In: IET Proceedings - Vision, Image & Signal Processing, Vol. 153, no. 3, p. 274-283 (2006) |
Permanent URL |
http://hdl.handle.net/2078.1/38352 |