A multi-modal transformer approach for football event classification

Zhang, Yixiao; Li, Baihua; Fang, Hui; Meng, Qinggang

ICIP2023_Yixiao_Camera_Ready_0704.pdf (401 kB)

A multi-modal transformer approach for football event classification

conference contribution

posted on 2023-07-04, 10:46 authored by Yixiao Zhang, Baihua LiBaihua Li, Hui FangHui Fang, Qinggang MengQinggang Meng

Video understanding has been enhanced by the use of multi-modal networks. However, recent multi-modal video analysis models have limited applicability to sports videos due to their specialised nature. This paper proposes a novel attention-based multi-modal neural network for sports event classification featuring a multi-stage fusion training strategy. The proposed multi-modal neural network integrates three modalities, including an image sequence modality, an audio modality and a newly proposed sports formation modality, to improve the sports video classification performance. Empirical results show that the proposed model outperforms the state-of-the-art transformer-based video method by 4.43% on top-1 accuracy on Soccernet-V2 dataset.

Funding

China Scholarship Council

Loughborough University

JADE: Joint Academic Data science Endeavour - 2

Engineering and Physical Sciences Research Council

Find out more...

History

School

Science

Department

Computer Science

Published in

2023 IEEE International Conference on Image Processing (ICIP)

Pages

2220 - 2224

Source

2023 IEEE International Conference on Image Processing (ICIP 2023)

Publisher

IEEE

Version

AM (Accepted Manuscript)

Rights holder

Publisher statement

© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Acceptance date

2023-06-21

Publication date

2023-09-11

Copyright date

2023

DOI

https://doi.org/10.1109/ICIP49359.2023.10223172

ISBN

9781728198354

Publisher version

https://doi.org/10.1109/ICIP49359.2023.10223172

Language

en

Location

Kuala Lumpur, Malaysia

Event dates

8th October 2023 - 11th October 2023

Depositor

Yixiao Zhang. Deposit date: 28 June 2023

Usage metrics

Keywords

multi-modal video sports events classification video analysis transformer

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

A multi-modal transformer approach for football event classification

Funding

China Scholarship Council

Loughborough University

JADE: Joint Academic Data science Endeavour - 2

History

School

Department

Published in

Pages

Source

Publisher

Version

Rights holder

Publisher statement

Acceptance date

Publication date

Copyright date

DOI

ISBN

Publisher version

Language

Location

Event dates

Depositor

Usage metrics

Categories

Keywords

Licence

Exports