Comparison of deep networks for gesture recognition

Download

thesisBugraSofu28092021.pdf

Date

2021-9-06

Author

Sofu, Buğra

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

245
views

212
downloads

Gesture recognition is an important problem and has been studied over the years especially in the fields such as surveillance systems, analysis of human behavior, robotics etc. In this thesis, different state of art algorithms, which are based on deep learning, were implemented and compared considering model complexities and accuracies. Also, a new approach was proposed and compared with them. Tested algorithms can be classified into two main categories: hybrid approaches, which use CNN and LSTM architectures successively, and three dimensional convolutional neural networks (3D-CNNs). For the hybrid approaches, we studied CNN-LSTM models and investigated the effect of different feature extractors such as Inception-V3 and ResNext50 models. For the ResNext50 architecture, additional to original network, we included an attention model called Squeeze and Excitation Block (SE). By this new approach, 21% accuracy increase was reached while the number of parameters was decreased, which means less model complexity than the original approach. For the 3D-CNNs, I3D model, which has pre-trained ImageNet weights, was applied and compared with C3D models, which cannot use ImageNet weights directly. Ability to use ImageNet weights gives the advantage of fast training, since network is initialized with ImageNet features, and can also result in a more accurate and effective model overall. 16.5% accuracy increase was obtained for the 3D-CNN architecture when I3D model was trained on Kinetics dataset.

Subject Keywords

Gesture Recognition, Hybrid Networks, 3D-CNNs, Two Stream Networks

URI

https://hdl.handle.net/11511/93021

Collections

Graduate School of Natural and Applied Sciences, Thesis

Suggestions

OpenMETU
Core

Human action recognition for various input characteristics using 3 dimensional residual networks Tüfekci, Gülin; Ulusoy, İlkay; Department of Electrical and Electronics Engineering (2019) Action recognition using deep neural networks is a far-reaching research area which has been commonly utilized in applications such as statistical analysis of human behavior, detecting abnormalities using surveillance cameras and robotic systems. Previous studies have been performing researches to propose new machine learning algorithms and deep network architectures to obtain higher recognition accuracy levels. Instead of suggesting a network resulting in small accuracy gain, this thesis focuses on evaluat...
Comparison of Cuboid and Tracklet Features for Action Recognition on Surveillance Videos Bayram, Ulya; Ulusoy, İlkay; Cicekli, Nihan Kesim (2013-01-01) For recognition of human actions in surveillance videos, action recognition methods in literature are analyzed and coherent feature extraction methods that are promising for success in such videos are identified. Based on local methods, most popular two feature extraction methods (Dollar's "cuboid" feature definition and Raptis and Soatto's "tracklet" feature definition) are tested and compared. Both methods were classified by different methods in their original applications. In order to obtain a more fair ...
SWARM-based data delivery in Social Internet of Things Hasan, Mohammed Zaki; Al-Turjman, Fadi (Elsevier BV, 2019-03-01) Social Internet of Things (SIoTs) refers to the rapidly growing network of connected objects and people that are able to collect and exchange data using embedded sensors. To guarantee the connectivity among these objects and people, fault tolerance routing has to be significantly considered. In this paper, we propose a bio-inspired particle multi-swarm optimization (PMSO) routing algorithm to construct, recover and select k-disjoint paths that tolerates the failure while satisfying quality of service (QoS) ...
Object Recognition via Local Patch Labelling Ulusoy, İlkay (2005-03-01) In recent years the problem of object recognition has received considerable attention from both the machine learning and computer vision communities. The key challenge of this problem is to be able to recognize any member of a category of objects in spite of wide variations in visual appearance due to variations in the form and colour of the object, occlusions, geometrical transformations (such as scaling and rotation), changes in illumination, and potentially non-rigid deformations of the object itself. In...
Generation and modification of 3D models with deep neural networks Öngün, Cihan; Temizel, Alptekin; Department of Information Systems (2021-9) Artificial intelligence (AI) and particularly deep neural networks (DNN) have become very hot topics in the recent years and they have been shown to be successful in problems such as detection, recognition and segmentation. More recently DNNs have started to be popular in data generation problems by the invention of Generative Adversarial Networks (GAN). Using GANs, various types of data such as audio, image or 3D models could be generated. In this thesis, we aim to propose a system that creates artificial...

Citation Formats

B. Sofu, “Comparison of deep networks for gesture recognition,” M.S. - Master of Science, Middle East Technical University, 2021.