Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/162442
Title: Visual metric and semantic localization for UGV
Authors: Zhang, Handuo
Keywords: Engineering::Electrical and electronic engineering::Control and instrumentation::Robotics
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Issue Date: 2022
Publisher: Nanyang Technological University
Source: Zhang, H. (2022). Visual metric and semantic localization for UGV. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/162442
Project: M-RP1A
Abstract: During the continual transitions from lab research to real-world applications of vision-based algorithms, there are significant challenges, e.g. the robustness to adapt to complex environments and the high demands of multi-task, multi-modal learning models. More specifically on visual simultaneous localization and mapping (visual SLAM) for mobile robots, two notable limitations are: (i) the drift issue during pose estimation, especially in dynamic environments, makes the positioning system unstable; (ii) the additional loop detection takes many computational resources for image retrieval and global feature geometry checking. This thesis explores the approaches of visual localization on unmanned ground vehicles by solving those limitations. The conventional feature extraction and matching pipeline for vision-based tasks involve three steps: (i) local feature detection and description, (ii) feature matching, and (iii) outlier rejection. The first contribution of the thesis is adding a feature selection and anticipation stage to reduce tracking drift. We explore the noise model of image features to select a subset of all the observed image features with the best ”contribution” during data association and pose estimation across multiple frames. Conventional SLAM algorithms take a strong assumption of scene rigidity, which limits the application under challenging environments. The second part of the thesis addresses the tough issue in dynamic environments with moving objects. We presented GMC, namely the motion clustering approach, a lightweight dynamic object filtering method. It can distinguish moving objects from static landmarks. Based on the theory of motion coherence within a particular image area, GMC could segment dynamic objects in 3D space. We can provide an efficient and robust correspondence algorithm that can extract dynamic objects from a static background with the method. In this way, we propose a dynamic SLAM system that is real-time and free from expensive GPU processors. In contrast to GMC, the thesis’s third part turns to an object-aware learning-based model for more general dynamic scenarios. We use object detection and tracking as points, lines, planes, etc. We utilize semantic information and extract sparse image features simultaneously to keep track of dynamic objects. The static background and different dynamic objects are jointly optimized in a newly developed bundle adjustment sliding window. The estimated 3D bounding boxes can provide more robust camera tracking and better scene understanding, and better map merging. The fourth part of the thesis leverages the emerging feature learning framework. It proposes a unified self-supervised model called LGDNet to generate both global and local image feature descriptors end-to-end. Global feature descriptors embed the whole image into a compact representation, leading to easier scene comparison. On the other hand, local features focus more on the local region similarities of some salient parts for structure from motion. Our proposed method can directly extract features together with descriptors that encode both local maximum responses and global context information, avoiding duplicate calculations based on different feature extraction criteria.
URI: https://hdl.handle.net/10356/162442
DOI: 10.32657/10356/162442
Schools: School of Electrical and Electronic Engineering 
Research Centres: ST Engineering-NTU Corporate Lab 
Rights: This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
Thesis__Visual Metric and Semantic Localization.pdf12.27 MBAdobe PDFThumbnail
View/Open

Page view(s)

254
Updated on Mar 29, 2024

Download(s) 50

109
Updated on Mar 29, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.