Robust data representations for visual learning

Title:
Robust data representations for visual learning
Creator:
Li, Sheng (Author)
Contributor:
Fu, Yun (Advisor)
Dy, Jennifer G. (Committee member)
Wang, Lu (Committee member)
Language:
English
Publisher:
Boston, Massachusetts : Northeastern University, 2017
Date Accepted:
April 2017
Date Awarded:
May 2017
Type of resource:
Text
Genre:
Dissertations
Format:
electronic
Digital origin:
born digital
Abstract/Description:
Extracting informative representations from data is a critical task in visual learning applications, which mitigates the gap between low-level observed data and high-level semantic knowledge. Many traditional visual learning algorithms pose strong assumptions on the underlying distribution of data. In practice, however, the data might be corrupted, contaminated with severe noise, or captured by different types of sensors, which violates these assumptions. As a result, it is of great importance to learn robust data representations that could effectively and efficiently handle the noisy visual data.

Recent advances on low-rank and sparse modeling have shown promising performance on recovering clean data from noisy observations, which motivate us to develop new models for robust visual learning. This dissertation focuses on extracting mid-level feature representations from visual data such as images and videos. The research goals of this dissertation are twofold: (1) learning robust data representations from visual data, by exploiting the low-dimensional subspace structures; (2) evaluating the performance of the learned data representations on various analytics tasks of images and videos.

Three types of data representations are studied in this dissertation, including graph, subspace, and dictionary. First, two novel graph construction schemes are proposed, by integrating the low-rank modeling with graph sparsification strategies. Each sample is represented in the low-rank coding space. And it is revealed that the similarity measurement in the low-rank coding space is more robust than that in the original sample space. The proposed graphs could greatly enhance the performance of graph based clustering and semi-supervised classification. Second, low-dimensional discriminative subspaces are learned in single-view and multi-view scenarios, respectively. The single-view robust subspace discovery model is motivated from low-rank modeling and Fisher criterion, and it is able to accurately classify the noisy images. The multi-view subspace learning model is designed for extracting compact features from multimodal time series data, which leverages a shared latent space and fuses information from multiple data views. Third, dictionary serves as expressive bases for characterizing visual data. A non-negative dictionary with Laplacian regularization is learned to extract robust features from human motion videos, which leads to promising motion segmentation results. In addition, a robust dictionary learning method is designed to transfer knowledge from source domain to a target domain with limited training samples.

In summary, this dissertation aims to address the challenges in processing noisy visual data captured in real world. The proposed robust data representations have shown promising performance in a wide range of visual learning tasks, such as image clustering, face recognition, human motion segmentation, and multimodal classification.
Subjects and keywords:
big data
computer vision
data analytics
machine learning
robust representations
DOI:
https://doi.org/10.17760/D20250037
Permanent Link:
http://hdl.handle.net/2047/D20250037
Use and reproduction:
In Copyright: This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the right-holder(s). (http://rightsstatements.org/vocab/InC/1.0/)
Copyright restrictions may apply.

Downloads