Robust data representations for visual learning
Permanent URL:
http://hdl.handle.net/2047/D20250037
Dy, Jennifer G. (Committee member)
Wang, Lu (Committee member)
Recent advances on low-rank and sparse modeling have shown promising performance on recovering clean data from noisy observations, which motivate us to develop new models for robust visual learning. This dissertation focuses on extracting mid-level feature representations from visual data such as images and videos. The research goals of this dissertation are twofold: (1) learning robust data representations from visual data, by exploiting the low-dimensional subspace structures; (2) evaluating the performance of the learned data representations on various analytics tasks of images and videos.
Three types of data representations are studied in this dissertation, including graph, subspace, and dictionary. First, two novel graph construction schemes are proposed, by integrating the low-rank modeling with graph sparsification strategies. Each sample is represented in the low-rank coding space. And it is revealed that the similarity measurement in the low-rank coding space is more robust than that in the original sample space. The proposed graphs could greatly enhance the performance of graph based clustering and semi-supervised classification. Second, low-dimensional discriminative subspaces are learned in single-view and multi-view scenarios, respectively. The single-view robust subspace discovery model is motivated from low-rank modeling and Fisher criterion, and it is able to accurately classify the noisy images. The multi-view subspace learning model is designed for extracting compact features from multimodal time series data, which leverages a shared latent space and fuses information from multiple data views. Third, dictionary serves as expressive bases for characterizing visual data. A non-negative dictionary with Laplacian regularization is learned to extract robust features from human motion videos, which leads to promising motion segmentation results. In addition, a robust dictionary learning method is designed to transfer knowledge from source domain to a target domain with limited training samples.
In summary, this dissertation aims to address the challenges in processing noisy visual data captured in real world. The proposed robust data representations have shown promising performance in a wide range of visual learning tasks, such as image clustering, face recognition, human motion segmentation, and multimodal classification.
computer vision
data analytics
machine learning
robust representations
Copyright restrictions may apply.