NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Feature extraction and classification algorithms for high dimensional dataFeature extraction and classification algorithms for high dimensional data are investigated. Developments with regard to sensors for Earth observation are moving in the direction of providing much higher dimensional multispectral imagery than is now possible. In analyzing such high dimensional data, processing time becomes an important factor. With large increases in dimensionality and the number of classes, processing time will increase significantly. To address this problem, a multistage classification scheme is proposed which reduces the processing time substantially by eliminating unlikely classes from further consideration at each stage. Several truncation criteria are developed and the relationship between thresholds and the error caused by the truncation is investigated. Next an approach to feature extraction for classification is proposed based directly on the decision boundaries. It is shown that all the features needed for classification can be extracted from decision boundaries. A characteristic of the proposed method arises by noting that only a portion of the decision boundary is effective in discriminating between classes, and the concept of the effective decision boundary is introduced. The proposed feature extraction algorithm has several desirable properties: it predicts the minimum number of features necessary to achieve the same classification accuracy as in the original space for a given pattern recognition problem; and it finds the necessary feature vectors. The proposed algorithm does not deteriorate under the circumstances of equal means or equal covariances as some previous algorithms do. In addition, the decision boundary feature extraction algorithm can be used both for parametric and non-parametric classifiers. Finally, some problems encountered in analyzing high dimensional data are studied and possible solutions are proposed. First, the increased importance of the second order statistics in analyzing high dimensional data is recognized. By investigating the characteristics of high dimensional data, the reason why the second order statistics must be taken into account in high dimensional data is suggested. Recognizing the importance of the second order statistics, there is a need to represent the second order statistics. A method to visualize statistics using a color code is proposed. By representing statistics using color coding, one can easily extract and compare the first and the second statistics.
Document ID
19940008550
Acquisition Source
Legacy CDMS
Document Type
Contractor Report (CR)
Authors
Lee, Chulhee
(Purdue Univ. West Lafayette, IN, United States)
Landgrebe, David
(Purdue Univ. West Lafayette, IN, United States)
Date Acquired
September 6, 2013
Publication Date
January 1, 1993
Subject Category
Computer Programming And Software
Report/Patent Number
TR-EE-93-1
NASA-CR-194298
NAS 1.26:194298
Accession Number
94N13023
Funding Number(s)
CONTRACT_GRANT: NAGW-925
Distribution Limits
Public
Copyright
Work of the US Gov. Public Use Permitted.
No Preview Available