Optimized Decoding for Auditory Attention Detection

Ekin, Bradley Robert

Optimized Decoding for Auditory Attention Detection

Files

Ekin_washington_0250O_15973.pdf (630.84 KB)

Authors

Ekin, Bradley Robert

Abstract

The method of stimulus reconstruction has shown to be an effective tool for detecting a listener's attentional focus in a multi-talker environment. Using electroencephalography (EEG), this technique aims to learn neural decoding functions to predict a signal which is most similar to the temporal amplitude envelope of an attended talker's speech. By comparing this prediction to the envelope of each speech source in the environment, a decision can be made as to which source the listener is attending to. However, the conventional method for stimulus reconstruction is incomplete when applied to multi-talker environments. This is because the standard minimum mean square error (MMSE) criterion used for learning neural decoder functions discards information relating to how the brain jointly encodes both attended and unattended speech stimuli, discarding information which could be used for developing more discriminative decoders for auditory attention detection. This thesis proposes how the conventional method of stimulus reconstruction can be improved by incorporating concepts from linear discriminant analysis (LDA). Utilizing the expected neural encoding properties to all attentional stimuli, we show how reconstruction error can be minimized while simultaneously maximizing the distance between the attentional class similarity metrics used for attention detection. This thesis then proposes how the method of stimulus reconstruction can be performed using only the spatial component of the neural response, improving computational efficiency by significantly reducing the number of neural features used for attention detection. By employing the utility of canonical correlation analysis (CCA) to relate this spatial neural response to a temporal window of stimulus lags, we show how detection accuracy comparable to traditional stimulus reconstruction can be achieved; accuracies which further improve by adapting concepts from LDA into this reduced-rank framework for auditory attention detection.