TitleDesign of loss functions and feature transformation for minimum classification error based automatic speech recognition
NameRatnagiri, Madhavi (author), RABINER, LAWRENCE (chair), Wilder, Joseph (internal member), Marsic, Ivan (internal member), Juang, Biing-Hwang (outside member), Rutgers University, Graduate School - New Brunswick,
SubjectElectrical and Computer Engineering,
Speech processing systems,
Pattern recognition systems
DescriptionAn automatic speech recognition system has two main components, the front end feature processing component, followed by a model training component. A widely used algorithm for training models is Minimum Classification Error (MCE) that designs the model parameters to minimize recognition error. One approach for designing the feature processing component to minimize the error in recognition is transforming the features using the MCE criterion that is most commonly used only to estimate the model parameters. Past efforts that have integrated feature transformation into MCE training assumed the Hidden Markov Model state distributions were represented by diagonal covariance Gaussian mixtures when estimating the model parameters; but assumed full covariance Gaussian mixtures when estimating the feature transformation matrix. We rectify this discrepancy in assumptions, derive and implement the MCE based feature transformation, using diagonal covariance Gaussian mixtures. For designing the model parameters, MCE minimizes the recognition error using a standard sigmoid loss function, which is a Parzen window estimate of the Bayes risk. Using different kernels for Parzen Window estimation we developed new loss functions, viz. the Gaussian Kernel based loss and Generalized Savage loss. Investigation into the recognition performance of these loss functions lead to the introduction of a new large margin based loss function for MCE (LM-MCE) where the error is minimized and margin maximized. Minimizing the error, aims to shift the decision boundary so the wrongly classified tokens move to the side of the correct class, (however some of these tokens could lie on or close to the boundary); while increasing the margin between the correctly classified tokens and the decision boundary, removes some of these tokens away from the boundary and improves the robustness of the classification. Unlike previous studies, this effort does not require MCE training prior to maximizing the margin nor does it require non-optimal manual increments of the margin shifts; and since this loss function is bounded, it is not susceptible to outliers. The development of the new LM-MCE loss function has a theoretical basis in the Vapnik and Chervonenkis (VC) theory and was developed using the Bayes risk formulation.
NoteIncludes bibliographical references
Noteby Madhavi Vedula Ratnagiri
CollectionGraduate School - New Brunswick Electronic Theses and Dissertations
Organization NameRutgers, The State University of New Jersey
RightsThe author owns the copyright to this work.