associated with noisy labeling. We propose a forced-alignment likelihood and fuzzy string
matching score based iterative selection of the corpus data to retrain the acoustic model in
an order of increasing degree of noise in the transcript, yielding a succession of enhanced
acoustic models, offering progressively lower error rates on an held-out test data. We show
results in terms of PER (phoneme-error-rate) on a large broadcast news data from a national …