2018 WordErrorRateEstimationforSpeec

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Automatic Speech Recognition (ASR) System; Word Error Rate (WER) Measure; e-WER; Large Vocabulary Continuous Speech Recognition (LVCSR) System.

Notes

Cited By

Quotes

Abstract

Measuring the performance of automatic speech recognition (ASR) systems requires manually transcribed data in order to compute the word error rate (WER), which is often time-consuming and expensive. In this paper, we propose a novel approach to estimate WER, or e-WER, which does not require a gold-standard transcription of the test set. Our e-WER framework uses a comprehensive set of features: ASR recognised text, character recognition results to complement recognition output, and internal decoder features. We report results for the two features; black-box and glass-box using unseen 24 Arabic broadcast programs. Our system achieves 16.9% WER root mean squared error (RMSE) across 1, 400 sentences. The estimated overall WER eWER was 25.3% for the three hours test set, while the actual WER was 28.5%.

References

BibTeX

@inproceedings{2018_WordErrorRateEstimationforSpeec,
  author    = {Ahmed Ali and
               Steve Renals},
  editor    = {Iryna Gurevych and
               Yusuke Miyao},
  title     = {Word Error Rate Estimation for Speech Recognition: e-WER},
  booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational
               Linguistics (ACL 2018) Volume 2: Short Papers},
  pages     = {20--24},
  publisher = {Association for Computational Linguistics},
  year      = {2018},
  url       = {https://www.aclweb.org/anthology/P18-2004/},
  doi       = {10.18653/v1/P18-2004},
}



 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2018 WordErrorRateEstimationforSpeecAhmed Ali
Steve Renals
Word Error Rate Estimation for Speech Recognition: E-WER2018