Confusion Network (CN) Decoding System
Jump to navigation
Jump to search
A Confusion Network (CN) Decoding System is a Natural Language Processing System that can build a Word Confusion Network from the output data of automatic speech recognition or machine translation systems.
- AKA: Word Confusion Network (WCN) Decoding System.
- Context:
- It can implement a Confusion Network Decoding Algorithm to solve a Confusion Network Decoding Task.
- It can range from being a Speech Recognition Confusion Network Decoding System to being a Machine Translation Confusion Network Decoding System.
- Example(s):
- Counter-Example(s):
- See: Word Representation. Neural Encoder-Decoder Network, Natural Language Processing, Speech Recognition, Machine Translation, Directed Acyclic Graph, Open-Source Software, Moses Statistical Machine Translation System.
References
2021
- (Wikipedia, 2021) ⇒ https://en.wikipedia.org/wiki/Confusion_network Retrieved:2021-7-11.
- A confusion network (sometimes called a word confusion network or informally known as a sausage) is a natural language processing method that combines outputs from multiple automatic speech recognition or machine translation systems. Confusion networks are simple linear directed acyclic graphs with the property that each a path from the start node to the end node goes through all the other nodes. The set of words represented by edges between two nodes is called a confusion set. In machine translation, the defining characteristic of confusion networks is that they allow multiple ambiguous inputs, deferring committal translation decisions until later stages of processing. This approach is used in the open source machine translation software Moses and the proprietary translation API in IBM Bluemix Watson.
2020
- (Liu et al., 2020) Chen Liu, Su Zhu, Zijian Zhao, Ruisheng Cao, Lu Chen, and Kai Yu. “Jointly Encoding Word Confusion Network and Dialogue Context with BERT for Spoken Language Understanding". In: Proceedings of the 21st Annual Conference of the International Speech Communication Association (Interspeech 2020).
- QUOTE: (...) we propose a novel BERT (Bidirectional Encoder Representations from Transformers) (...) based SLU model to jointly encode WCNs and system acts, which is named WCN-BERT SLU. It consists of three parts: a BERT encoder for jointly encoding, an utterance representation model, and an output layer for predicting semantic tuples. The BERT encoder exploits posterior probabilities of word candidates in WCNs to inject ASR confidences. Multi-head self-attention is applied over both WCNs and system acts to learn context-aware hidden states. The utterance representation model produces an utterance-level vector by aggregating final hidden vectors. Finally, we add both discriminative and generative output layers to predict semantic tuples.
2011
- (Rosti et al., 2011) ⇒ Antti-Veikko Rosti, Eugene Matusov, Jason Smith, Necip Ayan, Jason Eisner, Damianos Karakos, Sanjeev Khudanpur, Gregor Leusch, Zhifei Li, Spyros Matsoukas, Hermann Ney, Richard Schwartz, B. Zhang, and J. Zheng (2011). “Confusion Network Decoding for MT System Combination".In: Handbook of Natural Language Processing and Machine Translation. ISBN: 978-1-4419-7713-7
- QUOTE: Confusion network decoding has been very successful in combining speech-to-text (STT) outputs (...) from diverse systems using different modeling assumptions. Several modeling paradigms have been introduced in machine translation (MT) including rule-based, phrase-based, hierarchical, syntax-based and even cascades of rule-based and statistical MT systems. Building confusion networks from MT system outputs is more challenging compared to STT system outputs since the translations may have very different word orders and varying lexical choices without affecting the meaning of the sentence, whereas, the words and the word order of speech transcriptions are strictly defined by the utterance.
2008a
- (Leusch et al., 2008) ⇒ Gregor Leusch, Evgeny Matusov, and Hermann Ney (2008). "Complexity of Finding the BLEU-Optimal Hypothesis in a Confusion Network". In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP 2008).
- QUOTE: Confusion networks are a simple representation of multiple speech recognition or translation hypotheses in a machine translation system.
2008b
- (Rosti et al., 2008) ⇒ Antti-Veikko I. Rosti, Bing Zhang, Spyros Matsoukas, and Richard Schwartz (2008)."Incremental Hypothesis Alignment for Building Confusion Networks with Application to Machine Translation System Combination". In: Proceedings of the Third Workshop on Statistical Machine Translation (WMT@ACL 2008).
- QUOTE: Confusion network decoding has been the most successful approach in combining outputs from multiple machine translation (MT) systems in the recent DARPA GALE and NIST Open MT evaluations. Due to the varying word order between outputs from different MT systems, the hypothesis alignment presents the biggest challenge in confusion network decoding. This paper describes an incremental alignment method to build confusion networks based on the translation edit rate (TER) algorithm.
2007
- (Koehn et al., 2007) ⇒ Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. (2007). “Moses: Open Source Toolkit for Statistical Machine Translation". In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions (ACL 2007).
- QUOTE: Recently, approaches have been proposed for improving translation quality through the processing of multiple input hypotheses. We have implemented in Moses confusion network decoding as discussed in (Bertoldi and Federico 2005), and developed a simpler translation model and a more efficient implementation of the search algorithm. Remarkably, the confusion network decoder resulted in an extension of the standard text decoder.
2005
- (Bertoldi & Federico, 2005) ⇒ Nicola Bertoldi, and Marcello Federico (2005). “A New Decoder for Spoken Language Translation Based on Confusion Networks". In: Proceedings of the Automatic Speech Recognition and Understanding Workshop (ASRU 2005).
- QUOTE: In this paper, we propose an alternative approach which lies in between. Translation is namely applied on an approximation of the original ASR word-graph, known as confusion network (...) A specific log-linear translation model and efficient decoding algorithm are proposed which take advantage of the topological properties of confusion networks. The decoder can be seen as an extension of a phrase-based beam-search algorithm (...), in that each input word can now admit a variable number of alternative hypotheses, including the empty word.