Lee-Krahmer-Wubben Data-To-Text Generation Task
(Redirected from Lee-Krahmer-Wubben Data-To-Text Generation Benchmarking Task)
Jump to navigation
Jump to search
A Lee-Krahmer-Wubben (LKW) Data-To-Text Generation Task is a Data-to-Text Generation Task that generates text items using templatized data representations.
- AKA: Data-to-Text Generation via Template-based Design.
- Context:
- Task Input(s): text examples (words, phrases, sentences) from a corpus.
- Task Output(s): automatic generated text.
- Task Requirement(s):
- It can be solved by Lee-Krahmer-Wubben Data-To-Text Generation Training System that implements Lee-Krahmer-Wubben Data-To-Text Generation Algorithms.
- Example(s):
- BLEU scores (Tab.8 Lee et al., 2018):
Corpus | Retrieval | SMT | NMT | ||||||
---|---|---|---|---|---|---|---|---|---|
Templates (filled) |
Templates (unfilled) |
Direct | Templates (filled) |
Templatess (unfilled) |
Direct | Templates (filled) |
Templates (unfilled) |
Direct | |
Weather.gov | 63.94 | 34.52 | 69.57 | 89.29 | 36.56 | 61.92 | 89.85 | 36.93 | 78.90 |
Prodigy-METEO | 44.47 | 27.65 | 23.66 | 39.32 | 26.15 | 30.37 | 45.03 | 26.52 | 27.82 |
Robocup | 31.39 | 30.73 | 22.38 | 40.77 | 38.18 | 39.04 | 38.98 | 36.62 | 37.50 |
DutchSoccer | 2.49 | 1.65 | 4.99 | 1.64 | 0.90 | 2.10 | 1.95 | 1.23 | 1.70 |
- Human Evaluations (Tab.9 Lee et al., 2018):
Retrieval | SMT | NMT | |||||
---|---|---|---|---|---|---|---|
Corpus | Templates | Direct | Templates | Direct | Templates | Direct | |
Fluency | Weather.gov | 4.08(1.04) | 5.32(0.88) | 5.24(0.95) | 4.76(0.79) | 5.00(0.97) | 5.50(1.02) |
Prodigy-METEO | 3.27(1.13) | 2.81(1.14) | 2.99(1.16) | 3.02(1.13) | 3.31(1.47) | 3.27(1.43) | |
Robocup | 5.21(0.99) | 5.46(1.05) | 5.70(0.99) | 4.82(1.20) | 5.59(1.04) | 5.67(1.11) | |
Dutch Soccer | 4.12(0.99) | 5.33(0.91) | 2.11(0.97) | 1.78(0.85) | 6.10(0.84) | 5.73(0.84) | |
Clarity | Weather.gov | 4.36(1.14) | 5.52(0.99) | 5.45(1.02) | 5.24(1.02) | 5.13(1.26) | 5.69(1.04) |
Prodigy-METEO | 2.94(1.24) | 2.73(1.26) | 2.82(1.27) | 2.96(1.16) | 3.25(1.57) | 3.29(1.47) | |
Robocup | 5.59(0.96) | 5.73(1.03) | 5.96(0.92) | 5.11(1.22) | 5.84(0.98) | 5.78(1.37) | |
Dutch Soccer | 4.85(1.16) | 5.52(0.90) | 2.43(0.99) | 1.94(0.90) | 6.10(0.92) | 5.74(0.83) | |
Correctness | Weather.gov | 3.34(0.91) | 3.92(0.90) | 2.55(0.90) | 2.70(1.04) | 4.03(1.04) | 3.22(1.26) |
Prodigy-METEO | 4.17(1.22) | 3.21(0.97) | 3.88(1.23) | 3.72(1.20) | 3.99(1.18) | 3.56(0.88) | |
Robocup | 5.06(1.14) | 3.83(1.08) | 5.78(1.08) | 5.23(1.13) | 5.70(1.09) | 5.68(0.92) | |
Dutch Soccer | 3.34(0.91) | 3.92(0.90) | 2.55(0.90) | 2.70(1.04) | 4.03(1.04) | 3.22(1.26) |
- Counter-Example(s):
- See: Mention Generation Task, Text Generation Task, Natural Language Processing Task, Natural Language Generation Task, Natural Language Understanding Task, Natural Language Inference Task, Computing Benchmark Task.
References
2018a
- (Lee et al., 2018) ⇒ Chris van der Lee, Emiel Krahmer, and Sander Wubben. (2018). “Automated Learning of Templates for Data-to-text Generation: Comparing Rule-based, Statistical and Neural Methods.” In: Proceedings of the 11th International Conference on Natural Language Generation (INLG 2018). DOI:http://dx.doi.org/10.18653/v1/W18-6504
- QUOTE: The current work investigated differences in output quality for data-to-text generation using ’direct’ data-to-text conversion and extended models (see figure 1). For this extended model, the input representation and the text examples in the train and development set were ’templatized’.(...)
- QUOTE: The current work investigated differences in output quality for data-to-text generation using ’direct’ data-to-text conversion and extended models (see figure 1). For this extended model, the input representation and the text examples in the train and development set were ’templatized’.(...)
2018b
- (Qi et al., 2018) ⇒ Ye Qi, Devendra Singh Sachan, Matthieu Felix, Sarguna Padmanabhan, and Graham Neubig. (2018). “When and Why Are Pre-Trained Word Embeddings Useful for Neural Machine Translation?". In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2018) Volume 2 (Short Papers). DOI:10.18653/v1/N18-2084.
2017
(Klein et al., 2017) ⇒ Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, and Alexander M. Rush. (2017)."OpenNMT: Open-Source Toolkit for Neural Machine Translation". In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017) System Demonstrations.
2012
- (Snoek et al., 2012) ⇒ Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. (2012). “Practical Bayesian Optimization of Machine Learning Algorithms.” In: Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS-2012).
2007
- (Koehn et al., 2007) ⇒ Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. (2007). “Moses: Open Source Toolkit for Statistical Machine Translation". In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions (ACL 2007).