Texygen Text Generation Evaluation System
A Texygen Text Generation Evaluation System is a Top-to-Down Multi-dimensional Evaluation System based on a GAN architecture that is implemented over TensorFlow.
- AKA: Texygen System.
- Context:
- It can solve a Texygen Benchmark Task.
- Software Developers: It was developed by Zhu et al. (2018).
- Resource(s): It is an Open-Source Benchmark Platform available at https://github.com/geek-ai/Texygen
- System's Architecture:
- It is a Top-to-Down Multi-dimensional Evaluation System composed of two parts:
- Utils part -users provide Metrics class and Oracle class.
- Model part - users begin the training process by interacting with the GAN class.
- It is a Top-to-Down Multi-dimensional Evaluation System composed of two parts:
- Training Systems and Tools:
- a RL-based GAN synthetic data training system that uses oracle LSTM to generate data.
- a RL-based GAN real data training system that uses real-world datasets.
- Example(s):
- …
- Counter-Example(s):
- See: Text Generation System, Natural Language Generation System, Natural Language Understanding System, Hierarchical Reinforcement Learning System, Language Model.
References
2018
- (Zhu et al., 2018) ⇒ Yaoming Zhu, Sidi Lu, Lei Zheng, Jiaxian Guo, Weinan Zhang, Jun Wang, and Yong Yu. (2018). “Texygen: A Benchmarking Platform for Text Generation Models.” In: Proceedings of The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR 2018). DOI:10.1145/3209978.3210080.
- QUOTE: exygen is implemented over TensorFlow [1]. As shown in Fig. 1, the system consists of two parts with three major classes, highly decoupled with each other, and easy for customization.
In the utils part, we provide user Metrics class and Oracle class. The former has three subclasses designed for calculating BLEU score, NLL loss and EmbSim, while the latter one enables user to initialize three different types of Oracle: LSTM-based, GRU-based and SRU-based. The default oracle is LSTM.
In the model part, we enable users to begin the training process by only interacting with the GAN class (as a major class) without concerning about the classes for the generator, the discriminator and the reward (for RL-based GANs). Texygen also provides two different types of training processes in the GAN class: synthetic data training and real data training. The former one uses the oracle LSTM to generate data, while the latter one uses real-world datasets.
- QUOTE: exygen is implemented over TensorFlow [1]. As shown in Fig. 1, the system consists of two parts with three major classes, highly decoupled with each other, and easy for customization.
- ↑ Martin Abadi et a1. 2016. TensorFlow: A System for Large-Scale Machine Learning. In OSDI, Vol. 16. 2657283.