One Billion Word Language Modelling Benchmark Task

From GM-RKB

(Redirected from One Billion Word)

Jump to navigation Jump to search

An One Billion Word Language Modelling Benchmark Task is a NLP Benchmark Task that evaluates the performance of language modeling systems.

AKA: 1B Word Language Modelling Benchmark.
Context:
Benchmark Website: https://github.com/ciprian-chelba/1-billion-word-language-modeling-benchmark
Example(s):
- Model combination on the 1B Word Benchmark test set results:

Model	Perplexity
Interpolated KN 5-gram, 1.1B n-grams	67.6
All models	43.8

Language models on 1B Word Benchmark test set results.

Model	Num. Params	Training Time		Perplexity
	[billions]	[hours]	[CPUs]
Interpolated KN 5-gram, 1.1B n-grams (KN)	1.76	3	100	67.6
Katz 5-gram, 1.1B n-grams	1.74	2	100	79.9
Stupid Backoff 5-gram (SBO)	1.13	0.4	200	87.9
Interpolated KN 5-gram, 15M n—grams	0.03	3	100	243.2
Katz 5-gram, 15M n-grams	0.03	2	100	127.5
Binary MaXEnt 5-gram (n-gram features)	1.13	1	5000	115.4
Binary MaXEnt 5-gram (n-gram + skip-1 features)	1.8	1.25	5000	107.1
Hierarchical Softmax MaXEnt 4-gram (HME)	6	3	1	101.3
Recurrent NN-256 + MaXEnt 9-gram	20	60	24	58.3
Recurrent NN-512 + MaXEnt 9-gram	20	120	24	54.5
Recurrent NN-1024 + MaXEnt 9-gram	20	240	24	51.3

Counter-Example(s):
See: Natural Language Processing System, Language Model, Machine Translation System, Text Corpus, Language Modeling Algorithm.

References

2014

(Chelba et al., 2014) ⇒ Ciprian Chelba, Tomáš Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. (2014). “One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling.” In: Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014).
- QUOTE: With almost one billion words of training data, we hope this benchmark will be useful to quickly evaluate novel language modeling techniques, and to compare their contribution when combined with other advanced techniques. We show performance of several well-known types of language models, with the best results achieved with a recurrent neural network based language model. The baseline unpruned Kneser-Ney 5-gram model achieves perplexity 67.6; a combination of techniques leads to 35% reduction in perplexity, or 10% reduction in cross-entropy (bits), over that baseline. The benchmark is available as a this http://code.google.com/project; besides the scripts needed to rebuild the training / held-out data, it also makes available log-probability values for each word in each of ten held-out data sets, for each of the baseline n-gram models.

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=One_Billion_Word_Language_Modelling_Benchmark_Task&oldid=755346"