Character-Level Language Model Training (LM) Task

Context:
- It can be solved by a Character-Level Language Modeling System (that implements a character-level language modeling algorithm).
Example(s):
- a character-level LM task based on Text8 Dataset (Mahoney, 2009).
- char-rnn developed by Karpathy (2015),
- …
Counter-Example(s):
- Word-Level Language Modeling Task.
See: Language Modeling System, Natural Language Processing System, Recurrent Neural Network.

References

(Al-Rfou et al., 2018) ⇒ Rami Al-Rfou, Dokook Choe, Noah Constant, Mandy Guo, and Llion Jones. (2018). “Character-Level Language Modeling with Deeper Self-Attention.” In: CoRR, abs/1808.04444.
- QUOTE: Character-level modeling of natural language text is challenging, for several reasons. First, the model must learn a large vocabulary of words “from scratch”. Second, natural text exhibits dependencies over long distances of hundreds or thousands of time steps. Third, character sequences are longer than word sequences and thus require significantly more steps of computation.

(Karpathy, 2015) ⇒ Andrej Karpathy. (2015). “The Unreasonable Effectiveness of Recurrent Neural Networks.” In: Proceedings of Blog post 2015-05-21.
- QUOTE: ... This post is about sharing some of that magic with you.
  We’ll train RNNs to generate text character by character and ponder the question “how is that even possible?”
  By the way, together with this post I am also releasing code on Github that allows you to train character-level language models based on multi-layer LSTMs. You give it a large chunk of text and it will learn to generate text like it one character at a time. You can also use it to reproduce my experiments below. But we’re getting ahead of ourselves; What are RNNs anyway? …