Character-Level Language Model (LM)

AKA: Joint Probability Function for Characters.
Context:
- It can range from being a Forward Character-Level Language Model to being a Backward Character-Level Language Model to being a Bi-Directional Character-Level Language Model.
- It can be produced by a Character-Level Language Model Training System (solving by a character-level LM task).
- It can range from being a Unigram Character-Level Language Model, to being a Bigram Character-Level Language Model, to being a Trigram Character-Level Language Model to being an n-Gram Character-Level Language Model.
- It can range from being an Unsmoothed Character-Level n-Gram Model to being a Smoothed Character-Level n-Gram Model.
- …
Example(s):
- [math]\displaystyle{ f(\text{This is a phrase}) \Rightarrow LM_{3char}(\text{T}) \times LM_{3char}(\text{h} \mid \text{T}) \times LM_{3char}(\text{i} \mid \text{T, h}) \times LM_{3char}(\text{s} \mid \text{h, i}) \times \cdots \Rightarrow 0.00014 }[/math].
- a Maximum Likelihood-based Character-Level Language Model, such as an unsmoothed maximum-likelihood character-level LM.
- a Neural Network-based Character-Level Language Model.
- …
Counter-Example(s):
- a Word/Token-Level Language Model.
- a DNA Sequence Probability Function.
See: Text Character.

References

https://machinelearningmastery.com/develop-character-based-neural-language-model-keras/
- QUOTE: ... The benefit of character-based language models is their small vocabulary and flexibility in handling any words, punctuation, and other document structure. This comes at the cost of requiring larger models that are slower to train. …

(Karpathy, 2015) ⇒ Andrej Karpathy. (2015). “The Unreasonable Effectiveness of Recurrent Neural Networks." May 21, 2015
- QUOTE: … Okay, so we have an idea about what RNNs are, why they are super exciting, and how they work. We’ll now ground this in a fun application: We’ll train RNN character-level language models. That is, we’ll give the RNN a huge chunk of text and ask it to model the probability distribution of the next character in the sequence given a sequence of previous characters. This will then allow us to generate new text one character at a time. …