Character-Level Language Model (LM)
Jump to navigation
Jump to search
A Character-Level Language Model (LM) is a text-string likelihood function that operates on text character substrings.
- AKA: Joint Probability Function for Characters.
- Context:
- It can range from being a Forward Character-Level Language Model to being a Backward Character-Level Language Model to being a Bi-Directional Character-Level Language Model.
- It can be produced by a Character-Level Language Model Training System (solving by a character-level LM task).
- It can range from being a Unigram Character-Level Language Model, to being a Bigram Character-Level Language Model, to being a Trigram Character-Level Language Model to being an n-Gram Character-Level Language Model.
- It can range from being an Unsmoothed Character-Level n-Gram Model to being a Smoothed Character-Level n-Gram Model.
- …
- Example(s):
- [math]\displaystyle{ f(\text{This is a phrase}) \Rightarrow LM_{3char}(\text{T}) \times LM_{3char}(\text{h} \mid \text{T}) \times LM_{3char}(\text{i} \mid \text{T, h}) \times LM_{3char}(\text{s} \mid \text{h, i}) \times \cdots \Rightarrow 0.00014 }[/math].
- a Maximum Likelihood-based Character-Level Language Model, such as an unsmoothed maximum-likelihood character-level LM.
- a Neural Network-based Character-Level Language Model.
- …
- Counter-Example(s):
- See: Text Character.
References
2017
- https://machinelearningmastery.com/develop-character-based-neural-language-model-keras/
- QUOTE: ... The benefit of character-based language models is their small vocabulary and flexibility in handling any words, punctuation, and other document structure. This comes at the cost of requiring larger models that are slower to train. …
2015b
- (Goldberg, 2015) ⇒ Yoav Goldberg. (2015). “The Unreasonable Effectiveness of Character-level Language Models (and Why RNNs Are Still Cool).” In: Blog Post.
- QUOTE: ... In what follows I will briefly describe these character-level maximum-likelihood language models, which are much less magical than RNNs and LSTMs, and show that they too can produce a rather convincing Shakespearean prose. I will also show about 30 lines of python code that take care of both training the model and generating the output. Compared to this baseline, the RNNs may seem somewhat less impressive. ...
2015a
- (Karpathy, 2015) ⇒ Andrej Karpathy. (2015). “The Unreasonable Effectiveness of Recurrent Neural Networks." May 21, 2015
- QUOTE: … Okay, so we have an idea about what RNNs are, why they are super exciting, and how they work. We’ll now ground this in a fun application: We’ll train RNN character-level language models. That is, we’ll give the RNN a huge chunk of text and ask it to model the probability distribution of the next character in the sequence given a sequence of previous characters. This will then allow us to generate new text one character at a time. …
2003
- (Peng et al., 2003) ⇒ Fuchun Peng, Dale Schuurmans, Shaojun Wang, and Vlado Keselj. (2003). “Language Independent Authorship Attribution Using Character Level Language Models.” In: Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1. ISBN:1-333-56789-0 doi:10.3115/1067807.1067843
- QUOTE: We present a method for computer-assisted authorship attribution based on character-level n-gram language models. …