Character n-Gram Feature
Jump to navigation
Jump to search
A Character n-Gram Feature is a text token feature based on a substring operation.
- …
- Example(s):
- a Character 3-Gram Feature, such as [math]\displaystyle{ f }[/math](“rko”, “Markov”) ⇒
true
.
- a Character 3-Gram Feature, such as [math]\displaystyle{ f }[/math](“rko”, “Markov”) ⇒
- See: Token Prefix Feature, Token Suffix Feature.
References
2008
- (Zhifei & Khudanpur, 2008) ⇒ Zhifei Li, and Sanjeev Khudanpur. (2008). “Large-scale discriminative n-gram language models for statistical machine translation.” In: Proceedings of AMTA 2008.
2003
- (Klein et al., 2003) ⇒ Dan Klein, Joseph Smarr, Huy Nguyen, and Christopher D. Manning. (2003). “Named Entity Recognition with Character-level Models.” In: Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4. doi:10.3115/1119176.1119204