fastText Library
A fastText Library is an NLP library managed by facebook Research.
- Context:
- It can be implemented by a fastText System.
- Example(s):
- fastText, v0.9.2 [1] (~2020-04-20)
- fastText, v0.2.0 [2] (~2018-12-19)
- fastText, v0.1.0 [3] (~2017-12-01)
- …
- Counter-Example(s):
- See: PyText Framework, Word Meaning Classification.
References
2018
- https://github.com/facebookresearch/fastText#introduction
- QUOTE: fastText is a library for efficient learning of word representations and sentence classification.
2017a
- https://research.fb.com/fasttext/
- QUOTE: Understanding the meaning of words that roll off your tongue as you talk, or your fingertips as you tap out posts is one of the biggest technical challenges facing artificial intelligence researchers. But it is an essential need. Automatic text processing forms a key part of the day-to-day interaction with your computer; it’s a critical component of everything from web search and content ranking to spam filtering, and when it works well, it’s completely invisible to you. With the growing amount of online data, there is a need for more flexible tools to better understand the content of very large datasets, in order to provide more accurate classification results.
To address this need, the Facebook AI Research (FAIR) lab is open-sourcing fastText, a library designed to help build scalable solutions for text representation and classification. Our ongoing commitment to collaboration and sharing with the community extends beyond just delivering code. We know it’s important to share our learnings to advance the field, so have also published our research relating to fastText.
FastText combines some of the most successful concepts introduced by the natural language processing and machine learning communities in the last few decades. These include representing sentences with bag of words and bag of n-grams, as well as using subword information, and sharing information across classes through a hidden representation. We also employ a hierachical softmax that takes advantage of the unbalanced distribution of the classes to speed up computation. These different concepts are being used for two different tasks: efficient text classification and learning word vector representations.
- QUOTE: Understanding the meaning of words that roll off your tongue as you talk, or your fingertips as you tap out posts is one of the biggest technical challenges facing artificial intelligence researchers. But it is an essential need. Automatic text processing forms a key part of the day-to-day interaction with your computer; it’s a critical component of everything from web search and content ranking to spam filtering, and when it works well, it’s completely invisible to you. With the growing amount of online data, there is a need for more flexible tools to better understand the content of very large datasets, in order to provide more accurate classification results.
2017b
- (Bojanowski et al., 2017) ⇒ Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. (2017). “Enriching Word Vectors with Subword Information.” Transactions of the Association for Computational Linguistics 5
2017c
- (Joulin et al., 2017) ⇒ Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. (2017). “Bag of Tricks for Efficient Text Classification.” In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL2017) Volume 2: Short Papers.