gensim Text Analysis Package
Jump to navigation
Jump to search
A gensim Text Analysis Package is a Python-based NLP library.
- Context:
- It can include a word2vec-like System [1].
- It can support gensim-based Programs.
- It can (typically) support Text Analytics Tasks.
- Example(s):
gensim 3.4.0
[2] (~2018-03-01).gensim 2.2.0
[3] (~2017-06-21).gensim 0.13.1
[4] (~2016-06-24).- …
- Counter-Example(s):
- See: MediaWiki XML Dump Parser.
References
2016
- https://pypi.python.org/pypi/gensim/0.13.1
- QUOTE:
- All algorithms are memory-independent w.r.t. the corpus size (can process input larger than RAM, streamed, out-of-core),
- Intuitive interfaces
- easy to plug in your own input corpus/datastream (trivial streaming API)
- easy to extend with other Vector Space algorithms (trivial transformation API)
- Efficient multicore implementations of popular algorithms, such as online Latent Semantic Analysis (LSA/LSI/SVD), Latent Dirichlet Allocation (LDA), Random Projections (RP), Hierarchical Dirichlet Process (HDP) or word2vec deep learning.
- Distributed computing: can run Latent Semantic Analysis and Latent Dirichlet Allocation on a cluster of computers.
- QUOTE: