Document Corpus
(Redirected from Document Dataset)
Jump to navigation
Jump to search
A Document Corpus is a document set that is a corpus database (composed of one or more corpus documents).
- AKA: Document Collection.
- Context:
- It can (typically) be a Text Corpus.
- …
- Example(s):
- Counter-Example(s):
- See: Recording Collection, Document-Oriented Database System.
References
2009
- (Yao et al., 2009) ⇒ Limin Yao, David Mimno, and Andrew McCallum. (2009). “Efficient Methods for Topic Model Inference on Streaming Document Collections.” In: Proceedings of ACM SIGKDD Conference (KDD-2009). 10.1145/1557019.1557121
- QUOTE: … With today's large-scale, constantly expanding document collections, it is useful to be able to infer topic distributions for new documents without retraining the model.