Large Text Document Corpus

From GM-RKB

(Redirected from large text data)

Jump to navigation Jump to search

A Large Text Document Corpus is a corpus that is a large dataset (that requires significant resources to processed by a machine but can fit in large memory banks).

Context:
- It can fit into the computer memory of a very large computer.
- It can range from being a Relatively Large Corpus to being a Very Large Corpus.
Example(s):
- a Large Text Corpus, such as Genia Corpus, TREC Corpus, the KDD-2009 Abstracts Corpus.
- …
Counter-Example(s):
- a Small Corpus, such as the kdd09cma1 Corpus.
- any Large Corpora, such as a Web Snapshot.
See: Information Extraction Task, PubMed Corpus.

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=Large_Text_Document_Corpus&oldid=714507"

Concept