Inverted Index Data Structure

From GM-RKB
Jump to navigation Jump to search

An Inverted Index Data Structure is an index data structure where each Index Data Record contains a set of labels and a pointer to a data record in a tabular data structure.



References

2011

  • http://en.wikipedia.org/wiki/Inverted_index
    • QUOTE:In computer science, an inverted index (also referred to as postings file or inverted file) is an index data structure storing a mapping from content, such as words or numbers, to its locations in a database file, or in a document or a set of documents. The purpose of an inverted index is to allow fast full text searches, at a cost of increased processing when a document is added to the database. The inverted file may be the database file itself, rather than its index. It is the most popular data structure used in document retrieval systems,[1] used on a large scale for example in search engines. Several significant general-purpose mainframe-based database management systems have used inverted list architectures, including ADABAS, DATACOM/DB, and Model 204.

      There are two main variants of inverted indexes: A record level inverted index (or inverted file index or just inverted file) contains a list of references to documents for each word. A word level inverted index (or full inverted index or inverted list) additionally contains the positions of each word within a document.[2] The latter form offers more functionality (like phrase searches), but needs more time and space to be created.

2007