Information Retrieval Algorithm

AKA: IR Technique.
Context:
- It can involve the conversion of a Document into a Document Vector.
- It can make use of a TF-IDF Ranking Function.
- It can make use of:
  - an Inverted Index.
  - a Statistical Language Model.
  - a Word Vector Space Model.
  - a Knowledge Base.
Example(s):
- a Word Vector Similarity Distance IR Algorithm.
- a Web Search Algorithm.
- …
Counter-Example(s):
See: Information Retrieval Discipline.

References

(Meadow et al., 2007) ⇒ Charles T. Meadow, Bert R. Boyce, Donald H. Kraft, and Carol L. Barry. (2007). “Text Information Retrieval Systems, 3rd edition." Emerald Group Publishing. ISBN:0123694124

(Amitay et al., 2004) ⇒ Einat Amitay, Nadav Har'El, Ron Sivan, and Aya Soffer. (2004). “Web-a-where: geotagging web content.” In: Proceedings of the 27th ACM SIGIR Conference. (SIGIR 2004). http://dx.doi.org/10.1145/1008992.1009040

(Croft & Lafferty, 2003) ⇒ W. Bruce Croft, and John D. Lafferty, editors. (2003). “Language Modeling for Information Retrieval." Kluwer Academic.
- In the past several years a new framework for information retrieval has emerged that is based on statistical language modeling. The approach differs from traditional probabilistic approaches in interesting and subtle ways, and is fundamentally different from vector space methods. It is string that the language modeling approach to information retrieval was not proposed until the late 1990s; however, until recently the information retrieval and language modeling research communities were somewhat isolated.

(Shah et al., 2002) ⇒ Urvi Shah, Tim Finin, Anupam Joshi, R. Scott Cost, and James Matfield. (2002). “Information Retrieval on the Semantic Web.” In: Proceedings of the 11th International Conference on Information and Knowledge Management (CIKM 2002) doi:10.1145/584792.584868

(Witten et al., 1999) ⇒ Ian H. Witten, Alistair Moffat, and Timothy C. Bell. (1999). “Managing Gigabytes: compressing and indexing documents and images, 2nd Edition." Morgan Kaufmann
- Publishers ABSTRACT: In this fully updated second edition of the highly acclaimed Managing Gigabytes, authors Witten, Moffat, and Bell continue to provide unparalleled coverage of state-of-the-art techniques for compressing and indexing data. Whatever your field, if you work with large quantities of information, this book is essential reading--an authoritative theoretical resource and a practical guide to meeting the toughest storage and access challenges. It covers the latest developments in compression and indexing and their application on the Web and in digital libraries. It also details dozens of powerful techniques supported by mg, the authors' own system for compressing, storing, and retrieving text, images, and textual images. mg's source code is freely available on the Web.
- Up-to-date coverage of new text compression algorithms such as block sorting, approximate arithmetic coding, and fat Huffman coding

(Salton et al, 1975) ⇒ Gerard M. Salton, A. Wong, and C. Yang. (1975). “A Vector Space Model for Automatic Indexing.” In: Communications of the ACM, 18(11).