Document Subject Indexing Task
A Document Subject Indexing Task is a document classification task that requires the mapping of a Document to one or more Subject Index Terms (from a Subject Heading List).
- AKA: Subject Heading Cataloging.
- Context:
- Input: a Document.
- output: one or more Subject Headings (whose Labels may possible not be a Word Mention in the Document).
- It can support a Corpus Subject Index Creation Task.
- …
- Example(s):
- A Document may be cataloged as being about “Security” even though the term "Security" is not mentioned in the document.
- See: Single-Document Term Extraction Task, Indexing Task.
References
2015
- (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/subject_indexing Retrieved:2015-7-8.
- Subject indexing is the act of describing or classifying a document by index terms or other symbols in order to indicate what the document is about, to summarize its content or to increase its findability. In other words, it is about identifying and describing the subject of documents. Indexes are constructed, separately, on three distinct levels: terms in a document such as a book; objects in a collection such as a library; and documents (such as books and articles) within a field of knowledge.
Subject indexing is used in information retrieval especially to create bibliographic databases to retrieve documents on a particular subject. Examples of academic indexing services are Zentralblatt MATH, Chemical Abstracts and PubMed. The index terms were mostly assigned by experts but author keywords are also common.
The process of indexing begins with any analysis of the subject of the document. The indexer must then identify terms which appropriately identify the subject either by extracting words directly from the document or assigning words from a controlled vocabulary.[1] The terms in the index are then presented in a systematic order.
Indexers must decide how many terms to include and how specific the terms should be. Together this gives a depth of indexing.
- Subject indexing is the act of describing or classifying a document by index terms or other symbols in order to indicate what the document is about, to summarize its content or to increase its findability. In other words, it is about identifying and describing the subject of documents. Indexes are constructed, separately, on three distinct levels: terms in a document such as a book; objects in a collection such as a library; and documents (such as books and articles) within a field of knowledge.
- ↑ F. W. Lancaster (2003): "Indexing and abstracting in theory and practise". Third edition. London, Facet ISBN 1-85604-482-3. page 6
2005
- (Anderson & Pérez-Carballo, 2005) ⇒ James D. Anderson, and José Pérez-Carballo. (2005). “Information Retrieval Design: principles and options for information description, organization, display, and access in information retrieval databases, digital libraries, and indexes." Ometeca Institute
- subject cataloging, subject indexing. Whereas descriptive cataloging and descriptive indexing focus on the surface features of texts and documents, subject cataloging and subject indexing focus on analysis, description and indexing of the content, purpose or meaning of messages, in other words, the topics or subjects of messages and texts. The description of certain non-topical features of messages, texts and documents is frequently included in subject cataloging and indexing as well. Examples include special audiences (books for children), special formats (poetry, fiction, dictionaries, periodicals, statistics), special aspects or approaches (history, case studies), special media (film, video recordings, audio recordings, world-wide web), etc. The goal is to identify and provide access to all important topics and features. The challenge, of course, is figuring out what is, or will be, important for future users!