Data-Item Annotation Task
An Data-Item Annotation Task is an annotation task for data items (semantic analysis that requires the addition of one or more annotation items to an artifact to demarcate the structure).
- Context:
- Input: an Data Item.
- Optional: Annotation Guidelines to ensure consistency and reliability across annotators and datasets.
- output: an Annotated Artifact.
- Task Performance Metrics: Accuracy, Precision, Recall, F-Measure, Annotator Effort.
- It can (often) be a member of a Data Annotation Process (by implementing a data annotation algorithm).
- It can (often) be performed by a Data Annotator.
- It can be supported by an Data Annotation System (based on a annotation framework).
- It can range from being a Text Annotation Task (e.g. document annotation, wikitext annotation, software code annotation) to being an Image Annotation Task to being an Audio Annotation Task to being a Multimedia Annotation Task.
- It can range from being a Manual Data Annotation Task (with a human annotator) to being a Computer-Assisted Data Annotation Task (an interactive annotation system) to being a Fully-Automated Annotation Task (solved by a fully-automated annotation system).
- It can range from being a Structural Data Annotation Task (of syntactic structure) to being from a Semantic Data Annotation Task (of semantic structure)
- It can range from being a Simple Data Annotation Task to being a Complex Data Annotation Task.
- It can range from being an Independent Data Annotation Task to being a Collaborative Data Annotation Task.
- It can range from being an Coarse Annotation Task to being a Granular Annotation Task.
- It can support a Data Curation Task.
- It can involve Metadata Creation.
- …
- Input: an Data Item.
- Example(s):
- By Data Type:
- Text Annotation Task:
- Image Annotation Task: labeling objects in images, image segmentation
- Audio Annotation Task: transcribing speech, annotating sound events
- Video Annotation Task: annotating events, object tracking
- By Annotation Purpose:
- Tagging Task: assigning keywords/tags for retrieval or categorization
- Syntactic Annotation Task: POS Tagging for marking up parts of speech
- Semantic Annotation Task: Concept Mention Annotation, Knowledge Annotation
- Chatbot Answer Scoring Task: evaluating chatbot responses
- By Annotation Methodology:
- Manual Annotation Task: using human annotators
- Computer-Assisted Annotation Task: employing interactive annotation systems
- Fully-Automated Annotation Task: using fully-automated annotation systems
- Dataset Annotation Task: creating/enhancing machine learning datasets
- ...
- Domain-Specific Data Annotation Task, such as:
- …
- By Data Type:
- Counter-Example(s):
- Content Moderation.
- Physical-Item Annotation.
- Syntactic Parsing, which involves analyzing the syntactic structure of sentences rather than annotating.
- Editing Tasks, focused on correcting language rather than adding metadata or labels.
- Data Cleaning Tasks, which involves removing or correcting inaccuracies in data without necessarily adding annotations.
- See: Complex-Input Classification, Data Annotation, Machine Learning, Natural Language Processing.
References
2024
- https://www.theguardian.com/technology/article/2024/jul/06/mercy-anita-african-workers-ai-artificial-intelligence-exploitation-feeding-machine
- NOTES:
- Data annotation involves data reviewing and data labeling large volumes of data under strict performance targets and performance deadlines.
- Content moderators face exposure to disturbing and graphic content, resulting in severe psychological impacts.
- There is intense supervision and surveillance, with limited support for mental health and well-being.
- NOTES:
2011
- (Wikipedia - Annotation, 2009) ⇒ http://en.wikipedia.org/wiki/Annotation
- For DNA annotation, a previously unknown sequence representation of genetic material is enriched with information relating genomic position to intron-exon boundaries, regulatory sequences, repeats, gene names and protein products. This annotation is stored in genomic databases as Mouse Genome Informatics, FlyBase, and WormBase. Educational materials on some aspects of biological annotation from this year's Gene Ontology annotation camp and similar events are available at the Gene Ontology website. The National Center for Biomedical Ontology (www.bioontology.org) develops tools for automated annotation of database records based on the textual descriptions of those records.
In the digital imaging community the term annotation is commonly used for visible metadata superimposed on an image without changing the underlying master image, such as sticky notes, virtual laser pointers, circles, arrows, and black-outs (cf. redaction).
… legal publishers such as Thomson West and Lexis Nexis publish annotated versions of statutes, providing information about court cases that have interpreted the statutes. Both the federal United States Code and state statutes are subject to interpretation by the courts, and the annotated statutes are valuable tools in legal research.
In linguistics, annotation include comments and metadata; these non-transcriptional annotations are also non-linguistic. A collection of texts with linguistic annotations is known as a corpus (plural corpora). The Linguistic Annotation Wiki describes tools and formats for creating and managing linguistic annotations.
- For DNA annotation, a previously unknown sequence representation of genetic material is enriched with information relating genomic position to intron-exon boundaries, regulatory sequences, repeats, gene names and protein products. This annotation is stored in genomic databases as Mouse Genome Informatics, FlyBase, and WormBase. Educational materials on some aspects of biological annotation from this year's Gene Ontology annotation camp and similar events are available at the Gene Ontology website. The National Center for Biomedical Ontology (www.bioontology.org) develops tools for automated annotation of database records based on the textual descriptions of those records.
2009
- (WordNet, 2009) ⇒ http://wordnetweb.princeton.edu/perl/webwn?s=annotation
- S: (n) annotation, annotating (the act of adding notes)
- …
2009
- http://en.wiktionary.org/wiki/annotate
- To add annotation
2009
- http://en.wiktionary.org/wiki/annotation#Noun
- the process of writing such comment or commentary
- a critical or explanatory commentary or analysis.
- a comment added to a text
2006
- (Bukhardt et al., 2006) ⇒ Kyle Burkhardt, Bohdan Schneider, Jeramia Ory. (2006). “A Biocurator Perspective: Annotation at the Research Collaboratory for Structural Bioinformatics Protein Data Bank.” In: PLoS Computational Biology, 2(10). doi:10.1371/journal.pcbi.0020099
- QUOTE: The goal of annotation is to make each entry not only self-consistent but also consistent with the rest of the archive. To this end, annotators help authors represent their data in the best possible way. Annotators routinely review the incoming data and perform many standard inspections (see Box 1).
Box 1. Annotators Work to Represent PDB Data in the Best Possible Way by: * Reviewing entry for self-consistency * Matching given title to structure * Correcting format errors in data and coordinates * Checking sequence using BLAST [13] * Inserting sequence database reference * Providing protein name and synonyms * Checking scientific name of the source organism * Confirming chemical consistency between ligand name and the 3-D coordinates * Adding information describing the biological assembly * Checking entry visually * Generating validation reports * Finding citation references with PubMed [14]