Natural Language Processing (NLP) System Benchmark Task

A Natural Language Processing (NLP) System Benchmark Task is a AI benchmarking task that evaluates the AI system performance of NLP systems.

Context:
- Task Input: NLP Dataset,
- Task Output: NLP System's processed data (e.g. translated, POS tagged data).
- Task Requirement(s):
  - Benchmark Datasets,
  - Benchmark Performance Metrics,
  - a ML Training System (optional),
  - a NLP Baseline System,
  - NLP Competing System (s).
- It can (typically) be supported by an NLP Benchmark Corpus.
- It can (often) be an input to an NLP System Evaluation Task.
- ...
- It can range from being a Traditional NLP Benchmark (of rule-based NLP tasks) to being a Deep Learning NLP Benchmark (of neural network-based NLP tasks).
- It can range from being a Single-Task NLP Benchmark (of focused linguistic task) to being a Multi-Task NLP Benchmark (of diverse linguistic tasks).
- It can range from being a Low-Resource NLP Benchmark (of limited data scenarios) to being a High-Resource NLP Benchmark (of large-scale data scenarios).
- It can range from being a Monolingual NLP Benchmark (of single language processing) to being a Multilingual NLP Benchmark (of cross-lingual processing).
- ...
Example(s):
- Syntactic Parsing Benchmark Task such as: Penn Treebank Projects.
- Semantic Relation Mention Recognition Benchmark Task such as: PPLRE Project or CPROD1 Task;
- Named Entity Recognition Benchmark Tasks.
- Coreference Resolution Benchmark Tasks.
- Question Answering Benchmark Tasks.
- Document Summarization Benchmark Task such as: DUC-2005 summarization.
- CoNLL Tasks.
- Natural Language Understanding (NLU) Benchmark Tasks, such as:
  - a GLUE Benchmark (Wang et al., 2018),
  - a SuperGLUE Benchmark (Wang et al., 2019),
  - a RTE Challenge (Bentivogli et al., 2017),
- a Linguistic Semantic Analysis Benchmark Task such as:
  - Semantic Textual Similarity Benchmark (STS-B) Task.
- Domain-Specific NLP Benchmarks, such as:
  - Law NLP Benchmarks.
- Machine Translation Benchmark Tasks such as: WMT Competition.
- Sentiment Analysis Benchmark Tasks such as: SemEval Sentiment Analysis Task.
- Text Classification Benchmark Tasks such as: Reuters-21578 Dataset.
- Language Model Benchmark Tasks such as: WikiText Benchmark and HaluEval.
- …
Counter-Example(s):
See: Natural Language Processing System, NLP Benchmark Diagnostic Dataset, Natural Language Translation Task, Natural Language Inference System, Lexical Entailment, Syntactic Parsing System, Morphological Analysis System, Word Sense Disambiguation, SemEval Task, LRE Map, Transfer Learning in NLP, Few-Shot Learning in NLP, Explainable AI in NLP.

References

2023

(Wikipedia, 2023) ⇒ https://en.wikipedia.org/wiki/Natural-language_programming Retrieved:2023-11-12.
- Natural-language programming (NLP) is an ontology-assisted way of programming in terms of natural-language sentences, e.g. English. A structured document with Content, sections and subsections for explanations of sentences forms a NLP document, which is actually a computer program. Natural language programming is not to be mixed up with natural language interfacing or voice control where a program is first written and then communicated with through natural language using an interface added on. In NLP the functionality of a program is organised only for the definition of the meaning of sentences. For instance, NLP can be used to represent all the knowledge of an autonomous robot. Having done so, its tasks can be scripted by its users so that the robot can execute them autonomously while keeping to prescribed rules of behaviour as determined by the robot's user. Such robots are called transparent robots ^[1] as their reasoning is transparent to users and this develops trust in robots. Natural language use and natural-language user interfaces include Inform 7, a natural programming language for making interactive fiction, Shakespeare, an esoteric natural programming language in the style of the plays of William Shakespeare, and Wolfram Alpha, a computational knowledge engine, using natural-language input. Some methods for program synthesis are based on natural-language programming. ^[2]

↑ Development of reliable and trustworthy robots. “transparent robots" }
↑ Desai, Aditya, et al. “Program synthesis using natural language." Proceedings of the 38th International Conference on Software Engineering. ACM, 2016.

[1] Development of reliable and trustworthy robots. “transparent robots" }

[2] Desai, Aditya, et al. “Program synthesis using natural language." Proceedings of the 38th International Conference on Software Engineering. ACM, 2016.

[1]

[2]

Natural Language Processing (NLP) System Benchmark Task

References

2023

Navigation menu

Search