2023 GPTNERNamedEntityRecognitionvia

From GM-RKB
Jump to navigation Jump to search

Subject Headings: GPT-NER, NER Algorithm.

Notes

  • It transforms the NER task into a text generation task adapted for LLMs by generating output with marked entity boundaries rather than sequence labels.
  • It uses special tokens like "@@" and "##" to mark entity spans in the generated text, which is an easier format for LLMs to produce.
  • It introduces a self-verification stage to alleviate LLMs' tendency to overconfidently predict entities by asking if extracted entities belong to the labeled category.
  • It achieves comparable performance to supervised BERT baselines on 5 NER datasets, demonstrating LLMs can match heavily engineered models.
  • It shows much stronger performance compared to supervised models in low-resource scenarios with little labeled data, illustrating usefulness for real-world applications.
  • It has limitations like still lagging behind state-of-the-art supervised models on complex nested NER cases and constraints from GPT-3's context length.
  • It makes a compelling case for transforming structured prediction problems into text generation tasks solvable by LLMs with appropriate reformatting and self-regulation techniques.
  • It leverages the GPT-3 davinci-003 model as the backbone large language model for all experiments, taking advantage of its advanced text generation capabilities and in-context few-shot learning abilities to adapt it to named entity recognition through the proposed GPT-NER approach.
  • It compares with the state-of-the-art ACE+document-context model achieving 94.6 F1 on CoNLL 2003 as a key benchmark.
  • It compares with BERT-MRC+DSC which obtains the highest 93.88 F1 score on the OntoNotes 5.0 dataset.
  • It uses BINDER, the current SOTA with 88.7 F1 on ACE2004, as a target baseline during evaluation.
  • It compares against the top-performing BINDER again for ACE2005 which has a 89.5 F1 state-of-the-art mark.
  • It aims to match the strong 83.75 F1 score of BERT-MRC, the existing best result reported on the GENIA dataset.

Cited By

Quotes

Abstract

Despite the fact that large-scale Language Models (LLMs) have achieved SOTA performances on a variety of NLP tasks, their performance on NER is still significantly below supervised baseline. This is due to the gap between the two tasks, the NER and LLMs, the former being a sequence labeling task in nature while the latter are text-generation models. In this paper, we propose GPT-NER to resolve this issue. GPT-NER bridges the gap by transforming the sequence labeling task to a generation task that can be easily adapted by LLMs, e.g., the task of finding location entities in the input text "Columbus is a city" is transformed to generate the text sequence "@@Columbus## is a city", where special tokens @@## mark the entity to extract. To efficiently address the "hallucination" issue of LLMs, where LLMs have a strong inclination to over-confidently label NULL inputs as entities, we propose a self-verification strategy by prompting LLMs to ask itself whether the extracted entities belong to a labeled entity tag.

We conduct experiments on five widely adopted NER datasets, and GPT-NER achieves comparable performances to fully supervised baselines, which is the first time as far as we are concerned. More importantly, we find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce, GPT-NER performs significantly better than supervised models. This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2023 GPTNERNamedEntityRecognitionviaFei Wu
Jiwei Li
Shuhe Wang
Xiaofei Sun
Xiaoya Li
Rongbin Ouyang
Tianwei Zhang
Guoyin Wang
GPT-NER: Named Entity Recognition via Large Language Models10.48550/arXiv.2304.104282023