2016 AutomaticEntityRecognitionandTy

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Semi-Supervised NER.

Notes

Cited By

Quotes

Abstract

In today's computerized and information-based society, individuals are constantly presented with vast amounts of text data, ranging from news articles, scientific publications, product reviews, to a wide range of textual information from social media. To extract value from these large, multi-domain pools of text, it is of great importance to gain an understanding of entities and their relationships. In this tutorial, we introduce data-driven methods to recognize typed entities of interest in massive, domain-specific text corpora. These methods can automatically identify token spans as entity mentions in documents and label their fine-grained types (e.g., people, product and food) in a scalable way. Since these methods do not rely on annotated data, predefined typing schema or hand-crafted features, they can be quickly adapted to a new domain, genre and language. We demonstrate on real datasets including various genres (e.g., news articles, discussion forum posts, and tweets), domains (general vs. bio-medical domains) and languages (e.g., English, Chinese, Arabic, and even low-resource languages like Hausa and Yoruba) how these typed entities aid in knowledge discovery and management.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2016 AutomaticEntityRecognitionandTyHeng Ji
Xiang Ren
Ahmed El-Kishky
Jiawei Han
Automatic Entity Recognition and Typing in Massive Text Data10.1145/2882903.29125672016