2012 ASemiSupervisedApproachtoExtrac

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Multi-Token Entity Mention.

Notes

Cited By

Quotes

Author Keywords

Abstract

The paper describes a semi-supervised approach to extracting multiword units that belong to a specific semantic class of entities. The approach uses a small set of seed words representing the target class, and calculates distributional similarity between the candidate and seed words. We adapt a well-known document ranking function, BM25, to the task of calculating similarity between vectors of context features representing seed words and candidate words, and perform a systematic comparison to a number of distributional similarity measures. We then introduce a method for ranking multiword units by the likelihood of belonging to the target semantic class. The task used for evaluation is extraction of restaurant dish names from the corpus of 157,865 restaurant reviews.

References

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2012 ASemiSupervisedApproachtoExtracOlga VechtomovaA Semi-supervised Approach to Extracting Multiword Entity Names from User Reviews10.1145/2379307.2379309