Toponym Normalization Algorithm

A Toponym Normalization Algorithm is an Entity Mention Normalization Algorithm that can solve a Toponym Normalization Task and be applied by a Toponym Normalization System.

AKA: Toponym Mention Normalization Algorithm.
Context:
- It can be a Domain Dependent Algorithm by making use of:
  - Similarity Function that involves distance calculations based on Longitude and Latitude Data Values, and Background Knowledge that they Earth is fairly Round.
See: Subcellular Location Normalization Task.

References

2007

(Leidner, 2007) ⇒ Jochen L. Leidner. (2007). “Toponym Resolution in Text: Annotation, Evaluation and Applications of Spatial Grounding of Place Names. PhD Thesis, The University of Edinburgh.
- This thesis investigates how referentially ambiguous spatial named entities can be grounded, or resolved, with respect to an extensional coordinate model robustly on open-domain news text by collecting a repertoire of linguistic heuristics and extra-linguistic knowledge sources such as population. I then investigate how to combine these sources of evidence to obtain a superior method. Noise effects introduced by the named entity tagging that toponym resolution relies on are also studied. While few attempts have been made to solve toponym resolution, these were either not evaluated, or evaluation was done by manual inspection of system output instead of creating a re-usable reference corpus. A systematic comparison leads to an inventory of heuristics and other sources of evidence. In order to carry out a comparative evaluation procedure, an evaluation resource is required, so a reference gazetteer and an associated novel reference corpus with human-labelled referent annotation were created for this thesis, to be used to benchmark a selection of the reconstructed algorithms and a novel re-combination of the heuristics catalogued in the inventory. Performance of the same resolution algorithms is compared under different conditions, namely applying it to the output of human named entity annotation and automatic annotation using an existing Maximum Entropy sequence tagging model.

2003

Huifeng Li, Rohini K Srihari, Cheng Niu, and Wei Li. (2003). “InfoXtract Location Normalization: a hybrid approach to geographic references in information extraction.” In: András Kornai and Beth Sundheim, editors, HLT-NAACL 2003 Workshop: Analysis of Geographic References.
(Leidner et al., 2003) ⇒ JL Leidner, G Sinclair, B Webber. (2003). “Grounding spatial named entities for information extraction and question answering. In: Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References.

2002

Huifeng Li, Rohini K Srihari, Cheng Niu, and Wei Li. (2002). “Location Normalization for Information Extraction.” In: Nineteenth International Conference on Computational Linguistics (COLING 2002).

Toponym Normalization Algorithm

References

2007

2003

2002

Navigation menu

Search