2005 UsingSemNetworksForGeoInfRetr: Difference between revisions

From GM-RKB
Jump to navigation Jump to search
m (Text replace - " (2005)" to " (2005)")
 
m (Text replacement - "References == * " to "References == * ")
 
(21 intermediate revisions by 3 users not shown)
Line 1: Line 1:
* ([[2005_UsingSemNetworksForGeoInfRetr|Leveling & al, 2005]]) ⇒ [[author::Johannes Leveling]], [[author::Sven Hartrumpf]], and [[author::Dirk Veiel]]. ([[year::2005]]). "Using Semantic Networks for Geographic Information Retrieval." In: [[journal::Proceedings of Cross-Language Evaluatin Forum] ([[CLEF]] 2005). [http://dx.doi.org/10.1007/11878773 doi:10.1007/11878773]
* ([[2005_UsingSemNetworksForGeoInfRetr|Leveling et al., 2005]]) [[author::Johannes Leveling]], [[author::Sven Hartrumpf]], and [[author::Dirk Veiel]]. ([[year::2005]]). “Using Semantic Networks for Geographic Information Retrieval.In: [[journal::Proceedings of Cross-Language Evaluatin Forum] ([[CLEF]] 2005). [http://dx.doi.org/10.1007/11878773 doi:10.1007/11878773]


<B>Subject Headings:</B> [[Semantic Network]], [[GeoCLEF Task]], [[Question-Answering System]], [[Toponym Record Set]].
<B>Subject Headings:</B> [[Semantic Network]], [[GeoCLEF Task]], [[Question-Answering System]], [[Toponym Record Set]].


==Notes==
== Notes ==
* Associated with http://www.clef-campaign.org/2005/working_notes/workingnotes2005/veiel05.pdf
* Associated with http://www.clef-campaign.org/2005/working_notes/workingnotes2005/veiel05.pdf


Line 9: Line 9:
* ~12 http://scholar.google.com/scholar?q=%22Using+Semantic+Networks+for+Geographic+Information+Retrieval%22+2006
* ~12 http://scholar.google.com/scholar?q=%22Using+Semantic+Networks+for+Geographic+Information+Retrieval%22+2006


==Quotes==
== Quotes ==


===Abstract===
=== Abstract ===
This paper describes our work for the participation at the [[GeoCLEF task]] of [[CLEF ([[2005]])]]. We employ multilayered extended semantic networks for the representation of background knowledge, queries, and documents for [[geographic information retrieval (GIR)]]. In our approach, geographic concepts from the query network are expanded with concepts which are semantically connected via topological, directional, and proximity relations. We started with an existing geographic knowledge base represented as a semantic network and expanded it with concepts automatically extracted from the GEOnet Names Server.
This paper describes [[our work]] for the participation at the [[GeoCLEF task]] of [[CLEF (2005)]]. We employ multilayered extended semantic networks for the representation of [[background knowledge]], queries, and documents for [[geographic information retrieval (GIR)]]. In [[our approach]], geographic concepts from the query network are expanded with concepts which are semantically connected via topological, directional, and proximity relations. We started with an existing geographic knowledge base [[represented as]] a semantic network and expanded it with concepts automatically extracted from the GEOnet Names Server.


Several experiments for GIR on German documents have been performed: a baseline corresponding to a traditional information retrieval approach; a variant expanding thematic, temporal, and geographic descriptors from the semantic network representation of the query; and an adaptation of a question answering (QA) algorithm based on semantic networks. The second experiment is based on a representation of the natural language description of a topic as a semantic network, which is achieved by a deep linguistic analysis. The semantic network is transformed into an intermediate representation of a database query explicitly representing thematic, temporal, and local restrictions. This experiment showed the best performance with respect to mean average precision: 10.53% using the topic title and description. The third experiment, adapting a QA algorithm, uses a modified version of the QA system InSicht. The system matches deep semantic representations of queries or their equivalent or similar variants to semantic networks for document sentences.
Several experiments for GIR on German documents have been performed: a baseline corresponding to a traditional information retrieval approach; a variant expanding thematic, temporal, and geographic descriptors from the semantic network representation of the query; and an adaptation of a question answering (QA) algorithm based on semantic networks. The second experiment is based on a representation of the natural language description of a topic as a semantic network, which is achieved by a deep linguistic analysis. The semantic network is transformed into an intermediate representation of a database query explicitly representing thematic, temporal, and local restrictions. This experiment showed the best performance with respect to mean average precision: 10.53% using the topic title and description. The third experiment, adapting a QA algorithm, uses a modified version of the QA system InSicht. The system matches deep [[semantic representation]]s of queries or their equivalent or similar variants to semantic networks for document sentences.


{{#ifanon:|
{{#ifanon:|
Line 26: Line 26:
* grounding entities (i.e. connecting them to the model) and interpreting coordinates.
* grounding entities (i.e. connecting them to the model) and interpreting coordinates.


After identifying toponyms in queries and documents, coordinates can be assigned to them. In GIR, assigning a relevance score to a document for a given query typically involves calculating the distance between
After identifying [[toponym]]s in queries and documents, coordinates can be assigned to them. In GIR, assigning a relevance score to a document for a given query typically involves calculating the distance between
* <i>Toponyms in different languages.</i> The translation of toponyms plays an important role even for monolingual retrieval when different and external information resources are integrated. In gazetteers, mostly English names are used.
* <i>Toponyms in different languages.</i> The translation of [[toponym]]s plays an important role even for monolingual retrieval when different and external information resources are integrated. In gazetteers, mostly English names are used.
* <i>Name variants.</i> The same geographic object can be referenced by endonymic names, exonymic names, and historical names. An endonym is a local name for a geographic entity, e.g. “Wien”, “K¨oln”, and “Milano”. An exonym is a place name in a certain language for a geographic object that lies outside the region where this language has an official status; for example, “Vienna” and “Cologne” are the English exonyms for “Wien” and “K¨oln”, respectively, and “Mailand” is the German exonym for “Milano”. Examples of historical names or traditional names are “New Amsterdam” for “New York” and “C¨ollen” for “K¨oln”. For GIR, name variants should be conflated.
* <i>Name variants.</i> The same geographic object can be referenced by endonymic names, exonymic names, and historical names. An endonym is a local name for a geographic entity, e.g. “Wien”, “K¨oln”, and “Milano”. An exonym is a place name in a certain language for a geographic object that lies outside the region where this language has an official status; for example, “Vienna” and “Cologne” are the English exonyms for “Wien” and “K¨oln”, respectively, and “Mailand” is the German exonym for “Milano”. Examples of historical names or traditional names are “New Amsterdam” for “New York” and “C¨ollen” for “K¨oln”. For GIR, name variants should be conflated.
* <i>Composite names.</i> Composite names or complex named entities consist of two or more words. Frequently, appositions are considered to be a part of a name. For example, there is no need for the translation of the word “mount” in “Mount Cook”, but “Insel” is typically translated in the expression “Insel Sylt”/“island of Sylt”. For NER, certain rules have to be established how composite names are normalized. In some composite names, two or more toponyms (geographic names) are employed in reference to a single entity, e.g. “Frankfurt/Oder” or “Haren (Ems)”. While additional toponyms in a context allow for a better disambiguation, such composite names require a normalization, too.
* <i>Composite names.</i> Composite names or complex named entities consist of two or more words. Frequently, appositions are considered to be a part of a name. For example, there is no need for the translation of the word “mount” in “Mount Cook”, but “Insel” is typically translated in the expression “Insel Sylt”/“island of Sylt”. For NER, certain rules have to be established how composite names are normalized. In some composite names, two or more [[toponym]]s (geographic names) are employed in reference to a single entity, e.g. “Frankfurt/Oder” or “Haren (Ems)”. While additional [[toponym]]s in a context allow for a better disambiguation, such composite names require a normalization, too.
* <i>Semantic relations between toponyms and related concepts.</i> In GIR, semantic relations between toponyms and related concepts are often ignored. Concepts related to a toponym such as the language, inhabitants of a place, properties (adjectives), or phrases (“former Yugoslavia”) are not considered in geographic tagging. For example, the toponym “Scotland” can be inferred for occurrences of “Scottish”, “Scotsman”, or “Scottish districts”.
* <i>Semantic relations between [[toponym]]s and related concepts.</i> In GIR, semantic relations between [[toponym]]s and related concepts are often ignored. Concepts related to a toponym such as the language, inhabitants of a place, properties (adjectives), or phrases (“former Yugoslavia”) are not considered in geographic tagging. For example, the toponym “Scotland” can be inferred for occurrences of “Scottish”, “Scotsman”, or “Scottish districts”.
* <i>Temporal changes in toponyms.</i> Not all geographic concepts are static. For example, wars and treaties affect what a geographic name represents, e.g. “the EU” refers to a different region after its expansion. This is an indication that temporal and spatial restrictions should not be discussed separately.
* <i>Temporal changes in toponyms.</i> Not all geographic concepts are static. For example, wars and treaties affect what a geographic name represents, e.g. “the EU” refers to a different region after its expansion. This is an indication that temporal and spatial restrictions should not be discussed separately.
* <i>Metonymic usage.</i> Toponyms are used ambiguously. For example, “Libya” occurs in the news corpus as a reference to the “Libyan government” (as in “Libya stated that . . . ”).
* <i>Metonymic usage.</i> Toponyms are used ambiguously. For example, “Libya” occurs in the news corpus as a reference to the “Libyan government” (as in “Libya stated that . . . ”).


=== 5. Conclusion and Outlook===
=== 5. Conclusion and Outlook ===
The MultiNet paradigm offers representational means useful for GIR. We successfully employed semantic networks to uniformly represent queries, documents, and geographic background knowledge and to connect to external resources like GNS data. Three different approaches have been investigated: a baseline corresponding to a traditional IR approach; a variant expanding thematic, temporal, and geographic descriptors from the MultiNet representation of the query; and an adaptation of InSicht, a QA algorithm based on semantic networks. The diversity of our approaches looks promising for a combined system. We will continue research in the problem areas described in Sect. 2: improving NER, connecting semantic networks and databases, expanding geographic background knowledge, and investigating the role of semantic relations in geographic queries. In IR, there are methods that successfully treat polysemy and synonymy for terms. It remains to be analyzed whether such methods successfully treat polysemy and synonymy for toponyms in GIR, too.
The MultiNet paradigm offers representational means useful for GIR. [[We]] successfully employed semantic networks to uniformly represent queries, documents, and geographic [[background knowledge]] and to connect to external resources like GNS data. Three different approaches have been investigated: a baseline corresponding to a traditional IR approach; a variant expanding thematic, temporal, and geographic descriptors from the MultiNet representation of the query; and an adaptation of InSicht, a QA algorithm based on semantic networks. The diversity of [[our approach]]es looks promising for a combined system. [[We]] will continue research in the problem areas described in Sect. 2: improving NER, connecting semantic networks and databases, expanding geographic [[background knowledge]], and investigating the role of semantic relations in geographic queries. In IR, there are methods that successfully treat polysemy and synonymy for terms. It remains to be analyzed whether such methods successfully treat polysemy and synonymy for [[toponym]]s in GIR, too.


}}
}}


==References ==
== References ==
* Jones, C.B., Purves, R., Ruas, A., Sanderson, M., Sester, M., van Kreveld, M.J., Weibel, R.: Spatial information retrieval and geographical ontologies – an overview of the SPIRIT project. In: SIGIR (2002). (2002). 387–388
 
* Kunze, C., Wagner, A.: Anwendungsperspektiven des GermaNet, eines lexikalischsemantischen Netzes f¨ur das Deutsche. In Lemberg, I., Schr¨oder, B., Storrer, A., eds.: Chancen und Perspektiven computergest¨utzter Lexikographie. Volume 107 of Lexicographica Series Maior. Niemeyer, T¨ubingen, Germany (2001) 229–246
* Jones, C.B., Purves, R., Ruas, A., Sanderson, M., Sester, M., van Kreveld, M.J., Weibel, R.: Spatial information retrieval and geographical ontologies – an overview of the SPIRIT project. In: SIGIR ([[2002]]). ([[2002]]). 387–388
* Hartrumpf, S.: Hybrid Disambiguation in Natural Language Analysis. Der Andere Verlag, Osnabr¨uck, Germany (2003)
* Kunze, C., Wagner, A.: Anwendungsperspektiven des GermaNet, eines lexikalischsemantischen Netzes für das Deutsche. In Lemberg, I., Schr¨oder, B., Storrer, A., eds.: Chancen und Perspektiven computergestützter Lexikographie. Volume 107 of Lexicographica Series Maior. Niemeyer, Tübingen, Germany (2001) 229–246
* Hartrumpf, S.: Hybrid Disambiguation in Natural Language Analysis. Der Andere Verlag, Osnabrück, Germany (2003)
* Helbig, H.: Knowledge Representation and the Semantics of Natural Language. Springer, Berlin ([[2006]])
* Helbig, H.: Knowledge Representation and the Semantics of Natural Language. Springer, Berlin ([[2006]])
* Leveling, J., Hartrumpf, S.: University of Hagen at CLEF 2004: Indexing and translating concepts for the GIRT task. In Peters, C., Clough, P., Jones, G.J.F., Gonzalo, J., Kluck, M., [[Bernardo Magnini]], eds.: Multilingual Information Access for Text, Speech and Images: 5th Workshop of the Cross-Language Evaluation Forum, CLEF (2004). Volume 3491 of LNCS. Springer, Berlin ([[2005]]) 271–282
* Leveling, J., Hartrumpf, S.: University of Hagen at CLEF 2004: Indexing and translating concepts for the GIRT task. In Peters, C., Clough, P., Jones, G.J.F., Gonzalo, J., Kluck, M., [[Bernardo Magnini]], eds.: Multilingual Information Access for Text, Speech and Images: 5th Workshop of the Cross-Language Evaluation Forum, CLEF ([[2004]]). Volume 3491 of LNCS. Springer, Berlin ([[2005]]) 271–282
* 6. Gey, F., Larson, R., Sanderson, M., Joho, H., Clough, P., Petras, V.: GeoCLEF: the CLEF 2005 cross-language geographic information retrieval track overview. This volume
* 6. Gey, F., Larson, R., Sanderson, M., Joho, H., Clough, P., Petras, V.: GeoCLEF: the CLEF 2005 cross-language geographic information retrieval track overview. This volume
* Fonseca, F.T., Egenhofer, M.J., Agouris, P., Cˆamara, G.: Using ontologies for integrated geographic information systems. Transactions in Geographic Information Systems 6(3) (2002) 231–257
* Fonseca, F.T., Egenhofer, M.J., Agouris, P., Cˆamara, G.: Using ontologies for integrated geographic information systems. Transactions in Geographic Information Systems 6(3) (2002) 231–257
* Hammer, S., Dickmeiss, A., Levanto, H., Taylor, M.: Zebra – User’s Guide and Reference, Copenhagen. ([[2005]])
* Hammer, S., Dickmeiss, A., Levanto, H., Taylor, M.: Zebra – User’s Guide and Reference, Copenhagen. ([[2005]])
* Leveling, J.: University of Hagen at CLEF 2003: Natural language access to the GIRT4 data. In Peters, C., Gonzalo, J., Braschler, M., Kluck, M., eds.: Comparative Evaluation of Multilingual Information Access Systems: 4thWorkshop of the Cross-Language Evaluation Forum, CLEF (2003). Volume 3237 of LNCS. Springer, Berlin (2004) 412–424
* Leveling, J.: University of Hagen at CLEF 2003: Natural language access to the GIRT4 data. In Peters, C., Gonzalo, J., Braschler, M., Kluck, M., eds.: Comparative Evaluation of Multilingual Information Access Systems: 4thWorkshop of the Cross-Language Evaluation Forum, CLEF ([[2003]]). Volume 3237 of LNCS. Springer, Berlin (2004) 412–424
* Hartrumpf, S.: Question answering using sentence parsing and semantic network matching. In Peters, C., Clough, P., Jones, G.J.F., Gonzalo, J., Kluck, M., [[Bernardo Magnini]], eds.: Multilingual Information Access for Text, Speech and Images: 5th Workshop of the Cross-Language Evaluation Forum, CLEF (2004). Volume 3491 of LNCS. Springer, Berlin ([[2005]]) 512–521
* Hartrumpf, S.: Question answering using sentence parsing and semantic network matching. In Peters, C., Clough, P., Jones, G.J.F., Gonzalo, J., Kluck, M., [[Bernardo Magnini]], eds.: Multilingual Information Access for Text, Speech and Images: 5th Workshop of the Cross-Language Evaluation Forum, CLEF ([[2004]]). Volume 3491 of LNCS. Springer, Berlin ([[2005]]) 512–521


__NOTOC__
__NOTOC__

Latest revision as of 21:24, 17 December 2020

Subject Headings: Semantic Network, GeoCLEF Task, Question-Answering System, Toponym Record Set.

Notes

Cited By

Quotes

Abstract

This paper describes our work for the participation at the GeoCLEF task of CLEF (2005). We employ multilayered extended semantic networks for the representation of background knowledge, queries, and documents for geographic information retrieval (GIR). In our approach, geographic concepts from the query network are expanded with concepts which are semantically connected via topological, directional, and proximity relations. We started with an existing geographic knowledge base represented as a semantic network and expanded it with concepts automatically extracted from the GEOnet Names Server.

Several experiments for GIR on German documents have been performed: a baseline corresponding to a traditional information retrieval approach; a variant expanding thematic, temporal, and geographic descriptors from the semantic network representation of the query; and an adaptation of a question answering (QA) algorithm based on semantic networks. The second experiment is based on a representation of the natural language description of a topic as a semantic network, which is achieved by a deep linguistic analysis. The semantic network is transformed into an intermediate representation of a database query explicitly representing thematic, temporal, and local restrictions. This experiment showed the best performance with respect to mean average precision: 10.53% using the topic title and description. The third experiment, adapting a QA algorithm, uses a modified version of the QA system InSicht. The system matches deep semantic representations of queries or their equivalent or similar variants to semantic networks for document sentences.


References

  • Jones, C.B., Purves, R., Ruas, A., Sanderson, M., Sester, M., van Kreveld, M.J., Weibel, R.: Spatial information retrieval and geographical ontologies – an overview of the SPIRIT project. In: SIGIR (2002). (2002). 387–388
  • Kunze, C., Wagner, A.: Anwendungsperspektiven des GermaNet, eines lexikalischsemantischen Netzes für das Deutsche. In Lemberg, I., Schr¨oder, B., Storrer, A., eds.: Chancen und Perspektiven computergestützter Lexikographie. Volume 107 of Lexicographica Series Maior. Niemeyer, Tübingen, Germany (2001) 229–246
  • Hartrumpf, S.: Hybrid Disambiguation in Natural Language Analysis. Der Andere Verlag, Osnabrück, Germany (2003)
  • Helbig, H.: Knowledge Representation and the Semantics of Natural Language. Springer, Berlin (2006)
  • Leveling, J., Hartrumpf, S.: University of Hagen at CLEF 2004: Indexing and translating concepts for the GIRT task. In Peters, C., Clough, P., Jones, G.J.F., Gonzalo, J., Kluck, M., Bernardo Magnini, eds.: Multilingual Information Access for Text, Speech and Images: 5th Workshop of the Cross-Language Evaluation Forum, CLEF (2004). Volume 3491 of LNCS. Springer, Berlin (2005) 271–282
  • 6. Gey, F., Larson, R., Sanderson, M., Joho, H., Clough, P., Petras, V.: GeoCLEF: the CLEF 2005 cross-language geographic information retrieval track overview. This volume
  • Fonseca, F.T., Egenhofer, M.J., Agouris, P., Cˆamara, G.: Using ontologies for integrated geographic information systems. Transactions in Geographic Information Systems 6(3) (2002) 231–257
  • Hammer, S., Dickmeiss, A., Levanto, H., Taylor, M.: Zebra – User’s Guide and Reference, Copenhagen. (2005)
  • Leveling, J.: University of Hagen at CLEF 2003: Natural language access to the GIRT4 data. In Peters, C., Gonzalo, J., Braschler, M., Kluck, M., eds.: Comparative Evaluation of Multilingual Information Access Systems: 4thWorkshop of the Cross-Language Evaluation Forum, CLEF (2003). Volume 3237 of LNCS. Springer, Berlin (2004) 412–424
  • Hartrumpf, S.: Question answering using sentence parsing and semantic network matching. In Peters, C., Clough, P., Jones, G.J.F., Gonzalo, J., Kluck, M., Bernardo Magnini, eds.: Multilingual Information Access for Text, Speech and Images: 5th Workshop of the Cross-Language Evaluation Forum, CLEF (2004). Volume 3491 of LNCS. Springer, Berlin (2005) 512–521,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2005 UsingSemNetworksForGeoInfRetrJohannes Leveling
Sven Hartrumpf
Dirk Veiel
Using Semantic Networks for Geographic Information Retrieval10.1007/11878773