2002 TimelyAndNonIntrusiveDocAnnot: Difference between revisions

From GM-RKB
Jump to navigation Jump to search
(ContinuousReplacement)
Tag: continuous replacement
m (Text replacement - "ments]]" to "ment]]s")
 
Line 19: Line 19:


=== 1. Introduction ===
=== 1. Introduction ===
The [[Research Project Goal|effort]] behind the [[Semantic Web|Semantic Web (SW)]] is to [[Semantic Annotation Task|add content]] to [[Web Document|web documents]] in order to access [[knowledge]] instead of [[unstructured material]], allowing knowledge to be managed in an automatic way. Much work is done on (1) the [[definition]] of [[standards for representation of knowledge]] (e.g. [[XML]], [[RDF]], [[OIL]]), (2) the [[definition]] of [[structures for knowledge organization]] (e.g. [[ontologies]]) and (3) the [[population]] of such [[knowledge structure]]s. (1) and (2) actually provide the necessary infrastructure for the [[Semantic Web]]. (3) actually requires methodologies for creating semantically enriched documents. It is reasonable to expect users to manually annotate new documents up to a certain degree, but annotation is a slow time-consuming process that involves high costs. Therefore it is vital for the Semantic Web to produce automatic or semi-automatic methods for extracting information from web-related documents, either for helping in annotating new documents or to extract additional information from existing unstructured or partially structured documents. In this context, Information Extraction from texts (IE) is one of the most promising areas of Human Language Technologies for the Semantic Web. IE is an automatic method for locating important facts in electronic documents for successive use, e.g. for annotating documents or for information storing (such as populating an ontology with instances). In this perspective IE is the perfect support for knowledge identification and extraction from Web documents as it can – for example - provide support in documents annotation either in an automatic way (unsupervised extraction of information) or semi-automatic way (e.g. as support for human annotators in locating relevant facts in documents, via information highlighting).
The [[Research Project Goal|effort]] behind the [[Semantic Web|Semantic Web (SW)]] is to [[Semantic Annotation Task|add content]] to [[Web Document|web document]]s in order to access [[knowledge]] instead of [[unstructured material]], allowing knowledge to be managed in an automatic way. Much work is done on (1) the [[definition]] of [[standards for representation of knowledge]] (e.g. [[XML]], [[RDF]], [[OIL]]), (2) the [[definition]] of [[structures for knowledge organization]] (e.g. [[ontologies]]) and (3) the [[population]] of such [[knowledge structure]]s. (1) and (2) actually provide the necessary infrastructure for the [[Semantic Web]]. (3) actually requires methodologies for creating semantically enriched documents. It is reasonable to expect users to manually annotate new documents up to a certain degree, but annotation is a slow time-consuming process that involves high costs. Therefore it is vital for the Semantic Web to produce automatic or semi-automatic methods for extracting information from web-related documents, either for helping in annotating new documents or to extract additional information from existing unstructured or partially structured documents. In this context, Information Extraction from texts (IE) is one of the most promising areas of Human Language Technologies for the Semantic Web. IE is an automatic method for locating important facts in electronic documents for successive use, e.g. for annotating documents or for information storing (such as populating an ontology with instances). In this perspective IE is the perfect support for knowledge identification and extraction from Web documents as it can – for example - provide support in documents annotation either in an automatic way (unsupervised extraction of information) or semi-automatic way (e.g. as support for human annotators in locating relevant facts in documents, via information highlighting).


}}
}}

Latest revision as of 04:37, 24 June 2024

Subject Headings: Semantic Annotation Task.

Notes

Cited By

2003

Quotes

Abtract

The process of document annotation for the Semantic Web is complex and time consuming, as it requires a great deal of manual annotation. Information extraction from texts (IE) is a technology used by some of the most recent systems for actively supporting users in the process and reducing the burden of annotation. The integration of IE systems in annotation tools is quite a new development and in our opinion there is still the necessity of thinking the impact of the IE system in the process of annotation. In this paper we discuss two main requirements for active annotation: timeliness and tuning of intrusiveness. Then we present and discuss a model of interaction that addresses the two issues and Melita, an annotation framework that implements such methodology.,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2002 TimelyAndNonIntrusiveDocAnnotFabio Ciravegna
Alexiei Dingli
Daniela Petrelli
Yorick Wilks
Timely and Non-Intrusive Active Document Annotation via Adaptive Information ExtractionProceedings of the Workshop on Semantic Authoring Annotation and Knowledge Management at European Conference on Artificial Intelligencehttp://ftp.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-100/Fabio Ciravegna-et-al.pdf2002