WordFreak System
Jump to navigation
Jump to search
A WordFreak System is a [text annotation framework]]. English, Chinese and Arabic.
- AKA: WordFreak, WordFreak Annotation System.
- Context:
- It can (typically) be written in Java Programming Language that supports human and automatic annotation of linguistic data in
- It can solve a WordFreak Annotation Task by implementing WordFreak Annotation Algorithm.
- It can used Active Learning for solving linguistics annotation sub-tasks.
- It is an Extensible System.
- It consists of the following system's components:
- a WordFreak Visualization System allowing users to view and perform annotations (Viewers) as well as allowing users an annotator Choosers;
- a WordFreak Task Definition System, a two-tiered system for defining new annotation task requirements;
- a WordFreak Automatic Annotator System that integrates Sentence Boundary Detectors, POS Taggers, Parsers, and Coreference Revolvers.
- Example(s):
- a WordFreak Version 2.2.3 - the default system's plug-in that can be used to perform a variety of annotation tasks and read files in WordFreak and Penn Treebank file formats;
- a WordFreak ACE Plugin - a system's plug-in that can be used to perform an ACE Coreference Annotation Task and read files in ACE Pilot 2.0.1 File Format.
- a WordFreak MUC Plugin -a system's plug-in that can be used to perform MUC Coreference Annotation Tasksand read files in the MUC SGML file format.
- a WordFreak OpenNLP Plugin - a system's plug-in that uses automatic taggings for sentence boundary detection, tokenization, POS tagging, syntactic chunking, full parsing, and name finding.
- as described in Morton & LaCivita (2003).
- …
- Counter-Example(s):
- See: OpenNLP System, GATE System, UIMA System.
References
2010
- (WordFreak SourceForge, 2010) ⇒ http://wordfreak.sourceforge.net
- WordFreak is a java based linguistic annotation tool designed to support human, and automatic annotation of linguistic data as well as employ active-learning for human correction of automatically annotated data.
2009
- (Wilcock, 2009) ⇒ Graham Wilcock. (2009). “Introduction to Linguistic Annotation and Text Analytics.” In: Synthesis Lectures on Human Language Technologies. Morgan & Claypool. doi:10.2200/S00194ED1V01Y200905HLT003 ISBN:1598297384
- QUOTE: There are many tools that can be used for linguistic annotation. We will use WordFreak (http://wordfreak.sourceforge.net/), a Java-based linguistic annotation tool designed to support both human and automatic annotation of linguistic data. WordFreak is briefly described by its developers Thomas Morton and Jeremy LaCivita in (Morton and LaCivita 2003). There is no user manual, so we will give detailed examples here.
We use WordFreak in order to gain practical experience of doing linguistic annotations by hand. That’s the only way to learn the difficulties involved in making decisions in linguistic annotations. Later, when we use statistical NLP tools, we will appreciate the speed and power of automatic annotations, by contrast with manual annotations.
(...)WordFreak creates stand-off XML annotations. We will describe the format and see examples in the following sections. Note that GATE and WordFreak deal with existing annotations differently.
- QUOTE: There are many tools that can be used for linguistic annotation. We will use WordFreak (http://wordfreak.sourceforge.net/), a Java-based linguistic annotation tool designed to support both human and automatic annotation of linguistic data. WordFreak is briefly described by its developers Thomas Morton and Jeremy LaCivita in (Morton and LaCivita 2003). There is no user manual, so we will give detailed examples here.
2003
- (Morton & LaCivita, 2003) ⇒ Thomas Morton, and Jeremy LaCivita. (2003). “WordFreak: An Open Tool for Linguistic Annotation.” In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Demonstrations - Volume 4. doi:10.3115/1073427.1073436
- ABSTRACT: WordFreak is a natural language annotation tool that has been designed to be easy to extend to new domains and tasks. Specifically, a plug-in architecture has been developed which allows components to be added to WordFreak for customized visualization, annotation specification, and automatic annotation, without re-compilation. The APIs for these plug-ins provide mechanisms to allow automatic annotators or taggers to guide future annotation to supports active learning. At present WordFreak can be used to annotate a number of different types of annotation in English, Chinese, and Arabic including: constituent parse structure and dependent annotations, and ACE named-entity and coreference annotation. The Java source code for WordFreak is distributed under the Mozilla Public License 1.1 via SourceForge at: http://wordfreak.sourceforge.net. This site also provides screenshots, and a web deployable version of WordFreak.