2000 LTTTT
Jump to navigation
Jump to search
- (Grover et al., 2000) ⇒ Claire Grover, Colin Matheson, Andrei Mikheev, Marc Moens. (2000). “LT TTT - A flexible tokenisation tool.” In: Proceedings of LREC Conference (LREC 2000).
Subject Headings: Surface Word Segmentation Task, Tokenization Algorithm, Tokenization System, LT TTT Tokenization System.
Notes
- Superseded by the LT TTT2 System.
- URL: (no longer valid) http://www.ltg.ed.ac.uk/software/ttt/
Quotes
Abstract
We describe LT TTT, a recently developed software system which provides tools to perform text tokenisation and mark-up. The system includes ready-made components to segment text into paragraphs, sentences, words and other kinds of token but, crucially, it also allows users to tailor rule-sets to produce mark-up appropriate for particular applications. We present three case studies of our use of LT TTT: named-entity recognition (MUC-7), citation recognition and mark-up and the preparation of a corpus in the medical domain. We conclude with a discussion of the use of browsers to visualise marked-up text.
References
- Claire Grover, Andrei Mikheev, and Colin Matheson. (1999). “LT TTT version 1.0: text tokenisation software. http://www.ltg.ed.ac.uk/software/ttt/,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2000 LTTTT | Andrei Mikheev Marc Moens Claire Grover Colin Matheson | LT TTT - A flexible tokenisation tool | http://www.ltg.ed.ac.uk/papers/00tttlrec.pdf |