2004 ImprovingThePerfOfDictProtNER
Jump to navigation
Jump to search
- (Tsuruoka & Tsujii, 2004) ⇒ Yoshimasa Tsuruoka, Jun'ichi Tsujii. (2004). “Improving the Performance of Dictionary-based Approaches in Protein Name Recognition.” In: Journal of Biomedical Informatics, 37(6).
Subject Headings: Protein NER, Dictionary-based Algorithm.
Notes
Cited By
- (Cohen, 2005) ⇒ Aaron M. Cohen. (2005). “Unsupervised gene/protein named entity normalization using automatically extracted dictionaries.” In: Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases.
- Tsuruoka and Tsujii recently studied the use of dictionary-based approaches for protein name recognition (Tsuruoka and Tsujii, 2004), although they did not evaluate the normalization performance. They applied a probabilistic term variant generator to expand the dictionary, and a Bayesian contextual filter with a sub-sentence window size to classify the terms in the GENIA corpus as likely to represent protein names. Overall they obtained a precision of 71.1%, at a recall of 62.3% and an F-measure of 66.6%. Tsuruoka and Tsujii did not make use of curated database information, and instead split the GENIA corpus into training and test data sets of 1800 and 200 abstracts respectively, and extracted the tagged protein names from the training set to use as a dictionary. These results compare well to, being a bit below, other non-dictionary based methods applied to the GENIA corpus (Lee et al., 2004, Zhou et al., 2004).
Quotes
Abstract
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2004 ImprovingThePerfOfDictProtNER | Jun'ichi Tsujii Yoshimasa Tsuruoka | Improving the Performance of Dictionary-based Approaches in Protein Name Recognition | http://dx.doi.org/10.1016/j.jbi.2004.08.003 | 10.1016/j.jbi.2004.08.003 |