1998 TaggingInflectiveLanguagesPredi

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Morphological Analysis Task; Morphological Parsing Task

Notes

Cited By

Quotes

Abstract

The major obstacle in morphological (sometimes called morpho-syntactic, or extended POS) tagging of highly inflective languages, such as Czech or Russian, is - given the resources possibly available - the tagset size. Typically, it is in the order of thousands. Our method uses an exponential probabilistic model based on automatically selected features. The parameters of the model are computed using simple estimates (which makes training much faster than when one uses Maximum Entropy) to directly minimize the error rate on training data. The results obtained so far not only show good performance on disambiguation of most of the individual morphological categories, but they also show a significant improvement on the overall prediction of the resulting combined tag over a HMM-based tag n-gram model, using even substantially less training data.



References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1998 TaggingInflectiveLanguagesPrediJan Hajič
Barbora Hladká
Tagging Inflective Languages: Prediction of Morphological Categories for a Rich, Structured Tagset10.3115/980845.9809271998