2006 MultilingualDependencyAnalysisw
- (McDonald et al., 2006) ⇒ Ryan McDonald, Kevin Lerman, and Fernando Pereira. (2006). “Multilingual Dependency Analysis with a Two-stage Discriminative Parser.” In: Proceedings of the Tenth Conference on Computational Natural Language Learning.
Subject Headings: Correlated Label, Dependency Parse.
Notes
Cited By
- http://scholar.google.com/scholar?q=%22Multilingual+dependency+analysis+with+a+two-stage+discriminative+parser%22+2006
- http://dl.acm.org/citation.cfm?id=1596276.1596317&preflayout=flat#citedby
Quotes
Abstract
We present a two-stage multilingual dependency parser and evaluate it on 13 diverse languages. The first stage is based on the unlabeled dependency parsing models described by McDonald and Pereira (2006) augmented with morphological features for a subset of the languages. The second stage takes the output from the first and labels all the edges in the dependency graph with appropriate syntactic categories using a globally trained sequence classifier over components of the graph. We report results on the CoNLL-X shared task (Buchholz et al., 2006) data sets and present an error analysis.
3. Label Classification
The simplest labeler would be to take as input an edge [math]\displaystyle{ (i, j) \in y }[/math] for sentence x and find the label with highest score,
l(i,j) = argmax s(l, (i, j), y,x) l
Doing this for each edge in the tree would produce the final output. Such a model could easily be trained using the provided training data for each language. However, it might be advantageous to know the labels of other nearby edges. For instance, if we consider a head [math]\displaystyle{ x_i }[/math] with dependents xj1 , . . . , xjM, it is often the case that many of these dependencies will have correlated labels. To model this we treat the labeling of the edges (i, j1), . . . , (i, jM) as a sequence labeling problem,
(l(i,j1), . . . , l(i,jM)) = ¯l = argmax s(¯l, i, y,x) ¯l
We use a first-order Markov factorization of the score
= argmax m=2 s(l(i,jm), l(i,jm−1), i, y,x)
in which each factor is the score of labeling the adjacent edges (i, jm) and (i, jm−1) in the tree y. We attempted higher-order Markov factorizations but they did not improve performance uniformly across languages and training became significantly slower.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2006 MultilingualDependencyAnalysisw | Ryan T. McDonald Fernando Pereira Kevin Lerman | Multilingual Dependency Analysis with a Two-stage Discriminative Parser |