2006 MultilingualDependencyAnalysisw

(McDonald et al., 2006) ⇒ Ryan McDonald, Kevin Lerman, and Fernando Pereira. (2006). “Multilingual Dependency Analysis with a Two-stage Discriminative Parser.” In: Proceedings of the Tenth Conference on Computational Natural Language Learning.

Subject Headings: Correlated Label, Dependency Parse.

Notes

Cited By

Quotes

Abstract

We present a two-stage multilingual dependency parser and evaluate it on 13 diverse languages. The first stage is based on the unlabeled dependency parsing models described by McDonald and Pereira (2006) augmented with morphological features for a subset of the languages. The second stage takes the output from the first and labels all the edges in the dependency graph with appropriate syntactic categories using a globally trained sequence classifier over components of the graph. We report results on the CoNLL-X shared task (Buchholz et al., 2006) data sets and present an error analysis.

3. Label Classification

The simplest labeler would be to take as input an edge [math]\displaystyle{ (i, j) \in y }[/math] for sentence x and find the label with highest score,

l(i,j) = argmax s(l, (i, j), y,x)
            l

Doing this for each edge in the tree would produce the final output. Such a model could easily be trained using the provided training data for each language. However, it might be advantageous to know the labels of other nearby edges. For instance, if we consider a head [math]\displaystyle{ x_i }[/math] with dependents xj1 , . . . , xjM, it is often the case that many of these dependencies will have correlated labels. To model this we treat the labeling of the edges (i, j1), . . . , (i, jM) as a sequence labeling problem,

(l(i,j1), . . . , l(i,jM)) = ¯l = argmax s(¯l, i, y,x)
¯l

We use a first-order Markov factorization of the score

= argmax
m=2
s(l(i,jm), l(i,jm−1), i, y,x)

in which each factor is the score of labeling the adjacent edges (i, jm) and (i, jm−1) in the tree y. We attempted higher-order Markov factorizations but they did not improve performance uniformly across languages and training became significantly slower.

References

,

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2006 MultilingualDependencyAnalysisw	Ryan T. McDonald Fernando Pereira Kevin Lerman			Multilingual Dependency Analysis with a Two-stage Discriminative Parser