Predictor Feature Ablation Study
(Redirected from Feature Ablation Study)
Jump to navigation
Jump to search
A Predictor Feature Ablation Study is an empirical analysis task that explores the contribution of predictor features on predictive performance.
- …
- Counter-Example(s):
- See: Feature Selection Task.
References
2011
- (Gimpel et al., 2011) ⇒ Kevin Gimpel, Nathan Schneider, Brendan O'Connor, Dipanjan Das, Daniel Mills, Jacob Eisenstein, Michael Heilman, Dani Yogatama, Jeffrey Flanigan, and Noah A. Smith. (2011). “Part-of-speech tagging for twitter: Annotation, features, and experiments." In" Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.
- QUOTE: … We also show feature ablation experiments, each of which corresponds to removing one category of features from the full set ...
2008
- (Burkett & Klein, 2008) ⇒ David Burkett, and Dan Klein. (2008). “[www.aclweb.org/anthology-new/D/D08/D08-1092.pdf Two Languages Are Better Than One (for Syntactic Parsing)].” In: Proceedings of the Conference on Empirical Methods in Natural Language Processing.
- QUOTE: To verify that all our features were contributing to the model’s performance, we did an ablation study, removing one group of features at a time. Table 2 shows the [math]\displaystyle{ F_1 }[/math] scores on the bilingual development data resulting from training with each group of features removed.[1] Note that though head word features seemed to be detrimental in our rapid training setup, earlier testing had shown a positive effect, so we reran the comparison using our full training setup, where we again saw an improvement when including these features.
Table 2: Feature ablation study. [math]\displaystyle{ F_1 }[/math] on dev set after training with individual feature groups removed. Performance with individual baseline parsers included for reference.
- QUOTE: To verify that all our features were contributing to the model’s performance, we did an ablation study, removing one group of features at a time. Table 2 shows the [math]\displaystyle{ F_1 }[/math] scores on the bilingual development data resulting from training with each group of features removed.[1] Note that though head word features seemed to be detrimental in our rapid training setup, earlier testing had shown a positive effect, so we reran the comparison using our full training setup, where we again saw an improvement when including these features.
- ↑ We do not have a test with the basic alignment features removed because they are necessary to compute a0(t, t0).