2011 BoundedCoordinateDescentforBiol

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Abstract

We present a framework for discriminative sequence classification where linear classifiers work directly in the explicit high-dimensional predictor space of all subsequences in the training set (as opposed to kernel-induced spaces). This is made feasible by employing a gradient-bounded coordinate-descent algorithm for efficiently selecting discriminative subsequences without having to expand the whole space. Our framework can be applied to a wide range of loss functions, including binomial log-likelihood loss of logistic regression and squared hinge loss of support vector machines. When applied to protein remote homology detection and remote fold recognition, our framework achieves comparable performance to the state-of-the-art (e.g., kernel support vector machines). In contrast to state-of-the-art sequence classifiers, our models are simply lists of weighted discriminative subsequences and can thus be interpreted and related to the biological problem - a crucial requirement for the bioinformatics and medical communities.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2011 BoundedCoordinateDescentforBiolGeorgiana Ifrim
Carsten Wiuf
Bounded Coordinate-descent for Biological Sequence Classification in High Dimensional Predictor Space10.1145/2020408.20205192011