Feature Selection Algorithm

From GM-RKB
(Redirected from Feature Reduction Algorithm)
Jump to navigation Jump to search

A Feature Selection Algorithm is an dimensionality reduction algorithm that can be implemented by a feature selection system (to solve a feature selection task.



References

2007

  • (Zhao & Liu, 2007) ⇒ Zheng Zhao, and Huan Liu. (2007). “Spectral feature selection for supervised and unsupervised learning.” In: Proceedings of the 24th International Conference on Machine learning (ICML 2007).
    • Feature selection aims to reduce dimensionality for building comprehensible learning models with good generalization performance. Feature selection algorithms are largely studied separately according to the type of learning: supervised or unsupervised. This work exploits intrinsic properties underlying supervised and unsupervised feature selection algorithms, and proposes a unified framework for feature selection based on spectral graph theory. The proposed framework is able to generate families of algorithms for both supervised and unsupervised feature selection. And we show that existing powerful algorithms such as ReliefF (supervised) and Laplacian Score (unsupervised) are special cases of the proposed framework. To the best of our knowledge, this work is the first attempt to unify supervised and unsupervised feature selection, and enable their joint study under a general framework. Experiments demonstrated the efficacy of the novel algorithms derived from the framework.

2007

  • (Zhao & Liu, 2007) ⇒ Zheng Zhao, and Huan Liu. (2007). “Spectral feature selection for supervised and unsupervised learning.” In: Proceedings of the 24th International Conference on Machine learning (ICML 2007).
    • Feature selection aims to reduce dimensionality for building comprehensible learning models with good generalization performance. Feature selection algorithms are largely studied separately according to the type of learning: supervised or unsupervised. This work exploits intrinsic properties underlying supervised and unsupervised feature selection algorithms, and proposes a unified framework for feature selection based on spectral graph theory. The proposed framework is able to generate families of algorithms for both supervised and unsupervised feature selection. And we show that existing powerful algorithms such as ReliefF (supervised) and Laplacian Score (unsupervised) are special cases of the proposed framework. To the best of our knowledge, this work is the first attempt to unify supervised and unsupervised feature selection, and enable their joint study under a general framework. Experiments demonstrated the efficacy of the novel algorithms derived from the framework.

2005

2005

2004

  • (Dy & Brodley, 2004) ⇒ J. G. Dy, and C. E. Brodley. (2004). “Feature Selection for Unsupervised Learning.” In: Journal of Machine Learning Research, 5.

2003

  • (Guyon & Elisseeff, 2003) ⇒ Isabelle M. Guyon, and André Elisseeff. (2003). “An Introduction to Variable and Feature Selection.” In: The Journal of Machine Learning Research, 3.
    • … we summarize the steps that may be taken to solve a feature selection problem in a check list (We caution the reader that this check list is heuristic. The only recommendation that is almost surely valid is to try the simplest things first.)
    • 1. Do you have domain knowledge? If yes, construct a better set of “ad hoc” features.
    • 2. Are your features commensurate? If no, consider normalizing them.
    • 3. Do you suspect interdependence of features? If yes, expand your feature set by constructing conjunctive features or products of features, as much as your computer resources allow you (see example of use in Section 4.4).
    • 4. Do you need to prune the input variables (e.g. for cost, speed or data understanding reasons)? If no, construct disjunctive features or weighted sums of features (e.g. by clustering or matrix factorization, see Section 5).
    • 5. Do you need to assess features individually (e.g. to understand their influence on the system or because their number is so large that you need to do a first filtering)? If yes, use a variable ranking method (Section 2 and Section 7.2); else, do it anyway to get baseline results.
    • 6. Do you need a predictor? If no, stop.
    • 7. Do you suspect your data is “dirty” (has a few meaningless input patterns and/or noisy outputs or wrong class labels)? If yes, detect the outlier examples using the top ranking variables obtained in step 5 as representation; check and/or discard them.
    • 8. Do you know what to try first? If no, use a linear predictor.3 Use a forward selection method (Section 4.2) with the “probe” method as a stopping criterion (Section 6) or use the `0-norm embedded method (Section 4.3). For comparison, following the ranking of step 5, construct a sequence of predictors of same nature using increasing subsets of features. Can you match or improve performance with a smaller subset? If yes, try a non-linear predictor with that subset.
    • 9. Do you have new ideas, time, computational resources, and enough examples? If yes, compare several feature selection methods, including your new idea, correlation coefficients, backward selection and embedded methods (Section 4). Use linear and non-linear predictors. Select the best approach with model selection (Section 6).
    • 10. Do you want a stable solution (to improve performance and/or understanding)? If yes, subsample your data and redo your analysis for several “bootstraps” (Section 7.1).

2003

2002

1998

1998

1997

1997

1997

1996

1994

1977