Huber Regression System
A Huber Regression System is a Robustness Regression System that implements a Huber Regression Algorithm to solve a Huber Regression Task.
- AKA: Huber Regressor, Huber Regression Estimator.
- …
- Example(s):
- Counter-Example(s):
- See: Regression Analysis Task, Random Variable, L2-norm.
References
2017
- (Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/linear_model.html#huber-regression Retrieved:2017-09-17
- QUOTE: The
HuberRegressor
is different to Ridge because it applies a linear loss to samples that are classified as outliers. A sample is classified as an inlier if the absolute error of that sample is lesser than a certain threshold. It differs fromTheilSenRegressor
andRANSACRegressor
because it does not ignore the effect of the outliers but gives a lesser weight to them.The loss function that
HuberRegressor
minimizes is given by\underset{w, \sigma}{min\,} {\sum_{i=1}^n\left(\sigma + H_m\left(\frac{X_{i}w - y_{i}}{\sigma}\right)\sigma\right) + \alpha {||w||_2}^2}</math>
where
[math]\displaystyle{ H_m(z) = \begin{cases} z^2, & \text {if } |z| \lt \epsilon, \\ 2\epsilon|z| - \epsilon^2, & \text{otherwise} \end{cases} }[/math]
(...)
It is advised to set the parameter epsilon to 1.35 to achieve 95% statistical efficiency.
The HuberRegressor differs from using SGDRegressor with loss set to huber in the following ways.
HuberRegressor is scaling invariant. Once epsilon is set, scaling X and y down or up by different values would produce the same robustness to outliers as before. as compared to SGDRegressor where epsilon has to be set again when X and y are scaled.
HuberRegressor should be more efficient to use on data with small number of samples while SGDRegressor needs a number of passes on the training data to produce the same robustness.
- QUOTE: The