Huber Loss Function

Context:
- It can be mathematically defined as: $L_{\delta}=\left\{\begin{array}{cc}\frac{1}{2}(y-\hat{y})^{2} & if\;|(y-\hat{y})|<\delta \\\delta\left((y-\hat{y})-\frac{1}{2} \delta\right) & \text { otherwise }\end{array}\right.$
Example(s):
- …
Counter-Example(s):
- an Exponential Loss Function,
- a Hinge-Loss Function, as used by SVMs.
- a Kullback-Leibler Loss Function,
- a Logistic Loss Function,
- a Savage Loss Function,
- a Square Loss Function,
- a Tangent Loss Function.
See: Squared Error Function, Cross-Entropy Measure, Mean Absolute Error, Mean Squared Error, Outlier.

References

(Wikipedia, 2021) ⇒ https://en.wikipedia.org/wiki/Huber_loss Retrieved:2021-3-7.
- In statistics, the Huber loss is a loss function used in robust regression, that is less sensitive to outliers in data than the squared error loss. A variant for classification is also sometimes used.
  (...)The Huber loss function describes the penalty incurred by an estimation procedure . Huber (1964) defines the loss function piecewise by : L_\delta (a) = \begin{cases} \frac{1}{2}{a^2} & \text{for } |a| \le \delta, \\ \delta (|a| - \frac{1}{2}\delta), & \text{otherwise.} \end{cases} This function is quadratic for small values of , and linear for large values, with equal values and slopes of the different sections at the two points where |a| = \delta . The variable often refers to the residuals, that is to the difference between the observed and predicted values a = y - f(x) , so the former can be expanded to Compared to Hastie et al., the loss is scaled by a factor of ½, to be consistent with Huber's original definition given earlier. : L_\delta(y, f(x)) = \begin{cases} \frac{1}{2}(y - f(x))^2 & \textrm{for } |y - f(x)| \le \delta, \\ \delta\, |y - f(x)| - \frac{1}{2}\delta^2 & \textrm{otherwise.} \end{cases}