Linear Regression Bias

A Linear Regression Bias is a coefficient of the linear regression function that is a constant term that corresponds to the point that intercepts from the x-origin.

AKA: Intercept Term, Offset Term, Bias Term.
Example(s):
- [math]\displaystyle{ y'=b+w_1x_1+w_2x_2+\cdots+w_nx_n }[/math], where [math]\displaystyle{ b }[/math] is the bias.
- …
Counter-Example(s)
See: Unbiased Estimator, Biased Estimator, Risk, Sample Standard Deviation, Statistics, Estimator, Expected Value, Median, Consistent Estimator , Unbiased Estimation of Standard Deviation, Loss Function, Mean Squared Error, Shrinkage Estimator.

References

(Google ML Glossary, 2018) ⇒ (2008). bias. In: Machine Learning Glossary https://developers.google.com/machine-learning/glossary/ Retrieved: 2018-05-13.
- QUOTE: An intercept or offset from an origin. Bias (also known as the bias term) is referred to as [math]\displaystyle{ b }[/math] or [math]\displaystyle{ w_0 }[/math] in machine learning models. For example, bias is the [math]\displaystyle{ b }[/math] in the following formula:
  [math]\displaystyle{ y'=b+w_1x_1+w_2x_2+\cdots+w_nx_n }[/math]
  Not to be confused with prediction bias.

(Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/linear_model.html Retrieved: 2017-30-07.
- QUOTE: The following are a set of methods intended for regression in which the target value is expected to be a linear combination of the input variables. In mathematical notion, if [math]\displaystyle{ \hat{y} }[/math] is the predicted value.
  [math]\displaystyle{ \hat{y}(w, x) = w_0 + w_1 x_1 + \cdots + w_p x_p }[/math]
  Across the module, we designate the vector [math]\displaystyle{ w = (w_1,\cdots, w_p) }[/math] as coef_ and [math]\displaystyle{ w_0 }[/math] as intercept_.

(Quadrianto & Buntine, 2017) ⇒ Novi Quadrianto and Wray L. Buntine (2017). "Linear Regression". In: Sammut C., Webb G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA
- QUOTE: : Note that the function is a linear function of the weight vector [math]\displaystyle{ w }[/math]. The simplest form of the linear parametric model is when [math]\displaystyle{ \phi(x_i)=x_i\,\in \mathbb{R}^d }[/math], that is, the model is also linear with respect to the input variables, [math]\displaystyle{ f(x_i ) : = w_0 + w_1x_{i1} + \cdots + w_d x_{id} }[/math]. Here the weight [math]\displaystyle{ w_0 }[/math] allows for any constant offset in the data. With general basis functions such as polynomials, exponentials, sigmoids, or even more sophisticated Fourier or wavelets bases, we can obtain a regression function which is nonlinear with respect to the input variables although still linear with respect to the parameters.

(Wikipedia, 2009) ⇒ http://en.wikipedia.org/wiki/Simple_linear_regression
- Given a sample [math]\displaystyle{ (Y_i, X_i), \, i = 1, \ldots, n }[/math], the regression model is given by
  - [math]\displaystyle{ Y_i = a + bX_i + \varepsilon_i }[/math]
- Where [math]\displaystyle{ Y_i }[/math] is the dependent variable, [math]\displaystyle{ a }[/math] is the [math]\displaystyle{ y }[/math] intercept, [math]\displaystyle{ b }[/math] is the gradient or slope of the line, [math]\displaystyle{ X_i }[/math] is independent variable and [math]\displaystyle{ \varepsilon_i }[/math] is a random term associated with each observation.
- The linear relationship between the two variables (i.e. dependent and independent) can be measured using a correlation coefficient e.g. the Pearson Product Moment Correlation Coefficient.