Simple Linear Regression Task

A Simple Linear Regression Task is a linear regression task with a single response variable.

AKA: Univariate Linear Regression Task.
Context:
- Task Input: a Numerically-Labeled Training Dataset.
  - [math]\displaystyle{ \mathbf{Y} }[/math] response variable continuous dataset.
  - [math]\displaystyle{ \mathbf{X} }[/math] predictor variables continuous dataset.
  - (Optional) [math]\displaystyle{ \boldsymbol\beta }[/math], regression coefficients initial guess.
- output:
  - [math]\displaystyle{ \hat{\beta_j}=\{\beta_0,\beta_1 \} }[/math], estimated linear model parameters.
  - [math]\displaystyle{ \hat{y}(\beta,x) }[/math], predicted values (the Fitted Linear Function), a continuous dataset.
  - [math]\displaystyle{ \sum_{i=1}^n||\hat{y}_i - y_i||^2 }[/math] sum of squared errors vector, a continuous dataset.
  - [math]\displaystyle{ \sigma_x,\sigma_y,\rho_{X,Y}... }[/math], standard deviations, correlation coefficient, standard error of estimate and other statistical information the fitting parameters.
- Task Requirements
  A Simple Linear Regression System to find an optimal solution for:

[math]\displaystyle{ y_i=\beta_0+\beta_1x_i+\varepsilon_i\quad }[/math] for [math]\displaystyle{ \quad i=1,\cdots,n \; }[/math]

This is,
[math]\displaystyle{ \begin{pmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{pmatrix} = \begin{pmatrix} 1 & x_1 \\ 1 & x_2 \\ \vdots & \vdots\\ 1 & x_n \\ \end{pmatrix}\begin{pmatrix} \beta_0 \\ \beta_1 \\ \end{pmatrix}+\begin{pmatrix} \varepsilon_1 \\ \varepsilon_2 \\ \vdots \\ \varepsilon_n \end{pmatrix} }[/math]

[math]\displaystyle{ \beta_0 }[/math] is called the intercept.

by estimating the best-fitting [math]\displaystyle{ \beta }[/math] parameters that optimizes a objective function of the form:
[math]\displaystyle{ E(f)=\sum _{i=1}^{n}L(y_{i},\beta_0+\beta_1 x_i) }[/math]
[math]\displaystyle{ L(\cdot) }[/math] is an error function that may be derived as a loss function or a likelihood function.
Example(s):
A numerical experiment resulted in the four [math]\displaystyle{ (x, y) }[/math] data points [math]\displaystyle{ {(1, 6), (2, 5), (3, 7), (4, 10)} }[/math], find a line [math]\displaystyle{ y=\beta_1+\beta_2 x }[/math] that best fits these four points
e.g. ⇒ [math]\displaystyle{ y=3.5+1.4x }[/math].

…

Counter-Example(s):
A Multivariate Linear Regression Task.

See: Univariate Linear Regression Algorithm, Linear Model, System of Linear Equations.

References
2017a
(Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Simple_linear_regression Retrieved:2017-8-6.
In statistics, simple linear regression is a linear regression model with a single explanatory variable. That is, it concerns two-dimensional sample points with one independent variable and one dependent variable (conventionally, the x and y coordinates in a Cartesian coordinate system) and finds a linear function (a non-vertical straight line) that, as accurately as possible, predicts the dependent variable values as a function of the independent variables.
The adjective simple refers to the fact that the outcome variable is related to a single predictor.
It is common to make the additional hypothesis that the ordinary least squares method should be used to minimize the residuals (vertical distances between the points of the data set and the fitted line). Under this hypothesis, the accuracy of a line through the sample points is measured by the sum of squared residuals, and the goal is to make this sum as small as possible. Other regression methods that can be used in place of ordinary least squares include least absolute deviations (minimizing the sum of absolute values of residuals) and the Theil–Sen estimator (which chooses a line whose slope is the median of the slopes determined by pairs of sample points). Deming regression (total least squares) also finds a line that fits a set of two-dimensional sample points, but (unlike ordinary least squares, least absolute deviations, and median slope regression) it is not really an instance of simple linear regression, because it does not separate the coordinates into one dependent and one independent variable and could potentially return a vertical line as its fit.
The remainder of the article assumes an ordinary least squares regression.
In this case, the slope of the fitted line is equal to the correlation between and corrected by the ratio of standard deviations of these variables. The intercept of the fitted line is such that it passes through the center of mass $(,)$ of the data points.
2017b
(Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/linear_model.html Retrieved: 2017-30-07.
QUOTE: The following are a set of methods intended for regression in which the target value is expected to be a linear combination of the input variables. In mathematical notion, if [math]\displaystyle{ \hat{y} }[/math] is the predicted value.
[math]\displaystyle{ \hat{y}(w, x) = w_0 + w_1 x_1 + \cdots + w_p x_p }[/math]
Across the module, we designate the vector [math]\displaystyle{ w = (w_1,\cdots, w_p) }[/math] as coef_ and [math]\displaystyle{ w_0 }[/math] as intercept_.
2013
http://cran.r-project.org/doc/manuals/r-release/R-intro.html#Formulae-for-statistical-models
The template for a statistical model is a linear regression model with independent, homoscedastic errors :[math]\displaystyle{ y_i = \sum_{j=0}^p \beta_j x_{ij} + e_i, \, i = 1, …, n, }[/math] where the [math]\displaystyle{ e_i }[/math] are [math]\displaystyle{ NID(0, sigma^2 }[/math]). In matrix terms this would be written :[math]\displaystyle{ y = \mathbf{X} \beta + e }[/math] where the y is the response vector, X is the model matrix or design matrix and has columns [math]\displaystyle{ x_0, x_1, …, x_p }[/math], the determining variables. Very often [math]\displaystyle{ x_0 }[/math] will be a column of ones defining an intercept term.
2011a
(Allain, 2011) ⇒ Rhett Allain. (2015). “Linear Regression by Hand.” In: Wired, 2011-01-16
QUOTE: It only makes sense. I did linear regression in google docs and I did it for python. But what if you neither of those? Can you do it by hand? Why yes. Suppose I take the same data from the pylab example and I imagine trying to add a linear function to represent that data. ...
... you have to make up some criteria for choosing the best line. Commonly, it is chosen to pick the line such that the value of the sum of d2 is minimized. ... typically, the horizontal variable is your independent variable – so these might be some set values. The vertical data is typically the one with the most error (but not always). ...
There. That is the the basic form of linear regression by hand. Note that there ARE other ways to do this – more complicated ways (assuming different types of distributions for the data). Also, the same basic idea is followed if you want to fit some higher order polynomial. Warning, it gets complicated (algebraically) real quick.
2011b
(Quadrianto & Buntine, 2011) ⇒ Novi Quadrianto and Wray L. Buntine (2011). "Linear Regression" In: (Sammut & Webb, 2011) pp 747-750.
QUOTE: (1): Linear regression is an instance of the Regression problem which is an approach to modeling a functional relationship between input variables [math]\displaystyle{ x }[/math] and an output/response variable [math]\displaystyle{ y }[/math]. In linear regression, a linear function of the input variables is used, and more generally a linear function of some vector function of the input variables [math]\displaystyle{ \phi(x) }[/math]can also be used. The linear function estimates the mean of [math]\displaystyle{ y }[/math] (or more generally the median or a quantile).

QUOTE: (2): Formally, in a regression problem, we are interested in recovering a functional dependency [math]\displaystyle{ y_i = f(x_i ) +\epsilon_i }[/math] from [math]\displaystyle{ N }[/math] observed training data points [math]\displaystyle{ \{(x_i , y_i )\}_{i = 1}^N }[/math] , where [math]\displaystyle{ y\,\in \mathbb{R} }[/math] the noisy observed output at input location [math]\displaystyle{ x_i\,\in\, \mathbb{R}^d }[/math]. For the linear parametric technique, we tackle this regression problem by parameterizing the latent regression function f() by a parameter [math]\displaystyle{ w\,\in\,\mathbb{R}^H }[/math], that is, [math]\displaystyle{ f(x_i ) := \phi(x_i ), w }[/math] for [math]\displaystyle{ H }[/math] fixed basis functions [math]\displaystyle{ \{\phi_h (x_i )\}_{h = 1}^H }[/math] . Note that the function is a linear function of the weight vector [math]\displaystyle{ w }[/math]. The simplest form of the linear parametric model is when [math]\displaystyle{ \phi(x_i)=x_i\,\in \mathbb{R}^d }[/math], that is, the model is also linear with respect to the input variables, [math]\displaystyle{ f(x_i ) : = w_0 + w_1x_{i1} + \cdots + w_d x_{id} }[/math]. Here the weight [math]\displaystyle{ w_0 }[/math] allows for any constant offset in the data. With general basis functions such as polynomials, exponentials, sigmoids, or even more sophisticated Fourier or wavelets bases, we can obtain a regression function which is nonlinear with respect to the input variables although still linear with respect to the parameters.

Simple Linear Regression Task

References

2017a

2017b

2013

2011a

2011b

Navigation menu

Search