Kernel Regression Task

A Kernel Regression Task is a Nonparametric Regression Task that is based on Kernel Density Estimation.

Context:
- Task Input:
  a N-observed Numerically-Labeled Training Dataset [math]\displaystyle{ D=\{(x_1,y_1,z_1,...),(x_2,y_2,z_2,...),\cdots(x_n,y_n,z_n,...)\} }[/math] that can be represented by
  - [math]\displaystyle{ \mathbf{Y} }[/math] response variable continuous dataset.
  - [math]\displaystyle{ \mathbf{X} }[/math] predictor variables continuous dataset.
- Task Output:
  - [math]\displaystyle{ Y^* }[/math], predited reponse variable values
- Task Requirements
  It requires to estimate the conditional expectation [math]\displaystyle{ \operatorname{E}(Y | X) }[/math] using a kernel density estimation system for finding the optimal nonparametric regression function of the form:
  [math]\displaystyle{ f(x)=\sum_{i=1}^n\alpha_i\kappa(x_i,x) }[/math]
  that can solve the [math]\displaystyle{ y_i=f(x_i)+ \epsilon_i }[/math].
  
  It may require a regression diagnostic test to determine goodness of fit of the regression model.

Example(s):
Nadaraya-Watson Kernel Regression Task.

Priestley-Chao Kernel Estimation Task.

Gasser-Muller Kernel Estimation Task.

…

Counter-Example(s):
A Linear Regression Task.

A Regularized Linear Regression Task.

A Linear Least-Squares Regression Task.

See: Conditional Expectation, Non-Parametric, Random Variable, Nonparametric Regression, Gaussian Process Regression.

References
2017a
(Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Kernel_regression Retrieved:2017-8-27.
Kernel regression is a non-parametric technique in statistics to estimate the conditional expectation of a random variable. The objective is to find a non-linear relation between a pair of random variables X and Y.
In any nonparametric regression, the conditional expectation of a variable [math]\displaystyle{ Y }[/math] relative to a variable [math]\displaystyle{ X }[/math] may be written: [math]\displaystyle{ \operatorname{E}(Y | X) = m(X) }[/math] where [math]\displaystyle{ m }[/math] is an unknown function.
2017b
(Zhang, 2017) ⇒ Xinhua Zhang (2017) "Kernel Methods" in "Encyclopedia of Machine Learning and Data Mining" (2017) pp pp 690-695
QUOTE: Kernel Function Classes
Many machine learning algorithms can be posed as functional minimization problems, and the candidate function set is chosen as the RKHS. The main advantage of optimizing over an RKHS originates from the representer theorem.
Theorem 2 (representer theorem) Denote by [math]\displaystyle{ \Omega:[0,\infty) \to \mathbb{R} }[/math] a strictly monotonic increasing function, by [math]\displaystyle{ \mathcal{X} }[/math] a set, and by [math]\displaystyle{ c:(\mathcal{X}\times \mathbb{R}^2)^n \to \mathbb{R} \cup \{\infty\} }[/math] an arbitrary loss function. Then each minimizer [math]\displaystyle{ f \in \mathcal{H} }[/math] of the regularized risk functional
[math]\displaystyle{ c((x_1, y1, f(x_1)),\cdots, (x_n, y_n, f(x_n)))+\Omega(\parallel f \parallel^2_\mathcal{H}) }[/math]
admits a representation of the form
[math]\displaystyle{ f(x)=\sum_{i=1}^n\alpha_i\kappa(x_i,x) \quad\quad(1) }[/math]
The representer theorem is important in that although the optimization problem is in an infinite-dimensional space [math]\displaystyle{ \mathcal{H} }[/math], the solution is guaranteed to lie in the span of [math]\displaystyle{ n }[/math] particular kernels centered on the training points.
The objective (1) is composed of two parts: the first part measures the loss on the training set [math]\displaystyle{ \{x_i,y_i\}_{i=1}^n }[/math]which depends on [math]\displaystyle{ f }[/math] only via its value at [math]\displaystyle{ x_i }[/math] . The second part is the regularizer, which encourages small RKHS norm of [math]\displaystyle{ f }[/math]. Intuitively, this regularizer penalizes the complexity of [math]\displaystyle{ f }[/math] and prefers smooth [math]\displaystyle{ f }[/math] . When the kernel [math]\displaystyle{ \kappa }[/math] is translation invariant, i.e., [math]\displaystyle{ \kappa(x_1, x_2) = h (x_1-x_2) }[/math] Smola et al. (1998) showed that [math]\displaystyle{ \parallel f \parallel^2 }[/math] is related to the Fourier transform of [math]\displaystyle{ h }[/math], with more penalty imposed on the high-frequency components of [math]\displaystyle{ f }[/math].
2015
(Chiu et al., 2015) ⇒ Chiu, S. J., Allingham, M. J., Mettu, P. S., Cousins, S. W., Izatt, J. A., & Farsiu, S. (2015). "Kernel regression based segmentation of optical coherence tomography images with diabetic macular edema". Biomedical optics express, 6(4), 1172-1194. DOI: 10.1364/BOE.6.001172
ABSTRACT: We present a fully automatic algorithm to identify fluid-filled regions and seven retinal layers on spectral domain optical coherence tomography images of eyes with diabetic macular edema (DME). To achieve this, we developed a kernel regression (KR)-based classification method to estimate fluid and retinal layer positions. We then used these classification estimates as a guide to more accurately segment the retinal layer boundaries using our previously described graph theory and dynamic programming (GTDP) framework. We validated our algorithm on 110 B-scans from ten patients with severe DME pathology, showing an overall mean Dice coefficient of 0.78 when comparing our KR + GTDP algorithm to an expert grader. This is comparable to the inter-observer Dice coefficient of 0.79. The entire data set is available online, including our automatic and manual segmentation results. To the best of our knowledge, this is the first validated, fully-automated, seven-layer and fluid segmentation method which has been applied to real-world images containing severe DME.

Kernel Regression Task

References

2017a

2017b

2015

Navigation menu

Search