Symmetrization Lemma

From GM-RKB
(Redirected from Basic Lemma)
Jump to navigation Jump to search

A Symmetrization Lemma is Lemma that applies to the symmetrization of nondecreasing, convex measurable functions.



References

2020

  • (Wikipedia) ⇒ https://en.wikipedia.org/wiki/Vapnik–Chervonenkis_theory#Symmetrization Retrieved: 2020-03-12.
    • The majority of the arguments of how to bound the empirical process, rely on symmetrization, maximal and concentration inequalities and chaining. Symmetrization is usually the first step of the proofs, and since it is used in many machine learning proofs on bounding empirical loss functions (including the proof of the VC inequality which is discussed in the next section) it is presented here.

      Consider the empirical process:

      [math]\displaystyle{ f \mapsto (\mathbb{P}_n - P)f = \dfrac{1}{n} \sum_{i = 1}^n (f(X_i) - Pf) }[/math]

      Turns out that there is a connection between the empirical and the following symmetrized process:

      [math]\displaystyle{ f \mapsto \mathbb{P}^0_n f = \dfrac{1}{n} \sum_{i = 1}^n \varepsilon_i f(X_i) }[/math]

      The symmetrized process is a Rademacher process, conditionally on the data [math]\displaystyle{ X_i }[/math]. Therefore, it is a sub-Gaussian process by Hoeffding's inequality.

      Lemma (Symmetrization). For every nondecreasing, convex Φ: RR and class of measurable functions [math]\displaystyle{ \mathcal{F} }[/math],

      [math]\displaystyle{ \mathbb{E} \Phi (\|\mathbb{P}_n - P\|_{\mathcal{F}}) \leq \mathbb{E} \Phi \left(2 \left \|\mathbb{P}^0_n\right \|_{\mathcal{F}} \right) }[/math]

      The proof of the Symmetrization lemma relies on introducing independent copies of the original variables [math]\displaystyle{ X_i }[/math] (sometimes referred to as a ghost sample) and replacing the inner expectation of the LHS by these copies. After an application of Jensen's inequality different signs could be introduced (hence the name symmetrization) without changing the expectation. The proof can be found below because of its instructive nature.

2017

2010