Negative Log-Likelihood
Jump to navigation
Jump to search
A Negative Log-Likelihood is a non-negative function based on the logarithm of the negation of a likelihood ratio.
- AKA: Negated Log-Likelihood.
- Context:
- It can be calculated as [math]\displaystyle{ LLR = ln (- \frac{(P(I \vert D)/P(\sim I \vert D))}{P(I)/P(\sim I)}) }[/math] where [math]\displaystyle{ P(I\vert D) }[/math] and [math]\displaystyle{ P(\sim I\vert D) }[/math] are the frequencies of interactions observed in the given dataset (D) between annotated genes sharing benchmark associations (I) and not sharing associations (~I), respectively, while P(I) and P(~I) represent the prior expectations (the total frequencies of all benchmark genes sharing the same associations and not sharing associations, respectively).
- …
- Counter-Example(s):
- See Discriminative Model, Generative Model, Negative Log-Likelihood Classification Loss, Classification Loss.
References
2011
- https://quantivity.wordpress.com/2011/05/23/why-minimize-negative-log-likelihood/
- QUOTE:
Why is minimizing the negative log likelihood equivalent to maximum likelihood estimation (MLE)?Or, equivalently, in Bayesian-speak:
Why is minimizing the negative log likelihood equivalent to maximum a posteriori probability (MAP), given a uniform prior?
- QUOTE:
2009
- http://www8.tfe.umu.se/forskning/Control_Systems/Courses/System_Identification/chapter7.pdf
- QUOTE: Negative log likelihood function: Suppose [math]\displaystyle{ Y_N = [y_{(1)}, ..., y_{(N)}] }[/math] is a Gaussian random vector such that [math]\displaystyle{ E{Y_N} = 0 }[/math] and [math]\displaystyle{ E {Y_N Y^T_N} = R_N(\theta) }[/math].
- 1. Show that negative log likelihood function when [math]\displaystyle{ Y_N }[/math] is observed is [math]\displaystyle{ V(Y_N; \theta) = const + \frac{1}{2} \log \det R_N(\theta) + \frac{1}{2} Y^T_N R^{−1}_N (\theta)Y_N. }[/math]
- QUOTE: Negative log likelihood function: Suppose [math]\displaystyle{ Y_N = [y_{(1)}, ..., y_{(N)}] }[/math] is a Gaussian random vector such that [math]\displaystyle{ E{Y_N} = 0 }[/math] and [math]\displaystyle{ E {Y_N Y^T_N} = R_N(\theta) }[/math].
2008
- (Lin et al., 2008) ⇒ Chih-Jen Lin, Ruby C. Weng, and S. Sathiya Keerthi. (2008). “Trust Region Newton Method for Logistic Regression.” In: The Journal of Machine Learning Research, 9.
- QUOTE: The logistic regression model is useful for two-class classification. Given data [math]\displaystyle{ \mathbf{x} }[/math] and weights [math]\displaystyle{ (\mathbf{w},b) }[/math], it assumes the following probability model :[math]\displaystyle{ P(y=±1 | \mathbf{x},\mathbf{w}) = \frac{1}{1 + exp(-y(\mathbf{w}^T \mathbf{x} + b)}, }[/math] where [math]\displaystyle{ y }[/math] is the class label. If training instances are [math]\displaystyle{ x_i }[/math], [math]\displaystyle{ i=1,...,l }[/math] and labels are [math]\displaystyle{ y_i \in {1,-1}, }[/math] one estimates </math>(\mathbf{w}; b)</math> by minimizing the negative log-likelihood: :[math]\displaystyle{ \operatorname{min}_{\mathbf{x},b} \sum_{i=1}^{l} \log (1 + e^{-y_i(w^T x_i + b)}) }[/math]