1998 StatisticalDataAnalysis

(Cowan, 1998) ⇒ Glen Cowan. (1998). “Statistical Data Analysis." Oxford University Press, ISBN:0198501560

Subject Headings: Statistical Data Analysis, ML Estimator, Tikhonov Regularization.

Notes

Cited By

Quotes

Book Overview

This book is a guide to the practical application of statistics to data analysis in the physical sciences. It is primarily addressed at students and professionals who need to draw quantitative conclusions from experimental data. Although most of the examples are taken from particle physics, the material is presented in a sufficiently general way as to be useful to people from most branches of the physical sciences. The first part of the book describes the basic tools of data analysis: concepts of probability and random variables, Monte Carlo techniques, statistical tests, and methods of parameter estimation. The last three chapters then develop more advanced statistical ideas, focusing on interval estimation, characteristic functions, and correcting distributions for the effects of measurement errors (unfolding).

Chapter 1 Fundamental concepts 1

1.1 Probability and random variables 1

1.2 Interpretation of probability 4

1.2.1 Probability as a relative frequency 4

1.2.2 Subjective probability 5

1.3 Probability density functions 7

1.4 Functions of random variables 13

1.5 Expectation values 16

1.6 Error propagation 20

1.7 Orthogonal transformation of random variables 22

Chapter 2 Examples of probability functions 26

2.1 Binomial and multinomial distributions 26

2.2 Poisson distribution 29

2.3 Uniform distribution 30

2.4 Exponential distribution 31

2.5 Gaussian distribution 32

2.6 Log-normal distribution 34

2.7 Chi-square distribution 35

2.8 Cauchy (Breit--Wigner) distribution 36

2.9 Landau distribution 37

Chapter 3 The Monte Carlo method 40

3.1 Uniformly distributed random numbers 40

3.2 The transformation method 41

3.3 The acceptance--rejection method 42

3.4 Applications of the Monte Carlo method 44

Chapter 4 Statistical tests 46

4.1 Hypotheses, test statistics, significance level, power 46

4.2 An example with particle selection 48

4.3 Choice of the critical region using the Neyman--Pearson lemma 50

4.4 Constructing a test statistic 51

=====4.4.1 [[Linear test statistic]]s, the Fisher discriminant function 51 =====

4.4.2 Nonlinear test statistics, neural networks 54

4.4.3 Selection of input variables 56

4.5 Goodness-of-fit tests 57

4.6 The significance of an observed signal 59

4.7 Pearson's chi^2 test 61

Chapter 5 General concepts of parameter estimation 64

5.1 Samples, estimators, bias 64

5.2 Estimators for mean, variance, covariance 66

Chapter 6 The method of maximum likelihood 70

6.1 ML estimators 70

6.2 Example of an ML estimator: an exponential distribution 72

6.3 Example of ML estimators: mu and sigma^2 of a Gaussian 74

6.4 Variance of ML estimators: analytic method 75

6.5 Variance of ML estimators: Monte Carlo method 76

6.6 Variance of ML estimators: the RCF bound 76

6.7 Variance of ML estimators: graphical method 78

6.8 Example of ML with two parameters 80

6.9 Extended maximum likelihood 83

6.10 Maximum likelihood with binned data 87

6.11 Testing goodness-of-fit with maximum likelihood 89

6.12 Combining measurements with maximum likelihood 92

6.13 Relationship between ML and Bayesian estimators 93

Chapter 7 The method of least squares 95

7.1 Connection with maximum likelihood 95

7.2 Linear least-squares fit 97

7.3 Least squares fit of a polynomial 98

7.4 Least squares with binned data 100

7.5 Testing goodness-of-fit with chi^2 103

7.6 Combining measurements with least squares 106

7.6.1 An example of averaging correlated measurements 109

7.6.2 Determining the covariance matrix 112

Chapter 8 The method of moments 114

Chapter 9 Statistical errors, confidence intervals and limits 118

9.1 The standard deviation as statistical error 118

9.2 Classical confidence intervals (exact method) 119

An alternative (and often equivalent) method of reporting the statistical error of a measurement is with a confidence interval, which was first developed by Neyman (Ney37 ]. Suppose as above that one has [math]\displaystyle{ n }[/math] observations of a random variable [math]\displaystyle{ x }[/math] which can be used to evaluate_an estimator [math]\displaystyle{ \hat{\theta}(x_l,...,x_n) }[/math] for a parameter [math]\displaystyle{ \theta }[/math], and that the value obtained is [math]\displaystyle{ \hat{\theta}_{obs} }[/math]. …

9.3 Confidence interval for a Gaussian distributed estimator 123

9.4 Confidence interval for the mean of the Poisson distribution 126

9.5 Confidence interval for correlation coefficient, transformation of parameters 128

9.6 Confidence intervals using the likelihood function or chi^2 130

9.7 Multidimensional confidence regions 132

9.8 Limits near a physical boundary 136

9.9 Upper limit on the mean of Poisson variable with background 139

Chapter 10 Characteristic functions and related examples 143

10.1 Definition and properties of the characteristic function 143

10.2 Applications of the characteristic function 144

10.3 The central limit theorem 147

10.4 Use of the characteristic function to find the p.d.f. of an estimator 149

10.4.1 Expectation value for mean lifetime and decay constant 150

10.4.2 Confidence interval for the mean of an exponential random variable 151

Chapter 11 Unfolding 153

11.1 Formulation of the unfolding problem 154

11.2 Inverting the response matrix 159

11.3 The method of correction factors 164

11.4 General strategy of regularized unfolding 165

11.5 Regularization functions 167

11.5.1 Tikhonov regularization 167

A commonly used measure of smoothness is the mean value of the square of some derivative of the true distribution. This technique was suggested independently by Phillips [Phi62] and Tikhonov [Tik63, Tik77], and is usually called Tikhonov regularization. If we consider the p.d.f. [math]\displaystyle{ f_{true}(y) }[/math] before being discretized as a histogram, then the regularization function is :[math]\displaystyle{ S[f_{true}(y)] = - … (11.41) }[/math] where the integration is over all allowed values of [math]\displaystyle{ y }[/math]. The minus sign comes from the convention taken here that we maximize [math]\displaystyle{ \phi }[/math] as defined by (11.40). That is, greater [math]\displaystyle{ S }[/math] corresponds to more smoothness. (Equivalently one can of course minimize a combination of regularization and log-likelihood functions with the opposite sign; this convention as well is often encountered in the literature.)

11.5.2 Regularization functions based on entropy 169

11.5.3 Bayesian motivation for the use of entropy 170

11.5.4 Regularization function based on cross-entropy 173

11.6 Variance and bias of the estimators 173

11.7 Choice of the regularization parameter 177

11.8 Examples of unfolding 179

11.9 Numerical implementation 184

;