2008 ADictionaryOfStatistics COO

From GM-RKB
Jump to navigation Jump to search

{{#ifanon:|





Cook, (Ralph) Dennis: (1944- ;b. Williston, ND) American Statistician. After a BS at Northern Montana College, Cook gained his PhD in 1971 from Kansas State U and joined the faculty of Applied Statistics at U Minnesota. His research interests include *experimental design, *regression diagnostics, and dimension reduction. He was the *COPSS *Fisher Lecturer in 2005.


Cook's Statistic: See REGRESSION DIAGNOSTICS.


Cophenetic Correlation: The correlation between the distances at which a pair of observations are joined in a *dendrogram and the *dissimilarity (or similarity) values for that pair. It is a measure of how successful *cluster analysis has been in partitioning the *data.


COPPS: The Committee of Presidents of Statistical Societies (COPSS) aims to promote the common interests of the societies involved, which are the *American Statistical Association, the *Institute of Mathematical Statistics, the *Statistical Society of Canada, and the two North American branches of the *International Biometrics Society. Its most prestigious prize is the *Fisher Lectureship.


Copula: A function that relates a joint *cumulative distribution function to the distribution functions ofthe individual variables. If the individual distribution functions are known, but the joint distribution is unknown, then a copula can be used to suggest a suitable form for the joint distribution.

Let F be the multivariate distribution function for the random variables [math]\displaystyle{ X_l, X_2, ..., X_n }[/math], and let the cumulative disuibution ftmction of [math]\displaystyle{ X_j }[/math] be [math]\displaystyle{ F_j }[/math](for all [math]\displaystyle{ j }[/math]). Define random variables [math]\displaystyle{ U_l, U_2, ..., U_n }[/math] by [math]\displaystyle{ U_j = F_j(X_j) }[/math] for each [math]\displaystyle{ j }[/math], so that the marginal distribution of each [math]\displaystyle{ U_j }[/math] has a continuous uniform distribution in the interval (0, 1). Assume that for each value [math]\displaystyle{ u_j }[/math] there is a unique value [math]\displaystyle{ x_j = F^{-1}(u_j) }[/math] and let the joint cumulative distribution function of [math]\displaystyle{ U_1, U_2, ..., U_n }[/math] be C. Then

[math]\displaystyle{ C(u_1, u_2, ..., u_n) = P(U_j \lt u_j \text {for all} j) = F\{F_1^{-1}(u_1), F_2^{-1}(u_2), ..., F_n^{-1}(u_n)\} }[/math]

for all [math]\displaystyle{ u_1, u_2, ..., u_n) }[/math] in (0, 1), since [math]\displaystyle{ U_j \lt u_j }[/math] if and only if [math]\displaystyle{ X_j \lt F_j^{-1}(u_j) }[/math]. The function C is called the copula. An equivalent equation to the above is

[math]\displaystyle{ C\{F_1(x_1), F_2(x_2), ..., F_n(x_n) = F(x_1, x_2, ..., x_n), }[/math]

for all [math]\displaystyle{ x_1, x_2, ..., x_n }[/math], where [math]\displaystyle{ u_j = F_j(x_j) }[/math] for each [math]\displaystyle{ j }[/math]. Sklar's theorem formulated by Abe Sklar of the Illinois Institute of Technology published in 1959, states that, for a given F, there is a unique C such that this equation holds. Note that it may well be that it is not possible to express the inverse functions [math]\displaystyle{ F_j^{-1} }[/math]' in a simple form (an example is the multivariate normal distribution).

Assuming that the copula and the marginal distribution functions are differentiable, the corresponding result for probability density functions is that

[math]\displaystyle{ f(x_1, x_2, ..., x_n) = c\{F_1(x_1), F_2(x_2), ..., F_n(x_n)\}f_1(x_1)f_2(x_2) ..., f_n(x_n) }[/math]

The trivial case where [math]\displaystyle{ c\{F_1(x_1), F(x_2), ..., F(x_n) = 1 }[/math]corresponds to the case where the [math]\displaystyle{ n X }[/math]-variables are *independent Thus the copula encapsulates the interdependencies between the X-variables and is therefore also known as the dependence function. If [math]\displaystyle{ c(u_1, u_2, ..., u_n) }[/math] is the [[joint probability density function of [math]\displaystyle{ U_1, U_2 ..., U_n }[/math]then

[math]\displaystyle{ c(u_1, u_2, ..., u_n) = f(x_1, x_2, ..., x_n)/ \{f_1(x_1)f_2(x_2) ..., f_n(x_n)\} }[/math]

where [math]\displaystyle{ x_j = F_j^{-1}(u_j) }[/math] for each [math]\displaystyle{ j }[/math].


Cornish, Edmund Alfred: (1909-73; b. Perth, Australia; d. Adelaide, Australia) Australian statistician and biometrician. Cornish studied agricultural biochemistry at U Melbourne, graduating in 1931. His first job was at an agricultural research institute where he was confronted by the need for statistics. His earliest work was concerned with the 23-year rainfall Cycle in Adelaide (related to the well-known sunspot cycle). In 1937, at his own expense, he visited Sir Ronald Fisher in England. This visit initiated fundamental work on approximations to *distributions Subsequently he headed the CSIRO, Australia’s largest scientific and industrial research agency. He was President of the *IBS from 1970 to 1972.


Cornish-Fisher Expansion: A form of the *Edgeworth expansion, introduced by *Cornish and Sir Ronald Fisher in 1937. In its most-used inverse form, it relates the *cumulative distribution function of a normal distribution to some distribution of interest.

Denote the 100,p *percentiles of the normal distribution and of the distribution of interest by [math]\displaystyle{ x_p }[/math] and [math]\displaystyle{ u_p }[/math], respectively, and let [math]\displaystyle{ k }[/math], be the rth *cumulant of the distribution of interest. Then, for all [math]\displaystyle{ p }[/math],

[math]\displaystyle{ x_p = u_p + \frac{1}{6}(u_p - 1)k_3 + \frac{1}{24}(u_p^3 - 3u_p)k_4 - \frac{1}{36}(2u_p^3 - 5u_p)k_3^2 }[/math]
[math]\displaystyle{ + \frac{1}{120}(u_p^4 - 6u_p^2 + 3)k_5 - \frac{1}{24}(u_p^4 - 5u_p^2 + 2)k_3k_4 }[/math]
[math]\displaystyle{ + \frac{1}{324}(12u_p^4 - 53u_p^2 + 17)k_3^3 = \frac{1}{720}(u_p^5 - 10u_p^3 + 15u_p)k_6 - … . }[/math]

Corrected moment: See MOMENT.


Correlated variables: *Variables that display a non-zero correlation Correlated variables are not statistically independent.


correlation: A general term used to describe the fact that two (or more) ‘variables are related. Galton, in 1869, was probably the first to use tl1e term in this way [as ‘co-relation']. Usually the relation is not precise. For example, we would expect a tall person to weigh more than a short person of the same build, but there will be exceptions. Although the word 'correlation' is used loosely to describe the existence of some general relationship, it has a more specilic meaning in the context of linear relations between variables (See CORRELATION COEFFICIENT).


correlation coefficient, population: A measure of the *linear dependence of one numerical random variable on another. The phrase "coefficient of correlation" was apparently originated by *Edgeworth in 1892. It is usually denoted by [math]\displaystyle{ \rho }[/math] (rho). The value of [math]\displaystyle{ \rho }[/math], which lies between -1 and 1. inclusive, is defined as the ratio of the *covariance to the square root of the product of the variances of the *marginal distributions of the individual variables:

[math]\displaystyle{ \rho = {Cov(X, Y) \over \sqrt{Var(X)Var(Y)} }. }[/math]

If the correlation coefficient between the random variables [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] is equal to 1 or - 1 then this implies that [math]\displaystyle{ Y = a + bX }[/math] where [math]\displaystyle{ a }[/math] and [math]\displaystyle{ b }[/math] are constants. If [math]\displaystyle{ b }[/math] is positive then [math]\displaystyle{ \rho = 1 }[/math] and if b is negative then [math]\displaystyle{ \rho = - 1 }[/math]. The converse statements are also true.

If [math]\displaystyle{ X \text{and} Y }[/math] are completely unrelated (ie. are *independent] then [math]\displaystyle{ \rho =0 }[/math]. If[math]\displaystyle{ \rho = 0 }[/math] then [math]\displaystyle{ X \text{and} Y }[/math] are said to be uncorrelated. However, [math]\displaystyle{ \rho }[/math] is concerned only with linear relationships, and the fact that [math]\displaystyle{ \rho =0 }[/math] does not imply that [math]\displaystyle{ X \text{and} Y }[/math] are independent.


correlation coefficient, sample (Product-Moment Correlation Coefficient; Pearson's Correlation Coefficient): If the [math]\displaystyle{ n }[/math] pairs of values of random variables [math]\displaystyle{ X \text{and} Y }[/math] in a random sample are denoted by [math]\displaystyle{ (x_l, y_1), (x_2, y_2), ..., (x_n, y_n), }[/math]the sample correlation coefficient [math]\displaystyle{ r }[/math] is given by

[math]\displaystyle{ r = {S_{xy} \over \sqrt{S_{xx}S_{yy} } } }[/math]

where

[math]\displaystyle{ S_{xy} = \sum_{j=1}^{n}x_jy_j - \frac{1}{n}\bigg (\sum_{j=1}^{n}x_j \bigg ) \bigg (\sum_{j=1}^{n}y_j \bigg ), }[/math]
[math]\displaystyle{ S_{xx} = \sum_{j=1}^{n}x_j^2 - \frac{1}{n}\bigg( \sum_{j=1}^{n}x_j \bigg )^2, }[/math]

and [math]\displaystyle{ S_{yy} }[/math] is defined analogously to [math]\displaystyle{ S_{xx} }[/math]. If the sample means are denoted by [math]\displaystyle{ \bar x }[/math] and [math]\displaystyle{ \bar y }[/math], alternative definitions are

[math]\displaystyle{ S_{xy} = \sum_{j=1}^{n}x_jy_j - n\bar x\bar y, }[/math]
[math]\displaystyle{ S_{xx} = \sum_{j=1}^{n}x_j^2 - n\bar x^2 }[/math]

The coefficient [math]\displaystyle{ r }[/math] can take any value from - 1 to 1, inclusive. When increasing values of one variable are accompanied by generally increasing values of the other variable then [math]\displaystyle{ r \gt 0 }[/math] and the variables are said to display positive correlation. If [math]\displaystyle{ r \lt 0 }[/math] then the variables display negative correlation. The idea of correlation was put forward by Galton in 1869, and it was Galton who was the first to denote it by the symbol r in 1888. The formulae given here were introduced by Karl Pearson in 1896.

The sample correlation coefficient [math]\displaystyle{ r }[/math] is an estimate of the population correlation coefficient [math]\displaystyle{ \rho }[/math]. Correlation is closely linked to *linear regression. If the least squares regression lines of [math]\displaystyle{ y }[/math] on [math]\displaystyle{ x }[/math] and of [math]\displaystyle{ x }[/math] on [math]\displaystyle{ y }[/math] for the sample [math]\displaystyle{ (x_l, y_1), (x_2, y_2) ..., (x_n, y_n) }[/math], are, respectively, [math]\displaystyle{ y = a + bx }[/math] and [math]\displaystyle{ x = c + dy }[/math] then [math]\displaystyle{ r^2 = bd }[/math].

In a hypothesis test, to test for significant evidence of a linear relationship between [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] we compare the null hypothesis that [math]\displaystyle{ \rho = 0 }[/math] with the alternative hypothesis that [math]\displaystyle{ \rho \neq 0 }[/math], rejecting the null hypothesis if [math]\displaystyle{ |r| }[/math] is too large. Tables of critical values, under the assumption of normality, are given in Appendix XIII. If this assumption cannot be made, then a rank correlation coefficient may be preferred.

See also COEFFICIENT or DETERMINATION.

(Figure) Sample correlation coefficient. When the correlation between two variables is positive, the values of one variable generally rise as the values of the other variable rise. The correlation is negative if the values of one variable generally rise as the values of the other fall.


correlation matrix: A square symmetric *matrix in which the element in row [math]\displaystyle{ j }[/math] and column [math]\displaystyle{ k }[/math] is equal to the correlation coefficient between random variables [math]\displaystyle{ X_j }[/math] and [math]\displaystyle{ X_k }[/math]. The diagonal elements are each equal to 1.


correlogram: See AUTOCORRELATION.


correspondence analysis: A technique, originating in the 1930s, that results in the display of *data from a *contingency table in a *scatter diagram that includes points representing row categories and points representing column categories. If row points are positioned near each other in the diagram then this implies that the pattems of *counts along those rows are very similar. The same applies for groups of column points. If a row point and a column point are positioned close to one another then this implies a positive *association between the two. The calculations involved resemble those for *principal components analysis.


count: A synonym for *frequency


countable (Denumerable): A set is countable if its members can be listed as a linite or infinite sequence, [math]\displaystyle{ x_l, x_2 ... }[/math]. The rational numbers are countable, but the irrational numbers, even in a finite interval, are not.


counting process: A stochastic process, [math]\displaystyle{ X_l, X_2, ..., }[/math] in which [math]\displaystyle{ X_t }[/math], is the number of *events (for some definition of an event) that have occurred by time [math]\displaystyle{ t }[/math]. One example is a *Poisson process.


coupon-collecting distribution: See ARFWEDSON DISTRIBUTION.


covariance: The covariance of two random variables is the difference between the expected value of their product and the product of their separate expected values. For random variables [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y, }[/math]

[math]\displaystyle{ Cov(X, Y) = E(XY) - E(X) \times E(Y) }[/math]

If [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] are *independent then [math]\displaystyle{ Cov(X, Y) = 0 }[/math]. However, if [math]\displaystyle{ Cov(X, Y) = 0 }[/math] then [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] may not be independent. A useful result is

[math]\displaystyle{ Var(aX + bY) = a^2Var(X) + 2abCov(X, Y) + b^2Var(Y), }[/math]

where Var denotes variance, and a and b are constants. The term 'covariance' was used by Sir Ronald Fisher in 1930. See also CORRELATION.


covariance matrix: Alternative name for variance-covariance matrix.


covariance models: See ANOCOVA.


covariate (Control Variable): A *variable that has an effect that is of no direct interest. The analysis of the variable of interest is made more accurate by controlling for variation in the covariate.


covariogram: See AUTOCORRELATION.


Cox, Sir David Roxbee: (1924- ;b. Birmingham, England) English statistician knighted for his services to Statistics. Cox was an undergraduate at Cambridge U and gained his doctorate at Leeds U. After employment in the Royal Aircraft Establishment and the Wool Industries Research Establishment, he joined the Statistics faculty at Cambridge U. In 1955 he moved to Birkbeck College, London and in 1966 he was appointed Professor of Statistics at IC. He was Warden of Nuffield College, Oxford from 1989 to 1994. He was Editor of *Biometrika from 1965 to 1991. He *IMS *Rietz Lecturer in 1973 and *Wald Lecturer in 1990. In 1989 he was the *COPSS *Fisher Lecturer. He was President of the *Bernoulli Society in 1979 and of the *International Statistical Institute (ISI) in 1995. He was President of the *RSS in 1980, having received its *Guy Medal in Silver in 1961, and its Guy Medal in Gold in 1973. He was elected FRS in 1973 and was knighted in 1985. He was elected to membership of NAS in 1998 and to Honorary Life Membership of the *IBS in 2001. He is also an Honorary Life Member of the ISI.


Cox, Gertrude Mary: (1900-78; b. Dayton, IA; d. Durham, NC) American biometrician. Cox initially intended to be a deaconess in the Methodist Episcopal Church and did not start her studies at Iowa State College until 1927. By 1933 however she was a faculty member specializing in *experimental design. In 1940 she became head of the new Department of Experimental Statistics in the School of Agriculture at North Carolina State College. She was a pioneer in the use of computer programs, with her staff developing many of the early *SAS algorithms. She was joint author, with William *C0chran, of the statistical classic Experimental Desgns. In 1956, she was President of the *ASA. She was founding Editor of the joumal *Biometrics in 1945, remaining Editor until 1955. She was President of the *IBS in 1968 having been elected an Honorary Life Member in 1964. She was elected to the NAS in 1975.


Cox-Mantel test: A non-parametric test for comparing two *survival curves, which results from the work of Sir David *Cox and Nathan *Mantel. Denote the total number of deaths in the second group by [math]\displaystyle{ D_2 }[/math], and let the ordered survival times of the combined group be [math]\displaystyle{ t_{(1)} \lt t_{(2)} \lt … t_{(k)} }[/math]. The test statistic,[math]\displaystyle{ C }[/math] is given by[math]\displaystyle{ C = U / \sqrt{I} }[/math] where

[math]\displaystyle{ U = D_2 - \sum_{j=1}^{k}m_{(j)}p_{(j)} }[/math]
[math]\displaystyle{ I = \sum_{j=1}^{k} {m_{(j)}(d_{(j)} - m_{(j)}) \over d_{(j)} - 1} p_{(j)}(1 - p_{(j)} ) }[/math]

and [math]\displaystyle{ m_(j) }[/math] is the number of survival times equal to [math]\displaystyle{ t_{(j)} }[/math], [math]\displaystyle{ d_{(j)} }[/math] is the total number of individuals who died [or were censored) at time [math]\displaystyle{ t_{(j)} }[/math], and [math]\displaystyle{ p_{(j)} }[/math] is the proportion of these who were in the second group. If the diiferences between the survival curves are attributable to random variation then [math]\displaystyle{ C }[/math] has a standard normal distribution.


Cox Process (Doubly Stochastic Point Process): A *Poisson process, introduced in 1955 by Sir David *Cox, in which the mean is not constant but varies randomly in space or time.


Cox Regression Model: A *model, proposed by Sir David *Cox in 1972, for the lifetime of, for example, industrial components or medical patients. The model has the form

[math]\displaystyle{ ln\{h(t)\} = \alpha + x'\boldsymbol{\beta}, }[/math]

where h[math]\displaystyle{ (t) }[/math] is the hazard rate at time [math]\displaystyle{ t, \alpha }[/math], is a *parameter, [math]\displaystyle{ \beta }[/math] is a column *vector of slope parameters, and [math]\displaystyle{ x' }[/math] is a row vector of values of *background variables.


Cox-Snell [math]\displaystyle{ R^2 }[/math]: See COEFFICIENT OF DETERMINATION.


Cox-Snell Residuals: Residuals introduced in 1968 by Sir David *Cox and E. Ioyce Snell, for assessing the validity of a *survivor function that has been proposed for a set of survival *data. The value of the survivor function depends on the time, [math]\displaystyle{ t }[/math] and on one or more *parameters estimated by the *vector [math]\displaystyle{ \theta }[/math]. The Cox-Snell residual [math]\displaystyle{ r_j }[/math], corresponding to time [math]\displaystyle{ t_j }[/math], is given by

[math]\displaystyle{ r_j = - \ln \{S(t_j); \boldsymbol{\hat\theta} \} }[/math]

where [math]\displaystyle{ S(t_j)\boldsymbol{\hat\theta} }[/math] is the value of the estimated survivor function at time Q. If the model is correct, then the residuals should have an *experiential distribution with mean 1.


[math]\displaystyle{ C_p }[/math]: See MALLOWS [math]\displaystyle{ C_p }[/math]


Cramér, Carl Harald: (1893-1985; b. Stockholm, Sweden; d. Stockhohn, Sweden) Swedish mathematical statistician who spent his entire career at Stockholm U. He entered as a student in 1912 and retired as its President in 1961. His research centred on probability, risk theory, and the mathematical underpinnings of Statistics. His best known work is Mathematical Methods in Statistics, published in 1945. He was the 1953 *IMS Rietz Lecturer. He received the *Guy Medal in Gold of the *RSS in 1972, and was elected to the NAS in 1984.


Cramér-Rao Inequality; Cramér-Rao Lower Bound: See FISHER'S INFORMATION.


Cramér's V: See CHI-SQUARED TEST.


Cramér-Von Mises Test: An alternative to the Kolmogorov-Smirnov test for testing the *hypothesis that a set of *data come from a specified *continuous distribution. The *test was suggested independently by *Cramér in 1928 and *von Mises in 1931. The test statistic [math]\displaystyle{ W }[/math] (sometimes written as [math]\displaystyle{ W^2 }[/math]) is formally defined by

[math]\displaystyle{ W = n \int_{- \infty}^\infty \{F_n(x) - F_0(x) \}^2 f_0(x)dx, }[/math]

where [math]\displaystyle{ F_0(x) }[/math] is the distribution function specified by the null hypothesis, [math]\displaystyle{ F_n(x) }[/math] is the sample distribution function, and [math]\displaystyle{ f_0(x) = f_0'(x) }[/math]. In practice the statistic is calculated using

[math]\displaystyle{ W = \frac{1}{12n} + \sum_{j=1}^{n} \bigg(t_j - {2j - 1 \over 2n } \bigg)^2, }[/math]

where

[math]\displaystyle{ t_j = F_0(x_{(j)}) }[/math]

and [math]\displaystyle{ x_(j) }[/math] is the jth ordered observation [math]\displaystyle{ x_(1) \leq x_{(2)} \leq … \leq x_{(n)} }[/math].

The test has been adapted for use with discrete random variables, for cases where *parameters have to be estimated hom the data, and for comparing two samples. A nlodilication leads to the *Anderson-Darling test.


Craps: A game played with two dice. A roll totalling 2, 3, or 12 is a loss. A total of 7 or 11 is a win. If any other total ([math]\displaystyle{ t }[/math], say) is obtained then the dice are rolled repeatedly until either another total of tis obtained (win), or a 7 is obtained (lose). The probability of a win is [math]\displaystyle{ 244 / 495 \approx 0.493 }[/math], very close to but (for a betting person] worryinglya half.


Credibility Theory: When alternative views of the future are presented (for example, views concerning the total claims to be met by an insurance company in the next year) then there is a need to take a weighted average of these views. Credibility theory is concerned with the optimal development of the weights to be used.


Critical Path Analysis: A method of analysis aimed to schedule a set of tasks so that they are all completed in the shortest possible overall time. The difliculty is that the various tasks take different lengths of time to complete and that each task cannot be started until certain prerequisite other tasks have been completed. Ses also NETWORK FLOW PROBLEM.


Critical Region (Rejection Region): The set of values of the *statistic, in a hypothesis test, which lead to rejection of the null hypothesis. The phrase was introduced by *Neyman and Egon *Pearson in 1933.


Critical Value: An end point of a critical region. In a hypothesis test, comparison of the value of a test statistic with the appropriate critical value determines the result of the test. For example, 1.96 is the critical value for a two-tailed test in the case of a normal distribution and a 5% significancelevel; thus if the test statistic [math]\displaystyle{ z }[/math] is such that [math]\displaystyle{ |z| \gt 1.96 }[/math] then the alternative hypothesis is accepted in preference to the null hypothesis.


Cronbach, Lee Joseph: (1916-2001; b. Fresno, CA: d. Palo Alto, CA) American psychologist. Cronbach graduated from Fresno State College in 1934, gaining his MS from UCB in 1934 and his PhD in educational psychology from U Chicago in 1940. After posts at several universities, he joined the faculty at Stanford U in 1964, where he became Professor of Education. His classic book Essentials of Psychological Testing was published in 1949, with a 5th edition in 1990. He was elected to the NAS in 1974.


Cronbach's Alpha: See RELIABILITY.


Cross-Classification: See CONTINGENCY TABLE.


Cross-Correlation: The correlation between a selected set of *data and the corresponding set displaced in time and/or space.


Crossed Esign: An *experimental design in which every *level of one *variable occurs in combination with every level of every other variable. An example is a *randomized block design, in which each *treatment occurs once within each block. See also NESTED DESIGN.


Cross-Fratar Procedure: See DEMING-STEPHAN ALGORITHM.


Crossover TRial: An *experimental design in which each *experimental unit is used with each *treatment being studied. The simplest crossover trial uses two groups of experimental units (e.g. hospital patients), 1 and 2, and two treatments (e.g. medicines), A and B. The trial uses two equal-length time periods. In the first period, group 1 is assigned treatment A and group 2 is assigned treatment B. In the second period, the assignments are reversed. Balaam’s design uses four groups of patients to compare two treatments in two periods. The assignments are AA, AB, BA. and BB. However, a complication in all crossover trials, unless there is a protracted gap between the two periods, is that the treatment allocated in the first period may continue to have an effect (the carry-over effect) during the second period. More complex allocations aim to estimate these effects. An example-involving two treatments, four groups, and three time periods- is designed to help with the estimation of the carry-over effects:

[math]\displaystyle{ \begin{matrix} & Group 1 & Group 2 & Group 3 & Group 4 \\ \hline Time 1 & A & A & B & B\\ Time 2 & B & B & A & A\\ Time 3 & A & B & A & B \end{matrix} }[/math]

The analysis of such a design is not simple; from the statistician’s viewpoint (though not the patient's) it would be preferable to minimize the carry-over effects by allowing an interval (the wash-out period) between successive treatments.


Cross-Sectional Study: A study of a human population by means of a sample that includes representatives of all sections of society. An alternative to a Longitudinal Study.


Cross-Tabulation: See CONTINGENCY TABLE.


Cross-Validation: A method of assessing the accuracy and validity of a statistical *model. The available data set is divided into two parts. Modelling of the data uses one part only. The model selected for this part is then used to predict the values in the other part of the data. A valid model should show good predictive accuracy.


Cube Law: An empirical law relating to the outcome of two-party multiple constituency elections. It states that if the votes gained by the parties are in the ratio [math]\displaystyle{ p }[/math] to [math]\displaystyle{ (1 - p) }[/math], then the numbers of constituencies won by the parties will be in the ratio [math]\displaystyle{ p^3 }[/math] to [math]\displaystyle{ (1 - p)^3 }[/math]. A slight majority of votes (e.g. [math]\displaystyle{ p }[/math] = 55%) leads to a much larger imbalance in constituencies won, since [math]\displaystyle{ 0.55^3/(0.55^3 + 0.45^3) = 65 \text{%} }[/math]. In the United Kingdom the law worked well for many years, though more recently the imbalance has been less extreme.


Cubic Regression Model: See MULTIPLE REGRESSION MODEL.


Cumulant-Generating Function (CGF): See CUMULANT.


Cumulant: An alternative to a moment as part of a summary of the form of a *distribution. If the moment-generating function of a distribution exists, then its natural logarithm ts and is called the cumulant - generating function (cgf). The coefficient of [math]\displaystyle{ t^r/r! }[/math] in the *Taylor expansion ofthe cgf is called the rth cumulant and is denoted by ir, (where at is kappa). The cumulants can be expressed in terms of the central moments, and vice versa. In particular, denoting the mean by [math]\displaystyle{ p }[/math], and the central moments by [math]\displaystyle{ \mu_2, \mu_3, ..., }[/math]

[math]\displaystyle{ \kappa_1 = \mu }[/math]
[math]\displaystyle{ \kappa_2 = \mu_2 }[/math]
[math]\displaystyle{ \kappa_3 = \mu_3 }[/math]
[math]\displaystyle{ \kappa_4 = \mu_4 - 3\mu_2^2 }[/math]
[math]\displaystyle{ \kappa_5 = \mu_5 - 10\mu_3\mu_2 }[/math]

For an example of the application of cumulants, see CORNISH-FISHER expansion.


Cumulative Distribution Function (CDF; Distribution Function): cumulative distribution function (calf: distribution function) The function F, for a random variable K defined for all real values of x by

[math]\displaystyle{ F(x) = P(X \leq x). }[/math]

Clearly, [math]\displaystyle{ F(-\infty) = 0 }[/math], and [math]\displaystyle{ F(\infty) = l }[/math], where F(- \infity] and [math]\displaystyle{ F(-\infty) }[/math] are the limits of [math]\displaystyle{ F(x) }[/math] as x tends to [math]\displaystyle{ F(-\infty) }[/math] and [math]\displaystyle{ F(\infty) }[/math], respectively. This function is a non-decreasing function such that if [math]\displaystyle{ x_2 \gt x_1 }[/math] then [math]\displaystyle{ F(x_2) \geq F(x_1) }[/math]. If [math]\displaystyle{ X }[/math]is a continuous random variable then [math]\displaystyle{ F }[/math] is a continuous function, and conversely. If [math]\displaystyle{ X }[/math] has probability density function [math]\displaystyle{ f }[/math] then

[math]\displaystyle{ F(x) = \int{-\infty}{x} f(t)dt }[/math]

and [math]\displaystyle{ f(x) = F'(x), }[/math] where [math]\displaystyle{ F'(x), }[/math] denotes the derivative of [math]\displaystyle{ f(x). }[/math]

A useful property of F is that, for any value of [math]\displaystyle{ x }[/math] there is a corresponding value [math]\displaystyle{ u, 0 \leq u \leq 1 }[/math], such that

[math]\displaystyle{ u = F(x) }[/math].

In the case where F is a continuous and increasing function for [math]\displaystyle{ a\leq x \leq b }[/math], the random variable U defined by [math]\displaystyle{ U = F(X) }[/math] has a continuous uniform distribution on the interval [math]\displaystyle{ 0 \leq u \leq 1 }[/math], and, for a given value of U, the corresponding value of [math]\displaystyle{ X }[/math] is given by [math]\displaystyle{ F^{-1}(U) }[/math]. See SIMULATION.

In the case of a discrete random variable, the distribution function is a step function, in which the step at [math]\displaystyle{ x_j }[/math], and [math]\displaystyle{ P(X = x_j) }[/math] and [math]\displaystyle{ F(x) \rightarrow F(x_j)\text{as} x \rightarrow x_j }[/math] from above, but [math]\displaystyle{ F(x) \rightarrow F(x_j - 1) \text {as} x \rightarrow x_j }[/math] from below.


Cumulative Frequency: For a sample of numerical data the cumulative frequency corresponding to a number x is the total number of observations that are [math]\displaystyle{ \leq x }[/math].


Cumulative Frequency Polygon: A diagram representing grouped numerical data in which *cumulative frequency is plotted against upper class boundary, and the resulting points are joined by straight line segments to form a polygon. The polygon starts at the point on the x~axis corresponding to the lower class boundary of the lowest class.

(Figure) Cumulative frequency polygon. The diagram refers to the distances between the breeding colony and the point of recovery for a group of razorbilis. The outline is typical, with slow increases at the left and right ends ofthe polygon indicating the scarcity of the corresponding values.


Cumulative Odds Ratio: A function used in models for *ordinal variables. Let [math]\displaystyle{ j }[/math] denote a possible value of the ordinal variable, [math]\displaystyle{ Y }[/math] and let [math]\displaystyle{ E_1 }[/math] and [math]\displaystyle{ E_2 }[/math] be two possible events. The ratio [math]\displaystyle{ R_j }[/math], given by

[math]\displaystyle{ R_j = {P(Y \leq j|E_1) \over P(Y \gt j| E_1)}\bigg /{P(Y \leq j|E_2) \over P(Y \gt j| E_2)}, }[/math]

is called a cumulative *odds ratio. If [math]\displaystyle{ E_1 }[/math] and [math]\displaystyle{ E_3 }[/math] represent different values for a *vector of explanatory variables, the model [math]\displaystyle{ R_j = R }[/math], where [math]\displaystyle{ R }[/math] is a constant, is a proportional-odds model.


Cumulative Probability: For a discrete random variable [math]\displaystyle{ X }[/math] that can take only the ordered values [math]\displaystyle{ x_{(1)} \lt x_{(1)} \lt ..., }[/math], the cumulative probability of a valueor equal to [math]\displaystyle{ x_{(k)} }[/math] is

[math]\displaystyle{ \sum_{j=1}^{k} P(X = x_{(j)}. }[/math]

Cumulative Probability Function: An alternative term for the *cumulative distribution function in the case of a discrete random variable.


Cumulative Relative Frequency: *Cumulative frequency divided by total sample size.


Cumulative Relative Frequency Diagram (Cumulative Relative Frequency Graph): A *cumulative frequency polygon or a *step diagram in which the vertical axis has been scaled by dividing by the sample size so that the maximum *ordinate value is 1.


Cumulative Sum Chart: See QUALITY CONTROL.


Cup: A name for the [math]\displaystyle{ \cup }[/math] symbol denoting *union.


Curse Of Dimensionality: An expression introduced by Richard *Bellman in 1961 that describes the difficulty of obtaining accurate *estimates when there are many *parameters to be estimated simultaneously.


Curvilinear Regression Model: See MULTIPLE REGRESSION MODEL.


Curvilinear Relation: A relation between two *variables that is not linear but appears as a curve when the relation is graphed.


Cusum Chart: See QUALITY CONTROL.


Cut: See NETWORK FLOW PROBLEM.


Cut Vector: See RELIABILITY THEORY.


Cuzick Trend Test: Suppose that subject [math]\displaystyle{ j }[/math] (belonging to a particular group) has a value that is ranked [math]\displaystyle{ n }[/math] amongst the [math]\displaystyle{ n }[/math] values being considered. The G groups are supposed to be ordered (for example, the classes in a school) and a score is awarded to each according to its position in that order. Typically these are the scores 1, 2,, G. The test statistic is the sum. over all subjects, of the product of the rank and the group score.


Cycle: A repeating pattern in a time series; examples are annual and daily patterns.


Cyclic Data: *Data consisting of directions or times in which the measurement scale is cyclic (after 23.59 comes 00.00, after 359° comes 0°, after 31 December comes 1 January). Special techniques are required for summarizing and modelling all types of cyclic data. For example, the *histogram is replaced by the *circular histogram or the *rose diagram, and the principal distribution used to model the data is not the normal distribution but the *von Mises distribution.

For observations [math]\displaystyle{ \theta_1, \theta_2, ..., \theta_n, }[/math]the variability of the data is measured by the concentration, [math]\displaystyle{ \bar R }[/math], defined by

[math]\displaystyle{ n\bar R = \frac{1}{n} \sqrt{\bigg(\sum_{j=1}^{n}\sin\theta_j \bigg)^2 + \bigg(\sum_{j=1}^{n}\cos\theta_j \bigg)^2.} }[/math]

Thus [math]\displaystyle{ n\bar R = R }[/math], where R is the length of the resultant vector.

(Figure) Cyclic data. Each data item is represented by a unit vector. The diagram shows the first two such vectors, and also the last two. The resultant vector, with length ali, connects the start of the first unit vector to the end of the last unit vector. The direction of the resultant vector is [math]\displaystyle{ \theta }[/math].

The circular mean, [math]\displaystyle{ \bar \theta }[/math] is defined only when [math]\displaystyle{ \bar R \neq 0 }[/math] and is then the angle [math]\displaystyle{ 0° \lt \bar \theta \lt 360° }[/math] such that

[math]\displaystyle{ n \bar R \sin (\bar\theta) = \sum_{j=1}^{n}\sin\theta_j, n \bar R \sin (\bar\theta) = \sum_{j=1}^{n}\cos\theta_j }[/math]

One way of representing cyclic data is to regard each observation as a rnove of length 1 unit in the stated direction. The complete sequence of such moves, taken in any order, will end at a finishing point that is a distance nR from the start. The direction of this finishing point from the start will be the angle [math]\displaystyle{ \theta }[/math].


Cyclic Permutation: A rearrangement of an ordered list in which items from the end of the list are successively moved to the start. For example, the cyclic permutations of the letters UPTON are NUPTO, ONUPT, TONUP, and PTONU, but neither PUTON nor NOTUP.

}}