Encyclopedia > Gauss-Markov theorem

Article Content

Gauss-Markov theorem

This article is not about Gauss-Markov processes.

In statistics, the Gauss-Markov theorem states that in a linear model in which the errors have expectation zero and are uncorrelated and have equal variances, the best linear unbiased estimators of the coefficients are the least-squares estimators. More generally, the best linear unbiased estimator of any linear combination of the coefficients is its least-squares estimator. The errors are not assumed to be normally distributed, nor are they assumed to be independent (but only uncorrelated --- a weaker condition), nor are they assumed to be identically distributed (but only homoscedastic[?] --- a weaker condition, defined below).

More explicitly, and more concretely, suppose we have

<math>Y_i=\beta_0+\beta_1 x_i+\varepsilon_i</math>

for i = 1, . . . , n, where β₀ and β₁ are non-random but unobservable parameters, x_i are non-random and observable, ε_i are random, and so Y_i are random. (We set x in lower-case because it is not random, and Y in capital because it is random.) The random variables ε_i are called the "errors". The Gauss-Markov assumptions state that

<math>{\rm E}\left(\varepsilon_i\right)=0,</math>
<math>{\rm var}\left(\varepsilon_i\right)=\sigma^2<\infty,</math>

(i.e., all errors have the same variance; that is "homoscedasticity"), and

<math>{\rm cov}\left(\varepsilon_i,\varepsilon_j\right)=0</math>

for <math>i\not=j</math>; that is "uncorrelatedness." A linear unbiased estimator of β₁ is a linear combination

<math>c_1Y_1+\cdots+c_nY_n</math>

in which the coefficients c_i are not allowed depend on the earlier coefficients β_i, since those are not observable, but are allowed to depend on x_i, since those are observable, and whose expected value remains β₁ even if the values of β_i change. (The dependence of the coefficients on the x_i is typically nonlinear; the estimator is linear in that which is random; that is why this is "linear" regression.) The mean squared error of such an estimator is

<math>E\left((c_1Y_1+\cdots+c_nY_n-\beta_1)^2\right),</math>

i.e., it is the expectation of the square of the difference between the estimator and the parameter to be estimated. (The mean squared error of an estimator coincides with the estimator's variance if the estimator is unbiased; for biased estimators the mean squared error is the sum of the variance and the square of the bias.) The best linear unbiased estimator is the one with the smallest mean squared error. The "least-squares estimators" of β₀ and β₁ are the functions <math>\widehat{\beta}_0</math> and <math>\widehat{\beta}_1</math> of the Ys and the xs that make the sum of squares of residuals

<math>\sum_{i=1}^n\left(Y_i-\widehat{Y}_i\right)^2=\sum_{i=1}^n\left(Y_i-\left(\widehat{\beta}_0+\widehat{\beta}_1 x_i\right)\right)^2</math>

as small as possible.

The main idea of the proof is that the least-squares estimators are uncorrelated with every linear unbiased estimator of zero, i.e., with every linear combination

<math>a_1Y_1+\cdots+a_nY_n</math>

whose coefficients do not depend upon the unobservable β_i but whose expected value remains zero regardless of how the values of β₁ and β₂ change.

See also linear regression.

All Wikipedia text is available under the terms of the GNU Free Documentation License

Search Encyclopedia

Search over one million articles, find something about almost anything!