Encyclopedia > Principal components analysis

Article Content

Principal components analysis

In statistics, principal components analysis (PCA) is a transform[?] used for reducing dimensionality in a dataset while retaining the most important characteristics of that dataset. In signal processing it called the (discrete) Karhunen-Loève transform. It is also called the Hotelling transform.

The principal component w1 of a dataset x can be defined as

$\mathbf{w}_1  = \arg\max_{\Vert \mathbf{w} \Vert = 1} E\left\{ \left( \mathbf{w}^T \mathbf{x}\right)^2 \right\}$

with the first $k - 1$ components, the $k$-th component can be found by subtracting the first $k - 1$ principal components from x:
$\mathbf{\hat{x}}_{k - 1}  = \mathbf{x} - \sum_{i = 1}^{k - 1} \mathbf{w}_i \mathbf{w}_i^T \mathbf{x}$

and by substituting this as the new dataset to find a principal component in:
$\mathbf{w}_k  = \arg\max_{\Vert \mathbf{w} \Vert = 1} E\left\{ \left( \mathbf{w}^T \mathbf{\hat{x}}_{k - 1} \right)^2 \right\}.$


A simpler way to calculate the components wi uses the covariance matrix of x, the measurement vector. By finding the eigenvalues and eigenvectors of the covariance matrix, we find that the eigenvectors with the largest eigenvalues correspond to the dimensions that have the strongest correlation in the dataset. The original measurements are finally projected onto the reduced vector space.

Related (or even more similar than related?) is the calculus of empirical orthogonal functions[?] (EOF).

Another method of dimension reduction is a self-organizing map.

All Wikipedia text is available under the terms of the GNU Free Documentation License

Search Encyclopedia
 Search over one million articles, find something about almost anything!

Featured Article
 KANU ... as President in the December 1997 elections, and his KANU Party narrowly retained its parliamentary majority, with 109 out of 122 seats. On December 29, 2002, th ...