Principal Component Analysis

Sample Covariance Matrix

Convert the matrix a to mean deviation form and calculate its sample covariance matrix.

(See Lay 7.5, pp489-491)

In[1]:=

Write a procedure to calculate the sample mean of a matrix and use it to calculate the sample mean of a.

In[4]:=

In[5]:=

Out[6]//MatrixForm=

Write a procedure to calculate the mean deviation form of a matrix and use it to calculate the mean deviation form of a.

In[7]:=

In[8]:=

In[9]:=

Out[10]//MatrixForm=

Check: b should have zero sample mean.

In[12]:=

Out[13]//MatrixForm=

Write a procedure to calculate the sample covariance matrix of a given matrix and use it to calculate the sample covariance matrix of a.

In[14]:=

In[15]:=

Out[16]//MatrixForm=

Principal Components

Bring forward the sample covariance matrix from the previous example.

In[17]:=

Out[17]//MatrixForm=

We seek to diagonalize the matrix s.

Calculate the eigendata for s.

In[18]:=

Out[18]=

Construct d, the diagonal matrix of eigenvalues in decreasing order.

In[19]:=

Out[20]//MatrixForm=

Construct the orthonormal matrix p, consisting of normalized eigenvectors of s.

In[21]:=

Out[22]//MatrixForm=

Check the diagonalization.

In[23]:=

Out[23]=

The columns of p are the principal components of the data.

In[24]:=

In[26]:=

In[27]:=

Out[27]=

Use the principal components to define and relate the variables X and Y.

X represents the original data. Y represents the transformed data.

The change of basis matrix p is orthonormal.

In[30]:=

Out[33]//MatrixForm=

The matrix d is the covariance matrix for the transformed data.

y1 and y2 are independent.

In[34]:=

Out[34]//MatrixForm=

In[35]:=

Calculate the percentage of the total variance contained in the first principal component.

Tr[d] is the trace of the matrix d.

In[39]:=

Out[39]=

Out[40]=

The first principal component accounts for 98.4% of the variance in this data.

Created by Mathematica (April 6, 2005) |