# SuanShu, a Java numerical and statistical library

com.numericalmethod.suanshu.stats.pca

## Interface PCA

• All Known Implementing Classes:
PCAbyEigen, PCAbySVD

public interface PCA
Principal Component Analysis (PCA) is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated variables called principal components. The number of principal components is less than or equal to the number of original variables. This transformation is defined in such a way that the first principal component has as high a variance as possible (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to (uncorrelated with) the preceding components. Principal components are guaranteed to be independent only if the data set is jointly normally distributed.
• K. V. Mardia, J. T. Kent and J. M. Bibby, "Multivariate Analysis," London, Academic Press, 1979.
• W. N. Venables and B. D. Ripley, "Modern Applied Statistics with S," New York, Springer-Verlag, 2002.
• Wikipedia: Principal component analysis
• ### Method Summary

All Methods
Modifier and Type Method and Description
Vector cumulativeProportionVar()
Gets the cumulative proportion of overall variance explained by the principal components
Vector loading(int i)
Matrix loadings()
Vector mean()
Gets the sample means that were subtracted.
int nFactors()
Gets the number of variables in the original data.
int nObs()
Gets the number of observations in the original data; sample size.
Vector proportionVar()
Gets the proportion of overall variance explained by each of the principal components.
double proportionVar(int i)
Gets the proportion of overall variance explained by the i-th principal component.
Vector scale()
Gets the scalings applied to each variable.
Matrix scores()
Gets the scores of supplied data on the principal components.
double sdPrincipalComponent(int i)
Gets the standard deviation of the i-th principal component.
Vector sdPrincipalComponents()
Gets the standard deviations of the principal components (i.e., the square roots of the eigenvalues of the correlation (or covariance) matrix).
Matrix X()
Gets the (possibly centered and/or scaled) data matrix X used for the PCA.
• ### Method Detail

• #### nObs

int nObs()
Gets the number of observations in the original data; sample size.
Returns:
nObs, the number of observations in the original data
• #### nFactors

int nFactors()
Gets the number of variables in the original data.
Returns:
nFactors, the number of variables in the original data
• #### mean

Vector mean()
Gets the sample means that were subtracted.
Returns:
the sample means of each variable in the original data
• #### scale

Vector scale()
Gets the scalings applied to each variable.
Returns:
the scalings applied to each variable in the original data
• #### X

Matrix X()
Gets the (possibly centered and/or scaled) data matrix X used for the PCA.
Returns:
the (possibly centered and/or scaled) data matrix X
• #### sdPrincipalComponents

Vector sdPrincipalComponents()
Gets the standard deviations of the principal components (i.e., the square roots of the eigenvalues of the correlation (or covariance) matrix).
Returns:
the standard deviations of the principal components
• #### sdPrincipalComponent

double sdPrincipalComponent(int i)
Gets the standard deviation of the i-th principal component.
Parameters:
i - an index, counting from 1
Returns:
the standard deviation of the i-th principal component.

Matrix loadings()
Returns:

Vector loading(int i)
Parameters:
i - an index, counting from 1
Returns:
• #### proportionVar

Vector proportionVar()
Gets the proportion of overall variance explained by each of the principal components.
Returns:
the proportion of overall variance explained by each of the principal components
• #### proportionVar

double proportionVar(int i)
Gets the proportion of overall variance explained by the i-th principal component.
Parameters:
i - an index, counting from 1
Returns:
the proportion of overall variance explained by the i-th principal component
• #### cumulativeProportionVar

Vector cumulativeProportionVar()
Gets the cumulative proportion of overall variance explained by the principal components
Returns:
the cumulative proportion of overall variance explained by the principal components
• #### scores

Matrix scores()
Gets the scores of supplied data on the principal components. The signs of the columns of the scores are arbitrary.
Returns:
the scores of supplied data on the principal components