My PCA looks like this, it doesn't use NVARS*NVARS memory, much less so if you have a massive number of dimensions, it will be much lighter. Note you should prescale/demean.
Xr is the reduced output, odim is the number of output dimensions. Its a pretty direct translation of the description on wikipedia =). It would be nice if this was an option in alglib, since it is wayyyy more efficient if you don't care about the basis vectors. (Edit ... and I'm posting in a dead thread)
Code:
void pca(const alglib::real_2d_array& X, alglib::real_2d_array& Xr, int odim)
{
alglib::real_2d_array U;
alglib::real_2d_array VT;
alglib::real_1d_array W;
const double VARTHRESH = 1E-12;
std::cout << "svd: " << X.rows() << "x" << X.cols() << "...";
alglib::rmatrixsvd(X, X.rows(), X.cols(), 1, 0, 2, W, U, VT);
std::cout << " done" << endl;
std::cout << "odims = " << odim << "...";
Xr.setlength(X.rows(), odim);
for(int rr=0; rr<X.rows(); rr++) {
for(int cc=0; cc<odim; cc++) {
Xr(rr,cc) = U(rr, cc)*W[cc];
}
}
std::cout << " Done" << endl;
}