jysung100 wrote:
1) It saids I have to have at least (number of independent variable + 1) sets in order to run linear regression.
Is it due to implementation? Or, is it how normally linear regression works?
When I have less data set than number of variables, is there way to run it?
It is artificial restriction because: a) fast cross-validation algorithm, which is called by linread, can't be used when npoints<nvars+1, and b) regression quality decreases when you have more variables than points.
It is hard to remove from ALGLIB at that point, because assumption than npoints>=nvars+1 is widely used across
linreg package.
jysung100 wrote:
2) I am running it with c++ in 32bit ubuntu 10.10, what's the size limit of real_2d_array ?
My work involves having up to 500,000 independent variable. Would it be able to handle it?
ALGLIB can work with arrays up to 2GB on 32-bit systems. Array size on 64-bit systems is unlimited. But you should keep in mind that:
a) ALGLIB allocates temporaries - for example, it can allocate a copy of your array to avoid modification of values being passed. So I don't think that it will work on 32-bit system with input data whose size is larger than some fraction of 2GB (say, 0.5GB).
b)
linreg allocates array whose size is nvars*nvars. So you can't use it with nvars larger than, say, 5000.
BTW, how many points you have?