forum.alglib.net
http://forum.alglib.net/

lrbuild linear regression
http://forum.alglib.net/viewtopic.php?f=2&t=282
Page 1 of 1

Author:  jysung100 [ Mon Jan 24, 2011 9:56 pm ]
Post subject:  lrbuild linear regression

I have two questions.

1) It saids I have to have at least (number of independent variable + 1) sets in order to run linear regression.
Is it due to implementation? Or, is it how normally linear regression works?
When I have less data set than number of variables, is there way to run it?

2) I am running it with c++ in 32bit ubuntu 10.10, what's the size limit of real_2d_array ?
My work involves having up to 500,000 independent variable. Would it be able to handle it?

Thanks

Author:  Sergey.Bochkanov [ Tue Jan 25, 2011 7:24 am ]
Post subject:  Re: lrbuild linear regression

jysung100 wrote:
1) It saids I have to have at least (number of independent variable + 1) sets in order to run linear regression.
Is it due to implementation? Or, is it how normally linear regression works?
When I have less data set than number of variables, is there way to run it?

It is artificial restriction because: a) fast cross-validation algorithm, which is called by linread, can't be used when npoints<nvars+1, and b) regression quality decreases when you have more variables than points.

It is hard to remove from ALGLIB at that point, because assumption than npoints>=nvars+1 is widely used across linreg package.

jysung100 wrote:
2) I am running it with c++ in 32bit ubuntu 10.10, what's the size limit of real_2d_array ?
My work involves having up to 500,000 independent variable. Would it be able to handle it?

ALGLIB can work with arrays up to 2GB on 32-bit systems. Array size on 64-bit systems is unlimited. But you should keep in mind that:
a) ALGLIB allocates temporaries - for example, it can allocate a copy of your array to avoid modification of values being passed. So I don't think that it will work on 32-bit system with input data whose size is larger than some fraction of 2GB (say, 0.5GB).
b) linreg allocates array whose size is nvars*nvars. So you can't use it with nvars larger than, say, 5000.

BTW, how many points you have?

Page 1 of 1 All times are UTC
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/