# forum.alglib.net

ALGLIB forum
 It is currently Wed Sep 18, 2024 7:34 pm

 All times are UTC

### Forum rules

1. This forum can be used for discussion of both ALGLIB-related and general numerical analysis questions
2. This forum is English-only - postings in other languages will be removed.

 Page 1 of 1 [ 2 posts ]
 Print view Previous topic | Next topic
Author Message
 Post subject: Calculating p-values for ALGLIB's linear model?Posted: Tue Jul 04, 2023 5:25 am

Joined: Thu Aug 23, 2018 5:09 am
Posts: 5
I've been using the Linear Regression functionality in dataanalysis.cs (linearmodel) to perform multiple linear regression on my data. I've been able to obtain the coefficients and R2 for the output model, but I can't figure out how to calculate the p-values or t-statistics for said model's coefficients.

The outputs I'm aware of are those that can be unpacked with lrunpack():

Quote:
/*************************************************************************
Unpacks coefficients of linear model.

INPUT PARAMETERS:
LM - linear model in ALGLIB format

OUTPUT PARAMETERS:
V - coefficients, array[0..NVars]
constant term (intercept) is stored in the V[NVars].
NVars - number of independent variables (one less than number
of coefficients)

-- ALGLIB --
*************************************************************************/

...as well as those stored in LRReport:

Quote:
/*************************************************************************
* C - covariation matrix, array[0..NVars,0..NVars].
C[i,j] = Cov(A[i],A[j])
* RMSError - root mean square error on a training set
* AvgError - average error on a training set
* AvgRelError - average relative error on a training set (excluding
observations with zero function value).
* CVRMSError - leave-one-out cross-validation estimate of
generalization error. Calculated using fast algorithm
with O(NVars*NPoints) complexity.
* CVAvgError - cross-validation estimate of average error
* CVAvgRelError - cross-validation estimate of average relative error

All other fields of the structure are intended for internal use and should
not be used outside ALGLIB.
*************************************************************************/

But I don't see which of these outputs could relate to t-statistics or p-values, if any. Could anyone help me understand if ALGLIB can output either of these values, and if not, could anyone explain how I might go about calculating these values for myself using the provided information?

Top

 Post subject: Re: Calculating p-values for ALGLIB's linear model?Posted: Thu Jul 20, 2023 1:53 am

Joined: Thu Aug 23, 2018 5:09 am
Posts: 5
So I ended up figuring this out with the help of a friend.

The covariation matrix in LRReport (C) is used to calculate the Standard Error of each of the coefficients. Getting the square root of the values at [0,0], [1,1], ... [n,n] of the matrix will give you the standard errors of each coefficient. The t-statistic of each variable can then be calculated by dividing each coefficient (subtracted by the null hypothesis of the coefficient, which in most cases is 0) by its respective standard error. We can then use the StudentTDistribution function (https://www.alglib.net/specialfunctions/distributions/student.php) to get the integral of the t distribution. This also required the degrees of freedom, which is the number of samples subtracted by the number of variables. The result of this function can be easily modified into either a one-tailed or two-tailed p-value.

Here's a slapped together C# codeblock demonstrating how to accomplish this, in case anyone else (like me) has issues understanding the process. My project requires it be output to lists, though it could very easily be modified to output arrays instead.

Code:

public void GetStatisticsOfCovariationMatrix(double[] coefs, double[,] covariationMatrix, int numSamples, int numVariables, out List<double> standardErrors, out List<double> tStatistics, out List<double> pValues)
{
standardErrors = new List<double>();
for (int i = 0; i <= covariationMatrix.GetUpperBound(0); i++)
{
}

double nullHypothesisCoefficient = 0; //This may need to be changed depending on what your null hypothesis is

tStatistics = new List<double>();
for (int i = 0; i < standardErrors.Count; i++)
{
}

pValues = new List<double>();
for (int i = 0; i < tStatistics.Count; i++)
{
double tIntegral = alglib.studenttdistribution(numSamples - numVariables, Math.Abs(tStatistics[i]));
//The p value for one tail of the t-distribution
double p1 = 1 - tIntegral;
//The p value for both tails of the t-distribution
double p2 = p1 * 2;
}
}

Top

 Display posts from previous: All posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost timeSubject AscendingDescending
 Page 1 of 1 [ 2 posts ]

 All times are UTC

#### Who is online

Users browsing this forum: No registered users and 3 guests

 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum

Search for: