forum.alglib.net http://forum.alglib.net/ 

Calculating pvalues for ALGLIB's linear model? http://forum.alglib.net/viewtopic.php?f=2&t=4572 
Page 1 of 1 
Author:  Rammiloh [ Tue Jul 04, 2023 5:25 am ] 
Post subject:  Calculating pvalues for ALGLIB's linear model? 
I've been using the Linear Regression functionality in dataanalysis.cs (linearmodel) to perform multiple linear regression on my data. I've been able to obtain the coefficients and R2 for the output model, but I can't figure out how to calculate the pvalues or tstatistics for said model's coefficients. The outputs I'm aware of are those that can be unpacked with lrunpack(): Quote: /************************************************************************* Unpacks coefficients of linear model. INPUT PARAMETERS: LM  linear model in ALGLIB format OUTPUT PARAMETERS: V  coefficients, array[0..NVars] constant term (intercept) is stored in the V[NVars]. NVars  number of independent variables (one less than number of coefficients)  ALGLIB  Copyright 30.08.2008 by Bochkanov Sergey *************************************************************************/ ...as well as those stored in LRReport: Quote: /************************************************************************* LRReport structure contains additional information about linear model: * C  covariation matrix, array[0..NVars,0..NVars]. C[i,j] = Cov(A[i],A[j]) * RMSError  root mean square error on a training set * AvgError  average error on a training set * AvgRelError  average relative error on a training set (excluding observations with zero function value). * CVRMSError  leaveoneout crossvalidation estimate of generalization error. Calculated using fast algorithm with O(NVars*NPoints) complexity. * CVAvgError  crossvalidation estimate of average error * CVAvgRelError  crossvalidation estimate of average relative error All other fields of the structure are intended for internal use and should not be used outside ALGLIB. *************************************************************************/ But I don't see which of these outputs could relate to tstatistics or pvalues, if any. Could anyone help me understand if ALGLIB can output either of these values, and if not, could anyone explain how I might go about calculating these values for myself using the provided information? 
Author:  Rammiloh [ Thu Jul 20, 2023 1:53 am ] 
Post subject:  Re: Calculating pvalues for ALGLIB's linear model? 
So I ended up figuring this out with the help of a friend. The covariation matrix in LRReport (C) is used to calculate the Standard Error of each of the coefficients. Getting the square root of the values at [0,0], [1,1], ... [n,n] of the matrix will give you the standard errors of each coefficient. The tstatistic of each variable can then be calculated by dividing each coefficient (subtracted by the null hypothesis of the coefficient, which in most cases is 0) by its respective standard error. We can then use the StudentTDistribution function (https://www.alglib.net/specialfunctions/distributions/student.php) to get the integral of the t distribution. This also required the degrees of freedom, which is the number of samples subtracted by the number of variables. The result of this function can be easily modified into either a onetailed or twotailed pvalue. Here's a slapped together C# codeblock demonstrating how to accomplish this, in case anyone else (like me) has issues understanding the process. My project requires it be output to lists, though it could very easily be modified to output arrays instead. Code: public void GetStatisticsOfCovariationMatrix(double[] coefs, double[,] covariationMatrix, int numSamples, int numVariables, out List<double> standardErrors, out List<double> tStatistics, out List<double> pValues) { standardErrors = new List<double>(); for (int i = 0; i <= covariationMatrix.GetUpperBound(0); i++) { standardErrors.Add(Math.Sqrt(covariationMatrix[i, i])); } double nullHypothesisCoefficient = 0; //This may need to be changed depending on what your null hypothesis is tStatistics = new List<double>(); for (int i = 0; i < standardErrors.Count; i++) { tStatistics.Add((coefs[i]  nullHypothesisCoefficient) / standardErrors[i]); } pValues = new List<double>(); for (int i = 0; i < tStatistics.Count; i++) { double tIntegral = alglib.studenttdistribution(numSamples  numVariables, Math.Abs(tStatistics[i])); //The p value for one tail of the tdistribution double p1 = 1  tIntegral; //The p value for both tails of the tdistribution double p2 = p1 * 2; pValues.Add(p2); } } 
Page 1 of 1  All times are UTC 
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ 