forum.alglib.net
http://forum.alglib.net/

Getting rid of outliers.
http://forum.alglib.net/viewtopic.php?f=2&t=427
Page 1 of 1

Author:  intern [ Wed Sep 07, 2011 11:21 am ]
Post subject:  Getting rid of outliers.

Hi.
I have a series of data expressed as a double array. I am using alglib to find the mean and standard deviation using this method:
alglib.samplemoments(array, out values[0], out values[1], out values[2], out values[3]);
Console.WriteLine("MEAN: " + values[0]);
alglib.sampleadev(array, out values[4]);
Console.WriteLine("Deviation: " + values[4]);

I just arrange them in an array of doubles called values[5] so I can retrieve them whenever I want.
What I want to do is to get rid of outliers.
I have this website: http://www.wikihow.com/Reject-Outliers-in-Data that makes me understand half of the process.
Can someone help me understand step three. Is there any method in the alglib that can do this step for me?

Thanks in advance.

Author:  Sergey.Bochkanov [ Wed Sep 07, 2011 1:09 pm ]
Post subject:  Re: Getting rid of outliers.

Yes, ALGLIB can do it.

Take a look at http://en.wikipedia.org/wiki/Normal_cum ... n_function Integral you need can be calculated using special function erf(), also called error function. ALGLIB can calculate it with a special subroutine: http://www.alglib.net/translator/man/ma ... orfunction

Author:  intern [ Thu Sep 08, 2011 7:41 am ]
Post subject:  Re: Getting rid of outliers.

To be honest I am not the strongest mathimatician. Thank smart people for comming up with functions that take care of this for us.

Could you maybe explain to me how I would use the function that you pointed me toward?

Lets say that we have an array called myArray[] and it holds ten values: 1,2,3,40,5,6,-30,8,9,10.
I call the: alglib.samplemoments(myArray, out values[0], out values[1], out values[2], out values[3]); and alglib.sampleadev(array, out values[4]);
methods on that array to find the mean and deviation. Mean is now in array: values[0] and the deviation is in: values[4].

How I understand it, reading the material you provided, is: I take each point in the myArray and use the : errorfunction(double x); on it?
Is that how it is done? Where does the mean and deviation come in?

Could you maybe show me with simple code how you would use it?

Thanks

Page 1 of 1 All times are UTC
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/