forum.alglib.net

ALGLIB forum
It is currently Sun Dec 22, 2024 9:28 am

All times are UTC


Forum rules


1. This forum can be used for discussion of both ALGLIB-related and general numerical analysis questions
2. This forum is English-only - postings in other languages will be removed.



Post new topic Reply to topic  [ 10 posts ] 
Author Message
 Post subject: Linear discriminant analysis (LDA)
PostPosted: Sun Jun 13, 2010 7:31 pm 
Offline

Joined: Sun Jun 13, 2010 6:28 pm
Posts: 5
Hello,

I have downloaded ALGLIB for VBA and implemented the code in my project. It works well so far. Thanks a lot for your effort.

Nevertheless, after carrying out the “Linear discriminant analysis (LDA)” analysis I was not able to predict the group. The reason for this is that I’m not sure how to build and use the correct function. I tried to multiply the variable values with the linear combination coefficients, but I don’t know how to interpret the results.

Therefore I would highly appreciate if someone could help me. How to use the linear combination coefficients for group prediction (please explain in detail) purposes?

Thanks for your effort in advance
David


Top
 Profile  
 
 Post subject: Re: Linear discriminant analysis (LDA)
PostPosted: Mon Jun 14, 2010 6:59 am 
Offline
Site Admin

Joined: Fri May 07, 2010 7:06 am
Posts: 927
Do you have binary classification problem (two classes) or multi-class one?

In the first case you just have to project data on a line determined by coefficients you've found. Class #1 will be at the one side, class #2 will be at another side. You can use undocumented function DSOptimalSplit2() from bdss unit which can make a split for you. Here is description of its parameters:

Code:
Optimal binary classification

Algorithms finds optimal (=with minimal cross-entropy) binary partition.
Internal subroutine.

INPUT PARAMETERS:
    A       -   array[0..N-1], variable
    C       -   array[0..N-1], class numbers (0 or 1).
    N       -   array size

OUTPUT PARAMETERS:
    Info    -   completetion code:
                * -3, all values of A[] are same (partition is impossible)
                * -2, one of C[] is incorrect (<0, >1)
                * -1, incorrect pararemets were passed (N<=0).
                *  1, OK
    Threshold-  partiton boundary. Left part contains values which are
                strictly less than Threshold. Right part contains values
                which are greater than or equal to Threshold.
    PAL, PBL-   probabilities P(0|v<Threshold) and P(1|v<Threshold)
    PAR, PBR-   probabilities P(0|v>=Threshold) and P(1|v>=Threshold)
    CVE     -   cross-validation estimate of cross-entropy

  -- ALGLIB --
     Copyright 22.05.2008 by Bochkanov Sergey


If you have multi-class classification problem, you have to project your data at the top NClasses-1 eigenvectors obtained by LDA. Then you should use these values as inputs for some classification algorithm. There is no easy way to interpret such data when NClasses>2, so LDA is mostly used as preprocessing tool.


Top
 Profile  
 
 Post subject: Re: Linear discriminant analysis (LDA)
PostPosted: Mon Jun 14, 2010 5:35 pm 
Offline

Joined: Sun Jun 13, 2010 6:28 pm
Posts: 5
I have a two classes problem.

Let’s assume I have two variables (A and B). The linear combination coefficients, the output from the Linear discriminant analysis (LDA)”, is 0.9 for variable A and 0.3 for variable B.

I would like to predict the group for following dataset: A = 10; B= 12. How to use the function “DSOptimalSplit2()” with these parameters. I assume that the following proposal wouldn’t work:

INPUT PARAMETERS:
A - array[0..N-1], variable -> 10, 12
C - array[0..N-1], class numbers (0 or 1). -> ??
N - array size -> 1

Thanks in advance
David


Top
 Profile  
 
 Post subject: Re: Linear discriminant analysis (LDA)
PostPosted: Tue Jun 15, 2010 5:37 am 
Offline
Site Admin

Joined: Fri May 07, 2010 7:06 am
Posts: 927
You should:
* generate A[] by calculating dot product of your entire training set with (0.9,0.3)
* fill C[] by class numbers
* call DSOptimalSplit2()
* now you have threshold value and distribution of classes at the left and at the right sides
* then you just calculate 0.9*10+0.3*12 = 9+3.6 = 12.6 and see at which side of threshold it falls

The LDA is just a preprocessing method, you still have to train linear/nonlinear classifier after applying LDA.


Top
 Profile  
 
 Post subject: Re: Linear discriminant analysis (LDA)
PostPosted: Tue Jun 15, 2010 5:25 pm 
Offline

Joined: Sun Jun 13, 2010 6:28 pm
Posts: 5
Thanks!

Is the right side always class 0 and the left side always class 1 or how can I determine which side contains which class?


Top
 Profile  
 
 Post subject: Re: Linear discriminant analysis (LDA)
PostPosted: Tue Jun 15, 2010 7:05 pm 
Offline
Site Admin

Joined: Fri May 07, 2010 7:06 am
Posts: 927
Quote:
Is the right side always class 0 and the left side always class 1?

No. You can examine PAL/PBL ratio to see which class dominates left side. If it is >1, left side is A. However, it means that A is just more likely than B. It doesn't mean that everything on the left is A - classes may be inseparable (or nonlinearly separable), may overlap with each other. In such case DSOptimalSplit2() will try to find best split possible, but sometimes even best split can't separate two classes good enough.

Furthermore, DSOptimalSplit2() doesn't take into account difference between different kinds of misclassification. It assumes that both classes are equally important. Counterexample: Class 0 is +1$ for you, and class 1 is -1000$. You'll want to avoid #1 as mush as possible, but DSOptimalSplit2() assumes that misclassification of #0 as #1 is as bad as misclassification of #1 as #0.


Top
 Profile  
 
 Post subject: Re: Linear discriminant analysis (LDA)
PostPosted: Tue Jun 15, 2010 9:22 pm 
Offline

Joined: Sun Jun 13, 2010 6:28 pm
Posts: 5
Thank you very much again! My code works well.

Is there any chance to take into account weighted classes in ALGLIB?


Top
 Profile  
 
 Post subject: Re: Linear discriminant analysis (LDA)
PostPosted: Wed Jun 16, 2010 6:09 am 
Offline
Site Admin

Joined: Fri May 07, 2010 7:06 am
Posts: 927
Quote:
Is there any chance to take into account weighted classes in ALGLIB?

Yes, I think that it will be added to 2.7.0, which is expected to be released in July. I want to document (at least partially) bdss unit, particularly DSOptimalSplit2(), and to add support for binary problems with cost matrices. This idea was inspired by your question, BTW ;)


Top
 Profile  
 
 Post subject: Re: Linear discriminant analysis (LDA)
PostPosted: Wed Jun 16, 2010 8:26 pm 
Offline

Joined: Sun Jun 13, 2010 6:28 pm
Posts: 5
Sounds good.

I carried out the LDA in ALGLIB and compared my results with the SPSS output. It seems that the functions as well as the results are different. Where are the differences between the two approaches?


Top
 Profile  
 
 Post subject: Re: Linear discriminant analysis (LDA)
PostPosted: Thu Jun 17, 2010 9:58 am 
Offline
Site Admin

Joined: Fri May 07, 2010 7:06 am
Posts: 927
ALGLIB implements simplest form of LDA - Fisher's LDA without regularization or other improvements. SPSS supports many other methods under common name "discriminant analysis". So difference between ALGLIB output and SPSS output doesn't mean that there is an error in ALGLIB.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 10 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 38 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group