forum.alglib.net

ALGLIB forum
 It is currently Fri May 24, 2024 4:53 am

 All times are UTC

Forum rules

1. This forum can be used for discussion of both ALGLIB-related and general numerical analysis questions
2. This forum is English-only - postings in other languages will be removed.

 Page 1 of 1 [ 10 posts ]
 Print view Previous topic | Next topic
Author Message
 Post subject: Linear discriminant analysis (LDA)Posted: Sun Jun 13, 2010 7:31 pm

Joined: Sun Jun 13, 2010 6:28 pm
Posts: 5
Hello,

I have downloaded ALGLIB for VBA and implemented the code in my project. It works well so far. Thanks a lot for your effort.

Nevertheless, after carrying out the “Linear discriminant analysis (LDA)” analysis I was not able to predict the group. The reason for this is that I’m not sure how to build and use the correct function. I tried to multiply the variable values with the linear combination coefficients, but I don’t know how to interpret the results.

Therefore I would highly appreciate if someone could help me. How to use the linear combination coefficients for group prediction (please explain in detail) purposes?

David

Top

 Post subject: Re: Linear discriminant analysis (LDA)Posted: Mon Jun 14, 2010 6:59 am

Joined: Fri May 07, 2010 7:06 am
Posts: 915
Do you have binary classification problem (two classes) or multi-class one?

In the first case you just have to project data on a line determined by coefficients you've found. Class #1 will be at the one side, class #2 will be at another side. You can use undocumented function DSOptimalSplit2() from bdss unit which can make a split for you. Here is description of its parameters:

Code:
Optimal binary classification

Algorithms finds optimal (=with minimal cross-entropy) binary partition.
Internal subroutine.

INPUT PARAMETERS:
A       -   array[0..N-1], variable
C       -   array[0..N-1], class numbers (0 or 1).
N       -   array size

OUTPUT PARAMETERS:
Info    -   completetion code:
* -3, all values of A[] are same (partition is impossible)
* -2, one of C[] is incorrect (<0, >1)
* -1, incorrect pararemets were passed (N<=0).
*  1, OK
Threshold-  partiton boundary. Left part contains values which are
strictly less than Threshold. Right part contains values
which are greater than or equal to Threshold.
PAL, PBL-   probabilities P(0|v<Threshold) and P(1|v<Threshold)
PAR, PBR-   probabilities P(0|v>=Threshold) and P(1|v>=Threshold)
CVE     -   cross-validation estimate of cross-entropy

-- ALGLIB --
Copyright 22.05.2008 by Bochkanov Sergey

If you have multi-class classification problem, you have to project your data at the top NClasses-1 eigenvectors obtained by LDA. Then you should use these values as inputs for some classification algorithm. There is no easy way to interpret such data when NClasses>2, so LDA is mostly used as preprocessing tool.

Top

 Post subject: Re: Linear discriminant analysis (LDA)Posted: Mon Jun 14, 2010 5:35 pm

Joined: Sun Jun 13, 2010 6:28 pm
Posts: 5
I have a two classes problem.

Let’s assume I have two variables (A and B). The linear combination coefficients, the output from the Linear discriminant analysis (LDA)”, is 0.9 for variable A and 0.3 for variable B.

I would like to predict the group for following dataset: A = 10; B= 12. How to use the function “DSOptimalSplit2()” with these parameters. I assume that the following proposal wouldn’t work:

INPUT PARAMETERS:
A - array[0..N-1], variable -> 10, 12
C - array[0..N-1], class numbers (0 or 1). -> ??
N - array size -> 1

David

Top

 Post subject: Re: Linear discriminant analysis (LDA)Posted: Tue Jun 15, 2010 5:37 am

Joined: Fri May 07, 2010 7:06 am
Posts: 915
You should:
* generate A[] by calculating dot product of your entire training set with (0.9,0.3)
* fill C[] by class numbers
* call DSOptimalSplit2()
* now you have threshold value and distribution of classes at the left and at the right sides
* then you just calculate 0.9*10+0.3*12 = 9+3.6 = 12.6 and see at which side of threshold it falls

The LDA is just a preprocessing method, you still have to train linear/nonlinear classifier after applying LDA.

Top

 Post subject: Re: Linear discriminant analysis (LDA)Posted: Tue Jun 15, 2010 5:25 pm

Joined: Sun Jun 13, 2010 6:28 pm
Posts: 5
Thanks!

Is the right side always class 0 and the left side always class 1 or how can I determine which side contains which class?

Top

 Post subject: Re: Linear discriminant analysis (LDA)Posted: Tue Jun 15, 2010 7:05 pm

Joined: Fri May 07, 2010 7:06 am
Posts: 915
Quote:
Is the right side always class 0 and the left side always class 1?

No. You can examine PAL/PBL ratio to see which class dominates left side. If it is >1, left side is A. However, it means that A is just more likely than B. It doesn't mean that everything on the left is A - classes may be inseparable (or nonlinearly separable), may overlap with each other. In such case DSOptimalSplit2() will try to find best split possible, but sometimes even best split can't separate two classes good enough.

Furthermore, DSOptimalSplit2() doesn't take into account difference between different kinds of misclassification. It assumes that both classes are equally important. Counterexample: Class 0 is +1\$ for you, and class 1 is -1000\$. You'll want to avoid #1 as mush as possible, but DSOptimalSplit2() assumes that misclassification of #0 as #1 is as bad as misclassification of #1 as #0.

Top

 Post subject: Re: Linear discriminant analysis (LDA)Posted: Tue Jun 15, 2010 9:22 pm

Joined: Sun Jun 13, 2010 6:28 pm
Posts: 5
Thank you very much again! My code works well.

Is there any chance to take into account weighted classes in ALGLIB?

Top

 Post subject: Re: Linear discriminant analysis (LDA)Posted: Wed Jun 16, 2010 6:09 am

Joined: Fri May 07, 2010 7:06 am
Posts: 915
Quote:
Is there any chance to take into account weighted classes in ALGLIB?

Yes, I think that it will be added to 2.7.0, which is expected to be released in July. I want to document (at least partially) bdss unit, particularly DSOptimalSplit2(), and to add support for binary problems with cost matrices. This idea was inspired by your question, BTW ;)

Top

 Post subject: Re: Linear discriminant analysis (LDA)Posted: Wed Jun 16, 2010 8:26 pm

Joined: Sun Jun 13, 2010 6:28 pm
Posts: 5
Sounds good.

I carried out the LDA in ALGLIB and compared my results with the SPSS output. It seems that the functions as well as the results are different. Where are the differences between the two approaches?

Top

 Post subject: Re: Linear discriminant analysis (LDA)Posted: Thu Jun 17, 2010 9:58 am

Joined: Fri May 07, 2010 7:06 am
Posts: 915
ALGLIB implements simplest form of LDA - Fisher's LDA without regularization or other improvements. SPSS supports many other methods under common name "discriminant analysis". So difference between ALGLIB output and SPSS output doesn't mean that there is an error in ALGLIB.

Top

 Display posts from previous: All posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost timeSubject AscendingDescending
 Page 1 of 1 [ 10 posts ]

 All times are UTC

Who is online

Users browsing this forum: Bing [Bot] and 3 guests

 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum

Search for:
 Jump to:  Select a forum ------------------ ALGLIB forum    ALGLIB-discuss
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group