forum.alglib.net http://forum.alglib.net/ |
|
Problem with random forest http://forum.alglib.net/viewtopic.php?f=2&t=3829 |
Page 1 of 1 |
Author: | antonysoldatov [ Wed Jan 03, 2018 1:07 pm ] |
Post subject: | Problem with random forest |
Hello! I get problem with using random forest methods Here is building RF Code: double[,] xy = new double[,] {{0, 0, 0, 0, 0, 255, 0, 0, 0}, {0, 0, 0, 0, 0, 0, 0, 0, 1}, {0, 255, 0, 0, 0, 0, 0, 0, 2}, {0, 0, 0, 0, 255, 0, 0, 0, 3}, {0, 0, 0, 0, 0, 0, 0, 255, 4}}; alglib.dfbuildrandomdecisionforestx1(xy, 8, 5, 5, 50, 3, 0.6, out info, out df, out rep); When I try to use it like this Code: double[] x = new double[]{0, 0, 0, 0, 0, 0, 0, 0}; alglib.dfprocess(df, x, ref y); I get wrong classification result {0.005, 0.49, 0, 0.505, 0}. So max possibility is 4th value (0.505). But it should be 2nd value (inner value is zero array, that is class 1) Please help me to solve this problem. Thank you! |
Author: | Sergey.Bochkanov [ Wed Jan 03, 2018 4:59 pm ] |
Post subject: | Re: Problem with random forest |
Hi! Random forests are (no surprise!) randomized constructs. They try randomly many different classification schemes, with different variables being selected - and different random datasets being generated for training. In particular, it is very likely that roughly 40% of your random trees will be trained without instances of class #2. And your toy dataset is not well suited for randomized methods - drop just one variable (say, last one), and you can not reliably distinguish between instances of classes #2 and #4. So, it is completely normal that on such small toy dataset you get such results. Try training on larger dataset, with noise being added to inputs. |
Author: | antonysoldatov [ Thu Jan 04, 2018 5:28 am ] |
Post subject: | Re: Problem with random forest |
Thank you for reply and your advise! |
Page 1 of 1 | All times are UTC |
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |