For ALGLIB's Multinomial Logistic Regression:
Looking at the source code for function "mnltrainh", I see that training ends only when both of the following two conditions are met: a) function "spdmatrixcholeskysolve" returns true, and b) function "logit_mnlmcsrch" returns some desired values (related to uncertainty tolerance, etc.)
In general, the "training loop" inside "mnltrainh" consists of: i) calculate Hessian ii) calculate Gradient iii) multidimensional line search to move towards optimal parameter values
Questions: 1) Is it true that after each iteration of the above loop, the network weights are moved closer to the "optimal" weights? i.e. Is the distance between the current iteration of the network weights and the optimal network weights strictly decreasing with iteration? 2) Is the answer to question 1) in any way dependent on the return values of "spdmatrixcholeskysolve" or "logit_mnlmcsrch"? i.e. if I randomly terminate the training loop, regardless of those return values, can I be guaranteed that the current network weights are "better" than previous ones?
Thank you!
|