Now I understand you. I've told about built-in support for parallel computations, i.e. ability to split Cholesky decomposition of single large matrix to several cores. Such ability is not present in ALGLIB yet.
You talk about ability to run independent tasks in parallel, with tasks manually spawned by you. ALGLIB is 100% compatible with such usage pattern, so theoretically you should have x24 speed up (when compared with 2-core system). I do not know why it is not the case.
Maybe, .NET framework is incapable to efficiently spread computations between so many cores? Or you have very small problems (say, Cholesky of 10x10 matrix) so parallelization overhead is comparable with problem solution cost?
Can you test your parallelization framework with some dummy code (say, empty loop with 10^9 iterations)? Do you have x24 speedup in this case? I want to understand whether it is connected with the fact that your parallel code calls ALGLIB, or this issue is present with any code, ALGLIB-dependent or not.
|