How large are your matrices? Most ALGLIB algorithms are optimized for moderate/large data, when allocation penalty is not that high when compared with data processing cost. I suppose that you call ALGLIB functions many times for very small amounts of data. Is it right?
I agree that efficient algorithm should use dynamic allocations as infrequently as possible. Algorithms which were implemented within several last years (optimizers) cache as much as possible. But some old algorithms (linear algebra, for example) still do not store dynamically allocated arrays between calls.
However, it has only moderate influence on their performance. I've made experiments with rmatrixsvd() - it achieves quite good speedup with 4 threads on even small matrices:
Code:
MATRIX SIZE SPEEDUP
512 3.89
64 3.88
32 3.69
16 2.94
All timings were done on my Intel Core2 with 4 cores and 4 worker threads. You may see that speedup decreases for small matrices (due to dynamic allocation penalty), but even with N=32 calls to rmatrixsvd() have moderate dynamic allocations overhead. So I suppose that either you've tried to work with very small matrices (below N=32), or performance deterioration you''ve seen has other reasons.
Attachment contains source code which I've used to test. ALGLIB 3.3.0 was used.
In any case, you've opened interesting question, and I think that ability to store temporaries between calls to linear algebra functions (and some other functions too) will be added to one of the next ALGLIB releases (3.4.0 is almost ready, there is not so much time left before release, so it may wait for 3.5.0).