Discussing redesign of the C++ version

Sergey.Bochkanov · **Joined:** Fri May 07, 2010 7:06 am **Posts:** 927

Quote:

I agree that the native data structure may look rather clumsy. To me, the payoff is clarity and not having to worry about incompatibility with certain systems that may only deal with ANSI C in scientific computing - the latter is a much more hassle to an average researcher.

Can you point me to some specific examples of such systems? I think that I should at least have them in my mind when making design decisions.

Quote:

Personally, I've always relied on passing around double ** around internally as it is clearer in my own code.

Pseudocode, which is used to implement ALGLIB, was designed after Object Pascal. It has several distinct array-related features:
1) it is always possible to determine size of array by just calling Length(ARR) - something you can't do with double*
2) you can reallocate matrix (old matrix is freed, new memory is allocated) - again, it is impossible with double**, because you don't know what to do with old memory - to deallocate it or not, and how to deallocate it (is it one large block or a bunch of smaller blocks?)

And the problem is that all these features are essential for both ALGLIB core and interface. I.e. some functions make internal checks for sizes of their arguments (how can you determine size of a matrix represented by double**?), some functions resize their inputs.

I think about making some kind of "double**"-like interface to the "ae_matrix"-like computational core, but it is still unclear whether it is possible to generate such interface automatically or not.

Quote:

Actually, I usually insist on a straight C implementation of core numerical libraries for my research staff because I do not trust people generally to build efficient compiled code in C++.

You have a point here. I think that C++ is good at making nicely looking interfaces, and STL is very handy when you need some "standard" data structure (and don't care about performance!) But I can't imagine situation when you really need inheritance or polymorphism at the internals of numerical library.

But sometimes, when I work with pure C, I terribly miss overloading - it allows to solve problem of default argument values in an elegant way.

Quote:

I'm in the process of migrating from NR and GSL. I will buy the commercial license in a month after I run through some tests.

Do you want to use C++ version? As I told, I think on making pure C version of ALGLIB, but everything (including schedule) is still unclear.

twkie · **Joined:** Tue Jan 04, 2011 7:59 pm **Posts:** 5

As examples of 'C' only development environments:
Here's the link to the CUDA development page. http://developer.nvidia.com/object/gpucomputing.html
This is the link to FPGA development. http://www.directinsight.co.uk/products ... loper.html

I know of extensive usage of CUDA and FPGA technology in astronomy, oil explorations etc. A lot of it is more mundane control systems, but large scale optimization (using look ahead) is reportedly implemented on GPUs. Definitely, in finance, the obvious parallelizable simulations are using CUDA and I should think that some large scale linear algebra functions should be applicable.

BTW, are you planning to put in some pseudo random number generator? If you are, in addition to the normal Sobol, etc, there are newer ones like the Mersenne Twister (http://www.bedaux.net/mtrand/) and the newest WELL generator (http://www.iro.umontreal.ca/~panneton/WELLRNG.html). The Mersenne Twister is now standard for quite a few packages (R, Maple and Matlab) for random number generation but I am told the WELL generator is superior - I still can't find any document telling me what the initial states should ideally be.

Also, have you looked at the Sandia's optimizer? https://share.sandia.gov/news/resources ... /acro.html.

I apologize if I sound ungrateful for your efforts. Don't get me wrong: I'm very thankful for your efforts. I like small and compact libraries with simple code that I can reuse and adapt. The bigger packages, especially those overly structured, work 'nicely' as a package but almost impossible to disentangle without a lot of effort and testing.

twkie · **Joined:** Tue Jan 04, 2011 7:59 pm **Posts:** 5

One more design point about issue of using plain C and allocation of memory:

With plain C code - I make it a point NEVER to dynamically (re)allocate memory within core numerical code. I distinguish between wrapper/domain code vs pure numerical functions. Examples of the latter are SVD etc. All allocations of memory, including additional workspaces are the responsibility of the domain code that calls the pure numerical functions. In this way, I can control memory usage (very critical) in some applications, and I can reuse memory for repeated calls etc. I work with very time critical/sensitive systems and I ensure, whenever possible, that dynamically allocated memory is all reserved at the start up and no spurious allocation/deallocation is performed during runs.

If you use the ae_matrix, and some simple accessor functions like double *ae_at(ae_matrix, unsigned int, unsigned int) that looks in there and provides access, you can almost do a find/replace. Also, some memory utility like ae_alloc(ae_matrix, unsigned int, unsigned int), ae_dealloc(ae_matrix, unsigned int, unsigned int), double ae_numrows(ae_matrix), double ae_numcols(ae_matrix) and ae_fill(ae_matrix, double *) and ae_extract(ae_matrix, double *) so that anyone who uses these functions would not mangle your internal structure and you can keep house keeping bits in there like error bound check that throws exceptions (only in #defined in C++ systems embedding C code and user is comfortable with potential optimization issues in debugging stages). However, you are right that you do not have the ability to have automatic destructors on leaving scope. Still, if people writing numerical code obey the separation between domain code and pure numerical functions, this should not affect your project.

Sergey.Bochkanov · **Joined:** Fri May 07, 2010 7:06 am **Posts:** 927

twkie wrote:

As examples of 'C' only development environments:
Here's the link to the CUDA development page. http://developer.nvidia.com/object/gpucomputing.html
This is the link to FPGA development. http://www.directinsight.co.uk/products ... loper.html

I've heard of CUDA. But, unfortunately, it has a lot of limitations, one of the biggests is "no recursion" rule. I think that it is impossible to use "general" library on CUDA, i.e. anyone has to reimplement wheel one more time, now in CUDA fashion. I've expected more from Intel Larrabee, but it is unclear when Intel will introduce it to market.

twkie wrote:

BTW, are you planning to put in some pseudo random number generator?
....
Also, have you looked at the Sandia's optimizer? https://share.sandia.gov/news/resources ... /acro.html.

Thanks for the suggestions, I'll add them to issues tracker. I've always been interested in taking input from users.

As for my plans, in the near future I want to:
* add more optimization algorithms (optimization with general constraints, preconditioned optimization)
* add more interpolation/fitting algorithms (constrained Levenberg-Marquardt fitting, 2/3-dimensional interpolation/fitting)
* add support for parallel computations
* make better use of SSE instructions and low level assembly in general

I've already thought about pseudo-random numbers, although no specific algorithms were planned. But I think that such functionality will wait for the middle of 2011.

twkie wrote:

With plain C code - I make it a point NEVER to dynamically (re)allocate memory within core numerical code.
....
In this way, I can control memory usage (very critical) in some applications, and I can reuse memory for repeated calls etc. I work with very time critical/sensitive systems and I ensure, whenever possible, that dynamically allocated memory is all reserved at the start up and no spurious allocation/deallocation is performed during runs.

You have a point here. But it is impossible to completely avoid dynamic memory allocation by ALGLIB internals. ALGLIB is designed in such a way that it may allocate some memory inside solver (or another complex algorithm) and it is impossible to tell in advance how much memory it will need. It is not good decision to pass "large enough temporary array" for temporary allocations, because: 1) it is hard to tell what size is "large enough", and 2) it is hard to distinguish temporary allocations from non-temporary ones. Package have to be completely reimplemented in order to solve these problems.

twkie wrote:

If you use the ae_matrix, and some simple accessor functions like double *ae_at(ae_matrix, unsigned int, unsigned int) that looks in there and provides access, you can almost do a find/replace.

What do you think about another solution? Some "interface" matrix type, which will be mediator between user-allocated arrays and ALGLIB-manages arrays:

Code:

struct ae_interface_matrix
{
    double **pp_items;
    int rows, cols;
};
ae_interface_matrix matrix_from_pointer(double **, int rows, int cols);

....

double a[10][10];
int p[10];
alglib_rmatrixlu(matrix_from_pdouble(&a[0][0],10,10), 10, 10, vector_from_pint(p,10));

This type will represent user-managed array with additional information about size. It will be created before calls to ALGLIB functions, passed by value, but it will be lightweight, so I hope that performance penalty won't be too high. Benefits are: 1) no need to allocate/free ALGLIB-managed arrays (although complex structures still need alloc/free calls), arrays are managed by user alone, and 2) ability to check that user passed array which is large enough (i.e. to handle situation when we want to return 10 elements, and array has place only for 9 of them).

What do you think about such a solution?

twkie · **Joined:** Tue Jan 04, 2011 7:59 pm **Posts:** 5

Hi

Thanks for your thoughtful response. I look forward to your pseudo number generator - I'll send you papers and lesser GPL or BSD code if and when I find them.

Yes, building stuff on hardware has limitations. At least GPUs are a bit better than FPGAs. Work has to be done to modify code.

For your optimization code, do you have an estimate of how much working space you need given the dimensionality of the problem that you allocate internally?

I am happy with your suggestion on the matrix to native array interface.

TW

Sergey.Bochkanov · **Joined:** Fri May 07, 2010 7:06 am **Posts:** 927

twkie wrote:

For your optimization code, do you have an estimate of how much working space you need given the dimensionality of the problem that you allocate internally?

Sometimes it is easy to calculate temporaries size, but sometimes - not. For example, bound and linearly constrained optimizer uses CG optimizer and SVD solver deep in its internals, which in turn need their own temporaries. It is possible to implement some LAPACK-like scheme of querying algorithm about desired temporary size, but: a) it will be incompatible with C# (in C# you must use default, automatic allocator with garbage collection), b) whole library must be reimplemented, c) it will be too hard to write such library - every time I allocate one more array I have to change estimate for temporaries size.

forum.alglib.net

Forum rules

Discussing redesign of the C++ version

Who is online