forum.alglib.net

ALGLIB forum
It is currently Mon Dec 23, 2024 6:14 am

All times are UTC


Forum rules


1. This forum can be used for discussion of both ALGLIB-related and general numerical analysis questions
2. This forum is English-only - postings in other languages will be removed.



Post new topic Reply to topic  [ 15 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: PLINQ/Task vs Rbf
PostPosted: Thu Nov 15, 2012 11:54 am 
Offline

Joined: Fri May 13, 2011 12:09 pm
Posts: 7
Hi, is it possible (thread safe) evaluate in C# version of alglib rbf model in parallel, for example:
Code:
var res = points.AsParallel().Select(x =>
{
  double result;
  alglib.rbfcalc(model, x, out result);
  return result;
});


Top
 Profile  
 
 Post subject: Re: PLINQ/Task vs Rbf
PostPosted: Fri Nov 16, 2012 7:01 am 
Offline
Site Admin

Joined: Fri May 07, 2010 7:06 am
Posts: 927
No, it is not thread-safe.

Each rbfcalc() call involves search for nearest neighbors of a point - such search uses fields of model as temporaries which are modified during search. So, two parallel calls to rbfcalc() for same model will lead to two searches being executed simultaneously and sharing same internal buffers.


Top
 Profile  
 
 Post subject: Re: PLINQ/Task vs Rbf
PostPosted: Fri Nov 16, 2012 7:02 am 
Offline
Site Admin

Joined: Fri May 07, 2010 7:06 am
Posts: 927
BTW, it does not mean that you can't call rbfcalc() for two completely different structures - you just can't use one RBF model from two threads.


Top
 Profile  
 
 Post subject: Re: PLINQ/Task vs Rbf
PostPosted: Fri Nov 16, 2012 5:06 pm 
Offline

Joined: Fri May 13, 2011 12:09 pm
Posts: 7
Sergey.Bochkanov wrote:
No, it is not thread-safe.

I have very large dataset (~100.000 points, sometimes even ~1.000.000).
Can i still achieve a significant performance gain if i make somehow copy
of RBF model structure? Is it possible to make shallow copy of fields thad don't
change during the evaluation of the model and deep copy of fields that do change?


Top
 Profile  
 
 Post subject: Re: PLINQ/Task vs Rbf
PostPosted: Sat Nov 17, 2012 8:06 am 
Offline
Site Admin

Joined: Fri May 07, 2010 7:06 am
Posts: 927
You can copy structure by means of serializing it, and then performing several un-serializations from same source. However, your task is really huge - you will need more than 100MB just to store kd-tree search structure used by RBF model. So, you may face serious issues with memory size, bus bandwidth and CPU cache size, if you try to parallelize computations on multi-core computer. However, it may be worth trying :)

Regarding performing "smart" copying - it is possible, and it should work, but I can't say in several words which fields should be copied, and which - should not. There are two main places where memory is consumed - rbfmodel's fields which store centers/weights, and internals of the kdtree search structure. You can examine ALGLIB source and determine which fields are changed only at the initialization of the structure - these fields can be shared between different copies of the model.


Top
 Profile  
 
 Post subject: Re: PLINQ/Task vs Rbf
PostPosted: Sat Nov 17, 2012 10:10 am 
Offline

Joined: Fri May 13, 2011 12:09 pm
Posts: 7
Thanks, btw is there any plans of parallelizing alglib? Imho lack of parallel versions of algorithms will be a huge drawback of alglib in the future, for example if you have core i7 cpu and 32 gb ram...


Top
 Profile  
 
 Post subject: Re: PLINQ/Task vs Rbf
PostPosted: Sat Nov 17, 2012 12:28 pm 
Offline
Site Admin

Joined: Fri May 07, 2010 7:06 am
Posts: 927
Yes, it is planned to release multicore version of ALGLIB in the first months of 2013. It is really a huge drawback, because many important algorithms can greatly benefit from parallelization. We've already implemented framework for scheduling tasks between different cores (for some reasons we do not want to use SMP features of NET 4 or OpenMP), now it is in the testing phase.


Top
 Profile  
 
 Post subject: Re: PLINQ/Task vs Rbf
PostPosted: Sun Nov 18, 2012 1:40 pm 
Offline

Joined: Fri May 13, 2011 12:09 pm
Posts: 7
Hi again, one more question about memory consumption: i'm trying to run the following test code on win7 x64 with 4GB ram and program crashes with out of memory error somewhere during the model construction:
Code:
            alglib.rbfmodel model;
            int N = 1000*400;

            int expected_mem = 250 * N * (sizeof(double) + 2 * sizeof(int));
            Console.WriteLine( "Expected memory consumption = {0:N}", expected_mem );

            int nx = 3;
            int ny = 1;
            alglib.rbfcreate( nx, ny, out model );

            var rnd = new Random();
            var data = new double[N, nx + ny];
            var pts = new double[N][];
            var res = new double[N][];

            for ( int i = 0; i < N; ++i ) {
                data[i, 0] = -1 + rnd.NextDouble() * 2;
                data[i, 1] = -1 + rnd.NextDouble() * 2;
                data[i, 2] = -1 + rnd.NextDouble() * 2;
                data[i, 3] = Math.Sin( data[i, 0] ) * Math.Cos( data[i, 1] ) * data[i, 2];

                pts[i] = new double[nx];
                pts[i][0] = -1 + rnd.NextDouble() * 2;
                pts[i][1] = -1 + rnd.NextDouble() * 2;
                pts[i][2] = -1 + rnd.NextDouble() * 2;

                res[i] = new double[ny];
            }

            Console.WriteLine( "Data initialized, {0:N} points", N );

            var mem_0 = GC.GetTotalMemory( true );
            var s = Stopwatch.StartNew();

            alglib.rbfreport rep;
            alglib.rbfsetpoints( model, data );
            alglib.rbfbuildmodel( model, out rep );

            Console.WriteLine( "Model build, time = {0:N} ms", s.ElapsedMilliseconds );
            GC.Collect();
            var mem_1 = GC.GetTotalMemory( true );
            Console.WriteLine( "Allocated {0:N} bytes", mem_1 - mem_0 );
           
            s = Stopwatch.StartNew();
            for ( int j = 0; j < N; ++j ) {
                alglib.rbfcalc( model, pts[j], out res[j] );
            }
            Console.WriteLine( "Computation done! time = {0:N} ms", s.ElapsedMilliseconds );

            Console.ReadLine();


As far as i undersand 4GB ram should be sufficient to create model from 400 000 points? Or not?


Top
 Profile  
 
 Post subject: Re: PLINQ/Task vs Rbf
PostPosted: Mon Nov 19, 2012 4:54 pm 
Offline
Site Admin

Joined: Fri May 07, 2010 7:06 am
Posts: 927
Hello! I have not had enough time to solve this issue today, but I will investigate it tomorrow and report results to this topic.


Top
 Profile  
 
 Post subject: Re: PLINQ/Task vs Rbf
PostPosted: Tue Nov 20, 2012 7:11 am 
Offline
Site Admin

Joined: Fri May 07, 2010 7:06 am
Posts: 927
The problem is that algorithm needs a lot of memory for its internal calculations. You have 400.000 points, each have many neighbors whose influence must be accounted for. In your setting it allocates double[] array with 500.000.000 elements to store weights matrix. And NET framework has upper limit on array size (2GB), so you can not allocate such large array under .NET even when you compile for 64-bit architecture.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 15 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group