I am attempting to write a program that will do two various kinds of interpolation, and for these I have implemented the Alglib files. Unfortunately, I am having problems with the length of time one of them takes to execute.

The first sort of interpolation will consist of a CSV file being read in, where the file mainly contains 0 values with some points scattered throughout. The purpose of this is to estimate all the 0 values with kosher ones, from the given points.

This is the main interpolation function. Prior to this, the file will have been read in to a 2D array and the points selected out of this and stored in the real_2d_array ‘XY’.

**Code:**

/* --------------------------------------------------------------------------

Function Name: SurfaceInterpolation

Description: This function will interpolate our data.

It uses Alglib's rbfsetalgomultilayer to

estimate the point values in the file.

Input Parameters: None

Return Value: None

-------------------------------------------------------------------------- */

void SurfaceInterpolation(void)

{

//These are the calculated parameters for the algorithm call.

double Radius = 0 ;

alglib::ae_int_t Layers = 0 ;

//Debug - diagnostics

TRACE("Entered function.\n") ;

//We want to create the model

alglib::rbfcreate(2, 1, model);

//Debug - diagnostics

TRACE("created model.\n") ;

//We now need to set the XY points in our model.

alglib::rbfsetpoints(model, XY) ;

//Debug - diagnostics

TRACE("set points.\n") ;

// After we've configured model, we should rebuild it -

// it will change coefficients stored internally in the

// rbfmodel structure.

alglib::rbfreport rep ;

//We now want to call rbfsetalgomultilayer but we need to calculate the parameters for this.

//The first parameter is the model, which we have set up.

//The second parameter is the radius base.

//The third parameter is the number of layers.

//The fourth parameter is lambdaV, which for now can be ignored and left as the default (hence not included)

//The radius must be "several times" larger than the average distance between points

Radius = 3 * AverageDistance ;

//Debug - diagnostics

TRACE("set radius : %f.\n", Radius) ;

//The number of layers must be so that RLast = RBase/2^(NLayers-1) will be smaller than the typical distance between points.

//For now, this is just double the radius.

Layers = LAYER_VALUE ;

//Debug - diagnostics

TRACE("set layers : %i.\n", (int)Layers) ;

//Now let's set up the model with this algorithm.

alglib::rbfsetalgomultilayer(model, Radius, Layers) ;

//Debug - diagnostics

TRACE("set algorithm.\n") ;

//And rebuild the model so the changes take effect.

alglib::rbfbuildmodel(model, rep);

//Debug - diagnostics

TRACE("built model.\n") ;

//Now we want to update our grid with the new values

UpdateGrid() ;

//Debug - diagnostics

TRACE("updated grid.\n") ;

}

The second interpolation method will be reading in a while which mainly consists of points, with various clusters of 0 values. The aim here is to "fill in" the zero values, given the surrounding values. Again, prior to this, the file will have been read in to a 2D array and the points selected out of this and stored in the real_2d_array ‘XY’.

**Code:**

/* --------------------------------------------------------------------------

Function Name: PointsInterpolation

Description: This function will interpolate our data.

It uses Alglib's rbfsetakgoqnn to estimate

the point values in the file.

Input Parameters: None

Return Value: None

-------------------------------------------------------------------------- */

void PointsInterpolation(void)

{

//We want to create the model

alglib::rbfcreate(2, 1, model) ;

//We now need to set the XY points in our model.

alglib::rbfsetpoints(model, XY) ;

// After we've configured model, we should rebuild it -

// it will change coefficients stored internally in the

// rbfmodel structure.

alglib::rbfreport rep ;

alglib::rbfsetalgoqnn(model) ;

alglib::rbfbuildmodel(model, rep) ;

//Now we want to update our grid with the new values

UpdateGrid() ;

}

The final function calls, to UpdateGrid, will extract the values from the created model, back in to the 2D array, Grid[][], as the file will be saved again to a CSV.

**Code:**

/* --------------------------------------------------------------------------

Function Name: UpdateGrid

Description: This function will populate the grid array

with the new values from the interpolation

model. It will also ensure they are not negative.

Input Parameters: None

Return Value: None

-------------------------------------------------------------------------- */

void UpdateGrid(void)

{

double OurTempValue = 0 ;//Retrieved value from model

double LowestValue = 1 ;//Lowest number from mode

//We want to loop through all our points, updating them from the model.

for (int y = 0; y < GRID_Y; y++)

{

for (int x = 0; x < GRID_X; x++)

{

//We now want to retrieve our new value from the model,

//and save it to our temp varaible.

OurTempValue = alglib::rbfcalc2(model, y, x) ;

Grid[y][x] = OurTempValue ;

//Here we are checking if we have a negative number, because we are looking for the lowest.

if(OurTempValue < 0)

{

//Yep, it's negative! But is it less than our previously found negative?

if(OurTempValue < LowestValue)

{

//Indeed it is, so let's add this as our lowest now.

LowestValue = OurTempValue ;

}

}

}

}

//Now we want to add the lowest negative value to all values so let's make a call to do that,

//but only if we found any.

if(LowestValue < 1)

{

RaiseValues(LowestValue * -1) ;//* -1 to make it positive

}

}

The purpose of the last function call is to shift all the values in the array, so we don't get any negative ones. It's not crucial what the actual value is, just how they relate to one another (so -1 and 3 at just as OK as 1 and 5 providing it is consistent across the file).

Both of the interpolation functions appear to work, however the

*second *method (mainly values with a few zero entries) takes an awfully long time to execute. (There is no code optimisation turned on, currently.)

The files are quite large (they may be up to 500 by 500 values).

I have tested the two functions, the first takes seconds to complete, while the second method takes around 8 minutes to execute. Both files are 476 columns by 141 rows, but the rows could be much larger.

Obviously, this is far too long, and with the aid of the diagnostics I know that the areas taking the longest are

**building the model**, and

**updating the grid by extracting the values from the model**.

However UpdateGrid does not take a long time to execute with the first process.

Would anyone be able to help point out if I am on the right lines with these functions, and where I may be messing up with them taking so long? Are they just not supposed to be used for such large values or have I misunderstood something?

Thanks in advance for any help :)

Kat