forum.alglib.net
http://forum.alglib.net/

Error: APSERVAreDistinct: internal error (not sorted)
http://forum.alglib.net/viewtopic.php?f=2&t=4405
Page 1 of 1

Author:  tobiasho [ Sun Dec 05, 2021 3:15 pm ]
Post subject:  Error: APSERVAreDistinct: internal error (not sorted)

Hi everybody,

I am encountering the error "APSERVAreDistinct: internal error (not sorted)" using the function spline1dbuildakima() for 1D interpolation of data. Basically, I am interpolating multiple signals one after another, with all of them using the same spline nodes (in my case "timestamps"). They get called right after another:

Code:
   

alglib::real_1d_array pos_east, pos_north, pos_yaw, vel_mag, vel_angle, vel_yaw, timestamps;
alglib::spline1dinterpolant pos_east_int, pos_north_int, pos_yaw_int, vel_mag_int, vel_angle_int, vel_yaw_int;

void trajInterpolation(trajENU& traj) {

    for (int i = 0; i < 50; i++) {
       timestamps[i] = traj.timestamps[i];
       pos_east[i] = traj.pos_enu_east_m[i];
       pos_north[i] = traj.pos_enu_north_m[i];
       pos_yaw[i] = traj.ori_enu_psi_rad[i];
       vel_mag[i] = traj.vel_enu_mag_m_s[i];
       vel_angle[i] = traj.vel_enu_angle_rad[i];
       vel_yaw[i] = traj.ori_vel_enu_psi_rad_s[i];
    }

    alglib::spline1dbuildakima(timestamps, pos_east, pos_east_int);
    alglib::spline1dbuildakima(timestamps, pos_north, pos_north_int);
    alglib::spline1dbuildakima(timestamps, pos_yaw, pos_yaw_int);
    alglib::spline1dbuildakima(timestamps, vel_mag, vel_mag_int);
    alglib::spline1dbuildakima(timestamps, vel_angle, vel_angle_int);
    alglib::spline1dbuildakima(timestamps, vel_yaw, vel_yaw_int);

}

main () {
   
    [...]

    timestamps.setlength(50);
    pos_east.setlength(50);
    pos_north.setlength(50);
    pos_yaw.setlength(50);
    vel_mag.setlength(50);
    vel_angle.setlength(50);
    vel_yaw.setlength(50);

    [...]

    // A task is started, that calls trajInterpolation(trajENU& traj) cyclically
   
}



From what I understand, the error occurs, when the x-values for the spline (in my case "timestamps") are not sorted when passed to spline1dbuildakima. However, when the error occurs, the timestamps are sorted and distinct (example data that actually cause the error in my setup below):

1235908840.4490000000
1235908840.5490000000
1235908840.6490000000
1235908840.7490000000
1235908840.8490000000
1235908840.9490000000
1235908841.0490000000
1235908841.1490000000
1235908841.2490000000
1235908841.3490000000
1235908841.4490000000
1235908841.5490000000
1235908841.6490000000
1235908841.7490000000
1235908841.8490000000
1235908841.9490000000
1235908842.0490000000
1235908842.1490000000
1235908842.2490000000
1235908842.3490000000
1235908842.4490000000
1235908842.5490000000
1235908842.6490000000
1235908842.7490000000
1235908842.8490000000
1235908842.9490000000
1235908843.0490000000
1235908843.1490000000
1235908843.2490000000
1235908843.3490000000
1235908843.4490000000
1235908843.5490000000
1235908843.6490000000
1235908843.7490000000
1235908843.8490000000
1235908843.9490000000
1235908844.0490000000
1235908844.1490000000
1235908844.2490000000
1235908844.3490000000
1235908844.4490000000
1235908844.5490000000
1235908844.6490000000
1235908844.7490000000
1235908844.8490000000
1235908844.9490000000
1235908845.0490000000
1235908845.1490000000
1235908845.2490000000
1235908845.3490000000

Furthermore, the error is not reproducible in a deterministic way. I am working on embedded hardware and the function trajInterpolation(trajENU& traj) is called cyclically with a relatively high frequency (50 Hz). When supplying precisely identical input data (trajENU& traj) two things can be observed:

    The error is thrown after a varying number of iterations. trajENU& traj is changed by my program on a periodic basis, but for this test I always provide the same sequence of data. The error sometimes occurs very early in the sequence, the next time that part obviously works fine and the error occurs later on in the sequence. Sometimes the error doesn't occur at all (using the exact same data sequence every time!).

    When the error is thrown, it doesn't happen during the same call of alglib::spline1dbuildakima() inside trajInterpolation(trajENU& traj). As shown above, I am calling alglib::spline1dbuildakima() multiple times in each iteration with different y-values. Usually, the first couple of calls work before it crashes. However, "timestamps" does not change between these calls so I don't understand why the error isn't thrown during the first call (assuming "timestamps" contains non-sorted data).

I even verified the content of x->ptr.p_double inside ae_bool aredistinct() function in alglibinternal.cpp when the error is thrown and it's identical to the timestamps shown earlier.


I am quite lost on what's causing this behavior and any help is mightily appreciated. If I missed to provide any important information for understanding, please let me know.



Edit:
I narrowed the issue down further by extending the ae_bool aredistinct function in the following way:

Code:
/*************************************************************************
This function checks that all values from X[] are distinct. It does more
than just usual floating point comparison:
* first, it calculates max(X) and min(X)
* second, it maps X[] from [min,max] to [1,2]
* only at this stage actual comparison is done

The meaning of such check is to ensure that all values are "distinct enough"
and will not cause interpolation subroutine to fail.

NOTE:
    X[] must be sorted by ascending (subroutine ASSERT's it)

  -- ALGLIB --
     Copyright 02.12.2009 by Bochkanov Sergey
*************************************************************************/
ae_bool aredistinct(/* Real    */ ae_vector* x,
     ae_int_t n,
     ae_state *_state)
{
    double a;
    double b;
    ae_int_t i;
    ae_bool nonsorted;
    ae_bool result;


    ae_assert(n>=1, "APSERVAreDistinct: internal error (N<1)", _state);
    if( n==1 )
    {
       
        /*
         * everything is alright, it is up to caller to decide whether it
         * can interpolate something with just one point
         */
        result = ae_true;
        return result;
    }
    a = x->ptr.p_double[0];
    b = x->ptr.p_double[0];
    nonsorted = ae_false;
    for(i=1; i<=n-1; i++)
    {
        a = ae_minreal(a, x->ptr.p_double[i], _state);
        b = ae_maxreal(b, x->ptr.p_double[i], _state);
        nonsorted = nonsorted||ae_fp_greater_eq(x->ptr.p_double[i-1],x->ptr.p_double[i]);
       
        if (ae_fp_greater_eq(x->ptr.p_double[i-1],x->ptr.p_double[i]) == true) {
           std::cout << std::fixed;
           std::cout << "x->ptr.p_double[i-1]: " << x->ptr.p_double[i-1] << std::endl;
           std::cout << "x->ptr.p_double[i]: " << x->ptr.p_double[i] << std::endl;
        }
    }

    ae_assert(!nonsorted, "APSERVAreDistinct: internal error (not sorted)", _state);
    for(i=1; i<=n-1; i++)
    {
        if( ae_fp_eq((x->ptr.p_double[i]-a)/(b-a)+1,(x->ptr.p_double[i-1]-a)/(b-a)+1) )
        {
            result = ae_false;
            return result;
        }
    }
    result = ae_true;
    return result;
}


The added part

Code:
        if (ae_fp_greater_eq(x->ptr.p_double[i-1],x->ptr.p_double[i]) == true) {
           std::cout << std::fixed;
           std::cout << "x->ptr.p_double[i-1]: " << x->ptr.p_double[i-1] << std::endl;
           std::cout << "x->ptr.p_double[i]: " << x->ptr.p_double[i] << std::endl;
        }


outputs:

x->ptr.p_double[i-1]: 1235908880.616000
x->ptr.p_double[i]: 1235908880.716000


Clearly, ptr.p_double[i-1] isn't >= ptr.p_double[i] and when I write a program for my target hardware that is only performing the comparison with the numbers above, "false" is returned. So the issue only occurs when the code is running within a more complex project with a higher frequency. Any ideas, why ae_fp_greater_eq returns true in such a case?

Author:  Sergey.Bochkanov [ Mon Dec 06, 2021 3:05 pm ]
Post subject:  Re: Error: APSERVAreDistinct: internal error (not sorted)

Hi!

ae_fp_greater_eq() is very simple function, it is just a wrapper around >= operator (with two volatile locals used to workaround bugs related to legacy 80-bit floating point FPUs). There is simply no place for an error to creep in. I don't say that ALGLIB is 100% error free, but this particular piece of code simply has no place for errors. It is complete surprise for me that you nevertheless has this error.

Would you like to tell me about specific hardware being used to execute the program? Especially, CPU type?

Author:  tobiasho [ Mon Dec 06, 2021 3:21 pm ]
Post subject:  Re: Error: APSERVAreDistinct: internal error (not sorted)

Hi Sergey,

thanks for your reply! I am working on a Xilinx Zynq Ultrascale+ ZU3EG using an ARM Cortex R5 realtime core and FreeRTOS.
In the meantime, I was able to narrow down the issue further. By calling ae_fp_greater_eq() with hardcoded values in a endless loop, I have a chance of about 2*10⁻9 that the functions returns true instead of false. Interestingly, I buffered the result of the comparison in a global variable (result_tmp) - when the return value (result) is true, result_tmp is correctly false. So it appears the issue is not located in the comparison but in the process of returning the value from the function.

I'm getting a feeling this is more of a hardware bug than anything else ... Never encountered it anywhere else though.

Code:
bool result_tmp;

ae_bool ae_fp_greater_eq(double v1, double v2){

/* IEEE-strict floating point comparison */

volatile double x = v1;
volatile double y = v2;
result_tmp = x>=y;
return x>=y;

}

main() {

while(1) {
    double x = 1235908880.616;
    double y = 1235908880.716;
    ae_bool result = ae_fp_greater_eq(x, y);
}

}

Author:  Sergey.Bochkanov [ Mon Dec 06, 2021 10:06 pm ]
Post subject:  Re: Error: APSERVAreDistinct: internal error (not sorted)

Given what you already found, 100% hardware bug.

However, it is really interesting why it did not appear anywhere else. Is it possible to take a look at disassembly of the modified ae_fp_greater_eq?

Page 1 of 1 All times are UTC
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/