forum.alglib.net

ALGLIB forum
It is currently Thu Mar 28, 2024 6:49 pm

All times are UTC


Forum rules


1. This forum can be used for discussion of both ALGLIB-related and general numerical analysis questions
2. This forum is English-only - postings in other languages will be removed.



Post new topic Reply to topic  [ 4 posts ] 
Author Message
 Post subject: Error: APSERVAreDistinct: internal error (not sorted)
PostPosted: Sun Dec 05, 2021 3:15 pm 
Offline

Joined: Sun Dec 05, 2021 2:12 pm
Posts: 2
Hi everybody,

I am encountering the error "APSERVAreDistinct: internal error (not sorted)" using the function spline1dbuildakima() for 1D interpolation of data. Basically, I am interpolating multiple signals one after another, with all of them using the same spline nodes (in my case "timestamps"). They get called right after another:

Code:
   

alglib::real_1d_array pos_east, pos_north, pos_yaw, vel_mag, vel_angle, vel_yaw, timestamps;
alglib::spline1dinterpolant pos_east_int, pos_north_int, pos_yaw_int, vel_mag_int, vel_angle_int, vel_yaw_int;

void trajInterpolation(trajENU& traj) {

    for (int i = 0; i < 50; i++) {
       timestamps[i] = traj.timestamps[i];
       pos_east[i] = traj.pos_enu_east_m[i];
       pos_north[i] = traj.pos_enu_north_m[i];
       pos_yaw[i] = traj.ori_enu_psi_rad[i];
       vel_mag[i] = traj.vel_enu_mag_m_s[i];
       vel_angle[i] = traj.vel_enu_angle_rad[i];
       vel_yaw[i] = traj.ori_vel_enu_psi_rad_s[i];
    }

    alglib::spline1dbuildakima(timestamps, pos_east, pos_east_int);
    alglib::spline1dbuildakima(timestamps, pos_north, pos_north_int);
    alglib::spline1dbuildakima(timestamps, pos_yaw, pos_yaw_int);
    alglib::spline1dbuildakima(timestamps, vel_mag, vel_mag_int);
    alglib::spline1dbuildakima(timestamps, vel_angle, vel_angle_int);
    alglib::spline1dbuildakima(timestamps, vel_yaw, vel_yaw_int);

}

main () {
   
    [...]

    timestamps.setlength(50);
    pos_east.setlength(50);
    pos_north.setlength(50);
    pos_yaw.setlength(50);
    vel_mag.setlength(50);
    vel_angle.setlength(50);
    vel_yaw.setlength(50);

    [...]

    // A task is started, that calls trajInterpolation(trajENU& traj) cyclically
   
}



From what I understand, the error occurs, when the x-values for the spline (in my case "timestamps") are not sorted when passed to spline1dbuildakima. However, when the error occurs, the timestamps are sorted and distinct (example data that actually cause the error in my setup below):

1235908840.4490000000
1235908840.5490000000
1235908840.6490000000
1235908840.7490000000
1235908840.8490000000
1235908840.9490000000
1235908841.0490000000
1235908841.1490000000
1235908841.2490000000
1235908841.3490000000
1235908841.4490000000
1235908841.5490000000
1235908841.6490000000
1235908841.7490000000
1235908841.8490000000
1235908841.9490000000
1235908842.0490000000
1235908842.1490000000
1235908842.2490000000
1235908842.3490000000
1235908842.4490000000
1235908842.5490000000
1235908842.6490000000
1235908842.7490000000
1235908842.8490000000
1235908842.9490000000
1235908843.0490000000
1235908843.1490000000
1235908843.2490000000
1235908843.3490000000
1235908843.4490000000
1235908843.5490000000
1235908843.6490000000
1235908843.7490000000
1235908843.8490000000
1235908843.9490000000
1235908844.0490000000
1235908844.1490000000
1235908844.2490000000
1235908844.3490000000
1235908844.4490000000
1235908844.5490000000
1235908844.6490000000
1235908844.7490000000
1235908844.8490000000
1235908844.9490000000
1235908845.0490000000
1235908845.1490000000
1235908845.2490000000
1235908845.3490000000

Furthermore, the error is not reproducible in a deterministic way. I am working on embedded hardware and the function trajInterpolation(trajENU& traj) is called cyclically with a relatively high frequency (50 Hz). When supplying precisely identical input data (trajENU& traj) two things can be observed:

    The error is thrown after a varying number of iterations. trajENU& traj is changed by my program on a periodic basis, but for this test I always provide the same sequence of data. The error sometimes occurs very early in the sequence, the next time that part obviously works fine and the error occurs later on in the sequence. Sometimes the error doesn't occur at all (using the exact same data sequence every time!).

    When the error is thrown, it doesn't happen during the same call of alglib::spline1dbuildakima() inside trajInterpolation(trajENU& traj). As shown above, I am calling alglib::spline1dbuildakima() multiple times in each iteration with different y-values. Usually, the first couple of calls work before it crashes. However, "timestamps" does not change between these calls so I don't understand why the error isn't thrown during the first call (assuming "timestamps" contains non-sorted data).

I even verified the content of x->ptr.p_double inside ae_bool aredistinct() function in alglibinternal.cpp when the error is thrown and it's identical to the timestamps shown earlier.


I am quite lost on what's causing this behavior and any help is mightily appreciated. If I missed to provide any important information for understanding, please let me know.



Edit:
I narrowed the issue down further by extending the ae_bool aredistinct function in the following way:

Code:
/*************************************************************************
This function checks that all values from X[] are distinct. It does more
than just usual floating point comparison:
* first, it calculates max(X) and min(X)
* second, it maps X[] from [min,max] to [1,2]
* only at this stage actual comparison is done

The meaning of such check is to ensure that all values are "distinct enough"
and will not cause interpolation subroutine to fail.

NOTE:
    X[] must be sorted by ascending (subroutine ASSERT's it)

  -- ALGLIB --
     Copyright 02.12.2009 by Bochkanov Sergey
*************************************************************************/
ae_bool aredistinct(/* Real    */ ae_vector* x,
     ae_int_t n,
     ae_state *_state)
{
    double a;
    double b;
    ae_int_t i;
    ae_bool nonsorted;
    ae_bool result;


    ae_assert(n>=1, "APSERVAreDistinct: internal error (N<1)", _state);
    if( n==1 )
    {
       
        /*
         * everything is alright, it is up to caller to decide whether it
         * can interpolate something with just one point
         */
        result = ae_true;
        return result;
    }
    a = x->ptr.p_double[0];
    b = x->ptr.p_double[0];
    nonsorted = ae_false;
    for(i=1; i<=n-1; i++)
    {
        a = ae_minreal(a, x->ptr.p_double[i], _state);
        b = ae_maxreal(b, x->ptr.p_double[i], _state);
        nonsorted = nonsorted||ae_fp_greater_eq(x->ptr.p_double[i-1],x->ptr.p_double[i]);
       
        if (ae_fp_greater_eq(x->ptr.p_double[i-1],x->ptr.p_double[i]) == true) {
           std::cout << std::fixed;
           std::cout << "x->ptr.p_double[i-1]: " << x->ptr.p_double[i-1] << std::endl;
           std::cout << "x->ptr.p_double[i]: " << x->ptr.p_double[i] << std::endl;
        }
    }

    ae_assert(!nonsorted, "APSERVAreDistinct: internal error (not sorted)", _state);
    for(i=1; i<=n-1; i++)
    {
        if( ae_fp_eq((x->ptr.p_double[i]-a)/(b-a)+1,(x->ptr.p_double[i-1]-a)/(b-a)+1) )
        {
            result = ae_false;
            return result;
        }
    }
    result = ae_true;
    return result;
}


The added part

Code:
        if (ae_fp_greater_eq(x->ptr.p_double[i-1],x->ptr.p_double[i]) == true) {
           std::cout << std::fixed;
           std::cout << "x->ptr.p_double[i-1]: " << x->ptr.p_double[i-1] << std::endl;
           std::cout << "x->ptr.p_double[i]: " << x->ptr.p_double[i] << std::endl;
        }


outputs:

x->ptr.p_double[i-1]: 1235908880.616000
x->ptr.p_double[i]: 1235908880.716000


Clearly, ptr.p_double[i-1] isn't >= ptr.p_double[i] and when I write a program for my target hardware that is only performing the comparison with the numbers above, "false" is returned. So the issue only occurs when the code is running within a more complex project with a higher frequency. Any ideas, why ae_fp_greater_eq returns true in such a case?


Top
 Profile  
 
 Post subject: Re: Error: APSERVAreDistinct: internal error (not sorted)
PostPosted: Mon Dec 06, 2021 3:05 pm 
Offline
Site Admin

Joined: Fri May 07, 2010 7:06 am
Posts: 903
Hi!

ae_fp_greater_eq() is very simple function, it is just a wrapper around >= operator (with two volatile locals used to workaround bugs related to legacy 80-bit floating point FPUs). There is simply no place for an error to creep in. I don't say that ALGLIB is 100% error free, but this particular piece of code simply has no place for errors. It is complete surprise for me that you nevertheless has this error.

Would you like to tell me about specific hardware being used to execute the program? Especially, CPU type?


Top
 Profile  
 
 Post subject: Re: Error: APSERVAreDistinct: internal error (not sorted)
PostPosted: Mon Dec 06, 2021 3:21 pm 
Offline

Joined: Sun Dec 05, 2021 2:12 pm
Posts: 2
Hi Sergey,

thanks for your reply! I am working on a Xilinx Zynq Ultrascale+ ZU3EG using an ARM Cortex R5 realtime core and FreeRTOS.
In the meantime, I was able to narrow down the issue further. By calling ae_fp_greater_eq() with hardcoded values in a endless loop, I have a chance of about 2*10⁻9 that the functions returns true instead of false. Interestingly, I buffered the result of the comparison in a global variable (result_tmp) - when the return value (result) is true, result_tmp is correctly false. So it appears the issue is not located in the comparison but in the process of returning the value from the function.

I'm getting a feeling this is more of a hardware bug than anything else ... Never encountered it anywhere else though.

Code:
bool result_tmp;

ae_bool ae_fp_greater_eq(double v1, double v2){

/* IEEE-strict floating point comparison */

volatile double x = v1;
volatile double y = v2;
result_tmp = x>=y;
return x>=y;

}

main() {

while(1) {
    double x = 1235908880.616;
    double y = 1235908880.716;
    ae_bool result = ae_fp_greater_eq(x, y);
}

}


Top
 Profile  
 
 Post subject: Re: Error: APSERVAreDistinct: internal error (not sorted)
PostPosted: Mon Dec 06, 2021 10:06 pm 
Offline
Site Admin

Joined: Fri May 07, 2010 7:06 am
Posts: 903
Given what you already found, 100% hardware bug.

However, it is really interesting why it did not appear anywhere else. Is it possible to take a look at disassembly of the modified ae_fp_greater_eq?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC


Who is online

Users browsing this forum: Bing [Bot] and 58 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group