0
Answered

Extremely slow interp1

m.b 4 years ago updated by Pavel Holoborodko 4 years ago 4

Hi,

Could you please advise why interp1 with mp is so slow compared to original matlab interp1? belwo is the example I'm running. Thank you for your kind help

mohsen


% Grid length and useful vectors
N = 200;
kp = mp(linspace(1e-6,5,N))';
n = mp( ones(1,2)); nn = mp(ones(N,1));


% parameters & given values
gam = 2; beta = 0.96;
R = 1.05; w0 = 3.44;
prob = mp([0.9500, 0.0500; 0.6750, 0.3250]);


w = mp(sort( w0*( .995 + randn(N,1)/7.5 ) ));
W = [w, 0*nn];


% initialize
Cp = mp(R)*kp*n; k = Cp;


% convergence criterion
Cp_crit = 1;


% Interpolate until convergence
ii = 0; tic

while Cp_crit > 1e-3
Cp0 = Cp;
EMUp = (Cp.^(-mp(gam)))*prob';
C = (mp(beta*R)*EMUp).^(-1/mp(gam));
k = ( kp*n + C - W)/mp(R);

% Use the relation between C and k to ropose a new vector Cp corresponding to kp
for i=1:2
% Update the function by interpolation.
Cp(:,i) = interp1(k(:,i), C(:,i), kp,'linear','extrap');
end

Cp_crit = max(max(abs(Cp0-Cp)./(1+abs(Cp))));
ii = ii+ 1;
end
toc
disp(Cp_crit)


Under review

Dear Mohsen,


You are absolutely right - tests show that interp1 is unacceptably slow!

The slowest code is index lookup of interval where x belongs while interpolation.

Now it is simple linear search and of course should be improved.


Basically we have to implement histcounts, which is not trivial.


Will try to improve this in near days and let you know about the result.

Completed

Dear Mohsen,


Please download & install new version of toolbox: http://goo.gl/pMXV3

Now it has much faster interp1.


Also I have updated your script with small corrections to be more mp-fiendly:

rng('default');

% Grid length and useful vectors
N = 50;
kp = linspace(mp('1e-6'),mp(5),N)';
n = mp(ones(1,2));


% parameters & given values
gam = mp(2); beta = mp('0.96');
R = mp('1.05'); w0 = mp('3.44');
prob = mp('[0.9500, 0.0500; 0.6750, 0.3250]');


w = sort( w0*( mp('.995') + randn(N,1)/mp('7.5')));
W = [w, mp(zeros(N,1))];


% initialize
Cp = R*kp*n; k = Cp;


% convergence criterion
Cp_crit = 1;


% Interpolate until convergence
ii = 0;


while Cp_crit > 1e-3
    
    start = tic;
    Cp0 = Cp;
    EMUp = (Cp.^(-gam))*prob'; 
    C = (beta*R*EMUp).^(-1/gam); 
    k = (kp*n + C - W)/R; 


    % Use the relation between C and k to ropose a new vector Cp corresponding to kp
    for i=1:2
        % Update the function by interpolation.
        Cp(:,i) = interp1(k(:,i), C(:,i), kp,'linear','extrap');
    end


    Cp_crit = max(max(abs(Cp0-Cp)./(1+abs(Cp)))); 
    ii = ii+ 1;
    stop = toc(start);


    fprintf('iteration = %d\ttime = %8.4f sec\n',ii,stop);
end


disp(Cp_crit)
Older version has following timings for N=50:

>> interp1_test_mp
iteration = 1    time =   0.9576 sec
iteration = 2    time =   0.9761 sec
iteration = 3    time =   1.0043 sec
iteration = 4    time =   1.0012 sec
iteration = 5    time =   1.0255 sec
iteration = 6    time =   1.0131 sec
iteration = 7    time =   1.0001 sec
iteration = 8    time =   0.9991 sec
iteration = 9    time =   1.0098 sec
iteration = 10    time =   1.0682 sec
iteration = 11    time =   1.0019 sec
iteration = 12    time =   1.0133 sec
iteration = 13    time =   1.0193 sec
iteration = 14    time =   1.0049 sec
iteration = 15    time =   1.0054 sec
iteration = 16    time =   1.0192 sec
iteration = 17    time =   1.0178 sec
iteration = 18    time =   1.0121 sec
Cp_crit = 
    0.0003433297868733290059017411707557392
New one is ~25 times faster:

>> interp1_test_mp
iteration = 1    time =   0.0436 sec
iteration = 2    time =   0.0449 sec
iteration = 3    time =   0.0454 sec
iteration = 4    time =   0.0429 sec
iteration = 5    time =   0.0429 sec
iteration = 6    time =   0.0435 sec
iteration = 7    time =   0.0421 sec
iteration = 8    time =   0.0424 sec
iteration = 9    time =   0.0427 sec
iteration = 10    time =   0.0431 sec
iteration = 11    time =   0.0431 sec
iteration = 12    time =   0.0419 sec
iteration = 13    time =   0.0420 sec
iteration = 14    time =   0.0416 sec
iteration = 15    time =   0.0421 sec
iteration = 16    time =   0.0420 sec
iteration = 17    time =   0.0422 sec
iteration = 18    time =   0.0423 sec
Cp_crit = 
    0.0003433297868733290059017411707557392
But, still it might take a lot of time for N=200 to converge. Probably you need to re-design your algorithm to converge in less iterations.


If you are Ok - I will make this thread public, it might be useful for other users.

Hi Pavel,

Thank you very much for the prompt and kind reply. The newer version works better, so thank you very much for the time and effort you put in this. The improvement is surely there, yet accelerating the execution time of the command would be very welcome.

Please go ahead and make the thread public if you think it may help other users.

Best,

Mohsen

Answered

Dear Mohsen,

Later today we will release new version of toolbox with even faster interp1. Now interp1 hot-spots are implemented directly using C++ and thus it is ~100 times faster overall compared to original version.


However, I am sure that the issue is not related speed, but rather to the properties of the method you use in your script. Is convergence guaranteed for all N?


For some N it converges very quickly, for others - no convergence even after thousands of iterations.


N=79:

>> interp1_test_mp
iteration = 1    crit = 6.966185e-01    time =   0.0154 sec
iteration = 2    crit = 1.328338e-01    time =   0.0167 sec
iteration = 3    crit = 4.058389e-01    time =   0.0207 sec
iteration = 4    crit = 9.628735e-01    time =   0.0136 sec
iteration = 5    crit = 7.685783e-02    time =   0.0138 sec
iteration = 6    crit = 5.120528e-02    time =   0.0138 sec
iteration = 7    crit = 6.838371e-02    time =   0.0135 sec
iteration = 8    crit = 4.207984e-02    time =   0.0134 sec
iteration = 9    crit = 4.638199e-02    time =   0.0281 sec
iteration = 10    crit = 2.107455e-02    time =   0.0167 sec
iteration = 11    crit = 1.724844e-02    time =   0.0173 sec
iteration = 12    crit = 5.805996e-03    time =   0.0172 sec
iteration = 13    crit = 2.771057e-03    time =   0.0164 sec
iteration = 14    crit = 1.432664e-03    time =   0.0149 sec
iteration = 15    crit = 1.115638e-03    time =   0.0156 sec
iteration = 16    crit = 1.431466e-03    time =   0.0173 sec
iteration = 17    crit = 1.182536e-03    time =   0.0148 sec
iteration = 18    crit = 7.638381e-04    time =   0.0145 sec
Cp_crit = 
    0.0007638381016874159474483705763685007
N=80:

>> interp1_test_mp
iteration = 1    crit = 6.966185e-01    time =   0.0136 sec
iteration = 2    crit = 1.707076e-01    time =   0.0126 sec
iteration = 3    crit = 1.819746e-01    time =   0.0126 sec
....
iteration = 3644    crit = 1.867263e-02    time =   0.0130 sec
iteration = 3645    crit = 1.840743e-02    time =   0.0134 sec
....
Updated version will be available in a few hours.