Slow vectorized operations compared to Matlab double

Joe 10 months ago updated by Pavel Holoborodko 10 months ago 2

I've noticed when doing vectorized operations with advanpix mp data structures, it's extremely slow compared to Matlab. For example, consider the follow code where I initialize 1000x1000 sparse, random matrices using both Matlab double and advanpix double data structures:

A_matlab = sprand(1000,1000,.2);
A_advanpix = mp(sprand(1000,1000,.2),16);

I then time the operation of "zeroing out" the first 100 rows in each

A_matlab(1:100,:) = 0;
toc Elapsed time is 0.008518 seconds.

A_advanpix(1:100,:) = 0;
toc Elapsed time is 2.534518 seconds.

Of course, this naturally gets much worse as the size of A increases. Is this behavior expected or am I using the advanpix data structures incorrectly?

Thank you!


Under review

Yes, unfortunately, this is expected behavior, at least for now. MATLAB doesn't allow overloading sparse data types, so we must use workarounds to add our own sparse matrix type. This comes with copy overhead in each operation (even in indexed access).

That is why, for maximum speed, please try to avoid element-wise access operations for sparse matrices (assignment/reading).

For example, when assembling sparse matrix, generate arrays of non-zeros and their indices separately, and then convert them into sparse matrix using command "sparse".


Starting from September, we have been working on new ideas on how to reduce copy-overhead further (now we are using undocumented functions in MATLAB, etc.). I hope this will allow us to reduce timings in future versions.