Uw opmerkingen

What arguments you use for besselk? Real, complex (what quadrant)?

Now I am working on building efficient rational approximations for besselk for real arguments (quadruple case). Should be fast.
These issues have been already fixed in Windows version - you can download it from our website the usual way.
Updates for GNU Linux & Mac OSX follow (I know you probably use the GNU Linux version).
Dear Owen,

Both bugs are confirmed and will be fixed in next updates.
Thank you very much for reporting!

Until then, you can try using the following workarounds:

A. For 'mp' to 'single' conversion:

Replace the following line in mp.m (line #1146):
r =  mpimpl(9000,x);
with
r = single(mpimpl(9000,x));
B. For 'single' to 'mp':

Please insert this code after #667 line in mp.m:
           if nargin == 1
            if isa(x,'single'), x = double(x); end;
           end;
The current algorithm for besselk is consecutive and it focuses more on accuracy than speed.
I am not sure how to speed it up quickly. Will review it for new ideas.

Any chance you could try to divise computations of besselk by high-level threads? Using parfor/smpd for example?

(If you are computing matrix elements with besselk, probably you can compute different parts of the matrix in different workers/threads?)

P.S.
Performance of parallel algorithms (e.g. matrix multiplication) can be boosted up much further. But unfortunately there is a nasty bug in Intel threads runtime I am using for parallelism. They use simple spin-lock for 200msec at the end of every thread - which consumes a lot of CPU power! For example, if we use 8 threads, then possible overhead is 8*200msec = 1.6sec!! This alone makes parallelism inefficient for pieces of code which run under 1.6sec - rules out the mid-sized matrices.

If I reduce this delay - their runtime freezes in cases when only 2-cores are under load.
Now I am working on a custom thread manager among other things.
Matrix multiplication should run in parallel (we added this almost from the beginning - several years ago).

Please let me know the matrix size, precision, toolbox & MATLAB & OS versions and CPU model.
Your question made me to revise the parallel optimization of eigensolver for symmetric/Hermitian matrices.

As a result, today we have released new version of toolbox which solves the symmetric eigenproblem by ~3.4 times faster.
So that eig(A) for symmetric 1Kx1K matrix now takes around 9 seconds on the same CPU.

Larger matrices receive higher speed-up ratio. Actually now solver scales almost linearly to number of cores.
Btw, newest version has some improvements in matrix multiplication in quadruple precision.
You might notice some speed-up for large matrices (~10-15%).
This is one of OOP rules in MATLAB. It does the conversion automatically without asking the toolbox and we have no control over this behavior (and it is always been like this).

In case of scalar operands - MATLAB casts to more powerful or custom type.
In case of arrays - the scalar RHS entity is converted to LHS type.

Actually it is a good solution. Otherwise, the whole array A would be converted to 'double' in the case:

A=mp(1:10)
A(1,1)=10;
% A becomes 'double' !

This would be much more unexpected behavior, isn't it?