Pavel Holoborodko / Profile - Comments / Multiprecision Computing Toolbox for MATLAB

detailed documentation regarding parallel and multi-thread computation with MCT

The thing is that MathWorks doesn't really share information with MEX developers and they change how parallel toolbox works from version to version. For example, in March we stumbled on unknown issue - if code uses several parfor loops (and toolbox inside them) - the second loop cannot run just because MATLAB engine resets the path variable for the workers in second parfor loop. And toolbox cannot be found or cannot even read its files, since it is not in the path anymore.

There are a lot of quirks like this, depending on MATLAB version, etc. Usually we receive reports from users about new issues.

The toolbox's MEX is just an ordinarily DLL/SO and MATLAB developers can use it in any way they desire, keeping the hard issues in their wrappers for MEX. Each system thread or process can use MEX, but I don't know how they implemented it.

I would suggest just trying things - e.g. at least checking what precision is used in each parallel worker.

yesterday at 8:37 a.m.

detailed documentation regarding parallel and multi-thread computation with MCT

If non default precision is used - then mp.Digits() needs to be called inside parfor or other parallel code - to make sure each worker instance runs with the same precision.

yesterday at 7:36 a.m.

detailed documentation regarding parallel and multi-thread computation with MCT

You are right, but we have ~1000 functions in toolbox and each has a lot of variations in input argument, outputs, etc. which are covered by MATLAB's documentation already (say Y=fft(X), or s=svd(A) are the same). It would require significant efforts to maintain separate web-pages with documentation and mostly duplicate the MATLAB's one. I am not sure if it would be a wise thing to do (?).

When showing code to others maybe we can just say - "same code as MATLAB's but running in extended precision"?

yesterday at 7:34 a.m.

detailed documentation regarding parallel and multi-thread computation with MCT

I just saw your message - spam filter hides the emails notifications from the forum.

The mp.NumberOfThreads function has documentation:

        % mp.NumberOfThreads Sets maximum number of threads to use in computations enabled with multi-core parallelism.
        %
        %   N = 0 (default):
        %   Sets number of threads = number of real hardware cores in the system.
        %   Each thread is pinned to execute on particular hardware core for best
        %   performance. This is optimal strategy for most of the users, who runs
        %   one instance of toolbox at a time.
        %   
        %   N ~= 0:
        %   Pushes toolbox to use exactly N cores, taking into account
        %   hyper-threaded cores as well. No thread affinity is applied.
        %   
        %   This is useful if you run several toolbox instances (e.g. with parfor). 
        %   In this case compute number of threads as:
        %   
        %      N = total_number_of_cores / number_of_matlab_workers
        %   
        %   Returns current setting if called without arguments.
        %

Example of use and more information is in mpstartup.m:

% Set up maximum number of threads to use in computations enabled with multi-core parallelism:
%
% mp.NumberOfThreads(N) with:
%
% N = maxNumCompThreads [default]
%
% Use MATLAB's default setting. This is probably optimal strategy for most of the users,
% especially if user already controls number of computational threads by built-in MATLAB's commands.
%
% Usually MATLAB assigns number of computational threads equal to number
% of physical cores of CPU. Also each thread is binned to one physical
% core using thread affinity control. This is optimal for most heavy
% mathematical computations.
%
% Multi-threading in toolbox/MATLAB is based on OpenMP framework.
% OpenMP allows flexible control on number of threads, affinity,
% timings, thread scheduling, etc. using OS environment variables.
% See OpenMP specifications for details: https://www.openmp.org/specifications/
%
% On Windows we rely on Intel OpenMP which allows even more detailed
% tuning with (KMP_) environment variables: https://software.intel.com/en-us/node/522775
%
% For example, OpenMP configuration for 16-cores/32-threads Intel Core i9 7960X
% might look like (for Windows):
%
% OMP_NUM_THREADS = 16
% KMP_AFFINITY = explicit,proclist=[0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30]
%
% Here we restrict OpenMP to use only 16 computational threads and attach every thread
% to distinct physical CPU core ( = to use logical cores with even IDs).
%
% N = 0:
% Use all available CPU cores in the system (including logical cores).
% Pushes CPU utilization to the limit. This mode is beneficial for computations
% which is massively parallel, e.g. for matrix multiplication.
% In other cases speed can actually go down. Please test this mode for your
% particular code to see if it provides any speed-up.
%
% Use with caution since this mode might slow down all other applications running on the computer.
% Suitable for users who run one instance of toolbox at a time (no parfor).
%
% N ~= 0:
% Toolbox will use exactly N cores (including logical cores). No thread affinity is applied.
%
% This is useful if you run several toolbox instances (e.g. with
% parfor) and want to balance the load among all workers.
%
% In this case compute number of threads as:
%
% N = total_number_of_cores / number_of_matlab_workers
%
% NOTE. The 'maxNumCompThreads' was declared deprecated starting from R2011 up to R2014.
% But status of this command was restored later and now it is valid command.
%

yesterday at 7:03 a.m.

nchoosek, binomial coefficient

I have checked & tested, such vectorization won't be faster than loop, unless input arrays are really long (5M-10M). We can do the long arrays faster by parallelism. But short arrays won't benefit at all.

Please use loop or arrayfun:

B=arrayfun(@(n,k)nchoosek(n,k),mp(2:4),mp(1:3))

1 year ago

nchoosek, binomial coefficient

Could you please elaborate a bit more how this should work and what output should be generated?

This case is ambiguous and can be treated differently: one to one, one to all. Also this will make the function incompatible with MTLAB's built-in.

1 year ago

randn() precision limited to 55 digits

This is not an option either. Putting the slowness aside (applying the seed/state is a non-trivial process), there is another reason:

Random state must be reset to default on MATLAB startup (when user starts up the MATLAB/starts work with toolbox).

MATLAB doesn't allow differentiating startup vs. 'clear all' events. These events must be treated differently, but we can choose only one. We choose the most flexible: "startup" behavior and allow the user to take care of random state as one wishes. User just need to call mp.RandState when needed (as it was designed).

This doesn't lead to any slowdowns, undocumented hacks/hooks to MATLAB kernel, etc. etc. etc.

Update: I edited the comment after posting it (to remove logically incorrect statement).

1 year ago

randn() precision limited to 55 digits

@"Matlab’s built-in random number generator never treats “clear all” as a command to reset or reseed its generator."

MATLAB doesn't notify the toolboxes (nor MEX modules) when user calls 'clear' function on it. So that toolbox has no way to react to this event in some special way (e.g. to store/pass the seed to the next call). The 'clear' just kills the toolbox from executing, unloads and clears its memory. Next call to toolbox starts it fresh from default state.

Please send MT request to allow flexible handling of 'clear' function in 3rd party toolboxes/MEX.

Now toolbox provides two ways of using random number generators:

(1) Using the standard MATLAB functions, e.g. mp(rand(..)). The seed functionality is controlled by MATLAB. But precision is limited to double precision numbers.

(2) The generation of a full-precision random numbers requires special syntax (rand(..,'mp')) and a separate seed control function (mp.RandState). The separate seed control function is needed because of the MATLAB's restrictions on overloading of 'rng' and 'clear' functions (it is not possible to do or there is no notification mechanism provided to handle the events).

User can choose the (1) or the (2). In case of (2) manual control of a seed is required using the special function mp.RandState.

1 year ago

randn() precision limited to 55 digits

The 'clear all' wields full power over the MEX - as it just kills all the MEX modules (including mp-toolbox).
The mp.RandState is provided exactly for such cases when precise and explicit control over generator is needed. Please use it.

1 year ago

nchoosek, binomial coefficient

Jon, I have just released new version of toolbox - 5.3.6.15927.

It doesn't use the MATLAB's implementation of nchoosek anymore.

Instead it is now implemented directly in toolbox core.

>> nchoosek(mp(76), mp(32))
ans = 
    2695592391875730827550

Please update your environment.

1 year ago

Your comments

User menu