0
Under review

backslash

dattorro 3 days ago updated by Pavel Holoborodko 21 hours ago 3

Backslash, applied to an overdetermined system, is expected deterministic; i.e, no variability in solution.

But repeated calls to backslash, for data vL provided in the link, is variable and of infinite dynamic range.

clear all load vL % vL = double(vL); % Amat = double(Amat); xi = Amat\vL;  %backslash error is nonzero when differenced with same backslash later on for i=1:10    notabug = Amat\vL;    sum(abs(xi - notabug)) end

ans = 0

ans = 0
ans = 0
ans = 3.312601991721914126549721394248086e-06
ans = 3.271854404182205550123018444983506e-05
ans = 0
ans = 8.904811628089073569772503832159206e-06
ans = 8.904811628089073569772503832159206e-06
ans = 8.904811628089073569772503832159206e-06
ans = 3.312601991721914126549721394248086e-06

Uncommenting the double cast always produces ans=0 but reveals Amat to be an overdetermined rank deficient matrix.

MATLAB Version: 9.14.0.2489007 (R2023a) Update 6
Operating System: Microsoft Windows 10 Pro Version 10.0 (Build 19043)

CPU: AMD 3995WX

Advanpix Multiprecision Computing Toolbox Version 5.2.8.15537

https://www.convexoptimization.com/vL.mat

Under review

Jon, thank you very much for the report!

Generally, some degree of variability of the solution is to be expected (due to matrix conditioning, multi-threading, flow of divide-and-conquer/bulge chasing computations, etc.).

The condition number of a matrix is log10(cond(Amat)) = ~21, roughly meaning that last ~21 digits are expected to be lost during computation of the solution.


However, even if we use extended precision, the last ~21 digits of the computed solution are not accurate (=can vary every time we solve the system). What counts is the relative accuracy of the system delivered by the solution. 

I added few changes to your script:

clear all

ndigits=34;
mp.Digits(ndigits);

load vL
vL = mp(vL,ndigits);
Amat=mp(Amat,ndigits);

%vL = double(vL);
%Amat = double(Amat);

n0 = norm(Amat,1)*norm(vL, 1); % PH: norms to compute the relative accuracy

xi = Amat\vL;  %backslash error is nonzero when differenced with same backslash later on

for i=1:10
    notabug = Amat\vL;
    
    acc = norm(Amat*notabug-vL,1) / n0;     % PH: system rel. accuracy w.r.t solution vector.
    xacc = norm(xi - notabug,1)/norm(xi,1); % PH: rel. difference from the first solution.
    [m,idx] = max(abs(xi - notabug));
    
    fprintf("rel.acc(sys) = %.5e, rel.acc(x) = %.5e, [xi(%d) = %.8f, notabug(%d) = %.8f, rel.acc = %.5e]\n",acc,xacc,idx,xi(idx),idx,notabug(idx),abs(xi(idx)-notabug(idx))/abs(xi(idx)));
    
    %sum(abs(xi - notabug))
endv

Here is its results for precisions: 34, 50, 100 and 500 decimal digits.

% ndigits = 34
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 2.54932e-19, [xi(40) = 33835.88114407, notabug(40) = 33835.88114407, rel.acc = 3.10975e-19]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.24177e-10, [xi(41) = 5382.66065303, notabug(41) = 5382.66064798, rel.acc = 9.37861e-10]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 8.72727e-11, [xi(41) = 5382.66065303, notabug(41) = 5382.66065016, rel.acc = 5.33988e-10]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.04572e-10, [xi(41) = 5382.66065303, notabug(41) = 5382.66064912, rel.acc = 7.26616e-10]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 8.19852e-11, [xi(41) = 5382.66065303, notabug(41) = 5382.66064979, rel.acc = 6.01897e-10]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.23477e-10, [xi(41) = 5382.66065303, notabug(41) = 5382.66064801, rel.acc = 9.32217e-10]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.23477e-10, [xi(41) = 5382.66065303, notabug(41) = 5382.66064801, rel.acc = 9.32217e-10]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.24177e-10, [xi(41) = 5382.66065303, notabug(41) = 5382.66064798, rel.acc = 9.37861e-10]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 7.31347e-11, [xi(41) = 5382.66065303, notabug(41) = 5382.66065016, rel.acc = 5.33995e-10]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.24177e-10, [xi(41) = 5382.66065303, notabug(41) = 5382.66064798, rel.acc = 9.37861e-10]

% ndigits = 50
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 3.92138e-30, [xi(40) = 33835.88114392, notabug(40) = 33835.88114392, rel.acc = 3.26593e-30]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 9.09178e-27, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 6.17458e-26]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 5.67413e-27, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 3.50551e-26]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 5.65223e-27, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 3.38773e-26]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 5.21849e-27, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 3.40105e-26]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 3.92138e-30, [xi(40) = 33835.88114392, notabug(40) = 33835.88114392, rel.acc = 3.26593e-30]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 6.09511e-27, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 4.11961e-26]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 5.21855e-27, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 3.40105e-26]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 6.12574e-27, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 4.11961e-26]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 4.69512e-27, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 2.96222e-26]

% ndigits = 100
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 5.74485e-81, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 4.02796e-80]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.42945e-79, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 9.44583e-79]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 5.76549e-81, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 3.97130e-80]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.42444e-77, [xi(40) = 33835.88114392, notabug(40) = 33835.88114392, rel.acc = 1.04581e-77]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.42410e-77, [xi(40) = 33835.88114392, notabug(40) = 33835.88114392, rel.acc = 1.14507e-77]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.57315e-79, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 1.02690e-78]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.42394e-77, [xi(40) = 33835.88114392, notabug(40) = 33835.88114392, rel.acc = 1.14487e-77]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.99963e-77, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 1.18875e-76]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.99972e-77, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 1.18875e-76]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 1.47611e-77, [xi(40) = 33835.88114392, notabug(40) = 33835.88114392, rel.acc = 1.15290e-77]

% ndigits = 500
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 2.88681e-477, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 1.15224e-476]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 3.48807e-477, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 2.16831e-476]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 2.88681e-477, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 1.15224e-476]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 0.00000e+00, [xi(1) = 0.00000000, notabug(1) = 0.00000000, rel.acc = 0.00000e+00]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 0.00000e+00, [xi(1) = 0.00000000, notabug(1) = 0.00000000, rel.acc = 0.00000e+00]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 2.88681e-477, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 1.15224e-476]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 2.88681e-477, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 1.15224e-476]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 0.00000e+00, [xi(1) = 0.00000000, notabug(1) = 0.00000000, rel.acc = 0.00000e+00]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 0.00000e+00, [xi(1) = 0.00000000, notabug(1) = 0.00000000, rel.acc = 0.00000e+00]
rel.acc(sys) = 1.55559e-15, rel.acc(x) = 2.88681e-477, [xi(41) = 5382.66065164, notabug(41) = 5382.66065164, rel.acc = 1.15224e-476]

First column is the system accuracy w.r.t current solution vector, second - current solution vector difference from the first one. Lastly in square brackets we have the current solution vector element with max difference from the first vector and its relative difference.


The relative accuracy of all the solutions are the same (1.55559e-15), this is clearly the maximum achievable accuracy for the system. 


The solution vectors differ from the first one by last ~21 decimal digits. This coincides with the matrix conditioning (as we expected). The solution vector elements with max. difference also differs by the same (very small) margin from the first vector.

 

Overall, this perfectly matches the expected behavior of matrix solution.


***

Could you please send me references to MATLAB's requirements for solutions of overdetermined system to be bit-by-bit deterministic? In this particular case the MATLAB doesn't use any advanced QR algorithms/multithreading and probably this leads to exactly the same solution each time. But, mathematically it is not guaranteed or expected/justified. 


In toolbox we use advanced algorithms for the case (multi-threadfinfg + D&C + bulge chasing with different execution paths each time). Nonetheless the solutions provide the same relative accuracy and differ from each other by expected margin (which is log10(cond(A)) = ~21 digits). This is how it is expected to behave. Let me know your comments.

Accuracy (precision) of solution is not in question.  To say that two different multiprecision solutions are equivalent, because they fall within some calculable error tolerance, does not address the reproducibility problem at issue.  Mathworks certainly hasn't asserted that a small neighborhood of linear system solutions are equivalent.  One could indeed ask: Where does Matlab documentation state that linear system solution by mldivide() is NOT bit by bit deterministic?  The fact that double precision solution (of an overdetermined rank deficient linear system) is reproducible leads to an expectation of multiprecision reproducibility for MCT.

MCT can indeed produce an identical result (for the overdetermined rank deficient system in my seminal post) on a subsequent run for the very same problem:

ans = 3.312601991721914126549721394248086e-06    % different results

ans = 0                                                                              % identical results

This means that every neighborhood of equivalent solutions must include {0}.  "...multi-threading, flow of divide-and-conquer/bulge chasing computations" implies that MCT may have inadvertently introduced stochastics into the mldivide algorithm.  Nondeterministic MCT behavior implies departure from Matlab algorithms.  

MATLAB’s mldivide is expected deterministic; reproducible in floating point computations under identical conditions.  Repeated runs of the same problem in the same environment will yield the same bitwise result unless something in the environment changes; e.g, CPU type, number of threads, BLAS/LAPACK library, or MATLAB release:

mldivide (backslash)

https://www.mathworks.com/help/matlab/ref/mldivide.html

(From “Algorithms” and “Tips” sections):

“The mldivide function [...] performs a least-squares solution if A is rectangular. [...] The results of mldivide are sensitive to the condition of A. However, the same inputs on the same architecture produce the same outputs.”

Although MathWorks does not explicitly say “bitwise identical” in that reference, it emphasizes (under the same hardware and same MATLAB version) that mldivide provides reproducible results; i.e, the same result each time.

  • MATLAB’s deterministic algorithms, including linear algebra operations via LAPACK/BLAS, will generate consistent results across repeated runs when run under the same environment.
  • Floating-point arithmetic is subject to round-off error but, in absence of changes such as multithreading differences or CPU differences, you’ll get identical results bit by bit.

Floating Point Results Between Computers

https://www.mathworks.com/support/tech-notes/1100/1108.html

  • This tech note covers in detail why floating-point results may differ when hardware, compiler optimizations, or operating system changes, but not for repeat runs under the exact same conditions.
  • The takeaway is that if none of those external factors changes, MATLAB’s solver routines produce the same outputs for the same inputs.

Moore-Penrose Pseudoinverse (pinv)

https://www.mathworks.com/help/matlab/ref/pinv.html

  • When mldivide solves rank deficient or rectangular problems, it uses LAPACK routines that effectively compute the Moore-Penrose pseudoinverse (or a variant thereof) under the hood.
  • The pinv documentation states it returns a specific deterministic matrix A† so repeated calls with the same A are consistent.

Determinism

  • Same Environment: If you run x = A\b; for the same matrix A and vector b, on the same hardware, with the same version of MATLAB, and the same parallelization/threading settings, the results are bit for bit identical each time.
  • Potential Sources of Variation: Changes in CPU architecture, changes to number of threads or parallelization settings, or updating to a new MATLAB release (which might switch to a newer version of BLAS/LAPACK) can alter order of floating point operations and thus yield slight numerical differences.

As long as all conditions remain fixed, the backslash operator is deterministic for the same input problem—even in the overdetermined rank deficient case—and will yield the exact same solution for each call.

Jon, Happy New Year!

We have just released new version 5.3.6.15902: https://bit.ly/3mvxXqg
Now, its least-squares solver generates bitwise deterministic results every time.

Would appreciate it if you would update me on your test results.


Repeated runs of the same problem in the same environment will yield the same bitwise result unless something in the environment changes; e.g, CPU type, number of threads,...

I think with more advanced algorithms and many-core CPUs this will change in the future.

For example. Even though we use the same number of threads in a solver - it might split the overall work in different number  of jobs and distribute them differently among cores on each call. This might happen because some CPU cores might be busy with other processes on the computer. More flexible/adaptive algorithms take this into account (as we did in least-square solver before 5.3.6).

Thank you for all your support and contributions!

Best wishes for the New Year!

Pavel