How to compute the sum of squares of outer products of two matrices minus a common matrix in Matlab?

Suppose there are three n * n matrices X, Y, S. How to fast compute the the following scalars b

for i = 1:n
 b = b + sum(sum((X(i,:)' * Y(i,:) - S).^2));
end

The computation cost is O(n^3). There exists a fast way to compute the outer product of two matrices. Specifically, the matrix C

for i = 1:n
 C = C + X(i,:)' * Y(i,:);
end

can be calculated without for loop C = A.'*B which is only O(n^2). Is there exists a faster way to compute b?

edited Nov 12 at 11:55

asked Nov 12 at 11:30

messcode

186

did you mean X(i,:)'*Y(i,:) ? otherwise there is no mention of Y
– bla
Nov 12 at 11:52

yes, thanks for pointing out the error. I have corrected it.
– messcode
Nov 12 at 11:57

add a comment |

Suppose there are three n * n matrices X, Y, S. How to fast compute the the following scalars b

for i = 1:n
 b = b + sum(sum((X(i,:)' * Y(i,:) - S).^2));
end

The computation cost is O(n^3). There exists a fast way to compute the outer product of two matrices. Specifically, the matrix C

for i = 1:n
 C = C + X(i,:)' * Y(i,:);
end

can be calculated without for loop C = A.'*B which is only O(n^2). Is there exists a faster way to compute b?

edited Nov 12 at 11:55

asked Nov 12 at 11:30

messcode

186

did you mean X(i,:)'*Y(i,:) ? otherwise there is no mention of Y
– bla
Nov 12 at 11:52

yes, thanks for pointing out the error. I have corrected it.
– messcode
Nov 12 at 11:57

add a comment |

Suppose there are three n * n matrices X, Y, S. How to fast compute the the following scalars b

for i = 1:n
 b = b + sum(sum((X(i,:)' * Y(i,:) - S).^2));
end

The computation cost is O(n^3). There exists a fast way to compute the outer product of two matrices. Specifically, the matrix C

for i = 1:n
 C = C + X(i,:)' * Y(i,:);
end

can be calculated without for loop C = A.'*B which is only O(n^2). Is there exists a faster way to compute b?

edited Nov 12 at 11:55

asked Nov 12 at 11:30

messcode

186

Suppose there are three n * n matrices X, Y, S. How to fast compute the the following scalars b

for i = 1:n
 b = b + sum(sum((X(i,:)' * Y(i,:) - S).^2));
end

The computation cost is O(n^3). There exists a fast way to compute the outer product of two matrices. Specifically, the matrix C

for i = 1:n
 C = C + X(i,:)' * Y(i,:);
end

can be calculated without for loop C = A.'*B which is only O(n^2). Is there exists a faster way to compute b?

matlab matrix-multiplication

edited Nov 12 at 11:55

asked Nov 12 at 11:30

messcode

186

edited Nov 12 at 11:55

asked Nov 12 at 11:30

messcode

186

edited Nov 12 at 11:55

asked Nov 12 at 11:30

messcode

186

asked Nov 12 at 11:30

messcode

186

asked Nov 12 at 11:30

messcode

186

did you mean X(i,:)'*Y(i,:) ? otherwise there is no mention of Y
– bla
Nov 12 at 11:52

yes, thanks for pointing out the error. I have corrected it.
– messcode
Nov 12 at 11:57

add a comment |

did you mean X(i,:)'*Y(i,:) ? otherwise there is no mention of Y
– bla
Nov 12 at 11:52

yes, thanks for pointing out the error. I have corrected it.
– messcode
Nov 12 at 11:57

did you mean X(i,:)'*Y(i,:) ? otherwise there is no mention of Y
– bla
Nov 12 at 11:52

yes, thanks for pointing out the error. I have corrected it.
– messcode
Nov 12 at 11:57

add a comment |

2 Answers
2

active

oldest

votes

You can use:

X2 = X.^2;
Y2 = Y.^2;
S2 = S.^2;
b = sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));

Given your example

b=0;
for i = 1:n
 b = b + sum(sum((X(i,:).' * Y(i,:) - S).^2));
end

We can first bring the summation out of the loop:

b=0;
for i = 1:n
 b = b + (X(i,:).' * Y(i,:) - S).^2;
end
b=sum(b(:))

Knowing that we can write (a - b)^2 as a^2 - 2*a*b + b^2

b=0;
for i = 1:n
 b = b + (X(i,:).' * Y(i,:)).^2 - 2.* (X(i,:).' * Y(i,:)) .*S + S.^2;
end
b=sum(b(:))

And we know that (a * b) ^ 2 is the same as a^2 * b^2:

X2 = X.^2;
Y2 = Y.^2;
S2 = S.^2;
b=0;
for i = 1:n
 b = b + (X2(i,:).' * Y2(i,:)) - 2.* (X(i,:).' * Y(i,:)) .*S + S2;
end
b=sum(b(:))

Now we can compute each term separately:

 b = sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));

Here is the result of a test in Octave that compares my method and two other methods provided by @AndrasDeak and the original loop based solution for inputs of size 500*500:

===rahnema1 (B)===
Elapsed time is 0.0984299 seconds.

===Andras Deak (B2)===
Elapsed time is 7.86407 seconds.

===Andras Deak (B3)===
Elapsed time is 2.99158 seconds.

===Loop solution===
Elapsed time is 2.20357 seconds


n=500;
X= rand(n);
Y= rand(n);
S= rand(n);

disp('===rahnema1 (B)===')
tic
 X2 = X.^2;
 Y2 = Y.^2;
 S2 = S.^2;
 b=sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));
toc
disp('===Andras Deak (B2)===')
tic
 b2 = sum(reshape((permute(reshape(X, [n, 1, n]).*Y, [3,2,1]) - S).^2, 1, ));
toc
disp('===Andras Deak (B3)===')
tic
 b3 = sum(reshape((reshape(X, [n, 1, n]).*Y - reshape(S.', [1, n, n])).^2, 1, ));
toc
tic
 b=0;
 for i = 1:n
 b = b + sum(sum((X(i,:)' * Y(i,:) - S).^2));
 end
toc

edited Nov 12 at 13:20

answered Nov 12 at 12:56

rahnema1

9,8752922

1

Very nice! -- I guess sum(reshape(...)) is more expensive than sum(sum(...))? Is this because of the overhead in the reshape function? -- To avoid the double sum, in Octave you can do sum((...)(:)), and in MATLAB R2018b you can now do sum(...,'all'), which both are quite a bit more elegant than the double sum, and should be faster.
– Cris Luengo
Nov 12 at 16:17

Thanks. I don't know exactly which of sum-reshape or sum-sum performs better. Here I used sum-sum to reflect the original code and may be more readable. sum((...)(:)) may complicate readability and sum(...,'all') uses the ugly string parameter passing. Comparing them I think the difference in speed in the current problem is neglectible. May we need a more elegant function/operator?
– rahnema1
Nov 12 at 16:40

The double sum requires an intermediate array, so in principle it does more work. But yes, I'm sure it's not a measurable difference in this case. I agree with your readability comments. A new operator would be awesome. Something like sumall=@(x)sum(x(:)). But that does imply 20 or so new operators (meanall, stdall, maxall, etc. etc.), you can't do one and not all the others. :)
– Cris Luengo
Nov 12 at 16:44

add a comment |

You probably can't spare time complexity, but you can make use of vectorization to get rid of the loop and make use of low-level code and caching as much as possible. Whether it's actually faster depends on your dimensions, so you need to do some timing tests to see if this is worth it:

% dummy data
n = 3;
X = rand(n);
Y = rand(n);
S = rand(n);

% vectorize
b2 = sum(reshape((permute(reshape(X, [n, 1, n]).*Y, [3,2,1]) - S).^2, 1, ));

% check
b - b2 % close to machine epsilon i.e. zero

What happens is that we insert a new singleton dimension in one of the arrays, ending up with an array of size [n, 1, n] against one with [n, n], the latter being implicitly the same as [n, n, 1]. The overlapping first index corresponds to the i in your loop, the remaining two indices correspond to the matrix indices of the dyadic product you have for each i. Then we permute the indices in order to put the "i" index last, so that we can again broadcast the result with S of (implicit) size [n, n, 1]. What we then have is a matrix of size [n, n, n] where the first two indices are matrix indices in your original and the last one corresponds to i. We then just have to take the square and sum each term (instead of summing twice I reshaped the array to a row and summed once).

A slight variation of the above transposes S instead of the 3d array which might be faster (again, you should time it):

b3 = sum(reshape((reshape(X, [n, 1, n]).*Y - reshape(S.', [1, n, n])).^2, 1, ));

In terms of performance, reshape is free (it only reinterprets data, it doesn't copy) but permute/transpose will often lead to a perforance hit when data gets copied.

edited Nov 12 at 12:16

answered Nov 12 at 11:59

Andras Deak

20.4k63870

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53261260%2fhow-to-compute-the-sum-of-squares-of-outer-products-of-two-matrices-minus-a-comm%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

You can use:

X2 = X.^2;
Y2 = Y.^2;
S2 = S.^2;
b = sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));

Given your example

b=0;
for i = 1:n
 b = b + sum(sum((X(i,:).' * Y(i,:) - S).^2));
end

We can first bring the summation out of the loop:

b=0;
for i = 1:n
 b = b + (X(i,:).' * Y(i,:) - S).^2;
end
b=sum(b(:))

Knowing that we can write (a - b)^2 as a^2 - 2*a*b + b^2

b=0;
for i = 1:n
 b = b + (X(i,:).' * Y(i,:)).^2 - 2.* (X(i,:).' * Y(i,:)) .*S + S.^2;
end
b=sum(b(:))

And we know that (a * b) ^ 2 is the same as a^2 * b^2:

X2 = X.^2;
Y2 = Y.^2;
S2 = S.^2;
b=0;
for i = 1:n
 b = b + (X2(i,:).' * Y2(i,:)) - 2.* (X(i,:).' * Y(i,:)) .*S + S2;
end
b=sum(b(:))

Now we can compute each term separately:

 b = sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));

Here is the result of a test in Octave that compares my method and two other methods provided by @AndrasDeak and the original loop based solution for inputs of size 500*500:

===rahnema1 (B)===
Elapsed time is 0.0984299 seconds.

===Andras Deak (B2)===
Elapsed time is 7.86407 seconds.

===Andras Deak (B3)===
Elapsed time is 2.99158 seconds.

===Loop solution===
Elapsed time is 2.20357 seconds


n=500;
X= rand(n);
Y= rand(n);
S= rand(n);

disp('===rahnema1 (B)===')
tic
 X2 = X.^2;
 Y2 = Y.^2;
 S2 = S.^2;
 b=sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));
toc
disp('===Andras Deak (B2)===')
tic
 b2 = sum(reshape((permute(reshape(X, [n, 1, n]).*Y, [3,2,1]) - S).^2, 1, ));
toc
disp('===Andras Deak (B3)===')
tic
 b3 = sum(reshape((reshape(X, [n, 1, n]).*Y - reshape(S.', [1, n, n])).^2, 1, ));
toc
tic
 b=0;
 for i = 1:n
 b = b + sum(sum((X(i,:)' * Y(i,:) - S).^2));
 end
toc

edited Nov 12 at 13:20

answered Nov 12 at 12:56

rahnema1

9,8752922

1

Very nice! -- I guess sum(reshape(...)) is more expensive than sum(sum(...))? Is this because of the overhead in the reshape function? -- To avoid the double sum, in Octave you can do sum((...)(:)), and in MATLAB R2018b you can now do sum(...,'all'), which both are quite a bit more elegant than the double sum, and should be faster.
– Cris Luengo
Nov 12 at 16:17

Thanks. I don't know exactly which of sum-reshape or sum-sum performs better. Here I used sum-sum to reflect the original code and may be more readable. sum((...)(:)) may complicate readability and sum(...,'all') uses the ugly string parameter passing. Comparing them I think the difference in speed in the current problem is neglectible. May we need a more elegant function/operator?
– rahnema1
Nov 12 at 16:40

The double sum requires an intermediate array, so in principle it does more work. But yes, I'm sure it's not a measurable difference in this case. I agree with your readability comments. A new operator would be awesome. Something like sumall=@(x)sum(x(:)). But that does imply 20 or so new operators (meanall, stdall, maxall, etc. etc.), you can't do one and not all the others. :)
– Cris Luengo
Nov 12 at 16:44

add a comment |

You can use:

X2 = X.^2;
Y2 = Y.^2;
S2 = S.^2;
b = sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));

Given your example

b=0;
for i = 1:n
 b = b + sum(sum((X(i,:).' * Y(i,:) - S).^2));
end

We can first bring the summation out of the loop:

b=0;
for i = 1:n
 b = b + (X(i,:).' * Y(i,:) - S).^2;
end
b=sum(b(:))

Knowing that we can write (a - b)^2 as a^2 - 2*a*b + b^2

b=0;
for i = 1:n
 b = b + (X(i,:).' * Y(i,:)).^2 - 2.* (X(i,:).' * Y(i,:)) .*S + S.^2;
end
b=sum(b(:))

And we know that (a * b) ^ 2 is the same as a^2 * b^2:

X2 = X.^2;
Y2 = Y.^2;
S2 = S.^2;
b=0;
for i = 1:n
 b = b + (X2(i,:).' * Y2(i,:)) - 2.* (X(i,:).' * Y(i,:)) .*S + S2;
end
b=sum(b(:))

Now we can compute each term separately:

 b = sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));

Here is the result of a test in Octave that compares my method and two other methods provided by @AndrasDeak and the original loop based solution for inputs of size 500*500:

===rahnema1 (B)===
Elapsed time is 0.0984299 seconds.

===Andras Deak (B2)===
Elapsed time is 7.86407 seconds.

===Andras Deak (B3)===
Elapsed time is 2.99158 seconds.

===Loop solution===
Elapsed time is 2.20357 seconds


n=500;
X= rand(n);
Y= rand(n);
S= rand(n);

disp('===rahnema1 (B)===')
tic
 X2 = X.^2;
 Y2 = Y.^2;
 S2 = S.^2;
 b=sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));
toc
disp('===Andras Deak (B2)===')
tic
 b2 = sum(reshape((permute(reshape(X, [n, 1, n]).*Y, [3,2,1]) - S).^2, 1, ));
toc
disp('===Andras Deak (B3)===')
tic
 b3 = sum(reshape((reshape(X, [n, 1, n]).*Y - reshape(S.', [1, n, n])).^2, 1, ));
toc
tic
 b=0;
 for i = 1:n
 b = b + sum(sum((X(i,:)' * Y(i,:) - S).^2));
 end
toc

edited Nov 12 at 13:20

answered Nov 12 at 12:56

rahnema1

9,8752922

1

Very nice! -- I guess sum(reshape(...)) is more expensive than sum(sum(...))? Is this because of the overhead in the reshape function? -- To avoid the double sum, in Octave you can do sum((...)(:)), and in MATLAB R2018b you can now do sum(...,'all'), which both are quite a bit more elegant than the double sum, and should be faster.
– Cris Luengo
Nov 12 at 16:17

Thanks. I don't know exactly which of sum-reshape or sum-sum performs better. Here I used sum-sum to reflect the original code and may be more readable. sum((...)(:)) may complicate readability and sum(...,'all') uses the ugly string parameter passing. Comparing them I think the difference in speed in the current problem is neglectible. May we need a more elegant function/operator?
– rahnema1
Nov 12 at 16:40

The double sum requires an intermediate array, so in principle it does more work. But yes, I'm sure it's not a measurable difference in this case. I agree with your readability comments. A new operator would be awesome. Something like sumall=@(x)sum(x(:)). But that does imply 20 or so new operators (meanall, stdall, maxall, etc. etc.), you can't do one and not all the others. :)
– Cris Luengo
Nov 12 at 16:44

add a comment |

You can use:

X2 = X.^2;
Y2 = Y.^2;
S2 = S.^2;
b = sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));

Given your example

b=0;
for i = 1:n
 b = b + sum(sum((X(i,:).' * Y(i,:) - S).^2));
end

We can first bring the summation out of the loop:

b=0;
for i = 1:n
 b = b + (X(i,:).' * Y(i,:) - S).^2;
end
b=sum(b(:))

Knowing that we can write (a - b)^2 as a^2 - 2*a*b + b^2

b=0;
for i = 1:n
 b = b + (X(i,:).' * Y(i,:)).^2 - 2.* (X(i,:).' * Y(i,:)) .*S + S.^2;
end
b=sum(b(:))

And we know that (a * b) ^ 2 is the same as a^2 * b^2:

X2 = X.^2;
Y2 = Y.^2;
S2 = S.^2;
b=0;
for i = 1:n
 b = b + (X2(i,:).' * Y2(i,:)) - 2.* (X(i,:).' * Y(i,:)) .*S + S2;
end
b=sum(b(:))

Now we can compute each term separately:

 b = sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));

Here is the result of a test in Octave that compares my method and two other methods provided by @AndrasDeak and the original loop based solution for inputs of size 500*500:

===rahnema1 (B)===
Elapsed time is 0.0984299 seconds.

===Andras Deak (B2)===
Elapsed time is 7.86407 seconds.

===Andras Deak (B3)===
Elapsed time is 2.99158 seconds.

===Loop solution===
Elapsed time is 2.20357 seconds


n=500;
X= rand(n);
Y= rand(n);
S= rand(n);

disp('===rahnema1 (B)===')
tic
 X2 = X.^2;
 Y2 = Y.^2;
 S2 = S.^2;
 b=sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));
toc
disp('===Andras Deak (B2)===')
tic
 b2 = sum(reshape((permute(reshape(X, [n, 1, n]).*Y, [3,2,1]) - S).^2, 1, ));
toc
disp('===Andras Deak (B3)===')
tic
 b3 = sum(reshape((reshape(X, [n, 1, n]).*Y - reshape(S.', [1, n, n])).^2, 1, ));
toc
tic
 b=0;
 for i = 1:n
 b = b + sum(sum((X(i,:)' * Y(i,:) - S).^2));
 end
toc

edited Nov 12 at 13:20

answered Nov 12 at 12:56

rahnema1

9,8752922

You can use:

X2 = X.^2;
Y2 = Y.^2;
S2 = S.^2;
b = sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));

Given your example

b=0;
for i = 1:n
 b = b + sum(sum((X(i,:).' * Y(i,:) - S).^2));
end

We can first bring the summation out of the loop:

b=0;
for i = 1:n
 b = b + (X(i,:).' * Y(i,:) - S).^2;
end
b=sum(b(:))

Knowing that we can write (a - b)^2 as a^2 - 2*a*b + b^2

b=0;
for i = 1:n
 b = b + (X(i,:).' * Y(i,:)).^2 - 2.* (X(i,:).' * Y(i,:)) .*S + S.^2;
end
b=sum(b(:))

And we know that (a * b) ^ 2 is the same as a^2 * b^2:

X2 = X.^2;
Y2 = Y.^2;
S2 = S.^2;
b=0;
for i = 1:n
 b = b + (X2(i,:).' * Y2(i,:)) - 2.* (X(i,:).' * Y(i,:)) .*S + S2;
end
b=sum(b(:))

Now we can compute each term separately:

 b = sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));

Here is the result of a test in Octave that compares my method and two other methods provided by @AndrasDeak and the original loop based solution for inputs of size 500*500:

===rahnema1 (B)===
Elapsed time is 0.0984299 seconds.

===Andras Deak (B2)===
Elapsed time is 7.86407 seconds.

===Andras Deak (B3)===
Elapsed time is 2.99158 seconds.

===Loop solution===
Elapsed time is 2.20357 seconds


n=500;
X= rand(n);
Y= rand(n);
S= rand(n);

disp('===rahnema1 (B)===')
tic
 X2 = X.^2;
 Y2 = Y.^2;
 S2 = S.^2;
 b=sum(sum(X2.' * Y2 - 2 * (X.' * Y ) .* S + n * S2));
toc
disp('===Andras Deak (B2)===')
tic
 b2 = sum(reshape((permute(reshape(X, [n, 1, n]).*Y, [3,2,1]) - S).^2, 1, ));
toc
disp('===Andras Deak (B3)===')
tic
 b3 = sum(reshape((reshape(X, [n, 1, n]).*Y - reshape(S.', [1, n, n])).^2, 1, ));
toc
tic
 b=0;
 for i = 1:n
 b = b + sum(sum((X(i,:)' * Y(i,:) - S).^2));
 end
toc

edited Nov 12 at 13:20

answered Nov 12 at 12:56

rahnema1

9,8752922

edited Nov 12 at 13:20

answered Nov 12 at 12:56

rahnema1

9,8752922

answered Nov 12 at 12:56

rahnema1

9,8752922

answered Nov 12 at 12:56

rahnema1

9,8752922

1

Very nice! -- I guess sum(reshape(...)) is more expensive than sum(sum(...))? Is this because of the overhead in the reshape function? -- To avoid the double sum, in Octave you can do sum((...)(:)), and in MATLAB R2018b you can now do sum(...,'all'), which both are quite a bit more elegant than the double sum, and should be faster.
– Cris Luengo
Nov 12 at 16:17

Thanks. I don't know exactly which of sum-reshape or sum-sum performs better. Here I used sum-sum to reflect the original code and may be more readable. sum((...)(:)) may complicate readability and sum(...,'all') uses the ugly string parameter passing. Comparing them I think the difference in speed in the current problem is neglectible. May we need a more elegant function/operator?
– rahnema1
Nov 12 at 16:40

The double sum requires an intermediate array, so in principle it does more work. But yes, I'm sure it's not a measurable difference in this case. I agree with your readability comments. A new operator would be awesome. Something like sumall=@(x)sum(x(:)). But that does imply 20 or so new operators (meanall, stdall, maxall, etc. etc.), you can't do one and not all the others. :)
– Cris Luengo
Nov 12 at 16:44

add a comment |

1

Very nice! -- I guess sum(reshape(...)) is more expensive than sum(sum(...))? Is this because of the overhead in the reshape function? -- To avoid the double sum, in Octave you can do sum((...)(:)), and in MATLAB R2018b you can now do sum(...,'all'), which both are quite a bit more elegant than the double sum, and should be faster.
– Cris Luengo
Nov 12 at 16:17

Thanks. I don't know exactly which of sum-reshape or sum-sum performs better. Here I used sum-sum to reflect the original code and may be more readable. sum((...)(:)) may complicate readability and sum(...,'all') uses the ugly string parameter passing. Comparing them I think the difference in speed in the current problem is neglectible. May we need a more elegant function/operator?
– rahnema1
Nov 12 at 16:40

The double sum requires an intermediate array, so in principle it does more work. But yes, I'm sure it's not a measurable difference in this case. I agree with your readability comments. A new operator would be awesome. Something like sumall=@(x)sum(x(:)). But that does imply 20 or so new operators (meanall, stdall, maxall, etc. etc.), you can't do one and not all the others. :)
– Cris Luengo
Nov 12 at 16:44

Very nice! -- I guess sum(reshape(...)) is more expensive than sum(sum(...))? Is this because of the overhead in the reshape function? -- To avoid the double sum, in Octave you can do sum((...)(:)), and in MATLAB R2018b you can now do sum(...,'all'), which both are quite a bit more elegant than the double sum, and should be faster.
– Cris Luengo
Nov 12 at 16:17

Thanks. I don't know exactly which of sum-reshape or sum-sum performs better. Here I used sum-sum to reflect the original code and may be more readable. sum((...)(:)) may complicate readability and sum(...,'all') uses the ugly string parameter passing. Comparing them I think the difference in speed in the current problem is neglectible. May we need a more elegant function/operator?
– rahnema1
Nov 12 at 16:40

The double sum requires an intermediate array, so in principle it does more work. But yes, I'm sure it's not a measurable difference in this case. I agree with your readability comments. A new operator would be awesome. Something like sumall=@(x)sum(x(:)). But that does imply 20 or so new operators (meanall, stdall, maxall, etc. etc.), you can't do one and not all the others. :)
– Cris Luengo
Nov 12 at 16:44

add a comment |

% dummy data
n = 3;
X = rand(n);
Y = rand(n);
S = rand(n);

% vectorize
b2 = sum(reshape((permute(reshape(X, [n, 1, n]).*Y, [3,2,1]) - S).^2, 1, ));

% check
b - b2 % close to machine epsilon i.e. zero

A slight variation of the above transposes S instead of the 3d array which might be faster (again, you should time it):

b3 = sum(reshape((reshape(X, [n, 1, n]).*Y - reshape(S.', [1, n, n])).^2, 1, ));

In terms of performance, reshape is free (it only reinterprets data, it doesn't copy) but permute/transpose will often lead to a perforance hit when data gets copied.

edited Nov 12 at 12:16

answered Nov 12 at 11:59

Andras Deak

20.4k63870

add a comment |

% dummy data
n = 3;
X = rand(n);
Y = rand(n);
S = rand(n);

% vectorize
b2 = sum(reshape((permute(reshape(X, [n, 1, n]).*Y, [3,2,1]) - S).^2, 1, ));

% check
b - b2 % close to machine epsilon i.e. zero

A slight variation of the above transposes S instead of the 3d array which might be faster (again, you should time it):

b3 = sum(reshape((reshape(X, [n, 1, n]).*Y - reshape(S.', [1, n, n])).^2, 1, ));

In terms of performance, reshape is free (it only reinterprets data, it doesn't copy) but permute/transpose will often lead to a perforance hit when data gets copied.

edited Nov 12 at 12:16

answered Nov 12 at 11:59

Andras Deak

20.4k63870

add a comment |

% dummy data
n = 3;
X = rand(n);
Y = rand(n);
S = rand(n);

% vectorize
b2 = sum(reshape((permute(reshape(X, [n, 1, n]).*Y, [3,2,1]) - S).^2, 1, ));

% check
b - b2 % close to machine epsilon i.e. zero

A slight variation of the above transposes S instead of the 3d array which might be faster (again, you should time it):

b3 = sum(reshape((reshape(X, [n, 1, n]).*Y - reshape(S.', [1, n, n])).^2, 1, ));

In terms of performance, reshape is free (it only reinterprets data, it doesn't copy) but permute/transpose will often lead to a perforance hit when data gets copied.

edited Nov 12 at 12:16

answered Nov 12 at 11:59

Andras Deak

20.4k63870

% dummy data
n = 3;
X = rand(n);
Y = rand(n);
S = rand(n);

% vectorize
b2 = sum(reshape((permute(reshape(X, [n, 1, n]).*Y, [3,2,1]) - S).^2, 1, ));

% check
b - b2 % close to machine epsilon i.e. zero

A slight variation of the above transposes S instead of the 3d array which might be faster (again, you should time it):

b3 = sum(reshape((reshape(X, [n, 1, n]).*Y - reshape(S.', [1, n, n])).^2, 1, ));

In terms of performance, reshape is free (it only reinterprets data, it doesn't copy) but permute/transpose will often lead to a perforance hit when data gets copied.

edited Nov 12 at 12:16

answered Nov 12 at 11:59

Andras Deak

20.4k63870

edited Nov 12 at 12:16

answered Nov 12 at 11:59

Andras Deak

20.4k63870

answered Nov 12 at 11:59

Andras Deak

20.4k63870

answered Nov 12 at 11:59

Andras Deak

20.4k63870

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Myujth