{"id":5774,"date":"2015-05-22T09:01:20","date_gmt":"2015-05-22T08:01:20","guid":{"rendered":"http:\/\/www.walkingrandomly.com\/?p=5774"},"modified":"2015-07-02T18:37:43","modified_gmt":"2015-07-02T17:37:43","slug":"matlab-vectorisation-is-a-double-edged-sword","status":"publish","type":"post","link":"https:\/\/walkingrandomly.com\/?p=5774","title":{"rendered":"MATLAB: Vectorisation is a double-edged sword"},"content":{"rendered":"<p><strong>Update: 2nd July 2015\u00a0<\/strong>The <a href=\"https:\/\/github.com\/mikecroucher\/WalkingRandomly\/tree\/master\">code in github<\/a> has moved on a little since this post was written so I changed the link in the text below to <a href=\"https:\/\/github.com\/mikecroucher\/WalkingRandomly\/tree\/6991a1652a79b373a1d4fa512a10112e0545eeca\">the exact commit that produced the results discussed here<\/a>.<\/p>\n<p>Imagine that you are a very new <a href=\"http:\/\/uk.mathworks.com\/products\/matlab\/\">MATLAB<\/a> programmer and you have to create an N x N matrix called A where A(i,j) = i+j<br \/>\nYour first attempt at a solution might look like this<\/p>\n<pre>N=2000\r\n% Generate a N-by-N matrix where A(i,j) = i + j;\r\nfor ii = 1:N\r\n     for jj = 1:N\r\n         A(ii,jj) = ii + jj;\r\n     end\r\n end<\/pre>\n<p>On my current machine (Macbook Pro bought in early 2015), the above loop takes <strong>2.03 seconds.<\/strong> You might think that this is a long time for something so simple and complain that MATLAB is slow. The person you complain to points out that you should preallocate your array before assigning to it.<\/p>\n<pre>N=2000\r\n% Generate a N-by-N matrix where A(i,j) = i + j;\r\nA=zeros(N,N);\r\nfor ii = 1:N\r\n     for jj = 1:N\r\n         A(ii,jj) = ii + jj;\r\n     end\r\n end<\/pre>\n<p>This now takes <strong>0.049 seconds<\/strong> on my machine &#8211; a speed up of over 41 times! MATLAB suddenly doesn&#8217;t seem so slow after all.<\/p>\n<p>Word gets around about your problem, however, and seasoned MATLAB-ers see that nested loop, make a funny face twitch and start muttering &#8216;vectorise, vectorise, vectorise&#8217;. Emails soon pile in with <a href=\"http:\/\/uk.mathworks.com\/help\/matlab\/matlab_prog\/vectorization.html\">vectorised<\/a> solutions<\/p>\n<pre>% Method 1: MESHGRID.\r\n[X, Y] = meshgrid(1:N, 1:N);\r\nA = X + Y;<\/pre>\n<p>This takes <strong>0.025 seconds<\/strong> on my machine &#8212; a healthy speed-up on the loop-with-preallocation solution. You have to understand the meshgrid command, however, in order to understand what&#8217;s going on here. It&#8217;s still clear (to me at least) what its doing but not as clear as the nice,obvious double loop. Call me old fashioned but I like loops&#8230;I understand them.<\/p>\n<pre>% Method 2: Matrix multiplication.\r\nA = (1:N).' * ones(1, N) + ones(N, 1) * (1:N);<\/pre>\n<p>This one is MUCH harder to read but you don&#8217;t worry about it too much because at <strong>0.032 seconds<\/strong> it&#8217;s slower than meshgrid.<\/p>\n<pre>% Method 3: REPMAT.\r\nA = repmat(1:N, N, 1) + repmat((1:N).', 1, N);<\/pre>\n<p>This one appears to be interesting! <strong>At 0.009 seconds<\/strong>, it&#8217;s the fastest so far &#8211; by a healthy amount!<\/p>\n<pre>% Method 4: CUMSUM.\r\nA = cumsum(ones(N)) + cumsum(ones(N), 2);<\/pre>\n<p>Coming in at <strong>0.052 seconds<\/strong>, this cumsum solution is slower than the preallocated loop.<\/p>\n<pre>% Method 5: BSXFUN.\r\nA = bsxfun(@plus, 1:N, (1:N).');<\/pre>\n<p>Ahhh, <a href=\"http:\/\/uk.mathworks.com\/help\/matlab\/ref\/bsxfun.html\">bsxfun<\/a> or &#8216;The Widow-maker function&#8217; as I sometimes refer to it. Responsible for some of the fastest and most unreadable vectorised MATLAB code I&#8217;ve ever written. In this case, it brings execution time down to <strong>0.0045 seconds<\/strong>.<\/p>\n<p>Whenever I see something that can be vectorised with a repmat, I try to figure out if I can rewrite it as a bsxfun. The result is usually horrible to look at so I tend to keep the original loop commented out above it as an explanation! This particular example isn&#8217;t too bad but bsxfun can quickly get hairy.<\/p>\n<p><strong>Conclusion<\/strong><\/p>\n<p>Loops in MATLAB aren&#8217;t anywhere near as bad as they used to be thanks to advances in <a href=\"http:\/\/en.wikipedia.org\/wiki\/Just-in-time_compilation\">JIT compilation<\/a> but it can often pay, speed-wise, to switch to vectorisation. The price you often pay for this speed-up is that vectorised code can become very difficult to read.<\/p>\n<p>If you&#8217;d like the code I ran to get the timings above, <a href=\"https:\/\/github.com\/mikecroucher\/WalkingRandomly\/tree\/6991a1652a79b373a1d4fa512a10112e0545eeca\">it&#8217;s on githu<\/a>b (link refers to the exact commit used for this post)\u00a0.\u00a0Here&#8217;s the output from the run I referred to in this post.<\/p>\n<pre>Original loop time is 2.025441\r\nPreallocate and loop is 0.048643\r\nMeshgrid time is 0.025277\r\nMatmul time is 0.032069\r\nRepmat time is 0.009030\r\nCumsum time is 0.051966\r\nbsxfun time is 0.004527<\/pre>\n<ul>\n<li>MATLAB Version: 2015a<\/li>\n<li>Early 2015 Macbook Pro with 16Gb RAM<\/li>\n<li>CPU: 2.8Ghz quad core i7 Haswell<\/li>\n<\/ul>\n<p>This post is based on a demonstration given by Mathwork&#8217;s Ken Deeley during a recent session at The University of Sheffield.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Update: 2nd July 2015\u00a0The code in github has moved on a little since this post was written so I changed the link in the text below to the exact commit that produced the results discussed here. Imagine that you are a very new MATLAB programmer and you have to create an N x N matrix [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[53,4,11,7],"tags":[],"class_list":["post-5774","post","type-post","status-publish","format-standard","hentry","category-making-matlab-faster","category-math-software","category-matlab","category-programming"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p3swhs-1v8","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/5774","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5774"}],"version-history":[{"count":7,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/5774\/revisions"}],"predecessor-version":[{"id":5805,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/5774\/revisions\/5805"}],"wp:attachment":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5774"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5774"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5774"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}