{"id":3635,"date":"2011-06-18T16:53:40","date_gmt":"2011-06-18T15:53:40","guid":{"rendered":"http:\/\/www.walkingrandomly.com\/?p=3635"},"modified":"2011-06-18T16:53:40","modified_gmt":"2011-06-18T15:53:40","slug":"making-mathematica-faster-with-compile","status":"publish","type":"post","link":"https:\/\/walkingrandomly.com\/?p=3635","title":{"rendered":"Making Mathematica faster with Compile"},"content":{"rendered":"<p>Over at Sol Lederman&#8217;s fantastic new blog, <a href=\"http:\/\/playingwithmathematica.com\/2011\/06\/17\/friday-fun-6\/#more-413\">Playing with Mathematica<\/a>, he shared some code that produced the following figure.<\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" class=\"aligncenter\" src=\"https:\/\/www.walkingrandomly.com\/images\/mathematica8\/sol_image1.png\" alt=\"Sol's image\" \/><\/p>\n<p>Here&#8217;s Sol&#8217;s code with an AbsoluteTiming command thrown in.<\/p>\n<pre>f[x_, y_] := Module[{},\r\n  If[\r\n   Sin[Min[x*Sin[y], y*Sin[x]]] &gt;\r\n    Cos[Max[x*Cos[y],\r\n       y*Cos[x]]] + (((2 (x - y)^2 + (x + y - 6)^2)\/40)^3)\/\r\n      6400000 + (12 - x - y)\/30, 1, 0]\r\n  ]\r\n\r\n\u00a0AbsoluteTiming[\r\n\\[Delta] = 0.02;\r\nrange = 11;\r\nxyPoints = Table[{x, y}, {y, 0, range, \\[Delta]}, {x, 0, range, \\[Delta]}];\r\nimage = Map[f @@ # &amp;, xyPoints, {2}];\r\n]\r\nRotate[ArrayPlot[image, ColorRules -&gt; {0 -&gt; White, 1 -&gt; Black}], 135 Degree]<\/pre>\n<p>This took 8.02 seconds on the laptop I am currently working on (Windows 7 AMD Phenom II N620 Dual core at 2.8Ghz).  <em>Note that I am only measuring how long the calculation itself took and am ignoring the time taken to render the image and define the function<\/em>.<\/p>\n<p><strong>Compiled functions make Mathematica code go faster<\/strong><\/p>\n<p><a href=\"http:\/\/www.wolfram.com\/mathematica\/\">Mathematica<\/a> has a Compile function which does exactly what you&#8217;d expect&#8230;it produces a compiled version of the function you give it (if it can!).  Sol&#8217;s function gave it no problems at all.<\/p>\n<pre>f = Compile[{{x, _Real}, {y, _Real}}, If[\r\n    Sin[Min[x*Sin[y], y*Sin[x]]] &gt;\r\n     Cos[Max[x*Cos[y],\r\n        y*Cos[x]]] + (((2 (x - y)^2 + (x + y - 6)^2)\/40)^3)\/\r\n       6400000 + (12 - x - y)\/30, 1, 0]\r\n   ];\r\n\r\nAbsoluteTiming[\r\n \\[Delta] = 0.02;\r\n range = 11;\r\n xyPoints =\r\n  Table[{x, y}, {y, 0, range, \\[Delta]}, {x, 0, range, \\[Delta]}];\r\n image = Map[f @@ # &amp;, xyPoints, {2}];\r\n ]\r\nRotate[ArrayPlot[image, ColorRules -&gt; {0 -&gt; White, 1 -&gt; Black}],\r\n 135 Degree]<\/pre>\n<p>This simple change takes computation time down from 8.02 seconds to 1.23 seconds which is a 6.5 times speed up for hardly any extra coding work. Not too shabby!<\/p>\n<p><strong>Switch to C code to get it even faster<\/strong><\/p>\n<p>I&#8217;m not done yet though!  By default the Compile command produces code for the so-called Mathematica Virtual Machine but recent versions of Mathematica allow us to go even further.<\/p>\n<p>Install Visual Studio Express 2010 (and the Windows 7.1 SDK if you are running 64bit Windows) and you can ask Mathematica to convert the function to low level C code, compile it and produce a function object linked to the resulting compiled code.  Sounds complicated but is a snap to actually do.  Just add<\/p>\n<pre>CompilationTarget -&gt; \"C\"<\/pre>\n<p>to the Compile command.<\/p>\n<pre>f = Compile[{{x, _Real}, {y, _Real}},\r\n   If[Sin[Min[x*Sin[y], y*Sin[x]]] &gt;\r\n     Cos[Max[x*Cos[y],\r\n        y*Cos[x]]] + (((2 (x - y)^2 + (x + y - 6)^2)\/40)^3)\/\r\n       6400000 + (12 - x - y)\/30, 1, 0]\r\n   , CompilationTarget -&gt; \"C\"\r\n   ];\r\n\r\nAbsoluteTiming[\\[Delta] = 0.02;\r\n range = 11;\r\n xyPoints =\r\n  Table[{x, y}, {y, 0, range, \\[Delta]}, {x, 0, range, \\[Delta]}];\r\n image = Map[f @@ # &amp;, xyPoints, {2}];]\r\nRotate[ArrayPlot[image, ColorRules -&gt; {0 -&gt; White, 1 -&gt; Black}],\r\n 135 Degree]<\/pre>\n<p>On my machine this takes calculation time down to 0.89 seconds which is 9 times faster than the original.<\/p>\n<p><strong>Making the compiled function listable<\/strong><\/p>\n<p>The current compiled function takes just one x,y pair and returns a result.<\/p>\n<pre>In[8]:= f[1, 2]\r\n\r\nOut[8]= 1<\/pre>\n<p>It can&#8217;t directly accept a list of x values and a list of y values.  For example for the two points (1,2) and (10,20) I&#8217;d like to be able to do f[{1, 10}, {2, 20}] and get the results {1,1}.  However what I end up with is an error<\/p>\n<pre>f[{1, 10}, {2, 20}]\r\n\r\nCompiledFunction::cfsa: Argument {1,10} at position 1 should be a machine-size real number. &gt;&gt;<\/pre>\n<p>To fix this I need to make my compiled function listable which is as easy as adding<\/p>\n<pre>RuntimeAttributes -&gt; {Listable}<\/pre>\n<p>to the function definition.<\/p>\n<pre>f = Compile[{{x, _Real}, {y, _Real}},\r\n   If[Sin[Min[x*Sin[y], y*Sin[x]]] &gt;\r\n     Cos[Max[x*Cos[y],\r\n        y*Cos[x]]] + (((2 (x - y)^2 + (x + y - 6)^2)\/40)^3)\/\r\n       6400000 + (12 - x - y)\/30, 1, 0]\r\n   , CompilationTarget -&gt; \"C\", RuntimeAttributes -&gt; {Listable}\r\n   ];<\/pre>\n<p>So now I can pass the entire array to this compiled function at once.  No need for Map.<\/p>\n<pre>AbsoluteTiming[\r\n \\[Delta] = 0.02;\r\n range = 11;\r\n xpoints = Table[x, {x, 0, range, \\[Delta]}, {y, 0, range, \\[Delta]}];\r\n ypoints = Table[y, {x, 0, range, \\[Delta]}, {y, 0, range, \\[Delta]}];\r\n image = f[xpoints, ypoints];\r\n ]\r\nRotate[ArrayPlot[image, ColorRules -&gt; {0 -&gt; White, 1 -&gt; Black}],\r\n 135 Degree]<\/pre>\n<p>On my machine this gets calculation time down to 0.28 seconds, a whopping 28.5 times faster than the original.  Rendering time is becoming much more of an issue than calculation time in fact!<\/p>\n<p><strong>Parallel anyone?<\/strong><\/p>\n<p>Simply by adding<\/p>\n<pre>Parallelization -&gt; True<\/pre>\n<p>to the Compile command I can parallelise the code using threads.  Since I have a dual core machine, this might be a good thing to do.  Let&#8217;s take a look<\/p>\n<pre>f = Compile[{{x, _Real}, {y, _Real}},\r\n   If[\r\n    Sin[Min[x*Sin[y], y*Sin[x]]] &gt;\r\n     Cos[Max[x*Cos[y],\r\n        y*Cos[x]]] + (((2 (x - y)^2 + (x + y - 6)^2)\/40)^3)\/\r\n       6400000 + (12 - x - y)\/30, 1, 0]\r\n   , RuntimeAttributes -&gt; {Listable}, CompilationTarget -&gt; \"C\",\r\n   Parallelization -&gt; True];\r\n\r\nAbsoluteTiming[\r\n \\[Delta] = 0.02;\r\n range = 11;\r\n xpoints = Table[x, {x, 0, range, \\[Delta]}, {y, 0, range, \\[Delta]}];\r\n ypoints = Table[y, {x, 0, range, \\[Delta]}, {y, 0, range, \\[Delta]}];\r\n image = f[xpoints, ypoints];\r\n ]\r\nRotate[ArrayPlot[image, ColorRules -&gt; {0 -&gt; White, 1 -&gt; Black}],\r\n 135 Degree]<\/pre>\n<p>The first time I ran this it was SLOWER than the non-threaded version coming in at 0.33 seconds.  Subsequent runs varied and occasionally got as low as 0.244 seconds which is only a few hundredths of a second faster than the original.<\/p>\n<p>If I make the problem bigger, however, by decreasing the size of Delta then we start to see the benefit of parallelisation.<\/p>\n<pre>AbsoluteTiming[\r\n \\[Delta] = 0.01;\r\n range = 11;\r\n xpoints = Table[x, {x, 0, range, \\[Delta]}, {y, 0, range, \\[Delta]}];\r\n ypoints = Table[y, {x, 0, range, \\[Delta]}, {y, 0, range, \\[Delta]}];\r\n image = f[xpoints, ypoints];\r\n ]\r\nRotate[ArrayPlot[image, ColorRules -&gt; {0 -&gt; White, 1 -&gt; Black}],\r\n 135 Degree]<\/pre>\n<p>The above calculation (sans rendering) took 0.988 seconds using a parallelised version of f and 1.24 seconds using a serial version.  Rendering took significantly longer!  As a comparison lets put a Delta of 0.01 in the original code:<\/p>\n<pre>f[x_, y_] := Module[{},\r\n  If[\r\n   Sin[Min[x*Sin[y], y*Sin[x]]] &gt;\r\n    Cos[Max[x*Cos[y],\r\n       y*Cos[x]]] + (((2 (x - y)^2 + (x + y - 6)^2)\/40)^3)\/\r\n      6400000 + (12 - x - y)\/30, 1, 0]\r\n  ]\r\n\r\n AbsoluteTiming[\r\n\\[Delta] = 0.01;\r\nrange = 11;\r\nxyPoints = Table[{x, y}, {y, 0, range, \\[Delta]}, {x, 0, range, \\[Delta]}];\r\nimage = Map[f @@ # &amp;, xyPoints, {2}];\r\n]\r\nRotate[ArrayPlot[image, ColorRules -&gt; {0 -&gt; White, 1 -&gt; Black}], 135 Degree]<\/pre>\n<p>The calculation time (again, ignoring rendering time) took 32.56 seconds and so our C-compiled, parallel version is almost 33 times faster!<\/p>\n<p><strong>Summary<\/strong><\/p>\n<ul>\n<li>The Compile function can make your code run significantly faster by compiling it for the Mathematica Virtual Machine (MVM).\u00a0 Note that not every function is suitable for compilation.<\/li>\n<li>If you have a C-compiler installed on your machine then you can switch from the MVM to compiled C-code using a single option statement.\u00a0 The resulting code is even faster<\/li>\n<li>Making your functions listable can increase performance.<\/li>\n<li>Parallelising your compiled function is easy and can lead to even more speed but only if your problem is of a suitable size.<\/li>\n<li>Sol Lederman has a <a href=\"http:\/\/playingwithmathematica.com\/\">very cool Mathematica blog<\/a> &#8211; check it out!\u00a0 The code that inspired this blog post originated there.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Over at Sol Lederman&#8217;s fantastic new blog, Playing with Mathematica, he shared some code that produced the following figure. Here&#8217;s Sol&#8217;s code with an AbsoluteTiming command thrown in. f[x_, y_] := Module[{}, If[ Sin[Min[x*Sin[y], y*Sin[x]]] &gt; Cos[Max[x*Cos[y], y*Cos[x]]] + (((2 (x &#8211; y)^2 + (x + y &#8211; 6)^2)\/40)^3)\/ 6400000 + (12 &#8211; x &#8211; [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[52,4,8,41,7],"tags":[],"class_list":["post-3635","post","type-post","status-publish","format-standard","hentry","category-making-mathematica-faster","category-math-software","category-mathematica","category-parallel-programming","category-programming"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p3swhs-WD","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/3635","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3635"}],"version-history":[{"count":16,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/3635\/revisions"}],"predecessor-version":[{"id":3651,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/3635\/revisions\/3651"}],"wp:attachment":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3635"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3635"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3635"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}