{"id":2678,"date":"2010-06-26T10:07:42","date_gmt":"2010-06-26T09:07:42","guid":{"rendered":"http:\/\/www.walkingrandomly.com\/?p=2678"},"modified":"2010-06-26T10:08:58","modified_gmt":"2010-06-26T09:08:58","slug":"a-free-version-of-the-pdist-command-for-matlab","status":"publish","type":"post","link":"https:\/\/walkingrandomly.com\/?p=2678","title":{"rendered":"A free version of the pdist command for MATLAB"},"content":{"rendered":"<p>MATLAB contains a function called <a href=\"http:\/\/www.mathworks.de\/access\/helpdesk\/help\/toolbox\/stats\/pdist.html\">pdist<\/a> that calculates the &#8216;Pairwise distance between pairs of objects&#8217;.  Typical usage is<\/p>\n<pre>X=rand(10,2);\r\ndists=pdist(X,'euclidean');\r\n<\/pre>\n<p>It&#8217;s a nice function but the problem with it is that it is part of the <a href=\"http:\/\/www.mathworks.com\/products\/statistics\/\">Statistics Toolbox<\/a> and that costs extra.  I was recently approached by a user who needed access to the pdist function but all of the statistics toolbox license tokens on our network were in use and this led to the error message<\/p>\n<pre>??? License checkout failed.\r\nLicense Manager Error -4\r\nMaximum number of users for Statistics_Toolbox reached.\r\nTry again later.\r\nTo see a list of current users use the lmstat utility or contact your License Administrator<\/pre>\n<p>One option, of course, is to buy more licenses for the statistics toolbox but there is another way.  You may have heard of <a href=\"http:\/\/www.gnu.org\/software\/octave\/\">GNU Octave<\/a>, a free,open-source MATLAB-like program that has been in development for many years.\u00a0 Well, Octave has a sister project called <a href=\"http:\/\/octave.sourceforge.net\/\">Octave-Forge<\/a> which aims to provide a set of free toolboxes for Octave.\u00a0 It turns out that not only does Octave-forge contain a statistics toolbox but that toolbox contains an pdist function.\u00a0 I wondered how hard it would be to take Octave-forge&#8217;s pdist function and modify it so that it ran on MATLAB.<\/p>\n<p>Not very!\u00a0 There is a script called <a href=\"http:\/\/octave.sourceforge.net\/oct2mat\/index.html\">oct2mat<\/a> that is designed to automate some of the translation but I chose not to use it &#8211; I prefer to get my hands dirty you see.\u00a0 I named the resulting function octave_pdist to help clearly identify the fact that you are using an Octave function rather than a\u00a0 MATLAB function.\u00a0 This may matter if one or the other turns out to have bugs.\u00a0 It appears to work rather nicely:<\/p>\n<pre>dists_oct = octave_pdist(X,'euclidean');\r\n% let's see if it agrees with the stats toolbox version\r\nall( abs(dists_oct-dists)&lt;1e-10)\r\n\r\nans =\r\n     1\r\n<\/pre>\n<p>Let&#8217;s look at timings on a slightly bigger problem.<\/p>\n<pre>&gt;&gt; X=rand(1000,2);\r\n&gt;&gt; tic;matdists=pdist(X,'euclidean');toc\r\nElapsed time is 0.018972 seconds.\r\n&gt;&gt; tic;octdists=octave_pdist(X,'euclidean');toc\r\nElapsed time is 6.644317 seconds.\r\n<\/pre>\n<p>Uh-oh!  The Octave version is 350 times slower (for this problem) than the MATLAB version.  Ouch!   As far as I can tell, this isn&#8217;t down to my dodgy porting efforts, the original Octave pdist really does take that long on my machine (Ubuntu 9.10, Octave 3.0.5).<\/p>\n<p>This was far too slow to be of practical use and we didn&#8217;t want to be modifying algorithms so we ditched this function and went with the <a href=\"http:\/\/www.nag.co.uk\/numeric\/MB\/start.asp\">NAG Toolbox for MATLAB<\/a> instead (<a href=\"http:\/\/www.nag.co.uk\/numeric\/MB\/manual_22_1\/pdf\/G03\/g03ea.pdf\">routine g03ea<\/a> in case you are interested) since Manchester effectively has an infinite number of licenses for that product.<\/p>\n<p>If,however, you&#8217;d like to play with my MATLAB port of Octave&#8217;s pdist then download it below.<\/p>\n<ul>\n<li> <a href=\"https:\/\/www.walkingrandomly.com\/images\/downloads\/octave_pdist.m\">octave_pdist.m<\/a> makes use of some functions in the excellent <a href=\"http:\/\/biosig-consulting.com\/matlab\/NaN\/\">NaN Toolbox<\/a> so you will need to download and install that package first.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>MATLAB contains a function called pdist that calculates the &#8216;Pairwise distance between pairs of objects&#8217;. Typical usage is X=rand(10,2); dists=pdist(X,&#8217;euclidean&#8217;); It&#8217;s a nice function but the problem with it is that it is part of the Statistics Toolbox and that costs extra. I was recently approached by a user who needed access to the pdist [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[4,11,32,7],"tags":[],"class_list":["post-2678","post","type-post","status-publish","format-standard","hentry","category-math-software","category-matlab","category-open-source","category-programming"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p3swhs-Hc","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/2678","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2678"}],"version-history":[{"count":14,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/2678\/revisions"}],"predecessor-version":[{"id":2745,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/2678\/revisions\/2745"}],"wp:attachment":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2678"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2678"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2678"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}