{"id":5479,"date":"2014-06-16T21:48:08","date_gmt":"2014-06-16T20:48:08","guid":{"rendered":"http:\/\/www.walkingrandomly.com\/?p=5479"},"modified":"2014-06-16T21:58:36","modified_gmt":"2014-06-16T20:58:36","slug":"reproducing-matlab-random-numbers-in-python","status":"publish","type":"post","link":"https:\/\/walkingrandomly.com\/?p=5479","title":{"rendered":"Reproducing MATLAB random numbers in Python"},"content":{"rendered":"<p>When porting code between MATLAB and Python, it is sometimes useful to produce the exact same set of random numbers for testing purposes. \u00a0Both Python and MATLAB currently use the <a href=\"http:\/\/en.wikipedia.org\/wiki\/Mersenne_twister\">Mersenne Twister<\/a> generator by default so one assumes this should be easy&#8230;and it is&#8230;provided you use the generator in <a href=\"http:\/\/www.numpy.org\">Numpy<\/a> and avoid the seed 0.<\/p>\n<p><strong>Generate some random numbers in MATLAB<\/strong><\/p>\n<p>Here, we generate the first 5 numbers for 3 different seeds in MATLAB. Our aim is to reproduce these in Python.<\/p>\n<pre>&gt;&gt; format long\r\n&gt;&gt; rng(0)\r\n&gt;&gt; rand(1,5)'\r\n\r\nans =\r\n\r\n   0.814723686393179\r\n   0.905791937075619\r\n   0.126986816293506\r\n   0.913375856139019\r\n   0.632359246225410\r\n\r\n&gt;&gt; rng(1)\r\n&gt;&gt; rand(1,5)'\r\n\r\nans =\r\n\r\n   0.417022004702574\r\n   0.720324493442158\r\n   0.000114374817345\r\n   0.302332572631840\r\n   0.146755890817113\r\n\r\n&gt;&gt; rng(2)\r\n&gt;&gt; rand(1,5)'\r\n\r\nans =\r\n\r\n   0.435994902142004\r\n   0.025926231827891\r\n   0.549662477878709\r\n   0.435322392618277\r\n   0.420367802087489<\/pre>\n<p><strong>Python&#8217;s default random module<\/strong><\/p>\n<p><a href=\"https:\/\/docs.python.org\/2\/library\/random.html\">According to the documentation<\/a>,Python&#8217;s random module uses the Mersenne Twister algorithm but the implementation seems to be different from MATLAB&#8217;s since the results are different. \u00a0Here&#8217;s the output from a fresh ipython session:<\/p>\n<pre>In [1]: import random\r\n\r\nIn [2]: random.seed(0)\r\n\r\nIn [3]: [random.random() for _ in range(5)]\r\nOut[3]: \r\n[0.8444218515250481,\r\n 0.7579544029403025,\r\n 0.420571580830845,\r\n 0.25891675029296335,\r\n 0.5112747213686085]\r\n\r\nIn [4]: random.seed(1)\r\n\r\nIn [5]: [random.random() for _ in range(5)]\r\nOut[5]: \r\n[0.13436424411240122,\r\n 0.8474337369372327,\r\n 0.763774618976614,\r\n 0.2550690257394217,\r\n 0.49543508709194095]\r\n\r\nIn [6]: random.seed(2)\r\n\r\nIn [7]: [random.random() for _ in range(5)]\r\nOut[7]: \r\n[0.9560342718892494,\r\n 0.9478274870593494,\r\n 0.05655136772680869,\r\n 0.08487199515892163,\r\n 0.8354988781294496]<\/pre>\n<p><strong>The Numpy random module<\/strong><\/p>\n<p><a href=\"http:\/\/docs.scipy.org\/doc\/numpy\/reference\/routines.random.html\">Numpy&#8217;s random module<\/a>, on the other hand, seems to use an identical implementation to MATLAB for seeds other than 0. In the below, notice that for seeds 1 and 2, the results are identical to MATLAB&#8217;s. For a seed of zero, they are different.<\/p>\n<pre>In [1]: import numpy as np\r\n\r\nIn [2]: np.set_printoptions(suppress=True)\r\n\r\nIn [3]: np.set_printoptions(precision=15)\r\n\r\nIn [4]: np.random.seed(0)\r\n\r\nIn [5]: np.random.random((5,1))\r\nOut[5]: \r\narray([[ 0.548813503927325],\r\n       [ 0.715189366372419],\r\n       [ 0.602763376071644],\r\n       [ 0.544883182996897],\r\n       [ 0.423654799338905]])\r\n\r\nIn [6]: np.random.seed(1)\r\n\r\nIn [7]: np.random.random((5,1))\r\nOut[7]: \r\narray([[ 0.417022004702574],\r\n       [ 0.720324493442158],\r\n       [ 0.000114374817345],\r\n       [ 0.30233257263184 ],\r\n       [ 0.146755890817113]])\r\n\r\nIn [8]: np.random.seed(2)\r\n\r\nIn [9]: np.random.random((5,1))\r\nOut[9]: \r\narray([[ 0.435994902142004],\r\n       [ 0.025926231827891],\r\n       [ 0.549662477878709],\r\n       [ 0.435322392618277],\r\n       [ 0.420367802087489]])<\/pre>\n<p><strong>Checking a lot more seeds<\/strong><\/p>\n<p>Although the above interactive experiments look convincing, I wanted to check a few more seeds. All seeds from 0 to 1 million would be a good start so I wrote a MATLAB script that generated 10 random numbers for each seed from 0 to 1 million and saved the results as a .mat file.<\/p>\n<p>A subsequent Python script loads the .mat file and ensures that numpy generates the same set of numbers for each seed. It outputs every seed for which Python and MATLAB differ.<\/p>\n<ul>\n<li><a href=\"https:\/\/www.walkingrandomly.com\/images\/matlab\/pyrand\/generate_matlab_randoms.m\">generate_matlab_randoms.m<\/a><\/li>\n<li><a href=\"https:\/\/www.walkingrandomly.com\/images\/matlab\/pyrand\/python_randoms.py\">python_randoms.py<\/a><\/li>\n<\/ul>\n<p>On my mac, I opened a bash prompt and ran the two scripts as follows<\/p>\n<pre>matlab -nodisplay -nodesktop -r \"generate_matlab_randoms\"\r\npython python_randoms.py<\/pre>\n<p>The output was<\/p>\n<pre>MATLAB file contains 1000001 seeds and 10 samples per seed\r\nRandom numbers for seed 0 differ between MATLAB and Numpy<\/pre>\n<p><strong>System details<\/strong><\/p>\n<ul>\n<li><span style=\"line-height: 13px;\">Late 2013 Macbook Air<\/span><\/li>\n<li>MATLAB 2014a<\/li>\n<li>Python 2.7.7<\/li>\n<li>Numpy 1.8.1<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>When porting code between MATLAB and Python, it is sometimes useful to produce the exact same set of random numbers for testing purposes. \u00a0Both Python and MATLAB currently use the Mersenne Twister generator by default so one assumes this should be easy&#8230;and it is&#8230;provided you use the generator in Numpy and avoid the seed 0. [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[4,11,7,31,63],"tags":[],"class_list":["post-5479","post","type-post","status-publish","format-standard","hentry","category-math-software","category-matlab","category-programming","category-python","category-random-numbers"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p3swhs-1qn","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/5479","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5479"}],"version-history":[{"count":8,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/5479\/revisions"}],"predecessor-version":[{"id":5487,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=\/wp\/v2\/posts\/5479\/revisions\/5487"}],"wp:attachment":[{"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5479"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5479"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/walkingrandomly.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5479"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}