Reproducing MATLAB random numbers in Python

June 16th, 2014 | Categories: math software, matlab, programming, python, random numbers | Tags:

When porting code between MATLAB and Python, it is sometimes useful to produce the exact same set of random numbers for testing purposes. Both Python and MATLAB currently use the Mersenne Twister generator by default so one assumes this should be easy…and it is…provided you use the generator in Numpy and avoid the seed 0.

Generate some random numbers in MATLAB

Here, we generate the first 5 numbers for 3 different seeds in MATLAB. Our aim is to reproduce these in Python.

>> format long
>> rng(0)
>> rand(1,5)'

ans =

   0.814723686393179
   0.905791937075619
   0.126986816293506
   0.913375856139019
   0.632359246225410

>> rng(1)
>> rand(1,5)'

ans =

   0.417022004702574
   0.720324493442158
   0.000114374817345
   0.302332572631840
   0.146755890817113

>> rng(2)
>> rand(1,5)'

ans =

   0.435994902142004
   0.025926231827891
   0.549662477878709
   0.435322392618277
   0.420367802087489

Python’s default random module

According to the documentation,Python’s random module uses the Mersenne Twister algorithm but the implementation seems to be different from MATLAB’s since the results are different. Here’s the output from a fresh ipython session:

In [1]: import random

In [2]: random.seed(0)

In [3]: [random.random() for _ in range(5)]
Out[3]: 
[0.8444218515250481,
 0.7579544029403025,
 0.420571580830845,
 0.25891675029296335,
 0.5112747213686085]

In [4]: random.seed(1)

In [5]: [random.random() for _ in range(5)]
Out[5]: 
[0.13436424411240122,
 0.8474337369372327,
 0.763774618976614,
 0.2550690257394217,
 0.49543508709194095]

In [6]: random.seed(2)

In [7]: [random.random() for _ in range(5)]
Out[7]: 
[0.9560342718892494,
 0.9478274870593494,
 0.05655136772680869,
 0.08487199515892163,
 0.8354988781294496]

The Numpy random module

Numpy’s random module, on the other hand, seems to use an identical implementation to MATLAB for seeds other than 0. In the below, notice that for seeds 1 and 2, the results are identical to MATLAB’s. For a seed of zero, they are different.

In [1]: import numpy as np

In [2]: np.set_printoptions(suppress=True)

In [3]: np.set_printoptions(precision=15)

In [4]: np.random.seed(0)

In [5]: np.random.random((5,1))
Out[5]: 
array([[ 0.548813503927325],
       [ 0.715189366372419],
       [ 0.602763376071644],
       [ 0.544883182996897],
       [ 0.423654799338905]])

In [6]: np.random.seed(1)

In [7]: np.random.random((5,1))
Out[7]: 
array([[ 0.417022004702574],
       [ 0.720324493442158],
       [ 0.000114374817345],
       [ 0.30233257263184 ],
       [ 0.146755890817113]])

In [8]: np.random.seed(2)

In [9]: np.random.random((5,1))
Out[9]: 
array([[ 0.435994902142004],
       [ 0.025926231827891],
       [ 0.549662477878709],
       [ 0.435322392618277],
       [ 0.420367802087489]])

Checking a lot more seeds

Although the above interactive experiments look convincing, I wanted to check a few more seeds. All seeds from 0 to 1 million would be a good start so I wrote a MATLAB script that generated 10 random numbers for each seed from 0 to 1 million and saved the results as a .mat file.

A subsequent Python script loads the .mat file and ensures that numpy generates the same set of numbers for each seed. It outputs every seed for which Python and MATLAB differ.

On my mac, I opened a bash prompt and ran the two scripts as follows

matlab -nodisplay -nodesktop -r "generate_matlab_randoms"
python python_randoms.py

The output was

MATLAB file contains 1000001 seeds and 10 samples per seed
Random numbers for seed 0 differ between MATLAB and Numpy

System details

Late 2013 Macbook Air
MATLAB 2014a
Python 2.7.7
Numpy 1.8.1

Walking Randomly