## Randomness in MATLAB : Are you ruining your results due to your choice of syntax?

September 26th, 2012

Pop quiz: What does the following line of MATLAB code do?

rand('state',10)

If you said ‘It changes the seed of the random number generator to 10’ you get half a point.

‘Only half a point!?’ I hear you say accusingly ‘but it says so in my book [for example, 1-3], why not a full point?’

You only get a full point if you’d said something like ‘It changes the seed of the random number generator to 10 and it also changes the random number generator from the high quality, default Mersenne Twister generator to a lower quality legacy random number generator.

rand('seed',10)

This behaves in a very similar manner– it changes both the seed and the type of the underlying generator. However, the random number generator it switches to this time is an even older one that was introduced as far back as MATLAB version 4.  It is not very good at all by modern standards!

A closer look

>> RandStream.getGlobalStream
ans =
mt19937ar random stream (current global stream)
Seed: 0
NormalTransform: Ziggurat

mt1993ar refers to a particular variant of the Mersenne Twister algorithm— an industry strength random number generator that’s used in many software packages and simulations.  It’s been the default generator in MATLAB since 2007a.  Change the seed using the modern (since 2011a), recommended syntax and ask again:

>> rng(10)
>> RandStream.getGlobalStream
ans =
mt19937ar random stream (current global stream)
Seed: 10
NormalTransform: Ziggurat

This is behaving exactly as you’d expect, you ask it to change the seed and it changes the seed…nothing more, nothing less. Now, let’s use the older syntax

>> rand('state',10)
>> RandStream.getGlobalStream
ans =
legacy random stream (current global stream)
RAND algorithm: V5 (Subtract-with-Borrow), RANDN algorithm: V5 (Ziggurat)

The random number generator has completely changed!   We are no longer using the Mersenne Twister algorithm, we are now using a ‘subtract with borrow’ [see reference 4 for implementation details] generator which has been shown to have several undesirable issues [5-7].

Let’s do it again but this time using the even older ‘seed’ version:

>> rand('seed',10)
>> RandStream.getGlobalStream
ans =
legacy random stream (current global stream)
RAND algorithm: V4 (Congruential), RANDN algorithm: V5 (Ziggurat)

Now, this random number generator is ancient by computing standards.  It also has a relatively tiny period of only 2 billion or so.  For details see [4]

Why this matters

Now, all of this is well documented so you may wonder why I am making such a big issue out of it.  Here are my reasons

• I often get sent MATLAB code for the purposes of code-review and optimisation.  I see the old seeding syntax a LOT and the program’s authors are often blissfully unaware of the consequnces.
• The old syntax looks like all it should do is change the seed.  It doesn’t!  Before 2007a, however, it did!
• The old syntax is written in dozens of books because it was once the default, correct syntax to use.
• Many users don’t read the relevent section of the MATLAB documentation because they have no idea that there is a potential issue.  They read a book or tutorial..it says to use rand(‘state’,10) so they do.
• MATLAB doesn’t use the old generators by default any more because they are not very good [4-7]!
• Using these old generators may adversely affect the quality of your simulation.

The bottom line

Don’t do either of these to change the seed of the default generator to 10:

rand('state',10)
rand('seed',10)

rng(10)

Only if you completely understand and accept the consequences of the older syntax should you use it.

References

1. ‘MATLAB – A practical introduction to programming and problem solving’, 2009,Stormy Attaway

2. MATLAB Guide (Second Edition), 2005, Desmond Higham and Nicholas Higham

3. Essential MATLAB for Engineers and Scientists (Fourth Edition), 2009, Hahn and Valentine

4. Numerical Computing with MATLAB, 2004, Cleve Moler (available online)

5.  Why does the random number generator in MATLAB fail a particular test of randomness? The Mathworks, retreived 26th September 2012

6. A strong nonrandom pattern in Matlab default random number generator, 2006, Petr Savicky, retreived 26th September 2012

7.  Learning Random Numbers: A Matlab Anomaly, 2008, Petr Savicky and Marko Robnik-Šikonja, Applied Artificial Intelligence, Vol22 issue 3, pp 254-265

Other posts on random numbers in MATLAB

## Some Basic Stock Analysis with Mathematica 8

September 21st, 2012

One of my favourite investment news sites is The Motley Fool which frequently run articles such as 10 Shares Trading Near 52 week lows and 15 Shares Trading Near 52 week Highs.  The idea behind such filtering is to seek out shares that have done particularly badly (or well) over the last year and then subject them to further analysis in order to find opportunities.  Thanks to Mathematica’s FinancialData command, it is rather easy to generate these lists yourself whenever you like.

15 Shares Trading Near 52 Week Highs

The original article selected the 15 largest cap shares from the FTSE All Share Index that were trading within 3% of their 52 week high at the time of publication.  Let’s see how to do that using Mathematica.

The following code returns the tickers of all shares from the FTSE All Share Index that are trading within 3% of their 52 week high.

percentage = 3;
all52weekHighs =
Select[FinancialData["^FTAS", "Members"],
Abs[FinancialData[#, "FractionalChangeHigh52Week"]] < (percentage/100.) &];

The variable all52weekHighs contains a list of stock tickers (e.g. LLOY.L) that meet our criteria.  The next thing to do is to find the market cap of each one:

all52WeekHighsWithCaps =
Map[{#, FinancialData[#, "MarketCap"]} &, all52weekHighs];

This works fine for most shares. LloydsTSB for example returns {“LLOY.L”, 2.7746*10^10} at the time of writing but the MarketCap query fails for some tickers. For example, the Market Cap for HSL.L is not available and we get {“HSL.L”, Missing[“NotAvailable”]}.  Let’s discard these by insisting that we only consider stocks that have a numeric market cap.

Goodall52WeekHighsWithCaps =
Select[all52WeekHighsWithCaps, NumberQ[#[[2]]] &];

We sort the list according to MarketCap:

sorted = Sort[Goodall52WeekHighsWithCaps, #1[[2]] > #2[[2]] &];

Let’s prettify the list a little by iterating over all tickers and replacing the ticker with the associated stock name. Also, let’s divide the market cap by 1 million to make it more readable

finallist =
Map[{FinancialData[#[[1]], "Name"], #[[2]]/1000000} &, sorted];

Now, you may be wondering why I haven’t been showing you the output of these commands. This is simply because even this final list is rather large at 118 entries at the time of writing

Length[finallist]

118

The original article only considered the top 15 sorted by Market Cap so let’s show those. Market Caps are given in millions.

top15 = finallist[[1;;15]]//Grid

HSBC Holdings PLC    94159.
National Grid    24432.
Prudential PLC    21775.
Centrica PLC    17193.
Rolls Royce Group    16363.
WPP Plc    10743.
Experian PLC    10197.
Old Mutual PLC    8400.
Legal & General Group PLC    8036.
Wolseley PLC    7955.
Standard Life    6662.
J Sainsbury plc    6401.
Aggreko PLC    6391.
Land Securities Group PLC    6180.
British Land Co PLC    4859.

and we are done.

## 90th Carnival of Mathematics

September 16th, 2012

Welcome to the 90th edition of the Carnival of Mathematics and the first one hosted by me since I handed over the administrative reigns to the good people of aperiodical.  The CoM is a great way to read about and promote mathematical blogging and has been running for over 5 years.  Hosted on a different blog each month, it covers the entire mathematical spectrum from simple mucking around with numbers right up to cutting edge research.

Writers can submit their own posts for inclusion in a carnival if they like and anyone can submit any mathy post that they’ve found interesting– ideally, something written over the last month or so to keep it fresh.

If you want to keep up with the CoM, head over to its twitter feed or the dedicated page at aperiodical.

Trivia

Neat Stuff

Computation

Puzzles and Games

• Shecky R brings us Mind Wrenching – A self-referential logic puzzle that will give your brain cells a workout.
• Brent Yorgey has been visualizing winning strategies for “nim-like” games and says ‘This is a post about visualizing winning strategies for certain games where players take turns removing counters from two piles.  The games make for fun games to actually play, and analyzing them can get quite interesting!’

Funnies

Art and Mathematics

Books

Tricks and Tactics

Topology

• Mark Dominus of The Universe of Discourse gives us three posts this month: A two parter on topology and set theory (Click here for part 1 and here for part 2).

Teaching

• Dan McQuillan gives us On Trigonometric Nostalgia and says ‘This is a post about fostering a problem-solving mentality in a world where we do not even understand how our own tools work. It superimposes our nostalgia for the world we used to know with our innate curiosity, which still exists. Basic trigonometry is still fun and still relevant. Indeed, one can always ask questions and calculate!’
• Frederick Koh takes on the dot product in Understanding MATTERS (7) saying ‘This dot product concept involving parallel vector planes is rather fundamental, yet a handful of my students are unable to figure out how things exactly work. Hence I have decided to pen this detailed explanation in the hope that it will benefit not just my charges, but other math learners as well.’
• Augustus Van Dusen has written the first in an upcoming series of posts that will prove properties of logarthmic and exponential functions. Augustus says ‘This particular post will focus on the properties of logarithmic functions of real variables. Students in advanced placement calculus in high school and beginning college students who are not math majors are the intended audience.’

Comment

Not the only game in town

The Carnival of Mathematics isn’t the only mathematical blog carnival that’s doing the tour.  There’s also the fantastic monthly Math Teachers at Play.

End

That’s it for the 90th Edition.  Past editions written by me include 80, 76, 74 and 73 among others.  For future editions keep an eye on @carnivalofmath and aperiodical.com

## Send your submissions for 90th Carnival of Math

September 4th, 2012

I’m hosting the 90th Carnival of Mathematics very soon.  If you have written (or read) a mathematics blog article over the last month and want to give it more attention, why not make a submission? The deadline for submissions is 10th September.

## A Month of Math Software – August 2012

September 2nd, 2012

Welcome to the August edition of A Month of Math Software– a regular series of articles where I share what’s shiny and new in the world of mathematical software.  If you like what you see and want more, take a browse through the archives.  If you have some news that you think should go into next month’s edition, contact me to tell me all about it so I can tell the world.

This edition includes lots of new releases, blog posts and news about mathematics on mobile devices…enjoy!

Mobile Mathematics

August was a very big month for mobile mathematical applications with the following releases

General purpose mathematics

Do numerical computing using….Javascript!

• The Numeric Javascript library has been updated to version 1.2.2. The main new feature is linear programming– the function is numeric.solveLP()

Mathematical software libraries

GPU Programming

GPU stands for Graphical Processing Unit but these days you can get a GPU to do a lot more than just graphics.  You could think of them as essentially massively parallel math  co-processors that can make light work of certain operations.

• Jacket is a commercial GPU Processing add-on for MATLAB.  In recent blog posts, the Jacket developers discuss SAR Image Formation Algorithms on the GPU and Option Pricing.
• CULA is a set of commercial GPU-accelerated linear algebra libraries.  CULA-Dense is, as you might expect, for dense matrices and is now at version 15.    CULA-Sparse is at version S3.  I can’t find a what’s new document but the main change seems to be the addition of support for NVIDIA’s Kepler architecture.  The CULA library can be called from C, C++, Fortran, MATLAB, and Python and is free for individual academic use.
• GPULib is a commercial software library enabling GPU-accelerated calculations for IDL.  In a recent blog post, one of GPULib’s developers has been experimenting with OpenCL support.

Statistics