Parallelisation: Is MATLAB doing it wrong?

May 12th, 2009 | Categories: math software, matlab | Tags:

In the old days of computing, life was relatively simple as far as processor speed was concerned.  If your current computer ran at 1Ghz and you were about to buy one that ran at 2Ghz then, as a rough rule of thumb, you expected it to be about twice as fast.  Of course there was more to it than that (there always is) and other things such as amount of memory, hard disk speed etc etc also had an impact on how fast you got the results of your calculations.  It also didn’t help that not all Ghz were created equally – two chips that ran at the same clock speed but with different architectures could exhibit radically different calculation performance.

I did say life was ‘relatively simple’ not just ‘simple’.

These days we still have all of the above to worry about, but we also have the added complication that a typical modern computer contains not one, but two processors.  It’s almost like having two machines for the price of one and higher end desktops can contain as many as 8.  Which is a lot!

Back in the old, single processor, days we could take our slow code from an old machine, put it on a new machine and it would go faster.   That was it – no worries, no hassle.  If only life were so simple today….

These days we need to actually rewrite our code so that it makes better use of all the idle processors lying around in our new fangled machines or, to put it more technically, we need to parallelize our code.  Sometimes this is simple, often it’s not but one thing is for sure and that is that the environment we are programming in has to provide a set of parallel programming commands to help us do this extra work.

Old style languages such as C and FORTRAN have had this sort of stuff for years in the form of libraries such as Open MPI and Open MP but mathematical programming languages such as Mathematica and Maple have only recently gotten in on the act.

Version 7 of Mathematica and Version 13 of Maple* both included a set of tools for enabling parallel programming as part of their basic installs.  In other words – you don’t need to pay any extra money to get these programs to use your multi-processor computers to their full potential.

MATLAB, on the other hand, expects you to pay extra to be able to parallelize  your code by requiring you to purchase the Parallel Computing Toolbox.  It’s not cheap either!  Of course, Mathworks are free to charge whatever they like for their products but I don’t think that this particular policy is doing them any favours.

It’s not all bad – certain functions in basic MATLAB make full use of your multi core processors.  As of version 2009a, for example, their Fast Fourier Transform function (fft) will perform much faster on multi-processor machines but if you want to parallelize your own code then you still have to buy the Parallel toolbox.  Your alternatives include messing around with batch processing (if your problem can be solved this way) or use another package such as Mathematica or Maple which has full parallel support out of the box.

At my University we have several hundred network licenses for MATLAB itself and only 2 or 3 licenses for the Parallel toolbox.  That doesn’t mean that people aren’t interested in parallel MATLAB, quite the opposite, they simply do not want to have to pay extra to parallelize code on desktops and will go to great lengths (all legal I hasten to add) to avoid the extra cost.

Unfortunately for Mathworks one of the easiest options for someone starting on a new project is simply to use one of their competitors and this is starting to happen.  Manchester has a site license for Mathematica and I saw a big spike in interest after version 7 thanks to the parallel programming features it contains.

In summary I think that The Mathworks would do themselves a great favour by including the parallel toolbox as part of their basic package.

Any thoughts?

*Maple actually had parallel support before version 13 but it has been vastly improved in this latest version

  1. Stephen
    May 13th, 2009 at 23:03
    Reply | Quote | #1

    I couldn’t agree more. It took a big effort just to get them to support compiling the parallel toolbox.

  2. Brett
    July 28th, 2009 at 23:36
    Reply | Quote | #2

    I also agree. I think that TMW will eventually offer parallel computing for free. But that they see it as a luxury for now, and want to take advantage of those willing to pay for it, while those who aren’t willing to pay now probably don’t “need” it anyway. :)

  3. John Butcher
    January 21st, 2010 at 15:35
    Reply | Quote | #3

    Hi Mike,

    On a related topic, I’m getting some weird behaviour from Matlab. I have a mini cluster of machines that I have set up in order to parallelise some simulations I run. This doesn’t make use of the parallel computing toolbox, but uses bash scripts instead to send jobs to different machines. The weird thing is, that when I start matlab on the worker machines, some matlab instances appear to use both cores (I forgot to mention, they are dual core) as top reveals the CPUs at around 200%. For some machines however, it appears only one core only ever gets used, with the other sitting there idle for most of the time. This changes from machine to machine on the odd occasion also. Have you heard of this problem before? Any ideas on how to solve it as I want to use both cores to speed up my simulations.

    Many thanks

    John

    On another note, I must say I enjoy reading your blog – its full of great material which is very relevant (esp Matlab) to what I do!

  4. January 21st, 2010 at 16:01
    Reply | Quote | #4

    Hi John

    Glad you enjoy the blog and thanks for saying hi.

    My gut feeling would be that your script contains some MATLAB functions that get automatically parallelized. When these functions are being evaluated you will see full dual-core utilisation which will drop down to only one core afterwards.
    Although I do not think it is yet complete, the following post contains many of the functions that automatically get distributed across multiple cores

    http://www.walkingrandomly.com/?p=1894

    Certain routines that use BLAS and LAPACK might be multi-core aware too – I haven’t figured these out yet.

    Cheers,
    Mike

  5. John Butcher
    January 21st, 2010 at 16:18
    Reply | Quote | #5

    Thanks Mike,

    I thought this at first, but the code is the same for all machines, and after looking at top on each machine for a while, some stay at 200% for the duration, while the problem ones only stay at 100%. So I don’t think it is this.

    I’ll keep investigating! Thanks for your reply

  6. January 21st, 2010 at 17:08
    Reply | Quote | #6

    John

    I’ll be interested to hear of the outcome. Is the email address you gave when writing your comments valid?

    Something else that sounds obvious but might be worth checking – Are you using the same version of MATLAB on all machines? Is Multicore support turned off somehow on the problem ones?

    Cheers,
    Mike

  7. John Butcher
    January 21st, 2010 at 18:11
    Reply | Quote | #7

    Yeah the email address is valid – its my uni one. I’m using the same version yes, I’ll check re multicore support. I’ll be in touch..