## Archive for June, 2012

Welcome to the June 2012 edition of A Month of Math Software. The archive to all previous editions is here. Feel free to contact me if you have news that you’d like including in next month’s edition.

**A list of packages based upon SuiteSparse**

- Tim Davis has put together a list of projects and packages that rely on the various sparse matrix codes he’s written or co-authored over the years (UMFPACK, CHOLMOD, SuiteSparseQR, KLU, CSparse, and the ordering codes AMD and COLAMD). The list is very impressive but is incomplete. Drop Tim a line if you work on a project that uses his codes.

**Python based software**

- Sage, the superb computer algebra system based on Python and dozens of other open source projects, has seen a minor update to 5.0.1. Take a look at the changelog for the new stuff.
- Pandas, the Python Data Analysis Library, is now at release 0.8.0. This is a major update on the old 0.7.3 and has significantly improved timeseries data analysis in pandas.

**Try a spreadsheet that isn’t Excel**

- Gnumeric, the free spreadsheet that’s part of the GNOME office suite, was updated to 1.11.4 in June. Find out what’s new here.
- A new version of LibreOffice (version 3.5.4) was released at the end of May but I missed it in May’s update. LibreOffice is the fork of OpenOffice that everyone should be using these days and includes a great spreadsheet program.

**Math-o-mir – The mathematical notepad**

- Danijel gorupec contacted me to tell me of a new release of his free software, Math-o-mir. According to Danijel ‘The Math-o-mir is a software tool designed to write and edit mathematical equations as easily as possible. The goal was to achieve the same level of simplicity as with pencil and a sheet of paper. It is designed to act as your personal math notepad where you can write down your quick and informal calculations.’ I haven’t had chance to try it myself but it looks interesting. See what’s new in the 1.7 release over at Danijel’s blog.

**Gnu Regression, Econometrics and Time-series Library**

- Version 1.9.9 of the Gnu Regression, Econometrics and Time-series Library, gretl, was released at the beginning of June. As the name suggests, gretl is mainly of use to econometricians and it has a long pedigree. It’s first release was in 2000 and it has since been discussed in several journal articles (see below). See what’s new in version 1.9.9 here.
- Using gretl for Monte Carlo experiments – Journal of Applied Econometrics 2010
- GRETL 1.6.0 and its numerical accuracy – Journal of Applied Econometrics 2007
- Teaching undergraduate econometrics with GRETL – Journal of Applied Econometric 2006
- GRETL: Econometric Software for the GNU generation – Journal of Applied Econometrics 2003

**Numerical Libraries**

- Magma, the GPU accelerated linear algebra library, has been updated to version 1.2.1. See what’s new at http://icl.cs.utk.edu/magma/news/news.html?id=295
- The HSL numerical library has seen some updates. Go to http://www.hsl.rl.ac.uk/changes.html for the details.

**Blogs, Bits and Bobs**

- Planet Octave is now up and running. This is a collection of blogs related to Octave and is currently made up of posts from the Google Summer of Code students. Some great stuff on the Just in Time (JIT) compiler development, Octave GUI and more.
- PTC’s MathCAD blog turns 3 years old. Head over and take a look at their 5 most popular posts. If you can’t afford Mathcad, why not try the free Mathcad clone, SMath Studio which saw a new release this month.
- Geogebra now has a blog. The June newsletter is also worth checking out. Wikipedia tells us that ‘GeoGebra is an interactive geometry, algebra, and calculus application, intended for teachers and students.’
- Version 0.97 of Eureqa/Formulize has been released. If you don’t know what this is, take a look at the video. In a nutshell, you plug in data and it figures out the formula that best models it.

In a recent tweet, Cliff Pickover told the world that

727272727272727272727272727272727272727272727272727272727272727272727272727272727272727272727272727 is prime. That’s a nice looking prime and it took my laptop 1/100th of a second to confirm using Mathematica 8.

PrimeQ[727272727272727272727272727272727272727272727272727272727272727\ 272727272727272727272727272727272727] // AbsoluteTiming Out[1]= {0.0100000, True}

Can anyone else suggest some nice looking primes?

It is possible to write many integers as the sum of the cubes of three integers. For example

99 = (-5)^3 + 2^3+ 6^3

A more complicated example is

91 = (-67134)^3 + (-65453)^3+(83538)^3

Your task is to find integers x,y and z such that

33 = x^3 + y^3 + z^3

Hint: This is not a trivial problem and will (probably) require the use of a computer. Extra credit given if you also post your source code.

**Update 3: **If you are serious about attempting to crack this problem (and some people seem to be, judging from the comments), the following reference may be of help. The bibliography includes other jumping off points for this problem.

- Michael Beck, Eric Pine, Wayne Tarrant and Kim Yarbrough Jensen:
**New integer representations as the sum of three cubes**Math. Comp.**76**(2007), 1683-1690 (Link)

**Update 2: **This problem is, in fact, a long-standing unsolved problem in number theory. If my memory serves me correctly, it is mentioned in the book Unsolved Problems in Number Theory

**Update 1:** 33 is very VERY difficult so why not use 16 as a warm up problem. MUCH smaller search space.

I recently got access to a shiny new (new to me at least) set of compilers, The Portland PGI compiler suite which comes with a great set of technologies to play with including AVX vector support, CUDA for x86 and GPU pragma-based acceleration. So naturally, it wasn’t long before I wondered if I could use the PGI suite as compilers for MATLAB mex files. The bad news is that The Mathworks don’t support the PGI Compilers out of the box but that leads to the good news…I get to dig down and figure out how to add support for unsupported compilers.

In what follows I made use of **MATLAB 2012a** on **64bit Windows 7** with **Version 12.5 of the PGI Portland Compiler Suite**.

In order to set up a C mex-compiler in MATLAB you execute the following

mex -setup

which causes MATLAB to execute a Perl script at **C:\Program Files\MATLAB\R2012a\bin\mexsetup.pm**. This script scans the directory **C:\Program Files\MATLAB\R2012a\bin\win64\mexopts** looking for Perl scripts with the extension .stp and running whatever it finds. Each .stp file looks for a particular compiler. After all .stp files have been executed, a list of compilers found gets returned to the user. When the user chooses a compiler, the corresponding .bat file gets copied to the directory returned by MATLAB’s prefdir function. This sets up the compiler for use. All of this is nicely documented in the **mexsetup.pm** file itself.

So, I’ve had my first crack at this and the results are the following two files.

These are crude, and there’s probably lots missing/wrong but they seem to work. Copy them to **C:\Program Files\MATLAB\R2012a\bin\win64\mexopts. **The location of the compiler is hard-coded in pgi.stp so you’ll need to change the following line if your compiler location differs from mine

my $default_location = "C:\\Program Files\\PGI\\win64\\12.5\\bin";

Now, when you do **mex -setup**, you should get an entry **PGI Workstation 12.5 64bit 12.5 in C:\Program Files\PGI\win64\12.5\bin** which you can select as normal.

**An example compilation and some details.**

Let’s compile the following very simple mex file, mex_sin.c, using the PGI compiler which does little more than take an elementwise sine of the input matrix.

#include <math.h> #include "mex.h" void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[] ) { double *in,*out; double dist,a,b; int rows,cols,outsize; int i,j,k; /*Get pointers to input matrix*/ in = mxGetPr(prhs[0]); /*Get rows and columns of input*/ rows = mxGetM(prhs[0]); cols = mxGetN(prhs[0]); /* Create output matrix */ outsize = rows*cols; plhs[0] = mxCreateDoubleMatrix(rows, cols, mxREAL); /* Assign pointer to the output */ out = mxGetPr(plhs[0]); for(i=0;i<outsize;i++){ out[i] = sin(in[i]); } }

Compile using the -v switch to get verbose information about the compilation

mex sin_mex.c -v

You’ll see that the compiled mex file is actually a renamed .dll file that was compiled and linked with the following flags

pgcc -c -Bdynamic -Minfo -fast pgcc --Mmakedll=export_all -L"C:\Program Files\MATLAB\R2012a\extern\lib\win64\microsoft" libmx.lib libmex.lib libmat.lib

The switch **–Mmakedll=export_all** is actually not supported by PGI which makes this whole setup doubly unsupported! However, I couldn’t find a way to export the required symbols without modifying the mex source code so I lived with it. Maybe I’ll figure out a better way in the future. Let’s try the new function out

>> a=[1 2 3]; >> mex_sin(a) Invalid MEX-file 'C:\Work\mex_sin.mexw64': The specified module could not be found.

The reason for the error message is that a required PGI .dll file, pgc.dll, is not on my system path so I need to do the following in MATLAB.

setenv('PATH', [getenv('PATH') ';C:\Program Files\PGI\win64\12.5\bin\']);

This fixes things

>> mex_sin(a) ans = 0.8415 0.9093 0.1411

**Performance**

I took a quick look at the performance of this mex function using my quad-core, Sandy Bridge laptop. I highly doubted that I was going to beat MATLAB’s built in sin function (which is highly optimised and multithreaded) with so little work and I was right:

>> a=rand(1,100000000); >> tic;mex_sin(a);toc Elapsed time is 1.320855 seconds. >> tic;sin(a);toc Elapsed time is 0.486369 seconds.

That’s not really a fair comparison though since I am purposely leaving mutithreading out of the PGI mex equation for now. It’s a much fairer comparison to compare the exact same mex file using different compilers so let’s do that. I created three different compiled mex routines from the source code above using the three compilers installed on my laptop and performed a very crude time test as follows

>> a=rand(1,100000000); >> tic;mex_sin_pgi(a);toc %PGI 12.5 run 1 Elapsed time is 1.317122 seconds. >> tic;mex_sin_pgi(a);toc %PGI 12.5 run 2 Elapsed time is 1.338271 seconds. >> tic;mex_sin_vs(a);toc %Visual Studio 2008 run 1 Elapsed time is 1.459463 seconds. >> tic;mex_sin_vs(a);toc Elapsed time is 1.446947 seconds. %Visual Studio 2008 run 2 >> tic;mex_sin_intel(a);toc %Intel Compiler 12.0 run 1 Elapsed time is 0.907018 seconds. >> tic;mex_sin_intel(a);toc %Intel Compiler 12.0 run 2 Elapsed time is 0.860218 seconds.

PGI did a little better than Visual Studio 2008 but was beaten by Intel**.** I’m hoping that I’ll be able to get more performance out of the PGI compiler as I learn more about the compilation flags.

**Getting PGI to make use of SSE extensions**

Part of the output of the **mex sin_mex.c -v** compilation command is the following notice

mexFunction: 23, Loop not vectorized: data dependency

This notice is a result of the **-Minfo** compilation switch and indicates that the PGI compiler can’t determine if the **in** and **out** arrays overlap or not. If they don’t overlap then it would be safe to unroll the loop and make use of SSE or AVX instructions to make better use of my Sandy Bridge processor. This should hopefully speed things up a little.

As the programmer, I am sure that the two arrays don’t overlap so I need to give the compiler a hand. One way to do this would be to modify the **pgi.dat** file to include the compilation switch **-Msafeptr** which tells the compiler that arrays never overlap anywhere. This might not be a good idea since it may not always be true so I decided to be more cautious and make use of the restrict keyword. That is, I changed the mex source code so that

double *in,*out;

becomes

double * restrict in,* restrict out;

Now when I compile using the PGI compiler, the notice from -Mifno becomes

mexFunction: 23, Generated 3 alternate versions of the loop Generated vector sse code for the loop Generated a prefetch instruction for the loop

which demonstrates that the compiler is much happier! So, what did this do for performance?

>> tic;mex_sin_pgi(a);toc Elapsed time is 1.450002 seconds. >> tic;mex_sin_pgi(a);toc Elapsed time is 1.460536 seconds.

This is slower than when SSE instructions weren’t being used which isn’t what I was expecting at all! If anyone has any insight into what’s going on here, I’d love to hear from you.

**Future Work**

I’m happy that I’ve got this compiler working in MATLAB but there is a lot to do including:

- Tidy up the pgi.dat and pgi.stp files so that they look and act more professionally.
- Figure out the best set of compiler switches to use– it is almost certain that what I’m using now is sub-optimal since I am new to the PGI compiler.
- Get OpenMP support working. I tried using the
**-Mconcur**compilation flag which auto-parallelised the loop but it crashed MATLAB when I ran it. This needs investigating - Get PGI accelerator support working so I can offload work to the GPU.
- Figure out why the SSE version of this function is slower than the non-SSE version
- Figure out how to determine whether or not the compiler is emitting AVX instructions. The documentation suggests that if the compiler is called on a Sandy Bridge machine, and if vectorisation is possible then it will produce AVX instructions but AVX is not mentioned in the output of -Minfo. Nothing changes if you explicity set the target to Sandy Bridge with the compiler switch
**–***tp sandybridge*–*64.*

Look out for more articles on this in the future.

**Related WalkingRandomly Articles**

- Which MATLAB functions make use of multithreading?
- Using Intel’s SPMD Compiler (ispc) with MATLAB on Linux
- Parallel MATLAB with OpenMP mex files
- MATLAB mex functions using the NAG C Library

**My setup**

- Laptop model: Dell XPS L702X
- CPU: Intel Core i7-2630QM @2Ghz software overclockable to 2.9Ghz. 4 physical cores but total 8 virtual cores due to Hyperthreading.
- GPU: GeForce GT 555M with 144 CUDA Cores. Graphics clock: 590Mhz. Processor Clock:1180 Mhz. 3072 Mb DDR3 Memeory
- RAM: 8 Gb
- OS: Windows 7 Home Premium 64 bit.
- MATLAB: 2012a
- PGI Compiler: 12.5

A friend recently asked me how to optimally tile the shape below on the plane such that no two instances touch.

Off the top off my head I suggested considering the minimal enclosing circle and then pack hexagonally but he thinks one could do better. I thought I’d ask on here to see what others might come up with.

As the resident MATLAB support guy (among other things) at The University of Manchester, I often get asked which language MATLAB is written in. Many people seem to argue over whether or not it is Fortran or C with the prevailing wisdom being ‘It was originally written in Fortran for the free version 1 and then was re-written in C when it went commercial’.

Well, I am not sure about the history but looking at moden MATLAB, the question should be which **languages**, rather than which **language**! The list I have so far is

- C (many built in compiled mex functions are written in C)
- C++ (MATLAB makes use of the Boost C++ libraries..thanks to M in the comments for this one)
- CIL (Common Intermediate Langauge, used to be called MSIL, – The windows version of MATLAB uses this for various .NET stuff. Thanks to ‘pmcs’ in the comments for this one)
- NVidia CUDA (Routines in the GPU part of the parallel computing toolbox)
- Bash shell script (On Linux, the startup ‘binary’ is a shell script)
- Fortran (MATLAB uses the MKL and I’m fairly sure this is written in Fortran)
- Java (Many of the ticks in Yair Altman’s excellent book, Undocumented Secrets of MATLAB-Java Programming make use of the Java elements in MATLAB
- MATLAB (many MATLAB functions are written in .m code)
- Perl (Many mex-related scripts are written in Perl)
- Windows batch files (I’ve seen some simple .bat files as part of the mex machinery)

Welcome to the slightly delayed May edition of A Month of Math Software. The archives can be found at https://www.walkingrandomly.com/?cat=47 and if you have any news for next month then feel free to contact me.

**Wolfram SystemModeler**

Wolfram Research have released a new product called SystemModeler, a model based design and simulation tool that occupies the same space as products such as The Mathwork’s Simulink, Maplesoft’s Maplesim, and the free Scilab add-on, xcos.

**Mathematical libraries**

- The Numerical Algorithms Group (NAG) released a new version of the commercial NAG Library for SMP and multicore. This is a version of their Fortran library which includes a few hundred routines that have been optimised for multicore processors. The latest release, Mark 23, adds an extra 116 routines (some of them have been written up in detail) bringing the total number of functions to 1704. A total of 337 have been parallelised to make use of multicore processors. I played with this library when it was at Mark 22.
- Version 1.2 of Magma, the linear algebra library for NVIDIA GPUs has been released. The OpenCL version, clMAGMA has also seen an update in version 0.2.

**General Purpose Mathematics**

- Version 5.0 of Sage was released on 14th May; go to the changelog to see what’s new. Sage is a superb free alternative to Mathematica, Maple or MATLAB based on the Python programming language and over 100 other open source projects. If you’ve never heard of Sage, these Videos serve as a good introduction.
- Version 5.27 of Maxima has been released. Maxima is a free computer algebra system that’s available for all major operating systems and it is very capable. Features include symbolic calculus, linear algebra, plotting and arbitrary precision arithmetic. There is a post here on Walking Randomly on how to use it to plot direction fields for 1st order ODEs.
- A new version of SMath Studio, the free Mathcad-like clone, is now available. Version 0.94.4535 adds a couple of new features to a great program; Linux and Windows versions are available.
- The Euler Math Toolbox (A free, MATLAB-like language) is now at version 15.6. See http://euler.rene-grothmann.de/versions/version-15.html for the new stuff

**Data Analysis**

- R-Studio, the fantastic free Integrated Development Environment (IDE) for the statistical programming language R has been updated to version 0.96. New features include code-folding (click here for video), lots of new Sweave features (Sweave is a tool that allows to embed the R code for complete data analyses in latex documents) and more. See this blog post for all of the details.
- Excelis have updated their commerical Interactive Data Language (IDL). See what’s new in version 8.2 at http://www.exelisvis.com/ProductsServices/IDL/VersionUpdate.aspx I also recently learned of a free implementation of the IDL language — The GNU Data Language — which last saw a release in February of this year.
- A new beta of NCL (NCAR Command Language) is now available. NCL is a free interpreted language designed specifically for scientific data analysis and visualisation. There is an extensive what’s new page which gives the details of Version 6.1 .0-beta.
- The Perl Data Language (PDL) has seen a minor update to 2.4.11

**Python**

- Numpy 1.6.2 has been released. Numpy is the fundamental package for scientific computing with Python. This is a bug-fix release with details available in the README.txt file

**Mobile math software**

- Someone is working on Maxima for Android. I’ve not tried it yet.
- Gnuplot has been ported to Android by Corbin Champion, the developer who is also trying to bring Octave to the Android world (There is still time to donate at the time of writing). Look for AddiPlot in Google Play.