Archive for September, 2013
I support scientific applications at The University of Manchester (see my LinkedIn profile if you’re interested in the details) and part of my job involves working on code written by researchers in a variety of languages. When I say ‘variety’ I really mean it – MATLAB, Mathematica, Python, C, Fortran, Julia, Maple, Visual Basic and PowerShell are some languages I’ve worked with this month for instance.
Having to juggle the semantics of so many languages in my head sometimes leads to momentary confusion when working on someone’s program. For example, I’ve been doing a lot of Python work recently but this morning I was hacking away on someone’s MATLAB code. Buried deep within the program, it would have been very sensible to be able to do the equivalent of this:
a=rand(3,3) a = 0.8147 0.9134 0.2785 0.9058 0.6324 0.5469 0.1270 0.0975 0.9575 >> [x,y,z]=a(:,1) Indexing cannot yield multiple results.
That is, I want to be able to take the first column of the matrix a and broadcast it out to the variables x,y and z. The code I’m working on uses MUCH bigger matrices and this kind of assignment is occasionally useful since the variable names x,y,z have slightly more meaning than a(1,3), a(2,3), a(3,3).
The only concise way I’ve been able to do something like this using native MATLAB commands is to first convert to a cell. In MATLAB 2013a for instance:
>> temp=num2cell(a(:,1)); >> [x y z] = temp{:} x = 0.8147 y = 0.9058 z = 0.1270
This works but I think it looks ugly and introduces conversion overheads. The problem I had for a short time is that I subconsciously expected multiple assignment to ‘Just Work’ in MATLAB since the concept makes sense in several other languages I use regularly.
from pylab import rand a=rand(3,3) [a,b,c]=a[:,0]
a = RandomReal[1, {3, 3}] {x,y,z}=a[[All,1]]
a=rand(3,3); (x,y,z)=a[:,1]
I’ll admit that I don’t often need this construct in MATLAB but it would definitely be occasionally useful. I wonder what other opinions there are out there? Do you think multiple assignment is useful (in any language)?
While on the train to work this morning, I wondered which English words have the most anagrams that are also valid English words. So, I knocked up few lines of Mathematica code and came up with 4 sets of 7:
{{"ates", "east", "eats", "etas", "sate", "seat", "teas"}, {"pares", "parse", "pears", "rapes", "reaps", "spare", "spear"}, {"capers", "crapes", "pacers", "parsec", "recaps", "scrape", "spacer"}, {"carets", "caster", "caters", "crates", "reacts", "recast", "traces"}}
So, according to my program (and Mathematica’s dictionary), the best you can do is 7. I’m not going to post the code until later because I don’t want to influence the challenge which is ‘Write a program in your language of choice that queries a dictionary to find which English words have the most anagrams that are also valid English words.’
Last week I gave a live demo of the IPython notebook to a group of numerical analysts and one of the computations we attempted to do was to solve the following linear system using Numpy’s solve command.
Now, the matrix shown above is singular and so we expect that we might have problems. Before looking at how Numpy deals with this computation, lets take a look at what happens if you ask MATLAB to do it
>> A=[1 2 3;4 5 6;7 8 9]; >> b=[15;15;15]; >> x=A\b Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 1.541976e-18. x = -39.0000 63.0000 -24.0000
MATLAB gives us a warning that the input matrix is close to being singular (note that it didn’t actually recognize that it is singular) along with an estimate of the reciprocal of the condition number. It tells us that the results may be inaccurate and we’d do well to check. So, lets check:
>> A*x ans = 15.0000 15.0000 15.0000 >> norm(A*x-b) ans = 2.8422e-14
We seem to have dodged the bullet since, despite the singular nature of our matrix, MATLAB has able to find a valid solution. MATLAB was right to have warned us though…in other cases we might not have been so lucky.
Let’s see how Numpy deals with this using the IPython notebook:
In [1]: import numpy from numpy import array from numpy.linalg import solve A=array([[1,2,3],[4,5,6],[7,8,9]]) b=array([15,15,15]) solve(A,b) Out[1]: array([-39., 63., -24.])
It gave the same result as MATLAB [See note 1], presumably because it’s using the exact same LAPACK routine, but there was no warning of the singular nature of the matrix. During my demo, it was generally felt by everyone in the room that a warning should have been given, particularly when working in an interactive setting.
If you look at the documentation for Numpy’s solve command you’ll see that it is supposed to throw an exception when the matrix is singular but it clearly didn’t do so here. The exception is sometimes thrown though:
In [4]: C=array([[1,1,1],[1,1,1],[1,1,1]]) x=solve(C,b) --------------------------------------------------------------------------- LinAlgError Traceback (most recent call last) in () 1 C=array([[1,1,1],[1,1,1],[1,1,1]]) ----> 2 x=solve(C,b) C:\Python32\lib\site-packages\numpy\linalg\linalg.py in solve(a, b) 326 results = lapack_routine(n_eq, n_rhs, a, n_eq, pivots, b, n_eq, 0) 327 if results['info'] > 0: --> 328 raise LinAlgError('Singular matrix') 329 if one_eq: 330 return wrap(b.ravel().astype(result_t)) LinAlgError: Singular matrix
It seems that Numpy is somehow checking for exact singularity but this will rarely be detected due to rounding errors. Those I’ve spoken to consider that MATLAB’s approach of estimating the condition number and warning when that is high would be better behavior since it alerts the user to the fact that the matrix is badly conditioned.
Thanks to Nick Higham and David Silvester for useful discussions regarding this post.
Notes
[1] – The results really are identical which you can see by rerunning the calculation after evaluating format long in MATLAB and numpy.set_printoptions(precision=15) in Python
Along with a colleague, I’ve been playing around with Anaconda Python recently and am very impressed with it. At the time of writing, it is at version 1.7 and comes with Python 2.7.5 by default but you can install Python 3.3 using their conda package manager. After you’ve installed Anaconda, just start up a Windows command prompt (cmd.exe) and do
conda update conda conda create -n py33 python=3.3 anaconda
It will chug along for a while, downloading and installing packages before leaving you with a Python 3.3 environment that is completely separated from the default 2.7.5 environment. All you have to do to activate Python 3.3 is issue the following command at the Windows command prompt
activate py33
To demonstrate that the standard anaconda build remains untouched, launch cmd.exe, type ipython and note that you are still using Python 2.7.5
Microsoft Windows [Version 6.1.7601] Copyright (c) 2009 Microsoft Corporation. All rights reserved. C:\Users\testuser>ipython Python 2.7.5 |Anaconda 1.7.0 (64-bit)| (default, Jul 1 2013, 12:37:52) [MSC v.1500 64 bit (AMD64)] Type "copyright", "credits" or "license" for more information. IPython 1.0.0 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object', use 'object??' for extra details. In [1]:
Exit out of ipython and activate the py33 environment before launching ipython again. This time, note that you are using Python 3.3.2
In [1]: exit() C:\Users\testuser>activate py33 Activating environment "py33"... [py33] C:\Users\testuser>ipython Python 3.3.2 |Anaconda 1.7.0 (64-bit)| (default, May 17 2013, 11:32:27) [MSC v.1500 64 bit (AMD64)] Type "copyright", "credits" or "license" for more information. IPython 1.0.0 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object', use 'object??' for extra details. In [1]: