## Archive for the ‘CUDA’ Category

**Updated January 4th 2011**

It is becoming increasingly common for programmers to make use of GPUs (Graphical Processing Units) to speed up their programs substantially. There are three major low-level programming libraries that allow you to do this in languages such as C; namely CUDA, OpenCL and Microsoft DirectCompute. Of these three, CUDA is the most developed but it only works on Nvidia graphics cards.

I am often asked if the major commercial math packages support GPU computing and I find myself writing the same summary email over and over again. So, here is a very brief breakdown of what is currently on offer. I plan to expand the information contained in this page over time so if you have any information about GPU computing in these packages then let me know.

**MATLAB**

Core MATLAB contains no support for GPU computing but several organizations (including The Mathworks themselves) have produced add-on toolboxes that add such support:

- Jacket – This is a product from a company called AccelerEyes and is possibly the most advanced and well developed GPU solution for MATLAB currently available. As of version 2.0 it supports both OpenCL and CUDA frameworks.
- The Mathworks’ Parallel Computing Toolbox (PCT) – If you want to do your MATLAB GPU computing the officially supported way then this is the product you need. As a bonus, it also allows you to make better use of the multicore processor that almost certainly resides in your machine. Like many of the offerings on this page, only the CUDA framework is supported so you are out of luck if you don’t have an NVidia graphics card. Even if you do have an NVidia graphics card then you still might be out of luck since the PCT only supports cards that have compute level 1.3 or above (i.e. double precision only).
*CULA*is a set of GPU-accelerated linear algebra libraries utilizing the NVIDIA CUDA parallel computing architecture and it has a MATLAB interface.- GPUmat – This product is
**completely free**but is less developed than the commercial offerings above. Again. it is CUDA only - OpenCL toolbox – The only OpenCL solution for MATLAB I could find. It is free but development seems to have stalled.

**Mathematica**

Mathematica 8 has support for both CUDA and OpenCL built in so no need for any add-ons. Furthermore, it supports both single and double precision GPUs so you can experiment with GPU computing on older, cheaper cards.

**Maple**

Maple has had some CUDA-only GPU support since version 14. On the face of it, the CUDA package only appears to contain one accelerated function–Matrix-Matrix multiplication– but when you load this function it accelerates many functions that use matrix-matrix multiply internally. I’ve never found a definitive list of such functions though.

**Mathcad**

Mathcad 15 and Mathcad Prime have no support for GPU enhanced computing.

Christmas isn’t all that far away so I thought that it was high time that I wrote my Christmas list for mathematical software developers and vendors. All I want for christmas is….

### Mathematica

- A built in ternary plot function would be nice
- Ship workbench with the main product please
- An iPad version of Mathematica Player

### MATLAB

- Merge the parallel computing toolbox with core MATLAB. Everyone uses multicore these days but only a few can feel the full benefit in MATLAB. The rest are essentially second class MATLAB citizens muddling by with a single core (most of the time)
- Make the mex interface thread safe so I can more easily write parallel mex files

### Maple

- More CUDA accelerated functions please. I was initially excited by your CUDA package but then discovered that it only accelerated one function (Matrix Multiply). CUDA accelerated Random Number Generators would be nice along with fast Fourier transforms and a bit more linear algebra.

### MathCAD

- Release Mathcad Prime.
- Mac and Linux versions of Mathcad. Maple,Mathematica and MATLAB have versions for all 3 platforms so why don’t you?

### NAG Library

- Produce vector versions of functions like g01bk (poisson distribution function). They might not be needed in Fortran or C code but your MATLAB toolbox desperately needs them
- A Mac version of the MATLAB toolbox. I’ve got users practically begging for it :)
- A NAG version of the MATLAB gamfit command

### Octave

- A just in time compiler. Yeah, I know, I don’t ask for much huh ;)
- A faster pdist function (statistics toolbox from Octave Forge). I discovered that the current one is rather slow recently

### SAGE Math

- A Locator control for the interact function. I still have a bounty outstanding for the person who implements this.
- A fully featured, native windows version. I know about the VM solution and it isn’t suitable for what I want to do (which is to deploy it on around 5000 University windows machines to introduce students to one of the best open source maths packages)

### SMath Studio

- An Android version please. Don’t make it free – you deserve some money for this awesome Mathcad alternative.

### SpaceTime Mathematics

- The fact that you give the Windows version away for free is awesome but registration is a pain when you are dealing with mass deployment. I’d love to deploy this to my University’s Windows desktop image but the per-machine registration requirement makes it difficult. Most large developers who require registration usually come up with an alternative mechanism for enterprise-wide deployment. You ask schools with more than 5 machines to link back to you. I want tot put it on a few thousand machines and I would happily link back to you from several locations if you’ll help me with some sort of volume license. I’ll also give internal (and external if anyone is interested) seminars at Manchester on why I think Spacetime is useful for teaching mathematics. Finally, I’d encourage other UK University applications specialists to evaluate the software too.
- An Android version please.

How about you? What would you ask for Christmas from your favourite mathematical software developers?

One of the new features in MATLAB 2010b that’s getting me very excited is the CUDA based GP-GPU (General Purpose computation on Graphical Processing Units) integration that’s become available in the Parallel Computing Toolbox. As soon as I had MATLAB 2010b installed on my CUDA capable laptop (Dell XPS M1330 with a GeForce 8400M GS) running Ubuntu I wanted to try out as much of this new functionality as my low-end hardware would allow me. I’ve installed and played with CUDA on this machine in the past and so I fired up MATLAB 2010b and issued the following command to ask MATLAB how many CUDA enabled devices it thought I had on my system

gpuDeviceCount ??? Error using ==> feval The CUDA driver was found, but it is too old. The CUDA driver on your system is version: 3. The required CUDA version is: 3.1 or greater.

The practical upshot of the above error message is that I needed to upgrade my NVidia graphics driver which was at version 195.36.24. I went for the latest version which, at the time of writing, is version 256.53. I did this from the **NVIDIA-Linux-x86-256.53.run** file which I got direct from NVidia and all I’ll say about the process is that it ruined an otherwise perfectly good Sunday morning. I did get there in the end though!

So, I had the shiniest version of the graphics driver up and running. Time to fire up MATLAB again:

>> gpuDeviceCount Warning: The device selected (device 1, "GeForce 8400M GS") does not have sufficient compute capability to be used. Compute capability 1.3 (or greater) is required, the selected device has compute capability 1.1. > In deviceCount at 7 In GPUDevice.GPUDevice>GPUDevice.count at 27 In gpuDeviceCount at 10

Or, to put it another way, “**Please insert new laptop to continue.**” With that, my MATLAB/CUDA experiments are brought to an end.

Can anyone recommend a reasonably priced laptop that contains a CUDA capable graphics card at compute level 1.3 or above?

**Update (16th September 2010): **Several people have emailed me to defend The Mathwork’s design decision so I’d like to make something very clear: I completely agree with The Matworks in their insistence on CUDA compute level 1.3 or above. As one correspondent pointed out, this ensures that not only does the hardware support double precision but it is also IEEE-compliant and IEEE-compliance is a good thing! This blog post was never meant to criticize The Mathworks over this, I wrote it partly to ensure that more people are aware of the requirements and partly because I needed to vent over the sudden obsolescence of my relatively new laptop.