December 6th, 2013 | Categories: Maple, math software | Tags:

A question I get asked a lot is ‘How can I do nonlinear least squares curve fitting in X?’ where X might be MATLAB, Mathematica or a whole host of alternatives.  Since this is such a common query, I thought I’d write up how to do it for a very simple problem in several systems that I’m interested in

This is the Maple version. For other versions,see the list below

The problem

xdata = -2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9
ydata = 0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001

and you’d like to fit the function

 F(p_1,p_2,x) = p_1 \cos(p_2 x)+p_2 \sin(p_1 x)

using nonlinear least squares.  You’re starting guesses for the parameters are p1=1 and P2=0.2

For now, we are primarily interested in the following results:

  • The fit parameters
  • Sum of squared residuals

Future updates of these posts will show how to get other results such as confidence intervals. Let me know what you are most interested in.

Solution in Maple

Maple’s user interface is quite user friendly and it uses non-linear optimization routines from The Numerical Algorithms Group under the hood. Here’s the code to get the parameter values and sum of squares of residuals

with(Statistics):

xdata := Vector([-2, -1.64, -1.33, -.7, 0, .45, 1.2, 1.64, 2.32, 2.9], datatype = float):
ydata := Vector([.699369, .700462, .695354, 1.03905, 1.97389, 2.41143, 1.91091, .919576,  
  -.730975, -1.42001], datatype = float):
NonlinearFit(p1*cos(p2*x)+p2*sin(p1*x), xdata, ydata, x, initialvalues = [p1 = 1, p2 = .2], 
   output = [parametervalues, residualsumofsquares])

which gives the result

[[p1 = 1.88185090465902, p2 = 0.700229557992540], 0.05381269642]

Various other outputs can be specified in the output vector such as:

  • solutionmodule
  • degreesoffreedom
  • leastsquaresfunction
  • parametervalues
  • parametervector
  • residuals
  • residualmeansquare
  • residualstandarddeviation
  • residualsumofsquares.

The meanings of the above are mostly obvious.  In those cases where they aren’t, you can look them up in the documentation link below.

Further documentation

The full documentation for Maple’s NonlinearFit command is at http://www.maplesoft.com/support/help/Maple/view.aspx?path=Statistics%2fNonlinearFit

Notes

I used Maple 17.02 on 64bit Windows to run the code in this post.

December 6th, 2013 | Categories: Julia, programming | Tags:

A question I get asked a lot is ‘How can I do nonlinear least squares curve fitting in X?’ where X might be MATLAB, Mathematica or a whole host of alternatives.  Since this is such a common query, I thought I’d write up how to do it for a very simple problem in several systems that I’m interested in

This is the Julia version. For other versions,see the list below

The problem

xdata = -2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9
ydata = 0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001

and you’d like to fit the function

 F(p_1,p_2,x) = p_1 \cos(p_2 x)+p_2 \sin(p_1 x)

using nonlinear least squares.  You’re starting guesses for the parameters are p1=1 and P2=0.2

For now, we are primarily interested in the following results:

  • The fit parameters
  • Sum of squared residuals

Future updates of these posts will show how to get other results such as confidence intervals. Let me know what you are most interested in.

Julia solution using Optim.jl

Optim.jl is a free Julia package that contains a suite of optimisation routines written in pure Julia.  If you haven’t done so already, you’ll need to install the Optim package

Pkg.add("Optim")

The solution is almost identical to the example given in the curve fitting demo of the Optim.jl readme file:

using Optim

model(xdata,p) = p[1]*cos(p[2]*xdata)+p[2]*sin(p[1]*xdata)

xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9]
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001]

beta, r, J = curve_fit(model, xdata, ydata, [1.0, 0.2])
# beta = best fit parameters
# r = vector of residuals
# J = estimated Jacobian at solution

@printf("Best fit parameters are: %f and %f",beta[1],beta[2])
@printf("The sum of squares of residuals is %f",sum(r.^2.0))

The result is

Best fit parameters are: 1.881851 and 0.700230
@printf("The sum of squares of residuals is %f",sum(r.^2.0))

Notes

I used Julia version 0.2 on 64bit Windows 7 to run the code in this post

December 6th, 2013 | Categories: math software, programming, python | Tags:

A question I get asked a lot is ‘How can I do nonlinear least squares curve fitting in X?’ where X might be MATLAB, Mathematica or a whole host of alternatives.  Since this is such a common query, I thought I’d write up how to do it for a very simple problem in several systems that I’m interested in

This is the Python version. For other versions,see the list below

The problem

xdata = -2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9
ydata = 0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001

and you’d like to fit the function

 F(p_1,p_2,x) = p_1 \cos(p_2 x)+p_2 \sin(p_1 x)

using nonlinear least squares.  You’re starting guesses for the parameters are p1=1 and P2=0.2

For now, we are primarily interested in the following results:

  • The fit parameters
  • Sum of squared residuals

Future updates of these posts will show how to get other results such as confidence intervals. Let me know what you are most interested in.

Python solution using scipy

Here, I use the curve_fit function from scipy

import numpy as np
from scipy.optimize import curve_fit

xdata = np.array([-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9])
ydata = np.array([0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001])

def func(x, p1,p2):
  return p1*np.cos(p2*x) + p2*np.sin(p1*x)

popt, pcov = curve_fit(func, xdata, ydata,p0=(1.0,0.2))

The variable popt contains the fit parameters

array([ 1.88184732,  0.70022901])

We need to do a little more work to get the sum of squared residuals

p1 = popt[0]
p2 = popt[1]
residuals = ydata - func(xdata,p1,p2)
fres = sum(residuals**2)

which gives

0.053812696547933969
December 6th, 2013 | Categories: math software, mathematica, programming | Tags:

A question I get asked a lot is ‘How can I do nonlinear least squares curve fitting in X?’ where X might be MATLAB, Mathematica or a whole host of alternatives.  Since this is such a common query, I thought I’d write up how to do it for a very simple problem in several systems that I’m interested in

This is the Mathematica version. For other versions,see the list below

The problem

You have the following 10 data points

xdata = -2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9
ydata = 0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001

and you’d like to fit the function

 F(p_1,p_2,x) = p_1 \cos(p_2 x)+p_2 \sin(p_1 x)

using nonlinear least squares.  You’re starting guesses for the parameters are p1=1 and P2=0.2

For now, we are primarily interested in the following results:

  • The fit parameters
  • Sum of squared residuals

Future updates of these posts will show how to get other results such as confidence intervals. Let me know what you are most interested in.

Mathematica Solution using FindFit
FindFit is the basic nonlinear curve fitting routine in Mathematica

xdata={-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9};
ydata={0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001};

(*Mathematica likes to have the data in the form {{x1,y1},{x2,y2},..}*)
data = Partition[Riffle[xdata, ydata], 2];

FindFit[data, p1*Cos[p2 x] + p2*Sin[p1 x], {{p1, 1}, {p2, 0.2}}, x]

Out[4]:={p1->1.88185,p2->0.70023}

Mathematica Solution using NonlinearModelFit
You can get a lot more information about the fit using the NonLinearModelFit function

(*Set up data as before*)
xdata={-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9};
ydata={0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001};
data = Partition[Riffle[xdata, ydata], 2];

(*Create the NonLinearModel object*)
nlm = NonlinearModelFit[data, p1*Cos[p2 x] + p2*Sin[p1 x], {{p1, 1}, {p2, 0.2}}, x];

The NonLinearModel object contains many properties that may be useful to us. Here’s how to list them all

nlm["Properties"]

Out[10]= {"AdjustedRSquared", "AIC", "AICc", "ANOVATable", \
"ANOVATableDegreesOfFreedom", "ANOVATableEntries", "ANOVATableMeanSquares", \
"ANOVATableSumsOfSquares", "BestFit", "BestFitParameters", "BIC", \
"CorrelationMatrix", "CovarianceMatrix", "CurvatureConfidenceRegion", "Data", \
"EstimatedVariance", "FitCurvatureTable", "FitCurvatureTableEntries", \
"FitResiduals", "Function", "HatDiagonal", "MaxIntrinsicCurvature", \
"MaxParameterEffectsCurvature", "MeanPredictionBands", \
"MeanPredictionConfidenceIntervals", "MeanPredictionConfidenceIntervalTable", \
"MeanPredictionConfidenceIntervalTableEntries", "MeanPredictionErrors", \
"ParameterBias", "ParameterConfidenceIntervals", \
"ParameterConfidenceIntervalTable", \
"ParameterConfidenceIntervalTableEntries", "ParameterConfidenceRegion", \
"ParameterErrors", "ParameterPValues", "ParameterTable", \
"ParameterTableEntries", "ParameterTStatistics", "PredictedResponse", \
"Properties", "Response", "RSquared", "SingleDeletionVariances", \
"SinglePredictionBands", "SinglePredictionConfidenceIntervals", \
"SinglePredictionConfidenceIntervalTable", \
"SinglePredictionConfidenceIntervalTableEntries", "SinglePredictionErrors", \
"StandardizedResiduals", "StudentizedResiduals"}

Let’s extract the fit parameters, 95% confidence levels and residuals

{params, confidenceInt, res} = 
 nlm[{"BestFitParameters", "ParameterConfidenceIntervals", "FitResiduals"}]

Out[22]= {{p1 -> 1.88185, 
  p2 -> 0.70023}, {{1.8186, 1.9451}, {0.679124, 
   0.721336}}, {-0.0276906, -0.0322944, -0.0102488, 0.0566244, 
  0.0920392, 0.0976307, 0.114035, 0.109334, 0.0287154, -0.0700442}}

The parameters are given as replacement rules. Here, we convert them to pure numbers

{p1, p2} = {p1, p2} /. params

Out[38]= {1.88185,0.70023}

Although only a few decimal places are shown, p1 and p2 are stored in full double precision. You can see this by converting to InputForm

InputForm[{p1, p2}]

Out[43]//InputForm=
{1.8818508498053645, 0.7002298171759191}

Similarly, let’s look at the 95% confidence interval, extracted earlier, in full precision

confidenceInt // InputForm

Out[44]//InputForm=
{{1.8185969887307214, 1.9451047108800077}, 
 {0.6791239458086734, 0.7213356885431649}}

Calculate the sum of squared residuals

resnorm = Total[res^2]

Out[45]= 0.0538127

Notes
I used Mathematica 9 on Windows 7 64bit to perform these calculations

November 26th, 2013 | Categories: math software, mathematica, python, raspberry pi | Tags:

As soon as I heard the news that Mathematica was being made available completely free on the Raspberry Pi, I just had to get myself a Pi and have a play.  So, I bought the Raspberry Pi Advanced Kit from my local Maplin Electronics store, plugged it to the kitchen telly and booted it up.  The exercise made me miss my father because the last time I plugged a computer into the kitchen telly was when I was 8 years old; it was Christmas morning and dad and I took our first steps into a new world with my Sinclair Spectrum 48K.

How to install Mathematica on the Raspberry Pi

Future raspberry pis wll have Mathematica installed by default but mine wasn’t new enough so I just typed the following at the command line

sudo apt-get update && sudo apt-get install wolfram-engine

On my machine, I was told

The following extra packages will be installed:
  oracle-java7-jdk
The following NEW packages will be installed:
  oracle-java7-jdk wolfram-engine
0 upgraded, 2 newly installed, 0 to remove and 1 not upgraded.
Need to get 278 MB of archives.
After this operation, 588 MB of additional disk space will be used.

So, it seems that Mathematica needs Oracle’s Java and that’s being installed for me as well. The combination of the two is going to use up 588MB of disk space which makes me glad that I have an 8Gb SD card in my pi.

Mathematica version 10!

On starting Mathematica on the pi, my first big surprise was the version number.  I am the administrator of an unlimited academic site license for Mathematica at The University of Manchester and the latest version we can get for our PCs at the time of writing is 9.0.1.  My free pi version is at version 10!  The first clue is the installation directory:

/opt/Wolfram/WolframEngine/10.0

and the next clue is given by evaluating $Version in Mathematica itself

In[2]:= $Version

Out[2]= "10.0 for Linux ARM (32-bit) (November 19, 2013)"

To get an idea of what’s new in 10, I evaluated the following command on Mathematica on the Pi

Export["PiFuncs.dat",Names["System`*"]]

This creates a PiFuncs.dat file which tells me the list of functions in the System context on the version of Mathematica on the pi. Transfer this over to my Windows PC and import into Mathematica 9.0.1 with

pifuncs = Flatten[Import["PiFuncs.dat"]];

Get the list of functions from version 9.0.1 on Windows:

winVer9funcs = Names["System`*"];

Finally, find out what’s in pifuncs but not winVer9funcs

In[16]:= Complement[pifuncs, winVer9funcs]

Out[16]= {"Activate", "AffineStateSpaceModel", "AllowIncomplete", \
"AlternatingFactorial", "AntihermitianMatrixQ", \
"AntisymmetricMatrixQ", "APIFunction", "ArcCurvature", "ARCHProcess", \
"ArcLength", "Association", "AsymptoticOutputTracker", \
"AutocorrelationTest", "BarcodeImage", "BarcodeRecognize", \
"BoxObject", "CalendarConvert", "CanonicalName", "CantorStaircase", \
"ChromaticityPlot", "ClassifierFunction", "Classify", \
"ClipPlanesStyle", "CloudConnect", "CloudDeploy", "CloudDisconnect", \
"CloudEvaluate", "CloudFunction", "CloudGet", "CloudObject", \
"CloudPut", "CloudSave", "ColorCoverage", "ColorDistance", "Combine", \
"CommonName", "CompositeQ", "Computed", "ConformImages", "ConformsQ", \
"ConicHullRegion", "ConicHullRegion3DBox", "ConicHullRegionBox", \
"ConstantImage", "CountBy", "CountedBy", "CreateUUID", \
"CurrencyConvert", "DataAssembly", "DatedUnit", "DateFormat", \
"DateObject", "DateObjectQ", "DefaultParameterType", \
"DefaultReturnType", "DefaultView", "DeviceClose", "DeviceConfigure", \
"DeviceDriverRepository", "DeviceExecute", "DeviceInformation", \
"DeviceInputStream", "DeviceObject", "DeviceOpen", "DeviceOpenQ", \
"DeviceOutputStream", "DeviceRead", "DeviceReadAsynchronous", \
"DeviceReadBuffer", "DeviceReadBufferAsynchronous", \
"DeviceReadTimeSeries", "Devices", "DeviceWrite", \
"DeviceWriteAsynchronous", "DeviceWriteBuffer", \
"DeviceWriteBufferAsynchronous", "DiagonalizableMatrixQ", \
"DirichletBeta", "DirichletEta", "DirichletLambda", "DSolveValue", \
"Entity", "EntityProperties", "EntityProperty", "EntityValue", \
"Enum", "EvaluationBox", "EventSeries", "ExcludedPhysicalQuantities", \
"ExportForm", "FareySequence", "FeedbackLinearize", "Fibonorial", \
"FileTemplate", "FileTemplateApply", "FindAllPaths", "FindDevices", \
"FindEdgeIndependentPaths", "FindFundamentalCycles", \
"FindHiddenMarkovStates", "FindSpanningTree", \
"FindVertexIndependentPaths", "Flattened", "ForeignKey", \
"FormatName", "FormFunction", "FormulaData", "FormulaLookup", \
"FractionalGaussianNoiseProcess", "FrenetSerretSystem", "FresnelF", \
"FresnelG", "FullInformationOutputRegulator", "FunctionDomain", \
"FunctionRange", "GARCHProcess", "GeoArrow", "GeoBackground", \
"GeoBoundaryBox", "GeoCircle", "GeodesicArrow", "GeodesicLine", \
"GeoDisk", "GeoElevationData", "GeoGraphics", "GeoGridLines", \
"GeoGridLinesStyle", "GeoLine", "GeoMarker", "GeoPoint", \
"GeoPolygon", "GeoProjection", "GeoRange", "GeoRangePadding", \
"GeoRectangle", "GeoRhumbLine", "GeoStyle", "Graph3D", "GroupBy", \
"GroupedBy", "GrowCutBinarize", "HalfLine", "HalfPlane", \
"HiddenMarkovProcess", "", "", "", "", "", "", \
"", "", "", "", "", "", "", "", "", "", \
"", "", "", "", "", "", "", "", "", "", \
"", "", "", "", "", "", "", "", "", "", \
"ï ¯", "ï \.b2", "ï \.b3", "IgnoringInactive", "ImageApplyIndexed", \
"ImageCollage", "ImageSaliencyFilter", "Inactivate", "Inactive", \
"IncludeAlphaChannel", "IncludeWindowTimes", "IndefiniteMatrixQ", \
"IndexedBy", "IndexType", "InduceType", "InferType", "InfiniteLine", \
"InfinitePlane", "InflationAdjust", "InflationMethod", \
"IntervalSlider", "ï ¨", "ï ¢", "ï ©", "ï ¤", "ï \[Degree]", "ï ­", \
"ï ¡", "ï «", "ï ®", "ï §", "ï £", "ï ¥", "ï \[PlusMinus]", \
"ï \[Not]", "JuliaSetIterationCount", "JuliaSetPlot", \
"JuliaSetPoints", "KEdgeConnectedGraphQ", "Key", "KeyDrop", \
"KeyExistsQ", "KeyIntersection", "Keys", "KeySelect", "KeySort", \
"KeySortBy", "KeyTake", "KeyUnion", "KillProcess", \
"KVertexConnectedGraphQ", "LABColor", "LinearGradientImage", \
"LinearizingTransformationData", "ListType", "LocalAdaptiveBinarize", \
"LocalizeDefinitions", "LogisticSigmoid", "Lookup", "LUVColor", \
"MandelbrotSetIterationCount", "MandelbrotSetMemberQ", \
"MandelbrotSetPlot", "MinColorDistance", "MinimumTimeIncrement", \
"MinIntervalSize", "MinkowskiQuestionMark", "MovingMap", \
"NegativeDefiniteMatrixQ", "NegativeSemidefiniteMatrixQ", \
"NonlinearStateSpaceModel", "Normalized", "NormalizeType", \
"NormalMatrixQ", "NotebookTemplate", "NumberLinePlot", "OperableQ", \
"OrthogonalMatrixQ", "OverwriteTarget", "PartSpecification", \
"PlotRangeClipPlanesStyle", "PositionIndex", \
"PositiveSemidefiniteMatrixQ", "Predict", "PredictorFunction", \
"PrimitiveRootList", "ProcessConnection", "ProcessInformation", \
"ProcessObject", "ProcessStatus", "Qualifiers", "QuantityVariable", \
"QuantityVariableCanonicalUnit", "QuantityVariableDimensions", \
"QuantityVariableIdentifier", "QuantityVariablePhysicalQuantity", \
"RadialGradientImage", "RandomColor", "RegularlySampledQ", \
"RemoveBackground", "RequiredPhysicalQuantities", "ResamplingMethod", \
"RiemannXi", "RSolveValue", "RunProcess", "SavitzkyGolayMatrix", \
"ScalarType", "ScorerGi", "ScorerGiPrime", "ScorerHi", \
"ScorerHiPrime", "ScriptForm", "Selected", "SendMessage", \
"ServiceConnect", "ServiceDisconnect", "ServiceExecute", \
"ServiceObject", "ShowWhitePoint", "SourceEntityType", \
"SquareMatrixQ", "Stacked", "StartDeviceHandler", "StartProcess", \
"StateTransformationLinearize", "StringTemplate", "StructType", \
"SystemGet", "SystemsModelMerge", "SystemsModelVectorRelativeOrder", \
"TemplateApply", "TemplateBlock", "TemplateExpression", "TemplateIf", \
"TemplateObject", "TemplateSequence", "TemplateSlot", "TemplateWith", \
"TemporalRegularity", "ThermodynamicData", "ThreadDepth", \
"TimeObject", "TimeSeries", "TimeSeriesAggregate", \
"TimeSeriesInsert", "TimeSeriesMap", "TimeSeriesMapThread", \
"TimeSeriesModel", "TimeSeriesModelFit", "TimeSeriesResample", \
"TimeSeriesRescale", "TimeSeriesShift", "TimeSeriesThread", \
"TimeSeriesWindow", "TimeZoneConvert", "TouchPosition", \
"TransformedProcess", "TrapSelection", "TupleType", "TypeChecksQ", \
"TypeName", "TypeQ", "UnitaryMatrixQ", "URLBuild", "URLDecode", \
"URLEncode", "URLExistsQ", "URLExpand", "URLParse", "URLQueryDecode", \
"URLQueryEncode", "URLShorten", "ValidTypeQ", "ValueDimensions", \
"Values", "WhiteNoiseProcess", "XMLTemplate", "XYZColor", \
"ZoomLevel", "$CloudBase", "$CloudConnected", "$CloudDirectory", \
"$CloudEvaluation", "$CloudRootDirectory", "$EvaluationEnvironment", \
"$GeoLocationCity", "$GeoLocationCountry", "$GeoLocationPrecision", \
"$GeoLocationSource", "$RegisteredDeviceClasses", \
"$RequesterAddress", "$RequesterWolframID", "$RequesterWolframUUID", \
"$UserAgentLanguages", "$UserAgentMachine", "$UserAgentName", \
"$UserAgentOperatingSystem", "$UserAgentString", "$UserAgentVersion", \
"$WolframID", "$WolframUUID"}

There we have it, a preview of the list of functions that might be coming in desktop version 10 of Mathematica courtesy of the free Pi version.

No local documentation

On a desktop version of Mathematica, all of the Mathematica documentation is available on your local machine by clicking on Help->Documentation Center in the Mathematica notebook interface.  On the pi version, it seems that there is no local documentation, presumably to keep the installation size down.  You get to the documentation via the notebook interface by clicking on Help->OnlineDocumentation which takes you to http://reference.wolfram.com/language/?src=raspi

Speed vs my laptop

I am used to running Mathematica on high specification machines and so naturally the pi version felt very sluggish–particularly when using the notebook interface.  With that said, however,  I found it very usable for general playing around.  I was very curious, however, about the speed of the pi version compared to the version on my home laptop and so created a small benchmark notebook that did three things:

  • Calculate pi to 1,000,000 decimal places.
  • Multiply two 1000 x 1000 random matrices together
  • Integrate sin(x)^2*tan(x) with respect to x

The comparison is going to be against my Windows 7 laptop which has a quad-core Intel Core i7-2630QM. The procedure I followed was:

  • Start a fresh version of the Mathematica notebook and open pi_bench.nb
  • Click on Evaluation->Evaluate Notebook and record the times
  • Click on Evaluation->Evaluate Notebook again and record the new times.

Note that I use the AbsoluteTiming function instead of Timing (detailed reason given here) and I clear the system cache (detailed resason given here).   You can download the notebook I used here.  Alternatively, copy and paste the code below

(*Clear Cache*)
ClearSystemCache[]

(*Calculate pi to 1 million decimal places and store the result*)
AbsoluteTiming[pi = N[Pi, 1000000];]

(*Multiply two random 1000x1000 matrices together and store the \
result*)
a = RandomReal[1, {1000, 1000}];
b = RandomReal[1, {1000, 1000}];

AbsoluteTiming[prod = Dot[a, b];]

(*calculate an integral and store the result*)
AbsoluteTiming[res = Integrate[Sin[x]^2*Tan[x], x];]

Here are the results. All timings in seconds.

Test Laptop Run 1 Laptop Run 2 RaspPi Run 1 RaspPi Run 2 Best Pi/Best Laptop
Million digits of Pi 0.994057 1.007058 14.101360 13.860549 13.9434
Matrix product 0.108006 0.074004 85.076986 85.526180 1149.63
Symbolic integral 0.035002 0.008000 0.980086 0.448804 56.1

From these tests, we see that Mathematica on the pi is around 14 to 1149 times slower on the pi than my laptop. The huge difference between the pi and laptop for the matrix product stems from the fact that ,on the laptop, Mathematica is using Intels Math Kernel Library (MKL).  The MKL is extremely well optimised for Intel processors and will be using all 4 of the laptop’s CPU cores along with extra tricks such as AVX operations etc.  I am not sure what is being used on the pi for this operation.

I also ran the standard BenchMarkReport[] on the Raspberry Pi.  The results are available here.

Speed vs Python

Comparing Mathematica on the pi to Mathematica on my laptop might have been a fun exercise for me but it’s not really fair on the pi which wasn’t designed to perform against expensive laptops. So, let’s move on to a more meaningful speed comparison: Mathematica on pi versus Python on pi.

When it comes to benchmarking on Python, I usually turn to the timeit module.  This time, however, I’m not going to use it and that’s because of something odd that’s happening with sympy and caching.  I’m using sympy to calculate pi to 1 million places and for the symbolic calculus.  Check out this ipython session on the pi

pi@raspberrypi ~ $ SYMPY_USE_CACHE=no
pi@raspberrypi ~ $ ipython
Python 2.7.3 (default, Jan 13 2013, 11:20:46)
Type "copyright", "credits" or "license" for more information.

IPython 0.13.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import sympy
In [2]: pi=sympy.pi.evalf(100000) #Takes a few seconds
In [3]: %timeit pi=sympy.pi.evalf(100000)
100 loops, best of 3: 2.35 ms per loop

In short, I have asked sympy not to use caching (I think!) and yet it is caching the result.  I don’t want to time how quickly sympy can get a result from the cache so I can’t use timeit until I figure out what’s going on here. Since I wanted to publish this post sooner rather than later I just did this:

import sympy
import time
import numpy

start = time.time()
pi=sympy.pi.evalf(1000000)
elapsed = (time.time() - start)
print('1 million pi digits: %f seconds' % elapsed)

a = numpy.random.uniform(0,1,(1000,1000))
b = numpy.random.uniform(0,1,(1000,1000))
start = time.time()
c=numpy.dot(a,b)
elapsed = (time.time() - start)
print('Matrix Multiply: %f seconds' % elapsed)

x=sympy.Symbol('x')
start = time.time()
res=sympy.integrate(sympy.sin(x)**2*sympy.tan(x),x)
elapsed = (time.time() - start)
print('Symbolic Integration: %f seconds' % elapsed)

Usually, I’d use time.clock() to measure things like this but something *very* strange is happening with time.clock() on my pi–something I’ll write up later.  In short, it didn’t work properly and so I had to resort to time.time().

Here are the results:

1 million pi digits: 5535.621769 seconds
Matrix Multiply: 77.938481 seconds
Symbolic Integration: 1654.666123 seconds

The result that really surprised me here was the symbolic integration since the problem I posed didn’t look very difficult. Sympy on pi was thousands of times slower than Mathematica on pi for this calculation!  On my laptop, the calculation times between Mathematica and sympy were about the same for this operation.

That Mathematica beats sympy for 1 million digits of pi doesn’t surprise me too much since I recall attending a seminar a few years ago where Wolfram Research described how they had optimized the living daylights out of that particular operation. Nice to see Python beating Mathematica by a little bit in the linear algebra though.

 

October 24th, 2013 | Categories: programming, Science, Scientific Software, software deployment | Tags:

One of my favourite parts of my job at The University of Manchester is the code optimisation service that my team provides to researchers there.  We take code written in languages such as  MATLAB, Python, Mathematica and R and attempt (usually successfully) to make them go faster.  It’s sort of a cross between training and consultancy, is a lot of fun and we can help a relatively large number of people in a short time.  It also turns out to be very useful to the researchers as some of my recent testimonials demonstrate.

Other researchers,however, need something more.  They already have a nice piece of code with several users and a selection of papers.  They also have a bucket-load of ideas about how to  turn this nice code into something amazing.  They have all this good stuff and yet they find that they are struggling to get their code to the next level.  What these people need is some quality time with a team of research software engineers.

Enter the Software Sustainability Institute (SSI), an organisation that I have been fortunate enough to have a fellowship with throughout 2013.  These guys are software-development ninjas, have extensive experience with working with academic researchers and, right now, they have an open call for projects.  It’s free to apply and, if your application is successful, all of their services will be provided for free.  If you’d like an idea of the sort of projects they work on, take a look at the extensive list of past projects.

So, if you are a UK-based researcher who has a software problem, and no one else can help, maybe you can hire* the SSI team.

*for free!

Links

October 10th, 2013 | Categories: Julia, matlab, programming, python | Tags:

From the wikipedia page on Division by Zero: “The IEEE 754 standard specifies that every floating point arithmetic operation, including division by zero, has a well-defined result”.

MATLAB supports this fully:

>> 1/0
ans =
   Inf
>> 1/(-0)
ans =
  -Inf
>> 0/0
ans =
   NaN

Julia is almost there, but doesn’t handled the signed 0 correctly (This is using Version 0.2.0-prerelease+3768 on Windows)

julia> 1/0
Inf

julia> 1/(-0)
Inf

julia> 0/0
NaN

Python throws an exception. (Python 2.7.5 using IPython shell)

In [4]: 1.0/0.0
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
 in ()
----> 1 1.0/0.0

ZeroDivisionError: float division by zero

In [5]: 1.0/(-0.0)
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
 in ()
----> 1 1.0/(-0.0)

ZeroDivisionError: float division by zero

In [6]: 0.0/0.0
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
 in ()
----> 1 0.0/0.0

ZeroDivisionError: float division by zero

Update:
Julia does do things correctly, provided I make it clear that I am working with floating point numbers:

julia> 1.0/0.0
Inf

julia> 1.0/(-0.0)
-Inf

julia> 0.0/0.0
NaN
October 9th, 2013 | Categories: Android, Mobile Mathematics | Tags:

I got a Samsung Galaxy Note 3 yesterday and, since I’m so interested in compute performance, I ran various benchmarks on it.  Here are the results.

AnTuTu Benchmark Overall score was 35,637.  The screenshot below shows the comparison, given by the App, between my device and the Samsung Galaxy note 2 (my previous phone).

AnTuTu on Note3

Linpack Pro for Android – This was the app I used when I compared mobile phones to 1980s supercomputers back in 2010.  My phone at the time, a HTC Hero, managed just 2.3 Megaflops.  The Samsung Note 3, on the other hand, managed as much as 1074 Mflops. That’s quite an increase over the space of just 3 years!  I found that the results of this app are quite inconsistent with individual runs ranging from 666 to 1074 Mflops.

RgbenchMM – I’ve written about this benchmark, developed by Rahul Garg, before.  It gives more consistent results than Linkpack Pro and my Note 3 managed as much as 4471 Mflops!

Notes

  • The device was plugged in to the mains at the time of performing the benchmarks.  I rebooted the device after running each one.
  • There are at least two types of Note 3 in circulation that I know of – a quad core and an octo-core.  According to CPU-Z, mine has a Qualcomm Snapdragon 800 quad-core with a top speed of 2.27 Ghz.
  • Samsung have been accused of benchmark cheating by some.  See, for example, this post from The Register.
September 25th, 2013 | Categories: Julia, mathematica, matlab, programming, python | Tags:

I support scientific applications at The University of Manchester (see my LinkedIn profile if you’re interested in the details) and part of my job involves working on code written by researchers in a variety of languages.  When I say ‘variety’ I really mean it – MATLAB, Mathematica, Python, C, Fortran, Julia, Maple, Visual Basic and PowerShell are some languages I’ve worked with this month for instance.

Having to juggle the semantics of so many languages in my head sometimes leads to momentary confusion when working on someone’s program.  For example, I’ve been doing a lot of Python work recently but this morning I was hacking away on someone’s MATLAB code.  Buried deep within the program, it would have been very sensible to be able to do the equivalent of this:

a=rand(3,3)

a =
    0.8147    0.9134    0.2785
    0.9058    0.6324    0.5469
    0.1270    0.0975    0.9575

>> [x,y,z]=a(:,1)

Indexing cannot yield multiple results.

That is, I want to be able to take the first column of the matrix a and broadcast it out to the variables x,y and z. The code I’m working on uses MUCH bigger matrices and this kind of assignment is occasionally useful since the variable names x,y,z have slightly more meaning than a(1,3), a(2,3), a(3,3).

The only concise way I’ve been able to do something like this using native MATLAB commands is to first convert to a cell. In MATLAB 2013a for instance:

>> temp=num2cell(a(:,1));
>> [x y z] = temp{:}

x =
    0.8147

y =
    0.9058

z =
    0.1270

This works but I think it looks ugly and introduces conversion overheads. The problem I had for a short time is that I subconsciously expected multiple assignment to ‘Just Work’ in MATLAB since the concept makes sense in several other languages I use regularly.

Python:

from pylab import rand
a=rand(3,3)
[a,b,c]=a[:,0]

Mathematica:

a = RandomReal[1, {3, 3}]
{x,y,z}=a[[All,1]]

Julia:

a=rand(3,3);
(x,y,z)=a[:,1]

I’ll admit that I don’t often need this construct in MATLAB but it would definitely be occasionally useful. I wonder what other opinions there are out there? Do you think multiple assignment is useful (in any language)?

September 18th, 2013 | Categories: Problem of the week, programming | Tags:

While on the train to work this morning, I wondered which English words have the most anagrams that are also valid English words. So, I knocked up few lines of Mathematica code and came up with 4 sets of 7:

{{"ates", "east", "eats", "etas", "sate", "seat", "teas"},
{"pares", "parse", "pears", "rapes", "reaps", "spare", "spear"}, 
{"capers", "crapes", "pacers", "parsec", "recaps", "scrape", "spacer"},
{"carets", "caster", "caters", "crates", "reacts", "recast", "traces"}}

So, according to my program (and Mathematica’s dictionary), the best you can do is 7.  I’m not going to post the code until later because I don’t want to influence the challenge which is ‘Write a program in your language of choice that queries a dictionary to find which English words have the most anagrams that are also valid English words.’