When software licensing cripples scientific throughput
I work for The University of Manchester where, among other things, I assist in the support of various high performance and high throughput computing systems. Exchanges such as the following are, sadly, becoming all too commonplace
Researcher: “Hi, I have an embarrassingly parallel research problem that needs a lot of compute resource. Can you help?”
Support: “Almost certainly, you could have access to our 2500 core Condor pool or maybe our 2000 core HPC system or any number of smaller systems depending on the department you are in. Let’s meet to discuss your requirements in more detail”
Researcher: “Sounds great. I am using [insert expensive commercial package here], could we install that on your systems?”
Support: “Not unless you pay a HUGE amount of money because you’ll need dozens or maybe hundreds of licenses. The licenses will cost more than our machines! Could you use [insert open source equivalent here] instead?”
Researcher: “A member of your team suggested that about 2 years ago but [insert expensive commercial package here] is easier to use, looks pretty and a single license didn’t seem all that expensive. It’ll take me ages to convert to [insert open source equivalent here]. Instead of splitting the job up and spreading it around lots of machines, can’t I just run it on a faster machine?”
Support: “Sorry but parallelism is the only real game in town when it comes to making stuff faster these days. I’m afraid that you’ll have to convert to [insert open source equivalent here], open your chequebook or wait until 2076 for your results to complete on your workstation.”
The moral of the story is that if you want your compute to scale, you need to ensure that your licensing scales too.
This is definitely an argument for more flexible licensing. If they’re defining “system” as a core rather than independent systems carrying out a job then hundreds of thousands of people using dual or quad core computers are probably breaking license terms every day. Whilst I’m in favour of abandoning the traditional license model altogether I’m fairly doubtful that will happen; in the meantime licenses need to be clarified as a system carrying out a single job of tasks, which may be run infinitely in parallel.
Besides agreeing with the article, there are some programs which do not have open source equivalents, e.g. PSE gPROMS.
For Matlab I’ve gotten around this by using Matlab compiler
“For Matlab I’ve gotten around this by using Matlab compiler”
Me too! It forms the basis of MATLAB on our condor pool for instance.
I am trapped in Mathematica. I love the expressiveness, hate the slow speed, hate the slow interface with C++ and Java, hate the fact that many of my friends cannot run my programs. Poor me. – Hein (http://artent.net/blog/)
Being an opensource user myself, I’ll tell you: specialized open-source software, at least in my field (adaptive optics / control) is crappy and is nowhere near commercial tools. Octave is a joke: no profiler, ugly error messages (have you ever tried to debug on Octave something bigger than a 4-liner!?), huge number of bugs (imread cannot load TIFF – are you kidding me?!)… Scilab is a mess, and Scicos (xcos?) cannot perform a simplest control simulation from standard Control Theory course (breaks down with Algebraic loop error). Not to say that interface of Xcos is the ugliest I’ve ever seen.
Don’t get me wrong, but I don’t have a time to waste on opensource. If I need a license for commercial software, I will persuade my University to buy it. The time wasted on Octave just does not worth it. I tolerate Linux on my desktop, and use Debian to run mostly MATLAB and LaTeX. I use Linux not because it is the best, but because everything else is even worse (Win8 anyone?).
@Virens.
Octave does have a profiler: http://www.gnu.org/software/octave/doc/interpreter/Profiler-Example.html
imread does support TIFF according to this: http://www.gnu.org/software/octave/doc/interpreter/Loading-and-Saving-Images.html#doc-imread