Linting Condor Submission Scripts

July 19th, 2012 | Categories: Condor, Guest posts, programming | Tags:

This is a guest post by colleague and friend, Ian Cottam of The University of Manchester.

I recently implemented a lint program for Condor (unsurprisingly called: condor_lint), analogous to the famous one for C of days of yore. That is, it points out the fluff in a Condor script and suggests improvements. It is based on our local knowledge of how our Condor Pool is set up, here in Manchester, and also reflects recent changes to Condor.

Why I did it is interesting and may have wider applicability. Everything it reports is already written up on our extensive internal web site which users rarely read. I suspect the usual modus operandi of our users is to find, or be given, a Condor script, relevant to their domain, and make the minimum modifications to it that means it ‘works’. Subsequently, its basic structure is never updated (apart from referencing new data files, etc).

To be fair, that’s what we all do — is it not?

Ignoring our continually updated documentation means that a user’s job may make poor use of the Condor Pool, affecting others, and costing real money (in wasted energy) through such “bad throughput”.

Now, although we always run the user’s job if Condor’s basic condor_submit command accepts it, we first automatically run condor_lint. This directly tells them any “bad news” and also, in many cases, gives them the link to the specific web page that explains the issue in detail.

Clearly, even such “in your face” advice can still be ignored, but we are starting to see improvements.

Obviously such an approach is not limited to Condor, and we would be interested in hearing of “lint approaches” with other systems.

Links

Other WalkingRandomly posts by Ian Cottam

  1. July 20th, 2012 at 18:12
    Reply | Quote | #1

    Mike –

    This is reminiscent of what we’ve done with the MATLAB Code Analyzer (formerly, MLint). A quick view of the history of how we’ve encouraged writing good MATLAB code shed some light on ways to approach getting folks to write better code:

    * Embed best practices and tips in the documentation.
    This is a great start, as it makes information available on how to code well. Unfortunately, the people who need it most may be the least likely to read it!

    * Embed best practices and tips in a Lint-style code checker report, which people can run any time they want (if they know about it).
    Much better, since you get advice specific to the code that you are writing. The ability to generate reports is also helpful in gauging how well an organization as a whole is coding. Same problem occurs, though, in that most users didn’t discover the ability to run this report (and, again, the ones you really want to see it are the least likely to)

    * Embed code checker directly in the editor.
    This was a great step forward, as it drastically increased the visibility of the code analyzer. I’d often run into other MATLAB users who would take great pride in making sure their files were all “green” (no code analyzer warnings). This is just great – almost makes coding well a fun game! Still two problems, though: 1) hard to make sense out of cryptic (I mean, “precise”) 1-liners and to know what action to take. 2) A whole bunch of users totally ignored the squiggles and glowing boxes. We found that users who didn’t understand the concept of code analysis often didn’t put the energy into figuring out what these things meant. At least nobody seemed to mind them being there.

    * Offer to automatically fix issues.
    People love this one. Not only do we tell them what’s wrong, but we offer to fix it for them with a simple click. It’s almost magic seeing your code transformed into something better.

    * Include additional information along with 1-liners flagging issues.
    This was another big step forward – instead of trying to come up with the best single sentence to communicate what the issue is, we can put together a few paragraphs explaining the issue and recommending what to do about it. This helped a lot more users understand the value of those orange squiggles.

    Each one of these additions continued to chip away at the problem. It’s definitely an art to try to find the right amount of intrusiveness for a given class of users working on a given class of problems. You are justified in taking a slightly more invasive approach to chatting with your users about the issues in their code than we’d be justified in doing for most MATLAB users most of the time. That said, I’m all ears if there are great ideas floating around about how to help MATLAB users write better code (without them knowing that they were being helped!).

    Cheers,
    – scott

  2. Ian Cottam
    July 24th, 2012 at 15:28
    Reply | Quote | #2

    Thanks Scott: very informative and in agreement with our way of thinking!
    (I’m not a MATLAB man myself, but as you know many people at our University are.)
    cheers
    -Ian