Scientific programming does not compute

An article in Nature explains why scientists need to pay more attention to basic principles of software engineering.
As a result, codes may be riddled with tiny errors that do not cause the program to break down, but may drastically change the scientific results that it spits out. One such error tripped up a structural-biology group led by Geoffrey Chang of the Scripps Research Institute in La Jolla, California. In 2006, the team realized that a computer program supplied by another lab had flipped a minus sign, which in turn reversed two columns of input data, causing protein crystal structures that the group had derived to be inverted. Chang says that the other lab provided the code with the best intentions, and "you just trust the code to do the right job". His group was forced to retract five papers published in Science, the Journal of Molecular Biology and Proceedings of the National Academy of Sciences, and now triple checks everything, he says.—Zeeya Mirali, Nature 467:775—777 (2010)
Spotted by Shriram Krishnamurthi.



Causal commutative arrows and their optimization

Causal commutative arrows and their optimization, by Hai Liu, Eric Cheng, and Paul Hudak, ICFP 2009.

Extends arrows with two widely used concepts, loop and init. Both concepts arise quite often. Proves a normalisation theorem that reduces any program to a single use of loop, and show that in practice this can yield a speedup of two orders of magnitude. Lovely result!

As the paper notes in passing, 'loop' is closely related to the concept of 'trace' from category theory. Phil Scott is currently visiting Edinburgh, and I'm hoping he, Jeff Eggers, and I might explore the relation betwixt 'trace' and 'loop'.

An compelling example of the value of 'loop' and 'init' is the definition of integral:

integral = loop (arr (\ (v,i) -> i + dt * v) >>> init 0 >>> arr (\ i -> (i,i)))

This definition is valid for a stream of events, where each event is separated uniformly by time dt. An intriguing question is how to define integral in a similar style when the time between events is not uniform? Or for continuous behaviours rather than discrete events?


Low floors, low ceilings, and Alice

Matthias Felleisen just introduced me to the phrase "low floor, low ceiling", which characterises precisely the worry about Kodu I expressed in my previous post: languages with a low overhead when getting started may offer low payback later.

As an example of this effect, he suggested the following critique of the innovative Alice programming environment: Through the Looking Glass: Teaching CS0 with Alice. (Thanks to Shriram Krishnamurthi for the precise reference.) I don't consider Alice in the same bowdlerized category as Kodu, so I was surprised to see that "low floor, low ceiling" might still apply.

Meanwhile, Matthias, Shriram, and others offer Teach Scheme/Reach Java as an alternative, offering a curriculum based on widely-used professional programming languages to high school and middle school children. (Addendum: Matthias and Shriram inform me that the TeachScheme website is out of date, and the preferred alternative is Program by Design.)


Pea vs. Papert; Kodu; Bootstrap

Logo, and following on from it Scratch, Alice, and Kodu, promotes an implicit claim that in the right environment children can learn programming almost by osmosis, and that absorbing this skill improves other skills, such as planning. These implicit claims were challenged twenty-five years ago by Roy D. Pea, who discovered that school children exposed to Logo could still be confused by recursion after writing recursive programs, and that learning Logo did not improve their planning ability. Here is an overall summary, here is a paper specifically on recursion, and here is Pea's reply to an attack by Papert.

I found the second paper (Children's mental models of recursive LOGO programs) particularly interesting, as it helps to understand some of the issues my first-year students may face in learning recursion. I'm woefully ignorant of the literature on the psychology of programming, and need to learn more. Thanks to Matthias Felleisen and Emmanuel Schanzer for pointing me at this work.

What sparked my interest was a keynote on Kodu at POPL, by Matthew MacLaurin of MSR. The laudable goal of Kodu is to permit young children to program video games, as MacLaurin put it, 'to let them know the computer can do anything'. And Kodu can achieve quite spectacular results with very simple programs, but the programming model needs to be very simple to achieve this (a list of condition-action pairs). Because Kodu is so special purpose, I worry that instead of teaching children that 'computer can do anything', it will teach them that 'computers will let you build exactly what they are designed to let you build, and anything else is very difficult'. Pea's studies are orthogonal to this concern, so I'm left wondering whether Kodu will lead to a new generation of programmers or a new generation turned off by computing.

Meanwhile, Schanzer is running Bootstrap, an outgrowth of TeachScheme. "Bootstrap is a standards-based curriculum for middle-school students, which teaches them to program their own videogames using purely algebraic and geometric concepts." The start-up costs are higher than Kodu, but once started students have the full power of Scheme/Racket at their command. Seems preferable to me, but there may well be room for both in the world. More study needed! Not to mention more resources for getting proper computing (not just ICT) into schools at an early age.

This page is powered by Blogger. Isn't yours?