Sunday 13 April 2008

Summer of Code 2007 results and experience

Inferno projects in 2007 ended up within the Plan 9 from Bell Labs organisation. You can read about it in last year's blog. I was mentor for three of those projects: SPKI infrastructure for Inferno (Katie Reynolds/katelyn); Venti-like system in Limbo for Inferno, with added Rabin fingerprinting (Mechiel Lukkien/mjl); and a port of Inferno to the Nintendo DS (Noah Evans). The students were all talented, which was just as well, since otherwise acting as mentor for three projects (and helping a bit on some others) would have been quite impossible. The projects were modular, so that timing and expectations could be adjusted fairly easily as the summer wore on, and there was something to show for it all early on.

Here is an edited version of a post I made elsewhere just after the programme ended, of our experience of GSoC 2007.

I had three students to mentor and they all produced work that is being included in the organisation's distributions. One of the projects (SPKI) finally wrote code to implement some ideas I had originally intended to implement three years ago, but had to put aside for lack of time. That success in turn is leading to a significant change to ancient code and mechanisms in one of our systems, mainly by deleting code from its kernels. (So in a way, for us it was the Google Summer of Anti-code, which seems good to me.)

Those three projects were all quite hard. The SPKI one required installing two related but different operating systems and a large application suite, and then writing code in both C (for one part) and a concurrent programming language the student had never seen before (Limbo, for other parts). Another project required writing an archival storage subsystem broadly based on an existing design (Venti) but including some new techniques. In the third project, the student got the Inferno operating system settled as a native kernel on a new platform — first on an emulator, then on the hardware — without previous experience of doing kernel ports.

Despite the relative difficulty, the projects all worked out well, because the students settled down and did the work, and they kept it up. Sometimes I would receive e-mail starting "This is probably a silly question, but ...", and not only was it not silly, it revealed some long-standing flaw/deficiency/confusion in system or documentation or both. The students have also expressed interest in continuing to contribute to the underlying systems, time and graduate study permitting.

None of this would have happened without GSoC.

The style of interaction was different for each student. E-mail was used quite a bit, partly because of timezone differences, and partly because I prefer it because it offers a chance to think about the questions or responses, but one of the students used Google chat with me quite a bit at critical points to good effect. Interaction generally increased after mid-term evaluation, partly because the more complex stages had been reached, but mainly because I felt guilty about not spending as much time on the programme as I had originally intended, owing to a change in my own workload. Generally I found it similar to supervising a student project, but with the added complications of not being able to assess the student's ability as easily, having to cope with time zones, and deadlines in the day job.

Another GSoC organisation suggested that detailed specifications might improve the outcome.
One of my three projects was defined in considerable detail (including external references), another was fairly obvious in scope, and the third had an overall aim but no real detail. Thus, from my own experience, I cannot conclude that either over- or under-specification is critical (or not) to success. "It depends."

What else might help? All three projects were committing source code well before the mid-term point. Each of the projects had distinct stages identified, and all were deliberately or inherently open-ended (making it easy to add or remove items depending on actual progress). Insisting on fairly regular supply of code (or at least design material and discussion) helps to highlight writer's block.

Now, the Plan 9 organisation as a whole had 13 projects and 5 failed the final evaluation, which is not so much too high in absolute terms as too high too late (we ought to have failed more at mid-term). But then again, there were 8 successful projects, and none of those would have happened without GSoC. It's a bit like one of those tabloid newspaper articles bemoaning that "20% of people do/believe/hope some horrible thing" when a reasonable response might be that 80% of people do not. Our students succeeded more often than not. Still, we all hate wasting time and money on failure.

I later read through the original applications, the mid-term reviews and the final reviews, to see whether there was anything to distinguish the set that succeeded from the set that did not. I was reminded that during the application competition, I thought the quality of both GSoC applicants and the coherence of their applications was good. We had well over 100 applications, and there were only a few that were spam or no-hopers. The ones we accepted that subsequently failed still looked plausible to me.

For the projects I was going to mentor, I was particular about assessing the students' portfolios, which included any sort of code they'd previously written, ideally on their own, perhaps as a student project. I would not have declined to mentor someone who was relatively inexperienced, but I would never have agreed to mentor them on an ambitious project.

Still, only one of the three
that I did mentor had any relevant experence of their chosen project area, so the previous work wasn't directly applicable, and they still had quite a bit of work to do. On the other hand, at least one project that failed had an apparently experienced student who could point to previous, plausible, and even relevant work. As a rule, though, we'd probably pay special attention to evident capability in future.

All but two of the failed projects were reasonable ones that could be expected to be done (to an adequate level) in the time available. We ought, however, to have failed the three simpler projects at mid-term for lack of progress. One of those would have been especially good fun, and had a good mentor for it, who helped agree a simplified project at mid-term, but the student simply did not seem to come to grips with it at all. I eventually realised that all the projects that did fail had a worried mentor at mid-term, and the ones that succeeded looked fine to their mentors (even when the students were worried about progress).

One pleasant surprise for me was that the native kernel port to Nintendo DS succeeded.
One rule of thumb I originally had for GSoC projects was "never suggest doing a device driver let alone a kernel port", but although one port did fail, one worked. The reason for avoiding drivers and ports is that although they seem like fine projects, we know historically it is quite difficult to help debug them at a distance, even when both parties ostensibly have the same hardware, and that seemed a big risk for a short summer project. The Nintendo port has attracted an enthusiastic group, and work continues in a project on Google code.

Our rather loose (in several ways) organisation overall received a big benefit from participating in GSoC: there is now more code in public, and just as important, more visibility and more participants in the larger project.

No comments: