The REPL in .NET
Wednesday, June 22, 2005 posted by Benjamin Pollack
For the past few weeks, I’ve been working almost all of the time in C++. Every once and awhile, someone files a new bug against the Reflector, and I go in and work in C# for an hour, but by and large it’s all C++. Not only that, but the C++ code I’m working with is was fairly poorly designed; it’s very hard for me to check in more than a couple hundred lines per day on a very good day, simply because so much of my time is spent trying to figure out how exactly to modify the code we’ve got to do what I want.
Just as a random example about what I mean, one of the applications currently does not display its main window until after a lengthy connection procedure. I want it to display its main window immediately. This ends up being a very, very complicated change due to a bunch of assumptions in the code about when the window gets displayed, but I’m not even focusing on that. I’m focusing on figuring out what code to modify. There are variables called m_hwnd and m_hwnd1 and WNDPROCs named WndProc and WndProc1 that represent entirely different things. Moreover, just for fun, though m_hwnd and m_hwnd1 are different window classes and represent radically different functionality, their respective WNDCLASSes are given the same WNDPROC. To fix that problem, the programmer changes the WNDPROC on a per-window basis eight lines later by calling SetWindowLong and passing in the actual WNDPROC. Why? I don’t know. There are no comments in that method. Even if there were, this obfuscation makes it needlessly difficult to figure out which window is which. After carefully reading through a printout of the code for 40 minutes yesterday evening and scribbling notes all over, I renamed the variables m_hwndViewport and m_hwndMain. But really, why weren’t the variables just given sensible names to begin with?
That’s actually not what drives me most insane about working in C++, though. What drives me nuts is the lack of a REPL.
REPL stands for read-evaluate-print loop, and refers to an interface that takes a line of programming code, executes it, prints a result, and immediately returns. It’s basically just a fancy name for a command line. Most dynamic languages have a REPL, or something similar. Ruby has irb, the interactive Ruby interpreter, Python fires up in interactive mode by default, and so on.
The thing is that a language based around a REPL actually changes how you go about developing code. When you work on C++ (or any language without a good interpreter), generally your development procedure goes like this:
- Write a core piece of functionality
- Litter it with print statements if that’s your thing
- Compile (this still takes 30 seconds for our unoptimized 400 kB program)
- Fire it up under a debugger
- If the program starts behaving oddly, put in some breakpoints
- Start stepping through code when you hit one, inspecting variable contents and paying careful attention to the call stack
- Discover suddenly what you did wrong
- Quit the running program and make some trivial modifications
- Recompile (another 30 seconds) and try again
There are three problems here: firstly, you keep having these breaks where you need to recompile the program to integrate your change. Sure, they’re “short,” but it’s enough to break your flow for a few moments and make you lose track of exactly where you were going with your train of thought.
Secondly, there’s no quick check for ensuring that your fix actually works; you have to make your best attempt to patch the code based on your understanding of the error and start the process over again. If you analyzed the problem correctly, your chance of getting it right is very, very high, but you still may have made a fairly bad mistake if you even slightly misunderstood the error domain. (Tyler and I, for example, had an incredibly taxing couple of hours yesterday simply trying to figure out which program was causing one of our bugs, and we misdiagnosed the issue several times in the process.)
Thirdly, you can’t easily write and test just a small amount of code. In most languages, your only option if you want to do this is to write a test suite for your class protocol, which takes a lot of time during development and probably isn’t necessary for a lot of smaller stuff.
Developing with a language centered on a REPL is a totally different experience. Here’s basically what developing in Smalltalk looks like, for example (though Common Lisp and Scheme work basically the same way):
- Add or modify a few methods in a couple of classes
- Open up a Workspace (Smalltalk’s REPL) and print out the results of some arbitrary code that tests what you just wrote
- If you hit a bug, you can easily inspect any value in the entire system to find the error
- Once you find it, make the change, massage any “damaged” data back to pristine state by hand using the Workspace, and then resume execution where the breakpoint happened to see whether your fix worked
- Repeat
Notice: there’s no compile phase, it’s trivial to work on small pieces of code, and hitting a bug does not mean rapidly switching gears, but instead means you to try your fix in the running code right at this very second, and you get immediate feedback on whether your fix was correct. Step 4 has been rechristened fix-and-continue and has appeared in a more limited way in some more traditional languages, such as Java 1.4, C# in .NET 2.0, and C and C++ in Apple’s versions of GCC, but the other parts of the process simply don’t really exist.
The good news is that you can approach a REPL style of development in .NET by using an interpreter. Although I don’t quite buy that .NET is language-neutral (write in any language, as long as it’s object-oriented and has no more or fewer features than C#!), that mantra has brought forth tools like IronPython, a Python implementation that runs on .NET and takes full advantage of .NET’s toolchain. When I was working on the Reflector, the fast compile speed of C#, combined with IronPython’s ability to load assemblies on-the-fly, meant I got very, very close to REPL-style development using traditional tools. I would write some code in C#, hit compile, jump to IronPython, reload the assembly, and check out the code I’d just written. When I found bugs, I’d run some code against the problem classes, try out a fix or two in Python, then move that fix to C# when I was convinced I had properly diagnosed the problem. The constant shift between Python and C# was a little bizarre, but eventually my brain just got used to the concept that debugging was in Python and new code was in C#. The ability to so rapidly test and diagnose errors played a big part in letting me get the Reflector done as quickly as I did and actually having the result work.
If you’ve not tried REPL-style development before, but are afraid of or don’t have time for languages like Smalltalk and Common Lisp that are built around them, I’d still strongly encourage you to give a tool like IronPython a try. IronPython 0.7.5 runs on .NET 2.0 beta and is available from Microsoft. IronPython 0.6 runs on .NET 1.1 and is what I’m using here. And, if you like it, consider trying out one of the REPL-based languages. I have a hunch you’ll find it highly addicting.