Forging Your Own Tools

As hinted at on the micro blog (you didn't know there was a micro blog? It's here.) I've been thinking about some permacomputing stuff again and realized that I don't think I've written about one of my big side diversions over the last year: writing the worst possible version of various pieces of software I use. If there end up being talks at the local permacomputing meetup, this might be the prelude to one, but the one time I've been we just hung out and chatted for a while. Of course this ties in with the old software aesthetic post too, pretty much all of these programs would probably be totally understandable with, I dunno, 30 minutes of study?

Back in the ancient days, so the myth goes, one of the last things needed when you were an apprentice and you were graduating into becoming a journeyman, was that you'd forge your own tools. Personally I suspect that this is a bit of historical flattening; I could see it for say, a blacksmith, but I can't imagine a carpenter taking over one's shop to make a saw or a drawing knife after spending the last several years learning how to work wood.

In the case of software though, especially developing an eye for writing it yourself, I think it's important. Some of this comes from how I grew up, my dad was never a software developer, but he would do things like help other students cheat on with their programming homework and has worked with computers for the vast majority of his life, and he would always encourage me to clone software I was curious about. Generally speaking, programs we think of as actual programs are usually too big, embodying months or years of work for one person, but it's rare that you can't at least get a useful subset knocked out over a weekend, and in doing so you'll either make yourself a useful little tool or at worst gain a bit more appreciation for the software you actually use. Who knows, maybe you'll even learn enough to make a less-awful version of the tool down the line.

The most important part of making the worst version of a piece of software you're curious about is figuring out what you can cut, or more importantly what actually matters to you. It's been tossed around since at least the 80's that any given person probably only makes use of what, 20-30% of any given program's features? That's a lot of stuff for free, but they need that 100% or something close to it because that 20-30% is going to differ per-person. I could easily write an entire book without any grpahics at all, or only with figures, but your photographic portfolio is going to be boring without them and won't need much figure drawing. So step 1 is just to discard the features that you definitely don't use. That's easy. After that is the hard part: figuring out how you can cheat on the features that you do need. This is when we get into the example portion.

tl.cgi

This is probably the smallest of my worst series, only a handful of k in shell hooked up to the plan9 httpd.The motivation (aside from just wanting an excuse to spend a few minutes throwing something together,) was basically that I've gotten rid of the vast majority of my social media and wanted somwhere to put super low effort stuff that's not worth a whole new web page or gopher entry. All it does is give a chronological list of posts as they exist in the posts directory. It has no admin interface, it doesn't do anything else, no paging, just posts.

Why just posts? Well if I allow posting through the web now I need to worry about accounts, authorization, input sanitation, now the script needs write access to the filesystem, and all of these other complexities. Or I can just do my usual method of publishing files to the website (pushing them over a networked fs.) I have a script that reads from stdin until EOF, then puts the post in the right folder with the timestamp as the filename.

tvc

The biggest of these projects, and the one which probably got the most use, was tvc: terrible version control. I don't think any of the others were the serious result of frustration aside from tvc, but I've had the problem in my had for years now that git is, bluntly, way too much version control for practically anybody. Unfortunately git is all that anybody uses, and the alternatives to git, well most are too complicated in that they're trying to do things on the same sort of scale as git; they want to handle a Linux. Most are also dead. Mercurial's highest profile user these days is (aside from Mercurial,) XEmacs maybe? Bazaar has been abandoned for 10 years officially (and I remember the drama on the emacs mailing list when I was in uni about how it'd been practically abandoned years before, spawning the change to git,) Darcs has always been a research project not taken very seriously outside of the Haskell community. Fossil might be worth looking at if it didn't come with a wiki and a web server and version control and bug tracking and a forge.

So we take a step back and look at what we care about, right? Figure out the problems we want to solve. I'm a solo developer (or part of a small handful of people,) developing relatively small projects. It would be, quite frankly, remarkable if a project hit 4 digits of commits. It's highly unusual if it hits 3 digits. It's highly unlikely that someone malicious is going to be messing with my repository history. Nobody (so far as I know) uses any software I've ever written for leisure, I'm happy to hand-craft releases with my own gpg signing. I do have one political opinion to throw in: I don't see any point in having a centralized version control system, at least not for my projects. I looked into using CVS before, it was clunky for what I needed.

Alright, that's a lot that I don't need. What do I actually need then? Well, I need to make commits, move around in history, and make diffs. That's pretty much it. At this point a push/pull facility is also on my mind, but I punted on it.

To fast forwards a bit: when my assumptions are small I can get away with sequentially named commits instead of hashes. Because the repos are small I can afford to copy the whole tracked repository into the commit as the tree. With the whole tree in each commit I can trivially diff between branches rather than try to manage diffs between diffs. Checking out becomes trivial: delete all the currently-tracked files in the current workspace and copy out from the commit. With the assumption of one developer, it's not so much of a worry to potentially have conflicts. When I went to implement push/pull, I just put all the heavy lifting onto rsync. The end result was a remarkably lightweight and simple version control system. When I ported it to plan 9, I didn't even need push/pull implemented as the filesystem is inherently networked. I could just put the repo file on the network if I wanted, or manage it with copying.

Finishing up, it's worth making bad version of tools you use. It's fun, you learn stuff, and yours might even be better for your specific needs. Sometimes it really is like that old vaguely remembered Terry Davis quote: They say when things get big they don't get bad, they get worse. That goes both ways: when things get small they don't just get good, they get better.


But this page is ugly!
Why?