Wombat NNTP progress

I’ve been making some progress on Wombat, my NNTP server. I got XHDR, LIST OVERVIEW.FMT, and XOVER all implemented, so all the NNTP clients that I have work with it. The interesting thing is XHDR and XOVER do basically the same thing, except that XOVER is much more efficient. Instead of sending one header at a time for a range of articles, XOVER sends all the standard headers for a range of articles all at once. I can see why NNTP clients want it implemented, but it also makes me wonder why newer NNTP clients would bother with XHDR (MaxNews and some others do).

I also implemented some extended headers like Xref, Lines, and Bytes, since every client I tested with wanted them. Also both Lines and Bytes are required to be sent with the XOVER headers (XOVER has required and optional headers). Now, Xref and Lines were fairly easy to implement. Xref just lists each group the article is on the server, and its article number in that group. Lines is simply the number of lines in the body of article. But Bytes, oy.

RFC 2980 says XOVER requires Bytes, but nowhere does it say what the value is. Only by interrogating an existing news server was I able to guess that it’s most likely the total number of bytes of the article. But I’m not sure, I could very easily be wrong. I googled for it several times and came up with nothing. The other thing about Bytes is that it isn’t really stored in the headers. That would alter the size of the article. So its kept external of that. Me, I just added another long long field to the database and jammed it in there.

In related news, I’ve found that only Unison, NewsWatcher, and Thunderbird are really worth testing against. The others that I mentioned last time are so flaky I have to track down whether it’s a bug in my server or a bug in their client. Of the good ones, NewsWatcher is actually my favorite despite its age. It seems to have the most robust and flexible NNTP client implementation. Its about the only news client that notices when groups are added/removed on the server, and allows me to select certain useful options (i.e. does it use XHDR or XOVER to get header information). Very useful for testing.

Unison makes it easy to upload files (it automatically segments them, encodes them, etc) so I played around with that for a while. I found some bugs in Wombat that way. Namely, that I was treating all text as UTF-8 encoded, when it wasn’t. That meant yEnc encoded files got corrupted and couldn’t be properly rebuilt. I found out that using ISO Latin 1 for the encoding resulted in non-corrupted files, but that’s still a big assumption to make. In the end, however, I ended up just treating the headers as text and the body as a big bag of bits.

I’m still using Core Data as the back end, although a little less so now. The other thing I found out from uploading files from Unison was that large articles make the database huge, quick. Also seeing that I was treating most of the article as a binary blob, keeping it in the database wasn’t so useful. So I modified Wombat to keep the headers in the database, but write the article to an external file. I kept what I think is a traditional directory layout for articles. An article posted as the 4th article to com.orderndev.general gets path: com/orderndev/general/4.txt. Unlike traditional systems though, I don’t hard link from other group directories it was cross posted to. I just put the relative path to the article file in the database.

I’m struggling with using Core Data with threads. It has some lock and unlock methods on NSManagedObjectContext with documentation vaguely stating that you should use them, maybe, if you feel like it. Unfortunately, I occasionally get random crashes when I have multiple threads touching the object context, the in memory part of the database. I have already put locks around code anytime I create an entity, modify an entity, or retrieve an entity. I haven’t put locks around accessing attributes of entities, although it looks like I will have to. I just wish there was some good documentation for this.

Meanwhile, I’m still progressing through RFC 2980 and RFC 1036 and getting the standard stuff implemented. Yukon, ho!

NNTP, Core Data, and Wombats

For my new programming side project I’ve started writing an NNTP server. Like many young children, I once looked up into the night sky and wondered, “what would it be like to write my own news server?” I’ve read the RFC before, but never got around to actually implementing one, mainly because writing my own database never really appealed to me. I’m just crazy like that.

That’s where Core Data comes in. It’s an Apple data modeling technology that wraps SQLite. I’ve been looking for an excuse to learn this very cool technology, and this seemed a good excuse as any. I have to say, Core Data is very easy to use. The only problem I had was I kept wanting to design database tables instead of designing an object model. (Where’s my primary and foreign keys??) Way too much MySQL beforehand.

I’ve decided to code name this ill-advised project “Wombat,” for several reasons. First, Wombat is mentioned a few times in the examples in RFC 977. Second, its a really dumb name, and what other kind of a name would you give to an NNTP server that only runs Mac OS X 10.4 and higher? Plus Wombats look ornery, and if this server is anything, it’s ornery.

I’ve actually got RFC 977 implemented now, with a couple of extensions, which is supposed to be the NNTP standard. After I got most of the commands implemented, I decided to try my server with some Mac OS X native newsreaders, just to see them interact.

Hahahaha!

That’s the sound of all the Mac OS X newsreaders not knowing what the heck to do with a news server that implements the NNTP standard. Or perhaps, more accurately, news servers that only implement the standard.

I tried several NNTP readers: Panic’s Unison, Mozilla’s Thunderbird, the venerable MT-NewsWatcher, and some lesser known ones like Newsflash, OSXnews, MaxNews, and Xnntp. Most of them could get the list of groups and how many articles there were, but that was it.

You see, there’s another RFC, called “Common NNTP Extensions” that describes many ad-hoc extensions that servers started implementing that are not in the standard (RFC 977). Well, it turns out every reader I could find requires at least some of these extensions to be implemented. Namely, most readers want XOVER, LIST OVERVIEW.FMT, and XHDR implemented. Basically, those commands help the reader retrieve header information (subject, from, etc) en masse, and far more efficiently than the standard allows.

I guess what was more shocking to me, was that none of the clients fell back to the standard commands, when The Wombat started kicking back “500″ codes (command not implemented). The readers either treated my response like a bug in the server (that’s what they told the user) or just assumed The Wombat returned an empty data set. I’m guessing that the common extensions have been around for long enough (documented in 2000) and widely implemented enough, that all readers simply assume they’re going to be there.

There were also problems with the couple of extension commands I did implement. Namely I wanted a way to require authorization, so I implemented the AUTHINFO command, as described in the “Common NNTP Extensions” RFC. Apparently I’m the only one reading from that document.

The way AUTHINFO works is the client can issue commands like it normally would, but at any time the server can kick back a response that essentially says “Sorry, you have to authenticate before you do that.” At that point the client gives a username and password, the server authenticates it, and the client goes on its merry way. That’s the way it supposed to work anyway.

Unison, on the other hand, wants to start shoving AUTHINFO commands at the server at random intervals. This is contrary to the RFC that says the client should never initiate authentications, but only provide it when requested. The RFC also says that if the client offers AUTHINFO commands when not requested, the server is supposed to reject them. Foolishly believing the spec, that’s how I implemented The Wombat. Well, if you ever turn Unison down, you hurt its little feelings and it starts talking about you behind your back to the user. Stuff like: “the server doesn’t like you anymore. It rejected your username and password.” Unfortunately, the RFC doesn’t have a response for “you dumb client, you’re already logged in as that user!” Of course, I ended up modifying The Wombat to accommodate Unison’s pushy style of authentication.

So at the end of the day I have a fully implemented NNTP server, with respect to RFC 977, and no reader will work with it. I guess I have some work to do.

Respect for the testers

Sure it’s a lot fun to tease them and try to make their lives miserable, but really, if we didn’t have testers who would we, as engineers, have to torment? The marketing people? Please, they’re not even self aware enough to know that we’re doing it, and that’s no fun. Testers, on the other hand, are not only fun to use as scape goats, but they also provide an important service for the product.

Namely one I never want to do.

Despite that, I found myself doing exactly that recently. My WordPress plugin is now code complete, but is in need of testing. I looked around the apartment for suitable candidates, but the lizards around here are so small they cannot even depress the keys on the keyboard despite jumping on them. That’s how this unmentionable task fell to me: I had to write and execute test plans.

GAAAHHHH!!!!

Now writing test plans and such is something that I learned about in college. At the time I thought “Bah! This is nice for reference and all, but I’m never going to use this. I’m an engineer! I create the bugs, not find them!” Oh, how wrong I was. While in college, I also worked as an intern. Although I was supposed to be working on developing internal tools, I often got pulled into doing QA work. (Note for the unexperienced: the QA department is always understaffed. Hide behind the nearest potted plant if the QA manager ever comes within ten feet of your cubicle.) It was a never ending battle: me trying to escape QA work, the QA manager pulling me back in, and the other engineers laughing at me the entire time.

Testing and quality assurance work is never fun. When writing a test plan, you have to think of all the possible ways that a feature can break, and make sure all the different angles are covered. But that’s balanced by the fact that you can’t test everything so you have to be smart about what you test. That way you get the maximum possible coverage for the least amount of work. After you write the mind-numbingly boring test plan, some unlucky bloke has to run it. The experience is much like putting a portable drill to your temple and pressing really hard.

I’ve actually managed to get the test plans for my plugin written now. I found that writing them myself was a good exercise. I had to change my attitude from “how do I make this work?” to “how do I crush this pathetic excuse for software, and send the developer running home to his mommy?” I found several bugs just by thinking through how to test the different features. I also found that there were features that weren’t as usable as they should have been, since I hadn’t been looking at them from the point of the user, but that of an engineer. All of this, and I hadn’t even run the test plan. Good stuff.

I’m not looking forward to running my test plans. I have to run them at least three times: once on Safari, once on Firefox, and once on my arch-nemesis, Internet Explorer. May God have mercy on me.

I say all of this to show that I respect the testers and quality assurance people out there. Sure I go through this each time I have to do some sort of testing myself, or a tester finds a bug that I wouldn’t have caught myself, but it bears repeating. Testers are there to make to make the engineers look good. Unless the tester wants your parking spot. Then they’re probably trying to get you fired so they can have a shorter walk to the building.