I’ve been making some progress on Wombat, my NNTP server. I got XHDR, LIST OVERVIEW.FMT, and XOVER all implemented, so all the NNTP clients that I have work with it. The interesting thing is XHDR and XOVER do basically the same thing, except that XOVER is much more efficient. Instead of sending one header at a time for a range of articles, XOVER sends all the standard headers for a range of articles all at once. I can see why NNTP clients want it implemented, but it also makes me wonder why newer NNTP clients would bother with XHDR (MaxNews and some others do).

I also implemented some extended headers like Xref, Lines, and Bytes, since every client I tested with wanted them. Also both Lines and Bytes are required to be sent with the XOVER headers (XOVER has required and optional headers). Now, Xref and Lines were fairly easy to implement. Xref just lists each group the article is on the server, and its article number in that group. Lines is simply the number of lines in the body of article. But Bytes, oy.

RFC 2980 says XOVER requires Bytes, but nowhere does it say what the value is. Only by interrogating an existing news server was I able to guess that it’s most likely the total number of bytes of the article. But I’m not sure, I could very easily be wrong. I googled for it several times and came up with nothing. The other thing about Bytes is that it isn’t really stored in the headers. That would alter the size of the article. So its kept external of that. Me, I just added another long long field to the database and jammed it in there.

In related news, I’ve found that only Unison, NewsWatcher, and Thunderbird are really worth testing against. The others that I mentioned last time are so flaky I have to track down whether it’s a bug in my server or a bug in their client. Of the good ones, NewsWatcher is actually my favorite despite its age. It seems to have the most robust and flexible NNTP client implementation. Its about the only news client that notices when groups are added/removed on the server, and allows me to select certain useful options (i.e. does it use XHDR or XOVER to get header information). Very useful for testing.

Unison makes it easy to upload files (it automatically segments them, encodes them, etc) so I played around with that for a while. I found some bugs in Wombat that way. Namely, that I was treating all text as UTF-8 encoded, when it wasn’t. That meant yEnc encoded files got corrupted and couldn’t be properly rebuilt. I found out that using ISO Latin 1 for the encoding resulted in non-corrupted files, but that’s still a big assumption to make. In the end, however, I ended up just treating the headers as text and the body as a big bag of bits.

I’m still using Core Data as the back end, although a little less so now. The other thing I found out from uploading files from Unison was that large articles make the database huge, quick. Also seeing that I was treating most of the article as a binary blob, keeping it in the database wasn’t so useful. So I modified Wombat to keep the headers in the database, but write the article to an external file. I kept what I think is a traditional directory layout for articles. An article posted as the 4th article to com.orderndev.general gets path: com/orderndev/general/4.txt. Unlike traditional systems though, I don’t hard link from other group directories it was cross posted to. I just put the relative path to the article file in the database.

I’m struggling with using Core Data with threads. It has some lock and unlock methods on NSManagedObjectContext with documentation vaguely stating that you should use them, maybe, if you feel like it. Unfortunately, I occasionally get random crashes when I have multiple threads touching the object context, the in memory part of the database. I have already put locks around code anytime I create an entity, modify an entity, or retrieve an entity. I haven’t put locks around accessing attributes of entities, although it looks like I will have to. I just wish there was some good documentation for this.

Meanwhile, I’m still progressing through RFC 2980 and RFC 1036 and getting the standard stuff implemented. Yukon, ho!