Archive for July, 2006

Digg vs Netscape: The plight of the users

andy on Jul 31st 2006

Lately there’s been a lot of hoopla and fussing about Netscape’s proposal to attract and keep top Digg submitters by offering them money. Kevin Rose of Digg and others argue that people contribute to Digg because they enjoy it. On the other hand, Jason Calacanis of Netscape argues that the top submitters make the site, and therefore they should be paid for their work, and please, for the love of Pete, pay attention to him for once. This debate has gone on ad nauseam, but has yet to end. Mainly because Calacanis hasn’t run out of dumb things to say, and I haven’t had a chance to get my two cents in.

What people have failed to ask so far is: what does the consumer think? There are countless internet users who have not had the chance to weigh in on this all important issue, some of whom actually have actually heard of these two sites. In order to facilitate this discussion, I will speak for the consumers. No need to thank me, I accept large cash donations.

The first question on every consumer’s mind is: “who are we talking about again?” Despite the very public sparing of two unknown geeks on their personal blogs, not many people have heard of these sites. So the first step in helping consumers weigh in is to let them know what they’re fighting about, and to make fun of some websites that have more visitors than me.

Digg is a relatively new Web 2.0 company*. Despite their name and logo, they will not install a new underground pool for you, so you can put the phone down now. Digg is a website that allows users to see what stories are currently “hot” in the technology world. Although this might sound similar to Slashdot or something else you won’t care about, I can assure you it is totally different in who gets all the money. Unlike the totalitarian regime imposed on Slashdot by CmdrTaco (Is he really a commander? Is he really a taco?), users of Digg can actually submit and vote on stories they like. This is the allegedly “social” part of the website, which should give you an inkling of much SPF they require in their sunscreen. If a lot of people “digg” a story, then it gets moved to the front page. Instead of having stories decided by an editor, they are decided on by people who like to click shiny, gradient filled buttons and see their names on the front page of a geek website.

The Netscape name has been around for longer than most people can remember, let alone tolerate. Throughout its history, people have been amazed at Netscape’s ability to create products they desperately don’t need. Whether it be a web browser, server software, a web portal, a low cost ISP, or a rip off of Digg, people have responded resoundingly with the click of their web browsers. And they have said: “Didn’t these guys used to have a browser?” In Netscape’s recent bid to get attention, they have cloned Digg, and created many innovations based on what was already there. Their main innovations being: changing the word “Digg” to “Vote” and not having any users.

The next question most consumers will have is: “what’s in it for me?” Consumers are busy emailing that latest chain letter from Bill Gates to their grandmother, and will want to know why either site deserves their attention. Sure Netscape wants to pay the top submitters, but how about the people who just read these sites or frequently ignore them? How much they are worth? Sadly, not much research has been done in this area. I could only find one article, titled YouTube, Digg, MySpace: How much is a non-paying ‘user’ worth?. I would have found more, but it was the only one posted on Digg. Readers of this article will be impressed by the author’s ability quote herself repeatedly, not answer the question that she herself posed, and to refer to consumers who might just click on a banner ad as “freeloaders.”

It is clear from this article that more research needs to be done. Consumers cannot make such important decisions like which free news-aggregation web site to visit on so little information. I volunteer to take this on, and I can assure you I have the integrity of John C Dvorak. Up until this point neither Digg or Netscape have been giving me any money. I propose that they both start giving me money, and then I can decide if I’m more likely to visit their site or even click on banner ads or not. For phase two, they could give me more money, and then I judge if that increases their chances any. This is vital to the scientific process.

Finally, customers will want to know “Does the roe over Digg and Netscape mean we’re going to see a nerd slap fight?” Yes, but only in the virtual sense. I’m guessing both Kevin and Jason bruise easily and will avoid a physical confrontation and direct sunlight. Although, Jason might have other plans since his recent declaration of love for Kevin. Who knows, maybe this is just the prelude of Netscape being the latest clone of Match.com.

* Web 2.0 means they don’t make any actual “money”.

Filed in Amusing | Comments Off

wchar_t: Unsafe at any size

andy on Jul 28th 2006

One of today’s fads in software engineering is supporting multiple languages. It used to be that each language or script had its own code point system (or encoding), with each code point representing a different character. For reasons of convenience, the various scripts were incompatible, they could not identified by simply looking at the code points, and an identifier describing which script the text was in was not allow in the same zip code. Sometimes this caused problems with engineers who had weak constitutions; was that ‘c’ or a ‘¥’? Experienced programmers knew the correct answer was to cycle through all the known scripts, interpreting the text with each in turn, and ask the user to tell them when they could read it or saw the hidden picture. These were the earliest known captchas.

The Unicode Consortium was unhappy with this because they were not the cause of the mass confusion, as a result of being late to the party. They devised a scheme in which each character had its own unique code point. They also allocated enough code points to represent all the characters of a lot of different languages. They even added a unique byte sequence at the start of any Unicode text to mark it as Unicode. And thus, all was well and good as long as you didn’t mind having text that took four times more space than usual, and wasted three out of four bytes. The Unicode Consortium at first wasn’t interested in fixing this problem until they realized they could use it to add more “features” (read: confusion). The Consortium begat UTF-8 and UTF-16 in order to fill this need. UTF-8 encoding allowed most characters to be encoded in 8 bits, with the rest as escape sequences, and UTF-16 allowed most characters to be encoded in 16 bits.

Originally people implemented these types in C by using unsigned char (UTF-8), unsigned short (UTF-16), or unsigned int (UTF-32). At the time of adoption of Unicode both Win32 and the Mac Toolbox used UTF-16. It was a nice tradeoff between size and efficiency. For most characters they were only wasting one byte (as opposed to three bytes in UTF-32), but could still assume most characters where just 16-bits (as opposed to UTF-8 which escaped anything not ASCII). Life was good.

As most standards committees, the C/C++ standards committee were bent on death and destruction. They saw people were using this newfangled Unicode, and that it was almost sufficiently confusing. The standards committee wanted to advocate this confusion while adding even more. To achieve their demented objective, they introduced wchar_t and std::wstring. But which encoding of Unicode did it use: UTF-16 or UTF-32? BUHAHAHAHA! In their greatest show of leadership to date, the standards committee refused to say. It would be a surprise, and they would hate to spoil a surprise. wchar_t was defined to be more than a byte but no larger than jet liner.

With this new edict in hand, compiler and library writers quickly got to work. Instead of following each other’s lead, they each implemented wchar_t and its supporting libraries as they saw fit. Some saw the benefit of making wchar_t UTF-16. Others wanted it to be UTF-32. And thus, the standards committee bode their time.

Since both Windows and Mac OS (Classic) had adopted UTF-16 already, the compiler makers implemented wchar_t as UTF-16. But this was just a trap, meant to ensnare hard working cross platform engineers. Engineers who worked on software that ran on Windows and MacOS started using wchar_t. It was easy and worked well. A little too well.

Meanwhile, Unix vendors had decided that wasting one byte was insufficient, and that wasting three bytes per character was definitely the way to go. Besides its not like anyone on Unix was using Unicode for anything other than Klingon.

The trap was sprung in 1996 when Apple purchased NeXT and its Unix based operating system. Like all good traps no one realized what had happened for several more years. It wouldn’t be until 2001 when Mac OS X was released and Steve Jobs started after developers with cattle prods to get them to port to Mac OS X. Unfortunately for the standards committee, some developers continued to use the old developer tools, like CodeWarrior, and old executable formats, like CFM/PEF, that implemented wchar_t as UTF-16. But the standards committee was patient. They knew they would prevail in the end.

Apple would turn out to be the instrument of the standards committee. They continued to improve Xcode until it was good enough to actually build most of their own sample code. At the same time, Metrowerks finally won its game of Russian Roulette, and stopped development of CodeWarrior. Apple delivered the final blow when they announced they were moving to the Intel architecture and that they had the only compiler that supported it. A compiler with a secret.

There were screams of anguish when it dawned on engineers the cruel trick Apple and the standards committee had played. Mac OS X, being a Unix variant, had implemented wchar_t as UTF-32! All the cross platform code, code that used to work on Windows and Mac, no longer worked. Apple felt their pain, and issued this technical note, which essentially says: “instead of using wchar_t, which used to be cross platform before we destroyed it, use CFStringRef, which is not cross platform, has never been, and never will be. P.S. This is really your own fault for ever using wchar_t. Suckers.”

At the time that this was happening, I happened to work for Macromedia (now Adobe). Being the most important company that implements Flash, some of the Apple execs came down and talked to the Mac engineers at Macromedia. When the appropriate time came, I sprang into action demanding to know what would be done about wchar_t. There was stunned silence. “What’s wchar_t?” was the first answer. After explaining it, the next answer was “We don’t implement that.” After pointing them to their own documentation, the next answer was “Oh. Huh. Well, why did you use it? We don’t use that crap. Use CFString instead!” After slamming my head against the table, I attempted to explain wchar_t was used everywhere in our codebase, and CFString wasn’t cross platform. “Sure it is! It works on both Mac OS 9 and Mac OS X!”

The solution in the end for those duped into using wchar_t, is to go back and use unsigned short instead. Unfortunately, that means doing a lot find and replace (find: wchar_t replace: char16_t, where char16_t is typedef’d to unsigned short) and then re-implementing the wchar_t library (including wstring) for the new type. Yep. Reimplement the wchar_t library. The lucky jumped into a pit of rabid ice weasels, where they were torn from limb to limb. The unlucky had to repurpose all the old CodeWarrior MSL code to re-implement the wchar_t library as char16_t library.

The moral of the story is: don’t trust the standards committee. Especially on standards that aren’t really defined or when they start snickering behind your back. Usually that means they stuck a note on your back that says “Standardize me.” I’m not sure why that’s funny, but they think its hilarious. If you need to use a Unicode encoding, use UTF-8. You can just use char and std::string for that.

Besides, who doesn’t speak English?

Filed in Macintosh, Programming | 5 responses so far

Survey of Cross Platform GUI Architectures

andy on Jul 27th 2006

I’m a Mac guy, but I’ve spent a large part of my time working for large software companies. That means that I usually work on a cross platform product that supports both Mac and Windows. The upshot of that is each product has to have a platform layer of some kind so that the correct Windows API’s are called on Windows and the correct Mac API’s are call on the Mac. In my tenure I’ve encountered several different ways of tackling this problem. Some worked very well, while others made me want to quit and become and mime.

The interesting thing to note here is that historically the Mac and Windows API’s weren’t all that different. Sure the Mac Toolbox calls started out in Pascal, but most apps use the C bindings these days. Both the Toolbox and Win32 have very similar ideas about windows, menus, and events, especially when comparing the Win32 API with the Carbon Toolbox. There’s definitely not a one to one mapping for everything, but it wasn’t all that hard to see where you could write a compatibility layer and support both.

There are a few different techniques/architectures for building cross platform frameworks. For this discussion I’ll ignore entire platforms like Java and Adobe Flash which employ virtual machines. I’ll limit discussion to natively compiled applications.

At the end of each section, I’ll rate the architecture. They’ll be evaluated based the cost of maintenance, the initial cost of development, and the user experience they advocate.

  1. Emulation

    This is where you attempt to make one platform look like the other. For example you implement all the Mac Toolbox API’s that you use on Windows using the Win32 API. That way, you can program to the Mac API’s and it will just “magically” work on Windows. However, given platform popularity this usually means implementing MFC or Win32 on the Mac. Note that this isn’t emulation like what emulators do, but simply implementing a platform API on a different platform.

    This technique is unfortunately popular. There are some assumptions that go along with this:

    • The only thing required to develop for a platform is writing to the appropriate API.
    • If you emulate MFC on the Mac you can hire cheap MFC programmers and get a Mac version for free. (I’ve had managers tell me this with a straight face.)
    • It is cheaper and easier to maintain an emulation layer for a platform than the other techniques.
    • The platform on which the emulation is taking place is the only platform that suffers.
    • There is no other way to port an existing Windows application to another platform.

    It turns out every one of these assumptions are wrong.

    The first assumption assumes that the emulation layer is going to take care of everything and a Windows programmer will never have to know anything about the Mac. Unfortunately I can guarantee the emulation layer won’t be able to take care of everything and the Windows programmers will have to know something about the Mac. They will have to at least be able to build it on the Mac and figure out if their feature actually works there. But more than likely they’ll also have to know some basic UI guidelines and a general idea of how Mac users expect things to work. This includes the correct menu layout, correct dialog control layout, and the correct modifier keys to use.

    The second assumption implies that you can get away with only hiring Windows programmers and they will never ever have to touch the Mac code. Ha! The Mac emulation layer will have to be upgraded and maintained throughout the entire lifecycle of the product, almost daily. The emulation layer definitely is not a write-once-use-many-times sort of thing. You’ll start using new parts of the Win32 API that need to be implemented on the Mac, find bugs in the current implementation, and need to take advantage of new technology on the Mac.

    The conclusion to draw from the first two assumptions is any engineer will have to know or learn some basics about the Mac. They will at least need to be able to read some basic Mac GUI code and debug it.

    Thirdly, many people look at Emulation compared to, say, Core/Edge and draw the conclusion that emulation is half as cheap because the framework only exists on the Mac (in Core/Edge the framework exists on both platforms). They completely forget about the size of the framework. In Code/Edge you typically have a thin framework on each platform, where in Emulation you have a heavy framework on the Mac. Sure you don’t have anything on Windows, but you spending a lot more time on the Mac just maintaining the framework there.

    The fourth assumption people just seem to forget about. I mean, if you’re emulating Windows on the Mac, only the Mac port is going to suffer, right? Nope. Your cheap Windows engineer is going to start writing some code using some newfangled Win32 API and realize that its not implemented on the Mac emulation layer. Oops. They’d implement it on the Mac themselves, but you only hired cheap Windows programmers, remember? At this point, the engineer has to decide if they will ditch the new Win32 API’s or write a Windows only feature. So the Windows app suffers too because it becomes difficult to use the new Windows API’s. In other words, you’re really hurting both the Mac and Windows ports and the users are going to notice that.

    The last assumption is that there’s no other way to support another platform. If you’ve got a mature, existing Windows program its not going to be pleasant to port it no matter what option you choose. However, as I outline next, Emulation has a lot of hidden costs that end up making it much more expensive than the other options.

    Emulation is also non-incremental. You have to get a substantial amount of the framework written before you can start using it. This means there’s a large up front cost. In my experience it takes about ten or more engineers for about year to implement something reasonably resembling MFC on the Mac. It varies depending on how much Mac expertise you have. There’s also a large cost of upgrading this framework because you can’t upgrade one window at a time to use HIViews (or whatever), they have to be all upgraded at once.

    The cost of adding a platform (e.g. Linux) is the same as the initial development. That’s because the platform being emulated (e.g. Windows) isn’t abstracted out in any way.

    Finally, the Emulation framework is never complete. No matter how hard you work you’ll never get the Mac to look and behave like Windows even at the API level. It simply can’t be done. There are controls that the Mac has, that Windows doesn’t, and vice-versa. There are highlight and system colors that don’t have counterparts. This will trip up your Windows engineers who expect the Mac to behave like Windows. Since they don’t have any experience on the Mac they don’t know what to do, and you’re stuck.

    This is by far the worst way to approach cross platform development, and you should avoid it at all costs. I’ve worked on some large scale projects that used this approach and it was horrendous. The Windows programmers are encouraged to be clueless about the Mac and to hold up that development. It became difficult to do anything with the emulation layer except tread water. This lead to frustration and burn out. In other words, this approach is a great way to lose all your Mac engineers.

    Maintenance: D
    Initial Cost: F
    User experience: C

  2. Bridge Pattern

    This is what Qt does. You write an entire application framework that entirely abstracts out all platform details, and provide a completely consistent cross platform interface. As a result, this architecture tends to like to do everything manually. Qt does not use native controls, but draws the controls itself. It is yet another API set to learn but its really the only one you need to know (almost). You write to this one API and it works on all the platforms.

    The benefit is obvious — cross platform consistency. You can just write to the framework and it will just magically work on every platform. Since the framework does everything manually, there is a lot of control to force the consistency.

    The are a few disadvantages to this approach. First and foremost, its usually expensive. Either you have to buy a framework that does this (like Qt) or develop it yourself. Unfortunately this approach doesn’t really lend itself to an incremental implementation. To do the most basic thing you need a fairly complete application framework on each of the platforms.

    Second, the framework might not be as native as you’d like on each of the platforms. Qt has this problem. To ensure the messaging and event system works completely the same on each platform, the framework will often reimplement large parts of the native platform. This makes it easier to program against, but hurts the user because it doesn’t feel completely native.

    Lastly, the framework is yet another API to learn. Both your Mac programmers and Windows programmers have to learn it, if they do not already know it. If, for whatever reason, you need to add a custom control, then you have to start digging around in platform specific code. In that case, the bridge abstraction does not help you. Therefore, having someone fluent in Qt but not Mac and/or Windows (or vice versa) isn’t useful.

    Maintenance: B
    Initial Cost: F
    User experience: C

  3. Core/Edge

    This is a fairly simple idea. You write all your application logic in Standard C++ (or whatever common language and library). This is the “core” part, and should make up the majority of the code. In a MVC architecture, it should optimally be the model and controller parts. You also write core interfaces (class and function declarations) to access the platform specific code, such as the UI (aka the view part). The platform specific code, called the edge code, is written using a native framework or API’s.

    This strategy has several advantages. First, it gives the user the native feel and behavior of the platform because you end up using the native framework or API. Core/Edge doesn’t favor one platform over the other, but treats each platform equally. That means each can use the appropriate technologies with out have to emulate the other platform. It also means if one platform has a technology that another doesn’t its easy to use it without hurting the other platform.

    Second, it can be implemented incrementally. Since you’re using the native API’s you just start building the app on each platform as you normally would. As you start getting into common logic, or model classes, or controller classes, you simply switch to Core code. That means you write in your standard language with its standard library. (By standard I mean the language that is implemented on both platforms, like C++.) Typically your files will be broken in Core files and Edge files. If Core code needs to call back into Edge code, it calls into a Core interface. The Core interface only takes types defined by the standard language or the Core itself. The Core interface is then implemented by Edge code.

    The disadvantage with this approach is that it is easy to degenerate into the Edge/Edge approach. Doing this means you end up duplicating a lot of controller logic on each platform, or perhaps even view code. Some care has to be taken in ensuring the maximum amount of code is written such that it is Core.

    Another disadvantage is GUI resources. Since a native framework is used, then each platform is going to have a duplicate set of resources. For a given dialog, it will have to be re-layed out for each platform. If a control needs to be added, it will have be done once for each platform. This actually also provides an upside, which is each platform can use the native layout.

    In my experience, Core/Edge is the best architecture to use. It has a relatively low cost, high code reuse, and a very native feel.

    Maintenance: B
    Initial Cost: A
    User experience: A

  4. Edge/Edge

    This is just what it sounds like. Usually there is some common application logic that is shared (as in Core/Edge) but the majority of UI controller code is just duplicated on each platform. There typically is not much effort to make sure any code compiles on both platforms, but each platform is built separately from fairly different codebases.

    The one advantage of this approach is that the application will feel very native, which the user will appreciate. The programmers for one platform can concentrate on making it highly integrated into that OS, and make use of new technologies.

    The major downside is very little code is leveraged or reused. That means most code ends up being duplicated, once for each platform. That can mean the application gets prohibitively expensive to maintain, and features cost a lot more to implement.

    Maintenance: D
    Initial Cost: B
    User experience: A

This list is not meant to be comprehensive, but to outline the architectures I have experienced. Applications can also use hybrid approaches that mix and match the architectures.

The future of cross platform architectures is going to be interesting. Recently the Mac and Windows API’s have been seriously diverging. Apple is pushing Cocoa, its Objective-C application framework. Not only is it in a different language, its much higher level than the Carbon API’s and has a different object model. Meanwhile Microsoft is pushing C# and .Net. Once again, a different language and a different object model. It is no longer as obvious as how to abstract out both API’s.

As a result, the Emulation architecture, and to a lesser extent the Bridge architecture, become a lot harder to implement. Cocoa is a much higher level than MFC, so implementing MFC in terms of it would be near impossible. The Bridge architecture would also have to find some high level abstractions to properly encapsulate both .Net and Cocoa. In reality, it would probably have to continue what it is doing now: using very low level API’s and do everything manually. Conversely, both the Core/Edge and Edge/Edge would be able to support both Cocoa and .Net. Since the framework is application specific and incremental, its abstractions can be changed to encapsulate the new native frameworks with relative ease.

I will freely admit I am very biased when it comes to cross platform GUI architectures. But that bias comes from me being forced to use each of these in large software systems. From the perspective of this Mac engineer, the Core/Edge architecture is by far the best.

Filed in Macintosh, Programming | 3 responses so far

Photographic Evidence

andy on Jul 26th 2006

In order to prove that I exist, I have posted photos — in case you thought the photo to the left was doctored. I can assure you someone that homely does, in fact, exist.

OK, that’s not true, I just posted them so I could play with some new software. Its the thought that counts, but apparently not as much as getting to play with new toys.

In the photo albums you’ll find pictures of myself and the very lovely Elaine, pictures of what my family might have looked like several years ago, and pictures of cats that I do not own. If you find the pictures to harsh to look at, simply put your thumb over my photo. That’s what I do.

Filed in Personal, Toys | One response so far

Witnessing God

andy on Jul 25th 2006

In the quagmire that is my brain, I recently went down a couple of paths of thought that led me to one conclusion about what else I should be writing in this blog.

The first train of thought was actually pretty short. I was simply noticing the categories that I have in this blog, and how many posts of each kind I have. (Yes, I know I have a lot of spare time.) Even though I have several in the Christianity category, they’re all about the recent missions trip I was on. There’s nothing wrong with that, but it sent me down the path of “is that all God has done in my life recently?” I don’t think those few posts are very representative of what God has done for me. I began musing about what I could possibly write about. The problem is: I’m not a teacher, I’m not a pastor or even a deacon, I’ve never been to seminary or been ordained. I definitely don’t feel comfortable or qualified for writing essays about theology or philosophy. Not being a teacher or inspirational speaker, I don’t think I could write posts about uplifting stories or Bible studies or devotions. So what is it that I could do?

The second train of thought got started during one of my quiet times. I was thinking about evangelism, and how I could grow in that area. The missions trip was the only conscious effort I’ve had recently towards evangelism, and I wanted to change that. But being an introverted self-employed software engineer, I spend most of my time at home. Regular social activities usually involve Christian friends or people from church. I don’t meet that many non-Christian people because of this, so who am I to be a witness to?

That’s where these two trains bumped into each other. The easiest way to evangelize to people is to tell them what God has done for you. All I need to do is tell about how I’ve seen God working in my life. I can leave the teaching up to the experts. As far as to whom do I be a witness to, I can be witness to whomever reads this blog. I honestly don’t think there’s all that many people that ready this blog, but it is an audience, and its still probably larger than the number of people I talk to daily.

So from time to time I’m going to post about what God has done in my life. The first step is to post my testimony, which I’ve already done. I’m not sure how frequently I’ll end up posting, but I’m hoping with a certain regularity. If I’m feeling frisky, I’ll not only post what God has done for me, but what I learned from the experience and how I think it applies to my life. I’m hoping this forces me to be more aware and conscious of how God is moving in my life.

Filed in Christianity, Writing | 6 responses so far

Bad Behavior has blocked 1103 access attempts in the last 7 days.