The Bug of Quality

One of the perks of being a contractor is that I get to see all of the different ways companies handle the software process, such as the quality aspect of a product. This exposes me to different thoughts of how to improve and maintain quality, and allows me to evaluate them on their effectiveness. It is quite the learning experience.

As you might expect, everyone manages the quality assurance process differently. However companies can be broken down into two basic groups: those who know what they’re doing, and those who would better serve society by violating child labor laws while manufacturing some shoddily made rusty springs targeted as toys for preschoolers.

I’ll let you decide which group your company falls in.

Kicking the Bee’s Nest

Most companies in the latter group place a lot of emphasis on the quantity of bugs as a measure of how well the Quality Assurance group is doing. The result of this is an overflow of worthless bugs, and an irate quality assurance group who doesn’t want to talk to anyone from whom they can’t steal bugs. This can lead to ridiculous extremes where time is wasted just so QA can have another bug attributed to them.

In one instance I wrote a bug describing how a program’s updater functionality needed to display a progress indicator. I included steps for reaching the place in the program where the progress dialog should be, the actual results (no updater), and the expected results (a progress dialog.) I even included what should be in the progress dialog. The QA assigned as the owner did not like the fact that she did not get credit for the bug. So my bug was deep-sixed, and a new bug was given to me (credited to the QA) with simply the phrase: “The updater should show progress.”

Wow.

This just goes to show that measuring by quantity means that QA is going to try to game the system. That is, they’re going to go find the easiest to see bugs and write those up. This leads to a lot of typos and graphical nits being written up, while the really bad, functional bugs are ignored, because finding those requires an investment in time. In other words, the quality of the software goes down, but the number of bugs being written goes up. And management never has a true picture of the software quality.

Moreover, an emphasis on quantity means that bugs are going to have incomplete information. This means that engineers are going to either have to send the bug back to QA to complete it, which wastes time, or the engineer will have to figure out the missing information themselves. This is problematic because the engineer’s time costs more than the QA’s time, at least in any company I’ve ever worked for.

Too much emphasis on the quantity of bugs will give you the wrong results. Actually, let me take that a step further: any emphasis on the quantity of bugs will give you bad results.

Grading quality assurance people by how many bugs they write up is like grading engineers by how many lines of code they write. The quantity of bugs has absolutely no correlation to the quality of software, or even the performance of a particular QA. Sixty cosmetic bugs don’t indicate a bigger problem than six crashing bugs. Remember, nothing improves in the software unless you can actually do something with the bugs that are written up.

Quality is the Queen Bee

To use an overused phrase: It’s about quality, not quantity. So what is a quality bug? Well, to start off with, let’s define what a bug is:

A bug is a description of a defect in the software.

Although that is an accurate description of what a bug is, it doesn’t help us determine what a quality bug is and isn’t. To do that we need to remind ourselves of the purpose of a bug:

The purpose of a bug is to reliably identify flaws in the software, and ultimately allow the software to be improved in some way.

I’d like to point out that “improving the software in some way” does not necessarily mean fixing the bug. It might be providing a work around, or just documenting its existence so management knows about it.

In order to do anything with a bug, it has to be reproducible. This is probably the most important aspect of a bug. If the engineer is going to fix it, he or she needs to be able to see what’s going on. For this reason, it is important that each and every bug have precise steps on how to reproduce it.. I can’t tell you how many bugs I have been assigned that say “feature Foo does not work,” with no steps on how to reach feature Foo, which is invariably an obscure feature only accessible by navigating through no less than three menu options, six modal dialogs, and the Secret Squirrel handshake.

Furthermore, a bug should contain any additional information that will help reproduce the bug. For example, a screenshot of the dialog that’s all wonked. Or the actual file that QA used to cause the crash they’re reporting. Anything that helps the reader of the bug reproduce the bug, improves the quality.

Although this might seem obvious, next to being reproducible, the bug needs to state what’s wrong. A lot of people in the QA profession seem to think that after giving steps, the defect would be obvious. Unfortunately, often it’s not. Because of this, every bug must contain the actual results of executing the steps, and the results QA wanted to see (the expect results).

For example, I once received a bug that gave, in excruciating details, how to access a certain dialog. I easily repeated the steps, and stared at the dialog for half an hour. I couldn’t figure out what was wrong with it. The wording was correct, the UI was laid out fine, and it had all the functionality of its Windows counterpart. After sending the bug back to the QA, I found out that she simply didn’t want the dialog to exist at all. I’m still trying to figure out how I was supposed to know that.

Bugs should be self contained. Anyone who looks at the bug should not have to go looking in various documents and file servers just to replicate the bug, or determine what it’s about. Once, I received a bug that simply said:

Program violates Section VII.A.2.c of the spec.

My jaw just dropped when I read this. First, I didn’t have access to said spec, and assuming I did, I wasn’t about to go read it just to figure out what the author was talking about. Mainly because it still wouldn’t tell me how to reproduce the bug or in what way it violated that portion of the spec. The author of the bug had determined all of this information, and should have recorded it in the bug. Wasting the reader’s time is the mark of a low quality bug.

Probably the easiest way for me, as an engineer, to determine who on the QA team is merely adequate, and who is a superstar, is to see how well they whittle down the required steps to reproduce a bug. The merely adequate will simply use the first set of steps they found to reproduce the bug. The really exceptional QA will go a step further and determine which steps are actually required, and which can be removed and still have the bug reproduce reliably.

The benefit of having the minimal number of steps to reproduce is twofold. First, it saves the reader time when reproducing the bug. Second, it helps zero in on what piece of functionality in the program is actually broken. This is beneficial to the engineer because I can narrow down my search for the bug in the codebase. It’s beneficial to the manager because he or she has a better idea of which components in the software are stable and which need more work.

Back to the Hive

In order to “measure” how well Quality Assurance is doing, some companies have completely lost the point of writing a bug. The point of a bug is to reliably identify a defect in the software, so that it may be addressed. If the bug can’t reliably do that, it is not worth anything, and in fact, has a negative worth because of the time wasted creating it and the time trying to reproduce it.

In order to get worthwhile bugs, the emphasis must be on quality, with no attention paid to quantity. Quality demands that bugs always contain certain information, such as steps to reproduce, actual results, and expected results. This allows the development team to have a better understanding of the quality of the product, and to take the proper action.

Java vs Objective-C: The Smackdown

I’m in the process of translating an incomplete Java program I wrote into an Objective-C program. It’s a simple interpreter for a very simple kid programming language. At the time, it had been a while since I had dabbled in Java and thought it would be fun to get back into it. After all, it had “generics” and whatnot now, and other features that C++ programmers claimed it needed to be a “real” language. And what could be more fun than the overuse of angle brackets, incomprehensible error messages about type conflicts, and sounding all smart rambling on about “generics?”

Gooey Troubles

The Java programming went well until I had to start implementing the user interface for the interpreter. Now the language uses lots of graphics and such, because its supposed to be visual, and thus more fun for the kiddies. Although I remembered Java as being a nicely designed language, with a good runtime library, I had forgotten how bad the GUI programming was. Perhaps I was simply repressing the traumatic memories.

First off, there are a lot more choices for GUI programming in Java now than there was the last time I did Java programming. Back in the day, it was AWT (Abstract Window Kit) or nothing. AWT wasn’t horrible, assuming you were a masochist, but you couldn’t precisely position things, and the controls never looked native. Oh, and it was insanely slow too.

Fortunately, today you have more than one choice when it comes to UI toolkits. There’s still AWT, but Sun also introduced Swing (as in, swing from a rope) to replace AWT, and IBM came up with SWT, which apparently stands for “Swearing With Toolkits.” Now, when I say Sun replaced AWT with Swing, I actually mean they confused the hell out of me. Because Swing doesn’t actually replace AWT. It kinda supplements it in some places, while in others outright replaces AWT classes. But good luck figuring out which class does what. After about an hour with the Swing documentation I decided it’d be easier to just implement the entire UI in ASCII art. Oh, and if you were worried about the non-native control look in AWT, don’t worry, Sun kept that feature for Swing.

Then there’s SWT, which apparently stands for Surely the Wrong Thing. It was created by IBM for the Eclipse project, because AWT and Swing were so awful. But don’t worry, IBM bested Sun at its own game and managed to create something that was even worse. The Mac version of SWT was written by an Mac hating chipmunk. It uses old style Carbon controls and Quickdraw. That’s right, not even HIView’s and CoreGraphics, much less Cocoa. I’m hoping that since Adobe has an Eclipse plugin now, that they’ll rewrite SWT/Mac to be something decent, much like what they did with Premiere.

There’s also the Java Cocoa bindings from Apple. They’re not all that up-to-date though, and Apple has said they’re not going to be supported any more. Furthermore, if I’m going to be writing to Cocoa, why am I writing this program in Java?

So that’s where I stopped with the Java project. I decided that I didn’t really need all this pain to do, what should be, basic GUI programming.

Learning from the pain

I have a confession to make: I like Java. Or, more precisely, I want to like Java. I like the simple language design. It’s easy to read and write. It comes with a great runtime library: built-in threads and sockets and all sorts of good stuff. I think there’s a few things Objective-C can learn from Java, other than how to be tasty.

Probably the biggest thing I noticed when converting this Java program into Objective-C was the the garbage collection. What used to be a one line assignment in Java ended up being three or four lines in Objective-C to make sure it was retained and released properly. I found that there was more Objective-C code than Java code, primarily because I needed to manually manage memory. When I write straight Objective-C code, I don’t really think about it, because where I started, C++, memory management is even more manual. But doing the conversion really drove the point home: I’m wasting a lot of thought cycles and code on memory management. I also found that I suddenly had to worry about cyclic references, which I didn’t have to worry about in Java.

I know Objective-C 2.0 is going to have garbage collection, but it can’t get here soon enough for me. This conversion process only confirms the fact that the feature is well overdue, and how insufficient simple reference counting is.

Speaking of memory management, why is it that I have to manually call alloc and init? And inside my init, why do I have to manually call my super class and check self for nil? Why do I have to call my super class in dealloc? Maybe I’m delirious from all the reference counting I’ve been doing, but it seems to me like that’s all boilerplate code. Boilerplate that the compiler should be taking care of, not lazy ‘ol me.

Other than memory management, my other gripe is exceptions. Objective-C doesn’t unwind the stack per-se, it essentially just restores the stack and registers to a previous location, much like a thread context switch. i.e. It does a setjump() and longjump(). That means things get left dangling, memory leaks (‘cause, you know, no garbage collection) and it’s generally harder to clean things up and get back to a known, stable state. I know exceptions still work this way because of historical reasons, but it has to move on at some point. Currently the exception handling is pretty much worthless, which is probably why Apple still recommends NSErrors instead of exceptions. Exceptions aren’t worthless, just the way Objective-C implements them is.

No, it really was that painful

Now, lest you think I’ve gone all soft on Java, I have plenty of other gripes about Java. These didn’t come rushing back until I started the conversion to Objective-C.

Enums. Seriously, Sun, what up with that? One of the easiest to understand programming concepts, that’s equally as easy to implement, and it’s completely not there. What, did you think I’d like to list out every constant value, along with it’s type and value, instead of the compiler inferring the type and automatically generating a value for me? I may be a masochist, but I’m not that much of a masochist.

I’m still trying to figure out mule-headed reasons behind the “no free functions” rule. Is the thought that it would force the users to write object-oriented code? As they should be well aware, you can write really bad non-objected oriented code using just classes. Just look at MFC. Furthermore, just because someone writes a free function, that doesn’t make the code not object oriented. This failure of the Java language just feels like a religious argument to me.

Java really strives to be easy to read and understand. This lead to the fact that object parameters are passed by reference, while scalars are always pass by value. While this is somewhat easy to understand (See kids, this arbitrary type is reference, while this other arbitrary type is value), it doesn’t help real programmers. Because sometimes, I just have the crazy urge to retrieve the results of a function that’s done some work for me. I had more than a few classes that were just wrappers around out parameters.

And don’t even get me started on that stupid exception specification checking. Morons.

Ending the pain

Comparing Java to Objective-C is interesting to me because Java was influenced by Objective-C. All in all, almost all constructs in Java have a direct equal in Objective-C, which makes the conversion process fairly straight forward. However, while Java has continued to evolve, Objective-C hasn’t. Well, Apple is pushing forward with Objective-C 2.0 improvements, but the language has been dormant for many years. Hopefully, Apple will continue the improvements, and Objective-C can catch up.

As for Java, they just need to scrap all their previous GUI attempts and start over again. Until then, I don’t care how much support for “generics” or other modern language features Sun adds, I can’t take it seriously as a language or a platform.

Objective-C 3.0

My previous post about Key-Value Coding and being able dynamically add transient attributes to NSManagedObject got me thinking. Why can’t I dynamically add methods? I don’t mean on just NSManagedObjects, I mean on NSObjects. I realize that categories allow that somewhat, but only as a group of functions and only at compile time.

Anyway, this train of thought got me to thinking about what I’d like to see in the next version of Objective-C. Since 2.0 is pretty much a done deal, I’m thinking forward to Objective-C 3.0. I also ended up thinking about what it would take to implement the proposed features, in hopes it would give me a better idea as to what was feasible and what wasn’t. That said, I could totally be smoking crack, and none of this may be feasible.

Anonymous Functions

In the new Objective-C, I want at least some basic anonymous function support. To use JavaScript as an example:

this.foo = function() { alert('foo called'); }

For the JavaScript uninitiated, this adds a function called “foo” to the current object. In Objective-C, it would end up looking like:

[self addMethod: @function() { NSLog(@"foo called"); }
	forSelector:@selector(foo)];

As shown, this should be easily implementable because the anonymous function only touches global symbols. How to access globals doesn’t change from a normal function to an anonymous function.

Accessing instance variables in anonymous functions

Things get more interesting, and more difficult to implement, if you allow the code block to touch instance variables and methods. For example:

@interface Foo {
	int count;
}
- (float) cost;
@end

@implementation Foo

- (id) init {
	[self addMethod: @function() {
		NSLog(@"%d items at %f each",
			self->count, [self cost]);
		}
		forSelector:@selector(display)];
}

@end

In the above example, accessing the method, cost, should be pretty trivial to implement inside of a anonymous function. self is just a parameter to the function, and the compiler just needs to send a message to it.

Unfortunately, accessing the ivar, count, off of self, is probably impossible to implement. When the anonymous function is compiled, it doesn’t know which class it is attached to. This is a problem because in C, structure accesses are just byte offsets from the address of the structure. On any given class, count could be anywhere within the structure. Since the anonymous function doesn’t know which class to look at, it doesn’t know what byte offset to generate.

Fortunately, it looks like Objective-C 2.0 is going to add property support, which is basically syntatic sugar. Properties will automatically generate methods needed to access or mutate them. This brings properties into the realm of feasibility inside of anonymous functions because they can be treated as method calls.

Accessing local variables in anonymous functions

Although the ability to access and use member variables inside of anonymous functions is interesting, it’s also important to be able to access local variables. For example:


- (id) init {
	int count = 20;

	[self addMethod: @function() {
		NSLog(@"%d items", count);
		}
		forSelector:@selector(display)];

	[self foo]; // indirectly call the anonymous function
}

- (void) foo {
	[self display]; // invoke the added method
}

I should point out immediately that this example is an extremely contrived case, and doesn’t show how useful accessing locals from anonymous functions are. I need to introduce another concept before the usefulness becomes apparent. So for now, I’ll assume that it’s important, and just delve into how this might be implemented.

In the example given, count is a local in init, and it is used in the anonymous function. The compiler needs a mechanism to “bind” the local count to the anonymous function such that every time the function is invoked it can locate the local in memory. The compiler cannot assume that the anonymous function will always be invoked from the function that it was created in, but might instead be invoked by a function further down the stack. e.g. init calls foo and foo calls display. This fact rules out the possibility of passing in the local as a parameter to the anonymous function.

Normally locals are accessed by an offset from the base of a function’s stack frame. So if the code in the anonymous function can find the right stack frame for init, it can easily find the local count. It is also easy for code to walk up the call stack, and thus iterate the stack frames.

However, there is nothing currently in the stack frame that would uniquely identify a function. So the compiler would have to generate a context id for each function that defines an anonymous function, and generate code to store it in the stack frame. At runtime, the anonymous function would iterate up the stack frame until it found the correct context id, and then use the corresponding stack frame pointer as a base for the local’s byte offset.

The problem with this implementation is that I don’t remember seeing a space in the stack frame to store a context id. It would have to be a fixed byte offset from the base of the stack frame in order to be easily found. This is based off my memory of the PowerPC calling conventions. I unfortunately don’t remember my Intel calling conventions from my college days.

If the context id can’t be stored directly in the stack frame, then it’s possible to create a parallel stack which contains only the context ids. It would have to be stored as pairs of context ids and stack frame pointers. The downside of this approach is care would have to be taken to ensure thread safety.

Finally, there’s an obvious problem with the lifetime of local data. After init returns, count ceases to exist, and any call to display will result in a crash. Although at first tempted to try to fix this, I actually think it’s OK to allow the crash at runtime. This wouldn’t fly in the interpreted language world (like Ruby), but Objective-C isn’t interpreted. Objective-C programmers know they shouldn’t return pointers to locals, so creating anonymous functions that reference locals that will go out of scope shouldn’t be foreign or surprising to them.

Closures

Of course, the whole reason I bring up anonymous functions is because I want closures. Let’s take an example from Ruby:

array.each { |element| print element }

For those unfamilar with Ruby, some explaination is required. array, is, um, an array, and each is a instance method on array. each walks each element in the array, and calls the anonymous function attached to it, which is everything inside the curly braces, passing in the element. Parameters to the anonymous function are declared inside the pipe characters.

In Objective-C, the code would look something like:

NSArray* array = ...;
[array each] @function (id element) {
	NSLog( @"element: %@", element );
}

Where the implementation of each would look like:

- (void) each {
	NSEnumerator* enumerator = [self objectEnumerator];
	id element = nil;
	while ( element = [enumerator nextObject] )
		[closure yield:element];
}

Most of the above code is self explanatory. The new stuff is the closure object. Like the self object, it is a hidden parameter that is passed into the method except that it points to the anonymous function. The yield method takes a variable argument list, and actually invokes the anonymous function.

The interesting thing about the above code is that it implies anonymous functions can be encapsulated by objects. I’m not sure how feasible this is, however. An alternate syntax might be:

- (void) each {
	NSEnumerator* enumerator = [self objectEnumerator];
	id element = nil;
	while ( element = [enumerator nextObject] )
		@yield(element);
}

The only gotcha to the new syntax is that it doesn’t allow the called function to determine if there’s a closure attached, and change its behavior. As an example as to why you might want this can be illustrated via an NSArray initWithCapacity method:

...
- (id) initWithCapacity:(unsigned) size {
	...
	if ( closure == nil ) {
		// Normal initialization with capacity
	} else {
		for(unsigned i = 0; i < size; ++i)
			[self addObject: [closure yield:i]];
	}
	...
}
...

NSArray* uninitializedArray = [[NSArray alloc] initWithCapacity:20];
NSArray* initializedArray = [[NSArray alloc] initWithCapacity:20]
	@function(unsigned index) {
		return [NSNumber numberWithUnsignedInt: index * 2];
	}

As you can see, initWithCapacity changes behavior depending on if an anonymous function is attached or not. If there isn't a function, then it simply allocates an array of a given size. If there is a closure, then it calls the function to generate each element.

So, as you can see, it's advantageous inside of a method to know if there is an attached anonymous function. If @yield is used, then you obviously lose this ability. There are ways around this, such as introducing a new hidden BOOL parameter, or another code flow construct. However, they aren't quite as elegant.

Examples

Earlier I mentioned that it is an interesting feature to allow anonymous functions to access local variables. This becomes more apparent when using closures. For example:

unsigned sum = 0;
[array each] @function(NSNumber* element) {
	sum += [element unsignedIntValue];
}

You also don't have to write separate functions for threads. With closures you could write something like:

[NSThread spawn] @function() {
	NSLog(@"Hello World from a thread.");
}

I can take my earlier example of dynamically adding a method and make it simpler with closures:

[self addMethod: @selector(foo)] @function() {
	NSLog(@"foo called");
}

Syntax

If you go back to the Ruby example, you'll notice that it has a much nicer syntax. Anonymous functions look like normal code blocks in Ruby. I don't think that would be reproducible in Objective-C because of the backwards compatibility to C. Furthermore, Objective-C likes to prepend an '@' to all the Objective-C features. I'm not sure if it's for technical reasons or purely aesthetic reasons, but it's unlikely that Objective-C would drop it for new features. In an attempt to clean up the sum example, it could look like:

unsigned sum = 0;
[array each] @ {
	@| NSNumber* element |

	sum += [element unsignedIntValue];
}

However, I'm not sure how much of an improvement this really is. I had to retain the '@' character, so the anonymous function still doesn't look like normal code block. What's more, the use of pipe characters to declare parameters will probably look foreign to most people used to C. Using the C function style parameter list will probably feel more intuitive to more people.

Properties

Although I've focused on methods the entire article, there's no reason why the same thing couldn't apply to properties.

For example:

[self addProperty: @"cost"] @function() {
	return 30.0;
}

NSLog(@"cost %f", self.cost); // prints 30.0

Of course, there's no reason why the property has to be backed by a method. It could be as simple as adding an entry to an internal NSMutableDictionary. In fact, if the caller doesn't provide a closure, then addProperty: could just add a dictionary entry:

[self addProperty: @"cost"] ;

And to initialize it with a default value:

[self addProperty: @"cost" withValue: [NSNumber numberWithFloat:30.0] ] ;

Conclusion

When I started writing this article, I was just thinking about the direction that Apple was going with Core Data. I didn't intend to end up talking about adding closures to Objective-C, but that's the logical progression. Despite all the rambling on my part, I didn't even touch other interesting features, such as built-in concurrency support. I suppose that's another post for another day.

Objective-C is unique in that it combines the low-level functionality of C with some really high level features, such as garbage collection, not commonly found in compiled languages. The high level features really improve the productivity of programmers, but they haven't been updated in a while (Objective-C 2.0 not withstanding). I'm not sure if Apple will implement these features, or if they do it will be in the way I implied, but I do hope that the language will continue to progress.