|Home » Resources » Rants
Big Is Ugly
Week of July 29, 2002
Dispelling a myth.
One of the common misconceptions of the mission of Radsoft Laboratories is that it implies an elite group of intensely focused technicians pouring over source code as through a microscope to find ways to scrape miniscule pieces of flesh from already naked bones. Nothing could be further from the truth.
No one ever connected with Radsoft - there have been many outstanding contributors through the years - has ever worried about bloat or executable size in the final product. Such issues simply have not existed in this context. What has happened is that the rest of the world has increasingly distanced itself from what once were - and still are at Radsoft - considered rather self-evident principles of software engineering.
Perhaps the foremost principle of any engineering endeavour is to never lose one's head, to never suspend one's common sense; yet the history of computer science is ripe with examples of schools of thought that were ultimately abandoned because they were totally wacky. And anyone possessing common sense will affirm that waste must be calculated and factored into profit - and that only if that waste be held to an acceptable margin can the profit be taken seriously.
In arguing for the use of C rather than assembler in the coding of operating systems, Bell Labs software guru-grandpa Brian Kernighan weighed the advantages against the trade-offs and offered the conclusion that the idea was workable. Kernighan produced statistics that showed a 10% waste when coding in C compared to assembler, which also corresponded to a 10% decrease in operational efficiency.
This was acceptable, said Kernighan, because the rewards were so great. Software had replaced hardware as the most expensive component in computer systems; the proliferation of new processor architectures made it difficult and expensive to port software engineers to new platforms; and the risk for grave error increased as new code had to continually be rewritten from scratch.
Today all sophisticated operating systems and most device drivers are written in C. Dave Cutler, architect of VMS and Microsoft NT, was an avid fan of C, and initially wanted to rewrite VMS in it (his proposal was rejected). Cutler wrote his follow-up Prism in C (and took the source code cross-town to the Microsoft campus where he continued development as an employee of Bill Gates). He devised a system interface to devices that allowed - some would say forced - driver writers to use the same language. And he extrapolated the entire platform-dependent interface to a single, trivially small module: HAL, the hardware abstraction layer.
And Unix is of course written in C, and Unix is the reigning operating system of the day (some would say it is the only operating system).
Thus the accuracy of Brian Kernighan's original postulate is far beyond question. Kernighan was right: it was definitely worth the 10% waste. And today, with the increased efficacy of compilers, that 10% is undoubtedly much lower.
Not all new ideas are as intelligent. The 1980's saw a slew of ideas pop up like poison mushrooms, all based on the modular school of programming brought on a decade earlier. Elixirs such as Warnier-Orr and Jackson Structured Programming were nothing more than get-rich-quick schemes by unsavoury individuals who knew how to exploit typical IT management. No one talks of Warnier-Orr today, and after going bankrupt in England, Jackson was bought up for one pound sterling by a Swedish firm which succeeded in prolonging the agony for another few years before even that idea went tits up.
Out in California things were moving towards the graphical interface. The Xerox Palo Alto Research Center (PARC) had begun work on a system called SmallTalk, and the engineers in the SmallTalk project immediately recognised the need for further modularisation of system code.
A simple example will illustrate their discovery. Imagine a dialog with several entry fields, where you the user are expected to input data into each field. As you type your input, the data will appear. The dialog has the responsibility of not only accepting your input, but also of retaining it and displaying it. If your dialog is temporarily covered by another window, it must find the text you have entered and display it again when the other window is gone.
The prospect of loading new code modules to handle each and every entry field in the system - not to mention the plethora of push buttons, check boxes, radio buttons, static text label fields, scroll bars, combo boxes - and not to mention common behavioural denominators such as minimise, maximise and close buttons, caption bars, window borders and frames, etc etc etc - was obviously inadmissible. PARC needed to extrapolate this logic.
What followed was a kind of 'object orientation', wherein the various components of the graphical desktop were regarded as objects, and structurally held together by their behavioural code. While each instance of an entry field would display its own text, the code to accept this text, retain it, and display it would be centralised and instantiated only once, no matter how many entry fields were in use.
Even hard-core Windows programs such as Radsoft's are in this sense 'object oriented': each window is an instance of a 'class' which defines its behaviour (and appearance), and all instances are managed by the same code, instantiated (loaded into the computer's memory) only once.
SmallTalk went on to revolutionise the world of computer science. Although Steve Jobs was on his way with plans for a GUI before being formally introduced, he was no doubt strengthened in his decision by the acquaintance (he also took away their most formidable engineers). The Macintosh led to similar systems for the PC and other platforms, and the world never looked back. And these systems were designed to conserve resources - not to waste them.
Coding a GUI application is not a trivial task. Estimates are that well over half the code deals with the interface, while what is left contains the actual logic - the developer's workload doubles. And the underlying programming principle - what is known as 'event-driven programming' - is startling, not to say insurmountable, to many developers. The application programming interface (API) can be more complicated by a factor of twenty or more. Where developers rarely had to consult their manuals, they suddenly needed extensive (preferably onscreen) documentation, as most of their code uses system functions which can never be put to memory.
And still the developer must build an adequate, working overall picture of how the system works. The developer must know, for any given requirement detail, what controls to use, where the undocumented features are, where the documentation is incorrect - the developer must become an adept in a new black art. This is a target most IT managers refuse to believe their subordinates can reach.
When GUIs - primarily through the sudden success of MS Windows - broke on the computing scene, IT managers were quick to announce, yea to boast, that they were going to transform their departments and migrate their software to the new platform in a matter of months. When this proved impossible, they panicked and searched for an alternative to vanilla systems development, and found it in 16-bit Visual Basic.
Visual Basic, the lovechild of Bill Gates (who reputedly knows a bit of BASIC and nothing else), was merciless with machines of that era. Loading a single VB program could sink the most stalwart machine. Clearly, Visual Basic was not the answer, yet there was no alternative.
Other development ideas gradually surfaced. Borland and Microsoft touted proprietary class libraries, other software companies developed their own systems, and many of these companies gravitated towards C++. Using C++, both Borland and Microsoft were able to make the GUI programming learning curve easier. Borland's system was highly abstracted, while Microsoft's was basically a 'one-on-one' with their existing API.
Yet no one bothered to calculate the waste. Program skeletons created by either Borland's or Microsoft's system were horrendous, abysmally bloated monsters. The Borland development environment itself could sink a box. And the calculated waste for Microsoft's offering was 1000% - one hundred times more than the acceptable 10% Brian Kernighan had cited twenty years earlier.
IT managers welcomed the new systems, while a few not totally asleep individuals reacted. Accepting a 10% trade-off - after carefully considering the alternatives - was one thing; mindlessly swallowing a 1000% waste was quite another.
The availability of Visual Basic - and later Delphi - also made it possible for amateurs to try their hand at programming. To be sure, their efforts would never be ratified by a professional programming department, but the programs would often work, and amateurs could not be expected to understand the underlying principles (and their flagrant violation).
As computer hardware became more powerful, IT managers reconsidered the VB and Delphi options, and more often than not allowed their use once again. Visual Basic and Delphi became corporate standards; use of even C++ was often severely frowned on - not to mention hard-core programming such as Radsoft's.
The problem with programs that are written efficiently and work well is that someone sooner or later will have to work with them, and the developers who write them are too expensive to ever be on staff. Because IT managers have long ago given up on the idea of properly written programs, programs that are properly written cannot ever be used. If staff only know Visual Basic, then all programs must be written in Visual Basic. If an expert consultant is brought in to write one Ferrari program in C, and if that program later need minor adjustments, no one on staff will be able to do it.
C++ and the class libraries of Borland and Microsoft - and especially the development environments - are severely lacking. While C++ is far from a loveable language, its inherent inefficiency need not be so bad. Especially Microsoft's development environment suffers from a gross level of sloppiness. Akin to the medieval surgeon, Microsoft tosses junk into programming projects - junk never needed and never used.
Developers forced to use Microsoft's development system must learn where these inefficiencies lie. Most do not know, and many do not care. Executable sizes go ballistic. And suddenly people write to Radsoft, asking how to do the impossible.
And while object orientation is today thriving at Apple, where the superior NextStep/Objective-C system is in use, it is otherwise on the ropes, with visionaries now predicting its imminent downfall. But whatever the overnight self-made gurus come up with this time, they would serve us better by taking the time to calculate waste before they take that first giant step. They would profit by remembering the foremost principle of any engineering endeavour.