Rant of the Week


	`About \| Buy Stuff \| News \| Products \| Rants \| Search \| Security`

Home » Resources » Rants

WinFS Is Dead - Long Live WinFS

Week of August 28, 2004

Microsoft have announced the demise of their WinFS file system - again. Is it really gone?

The Seattle Post Intelligencer reports that WinFS is no more. Speculation is that the Redmond beast wants to concentrate more on security. But is that really an issue? To get the background on this one it is necessary to go back - to go way way back - to the dawn of civilisation. And if you, dear reader, think this rant has lost its track, just hold on. In time all will become clear.

Once upon a time - over twenty years ago in fact - there was very little. Anywhere. There was only the IBM PC. This PC ran in text mode (mode 3) with a monochrome display adapter (MDA) scheduled to output video memory at segment address b000. The screen was a succession of character and attribute pairs - the screen measured 80 columns by 25 rows with two bytes for each place on that map. That was reality.

Under the bonnet - down below - was something called MS-DOS or PC-DOS. The rage of the age was a version called 2.11. It had nuances of Unix in it. But it was basically only the basics and little more than a 'hardware interface'.

You could program video memory, which was a big boon - Unix terminals struggled with the same thing and foundered. All you had to do was kick the adapter into mode 3 and start addressing things up at segment b000 (later segment b800 for the colour graphics adapters CGA and EGA). One character, one attribute, one character, one attribute - that was it.

Along comes a guy sells a program to Philippe Kahn - the program was called 'SideKick'. SideKick was cool and some of it was brilliantly programmed. SideKick was the first 'TSR' - terminate and stay resident - program. It loaded but did not leave memory. It just disappeared from the screen. It found memory in the computer where it could hide. It polled the keyboard buffer, looking for a magic combination such as 'shift-shift' (both shift keys pressed at the same time) whereupon it would pop up on the screen over everything else that was already there.

SideKick used 'attributed' character strings - something Radsoft picked up on right away and incorporated into all their MS-DOS development libraries. It also used proprietary storage areas to pull off a very neat trick.

While working in almost any program at all, you could hit 'shift-shift' to call up SideKick and then select an area of the screen and have SideKick copy it. Then you hit 'shift-shift' again to make SideKick disappear, exited the program you were running, started a new program, hit 'shift-shift' again, and pasted SideKick's data in.

The clipboard was born.

Conferences were soon held by all the TSR vendors to arrive at a method for creating 'law and order' up there in the regions where TSRs liked to habituate, but the point was what they'd done with the clipboard. Another corporation was soon to show them all.

Lotus Development Corporation came out of nowhere and sold software for a staggering $45 million in their first year - and all for a product called '123'. 123 was the first 'office package' for the PC - and it used a clipboard mechanism.

You could copy and paste data between the applets in 123. A new technology was born.

Around 1990 it became clear Microsoft Windows would indeed be a winner, and Windows had its clipboard. Things might not have looked too spiffy back then, but it worked - mostly. The Windows clipboard wasn't perhaps the greatest at mixing data formats, but there was another issue that had to take precedence.

The idea of the clipboard is very simple. As SideKick showed, you simply keep memory of your own somewhere out of reach of the other applications. When someone wants to 'copy' something to the clipboard, you put that something in your memory area. When someone else - or the same application - wants to paste something in, you give them back the contents of that memory area.

The essence of this clipboard mechanism is that it is manual: it requires user intervention. The user must select the data to be copied, invoke the specific copy command, switch to the target app and denote the target location in that app, then invoke the paste command. The wizards at Microsoft sought a way of doing this without user intervention. They called their invention 'dynamic data exchange' (DDE).

DDE was very simply an automated clipboard mechanism. It simply worked from input data as to where the source of the data would be, who the target would be, and in effect a communications channel could be set up between two distinct running processes - the 'pipe' that they would use for these communications was nothing but a clipboard.

Things got unwieldy after a time with DDE, so Microsoft wrote something they called the DDEML - the DDE Management Library. This was supposed to make DDE operations simpler. Most vendors found it made things more difficult, but that's not the point.

The point is some freakos in Redmond saw another extrapolation of this paradigm.

Up to now they'd only been dealing in transfers of static data; what if you could embed 'intelligent memory' from one application into another? Say you had a word processing document - could you in fact embed a part of a spreadsheet in it? And if so, could you perhaps - cross yer fingers - get the spreadsheet program to automatically take care of that little do-thingy in the word processing document? And even further: could you make it so that if a user changed the source of that spreadsheet do-thingie on disk, that the word processing document would automatically reflect those changes too?

That's a lot.

Object Linking and Embedding (OLE) was born. OLE ran on Windows 3.1. It ran using DDE as a sort of 'carrier wave'. It could both take care of automated 'copy and paste' and also link in relevant objects based on actual disk files. If the contents of these files changed, the documents they were found in would change too.

Everybody thought this technology was astounding. Minds boggled. The OLE team kept going.

So far we're talking about documents, the OLE people said. What if we could extrapolate this to apply to anything anywhere? OLE2 - Object Linking and Embedding version 2 - was born.

OLE2 was based on something called 'COM'. 'COM' stands for 'component object model' and yes, it's based on something from the world of Unix called 'CORBA'. As the documentation went to pains to remind everyone all the time, COM was a binary interface - it wasn't program code. It simply delineated how objects might work together in memory.

But 'simply' was not a good operative word here. The original OLE2 spec was over 1,000 printed pages and the third party vendors went through the roof. How could they be expected to work with something so hopelessly exaggerated? But Microsoft would not budge. Some vendors tried their hand at it, but acceptance was not really there.

Kraig Brockschmidt, author of the original Windows calculator, was put in charge of the OLE2 project at Microsoft. Kraig programmed in C and nothing else - and OLE2 was written exclusively in C++. Kraig refused to learn C++. He would head the OLE2 team anyway.

Kraig wrote a book about OLE2 with the immortal opening phrase:

This is a book about fish.

Kraig did an admirable job of trying to explain this unwieldy technology and of addressing a number of concerns third party vendors had. One of the major concerns was about 'bloat on disk'.

OLE2 documents were sloppy. They contained objects from other documents, which in turn could contain objects from yet more documents. And so forth. Partly in the interests of speeding things up, certain concessions had to be made. It certainly wasn't efficient to save all this stuff every time it had to go to disk. If only a small part of a document were edited, it made more sense to save only that part. And OLE2 was never any sort of speed demon anyway, so this became essential.

What the OLE2 team came up with was something called 'storages and streams'. These could in some sense be equated to directories and files - except they were all in the same physical file all the time. A 'storage' was a location for information about streams, and a 'stream' was the equivalent of a file - some kind of data defining an object in the document.

If only one object in the document was edited, then only that stream had to be modified. The rest of the file - all its other streams, all its storages - could remain as they were.

[If you at this point think you know where all this is leading, and if you call to mind a certain quote from Shakespeare, then you are on the right track.]

The problem with storages and streams is that they tend to fragment. If a user deletes an object in a document, the OLE2 idea doesn't have an easy way of freeing the disk space in the stream for future use. Everything is more or less 'staked out' on the first major save and wherever everything is, that's where it stays. If things are added to the file, that's easy: just extend it. But when objects are freed it's very difficult to reclaim the space previously used.

So Kraig Brockschmidt set about in his book trying to show everyone that things were not all that bleak, that OLE2 storages and streams could in fact be defragged - albeit with difficulty. Kraig's program was written for Windows 3.1 and not for the coming 32-bit platforms. The documentation said it would in fact not work on these new platforms. The shame of it was it didn't even work on the 16-bit platform it was written to work on.

OLE2 storages and streams technology lie at the bottom of a lot of Microsoft 'technology'. The algorithms used to manage the Registry (and bloat it) are all based on OLE2 storages and streams.

So is WinFS.

WinFS - or 'OFS' (object file system) as it was once called - is nothing other than a glorified OLE2 storage and stream. It establishes a 'file system within a file'. When things are complicated enough already on disk, Microsoft go - and in typical fashion - introduce yet another level of complexity nobody wants and nobody needs and then...

Precisely: OFS was slated for 'Cairo', the original plan for Windows NT 4.0. Voices were hushed world-wide. Vendors held their collective breath. Such a file system could spell doom for the foundering platform. But as things turned out, OFS was shelved, vendors broke out the champagne, Bill Gates announced Cairo was never an operating system anyway, and everyone got back to work.

Some ideas never die. The ones we notice the most are of course the bad ones. WinFS - the latter-day incarnation of OFS, OLE2 storages and streams on disk as a true file system within a file - is such a bad idea. That it's now been shelved yet again must send waves of rejoicing far and wide.

But it's been gone before, just like Freddie, and it could be back again at any time.

Don't get too complacent.