|Home » Resources » Rants
Long Live the Registry
Week of November 26, 2005
Microsoft Windows Vista was to finally dispense with the Registry, but it's proved impossible. The Registry will be with Windows users for the foreseeable future.
This is a fatal blow.
Microsoft were to finally morph the Registry into something better with their coming Windows Vista. This is no longer in their plans. Dispensing with the Registry proved impossible. Windows users may very well have been dealt a fatal blow.
What is the Registry?
The Registry is a repository for configuration data of all kinds. Before looking at what's in the Registry and how the Registry works, it might be good to look at how an ideal system would work.
All modern Internet connected systems have to be multiuser. This has been noticed and proven too many times over. Being multiuser means configuration settings have to be user-specific. Being multiuser means users have different privilege levels. All users are able to manage their own settings, but not necessarily others. Sun Microsystems are planning on introducing additional 'granularity' into this concept for a future release of their OS.
Configuration settings may be generally divided into a number of 'domains'.
- System Domain. These are settings that come from the operating system itself. The operating system may be booted locally or through a network; in either case these settings belong to the system domain.
- Network Domain. If the machine is connected to a network, settings found in this network will next apply.
- Local Domain. These are settings which are common to the entire local machine and pertain to all users of the local machine.
- User Domain. These are settings which pertain only to the current user.
- Command Line. The user is able to override user domain settings on the command line for the launch of a process. These settings will normally not alter the user domain settings beyond the execution of the associated process.
In the above scenario, the system will take the given domains in the above order, letting each successive domain override settings previously found.
As an example, postulate the operating system itself sets the default font as Helvetica, the local network sets the font as Monaco, the local machine sets the font as Lucida, the user sets the font as Arial, and on the command line the user alters this to WingDings.
For the process being launched, the system font would be WingDings; without the command line domain being used, the font would be Arial; without user specific settings it would be Lucida; without a local machine setting it would be Monaco; and without a connection to a network it would be Helvetica. The settings override in the above order.
Given the above architecture, it becomes obvious that these settings are spread out around the local machine and the network. An ideal format for configuration files is XML so that even if a dedicated editor would be offered, the user could always fall back on a plain text editor to view, edit, and repair these files as necessary.
It's also important to understand that specific hardware configuration issues be not placed in the above domain system. Hardware and software have to be kept apart. Ordinary maintenance issues and incidents of user interference should not be able to jeopardise the low level stability of the system (and the network).
That at any rate is the ideal situation. Let's now compare this with Microsoft Windows by going back to the first years of its market success and seeing where Microsoft have taken things since then.
The first incarnation of the Windows Registry was called the 'Registration Database' found in versions 3.0 and forward. This 'database' admitted of one hierarchical key known as 'HKEY_CLASSES_ROOT' or 'HKCR'. The 'classes' the name referred to were the 'CLSID' classes, an extrapolation of the modular thinking of Multics, the system which preceded Unix in the operating system arena.
Modules are given 128-bit identifiers and information is kept in a repository so other modules can use them.
Also in this key - and used much more at the time - were the associations between file extensions, file types, and editors and viewers. A file ending with the extension 'TXT' might be classified as a 'txtfile' [sic] and 'txtfile' in turn set to be viewed and edited by 'NOTEPAD.EXE'.
The reason the extension is not tied directly to an application is that more than one file type might use the same editor. For example, 'LOG' files might also be edited by Notepad.
When Microsoft unveiled Windows NT in the early 1990s, the significance of HKCR became apparent: the Registry divided settings into two major categories, local machine and users. (The astute reader might wonder if this is true, as a Registry editor will display more keys, but the others are simply aliases to these two.)
- HKEY_LOCAL_MACHINE (HKLM) - All settings pertaining to the local machine. HKEY_CLASSES_ROOT is sorted under this key as an alias to HKLM/Software/Classes.
- HKEY_USERS (HKU) - All settings pertaining to all users of the local machine. HKEY_CURRENT_USER is sorted here.
The Windows Registry architecture has no counterpart to the command line domain. Registry settings cannot be altered on a per-program basis for troubleshooting or other purposes.
The advantages of using the Registry outweighed the disadvantages - as least when using the Windows platform - because the earlier system with INI files left so much to be desired.
INI files were pure text files of limited size (less than 64 KB) with a linear architecture incapable of storing anything but pure text (as opposed to XML which can be capable of storing raw binary data and also using an unlimited nested hierarchy).
INI files could only store strings and on input only return strings or interpret the strings as integers; developers wishing to write 'numbers' to INI files had to provide their own functions to do so.
INI files could not handle floating point numbers, arrays, dictionaries, booleans, or raw binary data. Having no hierarchy (and no on disk location to establish this hierarchy) INI files could not be user specific.
Moving to the Registry meant using the 'HKEY_CURRENT_USER' alias which meant developers didn't need to create their own settings architecture; they simply looked at settings for the 'current user' to read and write what they wanted.
On Disk Architecture
The Windows NT Registry is spread out in a number of on disk files called 'hives'. The user specific hive is in one location; other aspects of local machine settings are in others. The Windows directory 'repair' contains most of these files as per the user's last backup of the Registry.
The hive files are 'binary': they are not readable by an ordinary plain text editor. They require use of a dedicated editor.
If the hive files become corrupt - a single byte misalignment is all it takes - the operating system may be unable to boot. The dedicated editor will be incapable of correcting the error.
Clearly the concentration of all hardware and system settings in a small number of on disk locations makes the Windows Registry an Achilles heel.
Where user domain settings stored in plain text (XML) format are accessible to users when and where the need arises and spread out in application specific files so users can simply delete them and go back to factory defaults if something goes wrong, editing any settings in the Registry is something that is directly discouraged - and with reason.
The dedicated Registry editor does not have 'undo', 'save', and 'do not save' options. All modifications are irreversible and take place in realtime. The risk for damage to the operating system is always imminent. Normal users will normally not use the dedicated editor or otherwise access the Registry.
Even to the adept user the Registry, as depicted with the dedicated editor, will appear nearly unfathomable and unconquerable. Organised in a hierarchy reminiscent of a file system, it is nevertheless impervious to overview. Coupled with the Multics idea of class identifiers (and interface identifiers and so forth), it has its 'scare factor' even for the experienced administrator.
Because the Registry is so difficult to navigate and makes a overview impossible, it becomes the perfect hiding place for malware and all sorts of 'devious' activity. The ordinary user is never a match for the dedicated hacker; the hacker will know how to manipulate the Registry; the user will never find the hacker out, even with considerable skills.
Where the disk based domain system for configuration settings relies on the security model of the file systems to protect the system, the network, and the local machine from accidents, misuse, and abuse, the Registry model uses very few restrictions on total access. Most any Windows user can access - and damage - most any area of the Registry.
The Registry has areas not visible to even system administrators. These areas are owned by the built-in 'SYSTEM' account. Even the administrators might not be able to access them and modify them, and may even be incapable of seeing them.
But Windows administrators can assume ownership of 'SYSTEM' resources, after which they can be modified (and damaged) as any other Registry settings.
The Windows security model does not allow resources to be 'returned' to the 'SYSTEM' account ('SYSTEM' is never an actual login user). Once the resources are exposed they cannot be protected again.
Conflicts of Use
The sorting of HKEY_CLASSES_ROOT under HKEY_LOCAL_MACHINE has several rather undesirable consequences. As HKCR determines what programs will be used to open what files, the settings become machine specific rather than user specific. This is counterproductive.
If Harry wants a double-click on his DOC files to open them in his licensed copy of MS Word, he is free to do so; but if Sally prefers OpenOffice and wants to change the association, Harry's workplace will be affected as well. Sally will break Harry's coupling with MS Word.
Although the system pays homage to the idea of multiuser machines, it does not implement this satisfactorily.
[Strict division of permanent storage space into user specific areas is another matter. It is not covered here.]
The Foreseeable Future
Clearly the Registry represents a great part of all the 'evils' for which Windows has become notorious over the years: it is an Achilles heel, it does not have any domain hierarchy, it is mostly inaccessible to users while being wide open to hackers, it is the perfect hiding place for malware settings, it is not user specific as it should be on a modern multiuser computer.
Windows was never conceived as a multiuser system anyway. Its heritage comes from MS-DOS where the issue of who was using the computer was moot. The only security came from a lock on the front of the box or on the door to the room where the computer was kept.
The Internet changed all that: suddenly only the systems designed as multiuser from the get-go could survive. Wide open systems became ducks in the water. Malware came through the gaping security holes and hid themselves in the wide open unprotected Registry.
While it is clear that too much has to change with its basic architecture to make Windows secure in this new Millennium, having the Registry relegated to the trash heap and replaced by a safer system would have meant a significant improvement in user security.
But as things stand, the attacks will continue with Vista, the hackers and professional gangs will continue to break in and hide where users can't find them, and the system will continue to crash, hang, and - as everyone is aware by now - grow sluggish over time.