More Disk Space™   


Technology Highlights   
Operating System: Macintosh OS 6.x - 7.x
Development Language: C and 68000 assembly language
Compiler: Symantec Think C
Lead Programmer  Justin Gray
Programmers Justin Gray, Daniel Kopyc
Toward eDisk™ 
White Paper by Ron Hovingh 

The need for more storage space on Macintosh hard disks has never been greater. What Macintosh user does not yearn for a larger or additional hard drive — given today’s huge graphics files, swollen System Folders and bigger, more sophisticated software packages?

This white paper traces the history of compression utilities and hard disk storage on the Macintosh. It then explains how eDisk works to accomplish a significant evolutionary leap in hard disk capacity.

Most utilities designed to relieve cramped hard disks have been based on file compression. Although compression schemes and user interfaces have varied, file-compression software will essentially rewrite file data so it tends to take up less space on a hard disk.
Typically, compression utilities find a way to abbreviate redundant information ("6e" rather than "eeeeee") and express even complex patterns of data more efficiently. Enhancements to this basic premise have mainly involved "tighter" compression methods, as well as more user control and convenience.

To put the latest and most dramatic developments into an instructive context, a brief retrospective is in order. 

Earliest storage capacity enhancements 
The oldest capacity enhancements required the Macintosh user to first create a special "archive" file, then compress data into that archive. As the name "archive" implies, compressed files then had to be retrieved (decompressed) to become usable once again. To make these manual steps less cumbersome, later versions added pull-down menus as a shortcut for "compress" and "decompress" commands. Yet another interface allowed the user to double-click the archive file just once to decompress it, although this remained a separate and additional step before actually using the file. 
Benefits of simple file compression include the ability to store more files on a given disk, as well as shorter times for modem or network transfers. However, drawbacks include the need to either decompress the files or distribute decompression software to make them usable by others. 

Evolution of "transparent" compression 
More recent utilities have promised "transparent" operation — with compressed files or applications looking and behaving as if they are normal. They decompress "on the fly" with little or no delay when accessed. Thus, the user is freed of consciously using decompression as an first and intermediate step toward actually using the file. However, in general these have not stood as the perfect solution to inadequate disk space. Instead of delivering on the promise of "invisible" operation, some have interfered with indexing or optimization tools by changing file types. Many rely on extensions or control panels that use precious RAM on a full-time basis, conflict with other software, or render compressed files and applications unusable in their absence. Different packages also provided varying degrees of user control and flexibility over their compression operation. 

Despite advances with "transparent operation," there still appears to be two camps of Macintosh users: those who use file-compression utilities and enjoy their benefits, and those who have been disappointed by their limitations. Clearly, there is room for innovation beyond compressing files so more of them would fit on a given Macintosh hard disk. 

eDisk™ and the driver-level revolution 
A new approach on the Macintosh scene addresses storage capacity from the opposite direction. Instead of merely shrinking files, driver-level compression increases capacity of the hard disk on which files are stored.  
This is made possible by overcoming inefficiencies built into the original Macintosh Hierarchical Filing System (HFS). Standard HFS storage performance is adequate for the small hard disks (under 33.5 megabytes) that existed when it originated. However, its degree of "wasted space" has increased along with the size of hard disks. 

Partly by regaining this wasted space, eDisk™ can safely increase a given hard disk’s capacity by a factor of two, three or even four times. 
Thus, an 80-megabyte hard disk can be "seen" by the Macintosh as a 160-meg or even larger disk, with little or no price to pay in terms of performance, portability, or simplicity. 

Advantages of driver-level compression are rooted mainly in its separation from all but one operation: accessing the hard disk. This all but eliminates opportunities for conflicts and incompatibilities — which are almost countless at the file level but virtually nonexistent at the disk level. 
In order to illustrate how eDisk™ works, a look at basic Macintosh HFS principles is necessary. 

Wasted space with normal storage techniques 
Regardless of size, normal Macintosh hard disks are divided into 65,536 "Allocation Blocks." These represent "addresses" of the individual storage sections. These 65,536 identically sized Allocation Blocks will each grow larger with the size of the hard disk.  
On small hard disks, the smallest possible Allocation Block is 512 bytes (1/2 kilobyte).  On disks larger than 33.5 megabytes, however, the total capacity divided by 65,536 Allocation Blocks will produce considerably larger minimum spaces.  For example, smallest storage address on an 80-megabyte disk is 1,536 bytes. One single byte of data will "fill" that Allocation Block and allow nothing else in that space. 
Minimum Allocation Block size on a 1.7-gigabyte disk is 25,940 bytes (25 kilobytes). The inefficiency of this approach emerges routinely — whenever data to be written does not match up exactly with the Allocation Block size (or a multiple thereof). 

On an 80-meg disk, a 3,100-byte file will not fit within a one or even two 1,536-byte Allocation Blocks. Therefore, it is assigned THREE blocks representing 4,608 bytes of space. It thus wastes most of the third block, which cannot store anything else. The same effect is more dramatic on a 1.7-gigabyte disk. Storing four ordinary 3K files will render 100K of disk space unusable for anything but those four files! This is akin to pouring four small glasses of water into four separate five-gallon pails, simply because no smaller sized containers are available. Each Allocation Block "container" is 1/65,536th the size of the hard disk, no matter what size of container this math will produce. 

To a degree, partitioning a hard disk into two or more "volumes" can reduce the wasteful storage tendencies of the standard HFS. Because each partition volume is treated as a separate and smaller disk, the size of its Allocation Blocks is also reduced accordingly. (For the sake of simplicity here, however, "disk" will refer to hard disks or partition volumes within them.) 

Using the water metaphor, smaller partitions may create Allocation Blocks that are more like gallon jugs instead of 5-gallon pails for those small glasses of water. Although making Allocation Blocks uniformly smaller will tend to reduce wasted space, it is still not ideal. Again using the water metaphor, you will still need to dedicate two entire gallon jugs for storing a gallon PLUS JUST ONE MORE DROP of water. 

Custom-sized storage spaces with eDisk 
With eDisk, the arbitrary figure of 65,536 Allocation Blocks no longer applies at the disk level. Neither must all blocks be of uniform size.  
Rather, with eDisk storage blocks can vary in both size and number in order to accomplish a "tighter fit" with the particular disk and the particular files stored on it. Although the general principle and eDisk’s relationship to other Macintosh operations will be explained, exactly how eDisk creates and uses these customized block sizes is propriety technology developed by Alysis Software Corporation. The net result with eDisk is that a "one-gallon" quantity of data is placed neatly in a "one-gallon" storage container, while a one-pint space can be created to store a one-pint file efficiently and drip-sized data only uses a drip’s worth of disk space.  

In contrast to an earlier example, with eDisk a 3,100-byte file would use exactly 3,100 bytes of space on an 80-meg hard disk, rather than 4,608 bytes (three 1,536-byte Allocation Blocks). This effect can be multiplied by each of hundreds or thousands of files. On larger disks with larger Allocation Blocks, the difference with eDisk can be even more dramatic. What’s more, eDisk compresses the data on the fly, then creates custom-sized spaces for it to multiply the efficiency. 

Writing data from program to disk 
To grasp how eDisk safely accomplishes both data compression and creates more efficient storage spaces on disk, its place in the Macintosh system must be understood. Ordinarily, three levels of software are involved when you "save" a document, duplicate a file with the Macintosh Finder, or otherwise write data to a hard disk. They are the Finder (or application), Macintosh File Manager, and hard disk software. 

Let’s again use a 3,100-byte word-processor file as an example: 
• 1. The application (i.e., MacWrite, Microsoft Word) gauges the size of the data to be written to the hard disk. If the application was used to make revisions in the document, these changes exist temporarily in RAM. This new data in RAM must be added to the document size as it is written to the disk. 
• 2. The application tells the File Manager, which is part of the Macintosh operating system, that 3,100 bytes must be written to the disk. Also specified is where the document should be displayed in terms of a folder (or directory) by the Hierarchical Filing System.  
• 3. The File Manager receives the 3,100 bytes of data in seven blocks of 512 bytes, one block at a time. In turn, it relays these blocks to the hard disk driver software (i.e., Silverlining, ETC Tools). The File Manager also specifies where to write the data on the hard disk.  
• 4. The hard disk software writes the data (strings of "0" and "1") into various Allocation Blocks on the hard drive. It blindly follows the File Manager’s orders on the "address" of each sector to be used. It sends the disk’s read/write heads to those places accordingly. How does the File Manager know where there’s empty space on the disk? It has a map — a "volume control block" supplied by the hard disk software when the disk mounts. The File Manager stores this directory in memory. It begins with the hard disk’s name and fans out into subdirectories — much like the Macintosh folders-and-files desktop metaphor that users see. 

How eDisk™ works with the Mac and hard disk driver 
eDisk software is positioned between the Macintosh File Manager and the hard disk’s driver software. It plays part of the role of each one. The File Manager thinks it is passing the data and addresses to disk driver software. In turn, the hard disk software believes it is taking orders directly from the File Manager. 

While it seems "business as usual" from both directions, in fact each side is dealing with eDisk software between them.  
eDisk translates the File Manager’s inefficient orders before passing them on to the hard disk software. From the other direction, eDisk interprets the hard disk software’s vision of a 40-meg drive and tells the File Manager that it’s really an 80-meg drive.  

This translation begins when the disk is first mounted, and continues as its size, structure and data addresses are updated in the volume control block. Advantages of eDisk’s unique position in this chain-of-command are numerous: 

• It allows Macintosh users to continue choosing hard disk software with special features they want for partitioning a disk into multiple volumes, enhancing read/write speed, setting aside Virtual Memory blocks and so on. Unlike an older Macintosh driver-level utility promising "transparent" operation, eDisk does not insist on replacing the hard disk driver with itself. It does not alter catalog information or low-level disk structures. 

• Because it is situated on the disk driver side of the File Manager rather than file side, eDisk is two very important steps away from the highly complex file-level operations, where incompatibilities can be rampant. 
Previous so-called "whole disk compression" utilities actually continue to work at the file level, just like the earliest "archive" compression tools. When the Finder or an application attempts to send file data to the File Manager to be written to disk, "transparent" compressors had to intercept this data before it reached the File Manager. 

These file-level utilities can scramble the contents of files into compression coding, change file sizes (sometimes artificially), and even alter file types before the File Manager gets them. This can interfere with disk optimizing and file-recovery programs, as well as indexing and searching programs like OnLocation and GOfer. While successful if done well, file-level compression typically requires considerable use of extensions and control panels to "patch" normal system operations. What works fine today may "break" with the next version of the application whose data they intercept and restructure, or with an update of the operating system to which they relay altered data.  
In addition to consuming precious RAM, these file-level compression tools may conflict with extensions or other software attempting to use memory or execute patches in competing areas. What’s more, the altered file data on disk is rendered useless if the compression utility is turned off, absent, or not working correctly between the operating system and application. 
In contrast, eDisk compression operates at the driver level so it is not prone to the usual incompatibilities that file-level compression utilities try to deal with.

A hard disk expanded with eDisk can be hooked up to any Macintosh without installing or reinstalling any software. The eDisk compressor continues its translation role between the hard drive software and the new Mac’s File Manager, whether it’s System 6.0.x or 7.x.x. In fact, you can even reinitialize ("erase") the hard disk without reinstalling eDisk or losing its enhanced capacity. Even on a "blank" hard disk, eDisk will remain intact as part of its overhead structure. (Reformatting, however, is among the ways to return to the disk’s unexpanded status.) 

Designed as an "install and forget" expansion tool, the simplicity and ease of using eDisk make it ideal for Macintosh beginners and experts alike.  
There’s no additional risk to the safety of data with eDisk. In fact, it enhances safety with redundant tagging of data on disk. It is immune to file-level system viruses and poorly written INIT or application software, since it operates on the other side of the File Manager. 

How does eDisk™expand the disk? 
You’ve no doubt heard people say this of some item’s value: "It’s worth whatever someone will pay for it." Likewise, a hard disk’s capacity is whatever you can safely store on it. It’s OK for eDisk to tell the File Manager that it’s dealing with an 80-meg expanded disk instead of the original 40-meg disk, because it actually serves as one.  How does eDisk "get away with it"? For starters, it recovers space that is normally lost to inefficiencies in the Macintosh filing system, as described earlier. Thousands of files are stored in precise, custom-sized spaces, rather than partially wasting the space in each of thousands of Allocation Blocks.

Beyond that, eDisk will also compress data as it’s passed from the File Manager to the hard disk driver software. On the fly, it employs the fast and highly efficient compression algorithms developed by Alysis Software Corporation with its leading-edge Resource Compressor, SuperDisk!™ and More Disk Space™ products.  To describe this 1-2 punch in a simplified way: 

• eDisk can take the data for "eeeeee" and create an exactly "eeeeee"-sized space for it. This is already much more efficient than the standard Macintosh Hierarchical Filing System.  

• Secondly, eDisk can also compress "eeeeee" into "6e," then create an exactly "6e"-sized space for this smaller version of the data. 
The powerful synergy of this eDisk combination makes it possible to use hard disks at two, three or even four times their standard capacity.  
File compression alone cannot accomplish this. Here are examples of the limits for file-level compression on a standard disk, as depicted in the earlier graphic: 

• With 1,536-byte Allocation Blocks on an 80-meg hard disk, a 6,700-byte file will require five blocks and thus tie up 7,680 bytes of space. (It thus wastes about 980 bytes of space that eDisk could make usable.) 

• Despite compressing the same 6,700-byte file to half its size (3,350 bytes), it would occupy three 1,536-byte blocks on the 80-meg drive — wasting 82% of the space in the third block. Although this is better than occupying five blocks, it’s not as efficient as what eDisk can accomplish with a single 3,350-byte storage space customized for that compressed file. 

• On much larger hard disks, with Allocation Blocks of 25k or even higher, compressing data with standard file-level utilities may have no effect at all. A 20k file compressed to 10k will still occupy all of a 25k Allocation Block. 

About eDisk™ performance 
The astute reader may have already realized another potential payoff from the eDisk system — speed! The eDisk-enhanced hard disk tends to "read" data in long, continuous strings at one address, rather than skip around inefficiently. eDisk’s effect on performance will depend in part on the hard disk’s access speed ("seek time") and data-transfer rate capacity. 

Choices made by the user also help determine whether an eDisk-enhanced disk will be faster, comparable or slightly slower than a standard disk. During installation, the user can choose "Fast," "Faster" or "Fastest" compression levels, depending on priorities for speed versus tighter compression. What’s more, disk performance can be greatly enhanced by an eDisk option called "Delayed Write." 

In much the same way as a "RAM disk" works to speed up an application, eDisk uses a "Delayed Write" cache to accumulate data in RAM rather than always writing it to disk immediately. This cache is managed with intelligence to improve disk performance. eDisk "flushes" data from the cache (writes it to disk) when the adjustable cache is filled, or after a certain amount of time has passed with no user activity. Speed, compression and expansion are also improved because one large disk write is more efficient than several smaller writes of the same data. 

Beyond 2x expansion 
During installation, the user decides whether to have eDisk assume a 2x, 3x or 4x expansion factor in multiplying the hard disk’s capacity. A 2x factor for expansion of disk capacity is easily achievable, thanks to eDisk’s data compression plus storage spaces that are efficiently customized for each file.  The 3x and 4x choices permitted by eDisk are unique but not unrealistic, especially when the disk is primarily used to store text, graphic, data base and page-layout files. Compressibility of data will determine to which level the disk remains expanded. A 40Mb disk may expand 3x to 120Mb, then dip in size to 115Mb or less if it primarily holds "dense" data such as system software, fonts, or other files that are already compressed. 

Even if the nature of the files does not permit a 3x or 4x factor, the Macintosh will still display accurate "in disk" sizes. Remaining free space will be automatically adjusted to reflect results for data already on the disk. eDisk’s use of accurate figures will allow uncomplicated backups and other movement of files from an expanded disk to an ordinary disk.  


 copyright 1998 by Alysis Software Corporation.  More Disk Space, SuperDisk!, eDisk, Compatibility INIT, the Alysis Resource Compressor, the Complete Delete, the Alysis Disk Expander, the Alysis Installer, and Safety Belt are tradmarks of Alysis Software Corporation.  DPI-On-The-Fly and the IPM are trademarks of NEC Technologies, Inc.