eDisk™
Technology Highlights
Operating System: |
Macintosh OS 6.x - 7.x |
Development Language: |
C and 68000 assembly language |
Compiler: |
Symantec Think C |
Lead Programmer |
Justin Gray |
Programmers |
Justin Gray, Daniel Kopyc |
Toward eDisk™
EXPANDING CAPACITY
OF A MACINTOSH HARD DISK
White Paper by Ron Hovingh
Introduction
The need for more storage space on
Macintosh hard disks has never been greater. What Macintosh user does not
yearn for a larger or additional hard drive — given today’s huge graphics
files, swollen System Folders and bigger, more sophisticated software packages?
This white paper traces the history
of compression utilities and hard disk storage on the Macintosh. It then
explains how eDisk works to accomplish a significant evolutionary leap
in hard disk capacity.
Most utilities designed to relieve
cramped hard disks have been based on file compression. Although compression
schemes and user interfaces have varied, file-compression software will
essentially rewrite file data so it tends to take up less space on a hard
disk.
Typically, compression utilities
find a way to abbreviate redundant information ("6e" rather than "eeeeee")
and express even complex patterns of data more efficiently. Enhancements
to this basic premise have mainly involved "tighter" compression methods,
as well as more user control and convenience.
To put the latest and most dramatic
developments into an instructive context, a brief retrospective is in order.
Earliest storage capacity enhancements
The oldest capacity enhancements
required the Macintosh user to first create a special "archive" file, then
compress data into that archive. As the name "archive" implies, compressed
files then had to be retrieved (decompressed) to become usable once again.
To make these manual steps less cumbersome,
later versions added pull-down menus as a shortcut for "compress" and "decompress"
commands. Yet another interface allowed the user to double-click the archive
file just once to decompress it, although this remained a separate and
additional step before actually using the file.
Benefits of simple file compression
include the ability to store more files on a given disk, as well as shorter
times for modem or network transfers. However, drawbacks include the need
to either decompress the files or distribute decompression software to
make them usable by others.
Evolution of "transparent" compression
More recent utilities have promised
"transparent" operation — with compressed files or applications looking
and behaving as if they are normal. They decompress "on the fly" with little
or no delay when accessed. Thus, the user is freed of consciously using
decompression as an first and intermediate step toward actually using the
file. However, in general these
have not stood as the perfect solution to inadequate disk space. Instead
of delivering on the promise of "invisible" operation, some have interfered
with indexing or optimization tools by changing file types. Many rely on
extensions or control panels that use precious RAM on a full-time basis,
conflict with other software, or render compressed files and applications
unusable in their absence. Different packages also provided varying degrees
of user control and flexibility over their compression operation.
Despite advances with "transparent
operation," there still appears to be two camps of Macintosh users: those
who use file-compression utilities and enjoy their benefits, and those
who have been disappointed by their limitations. Clearly, there is room
for innovation beyond compressing files so more of them would fit on a
given Macintosh hard disk.
eDisk™ and the driver-level revolution
A new approach on the Macintosh scene
addresses storage capacity from the opposite direction. Instead of merely
shrinking files, driver-level compression increases capacity of the hard
disk on which files are stored.
This is made possible by overcoming
inefficiencies built into the original Macintosh Hierarchical Filing System
(HFS). Standard HFS storage performance is adequate for the small hard
disks (under 33.5 megabytes) that existed when it originated. However,
its degree of "wasted space" has increased along with the size of hard
disks.
Partly by regaining this wasted space,
eDisk™ can safely increase a given hard disk’s capacity by a factor of
two, three or even four times.
Thus, an 80-megabyte hard disk can
be "seen" by the Macintosh as a 160-meg or even larger disk, with little
or no price to pay in terms of performance, portability, or simplicity.
Advantages of driver-level compression
are rooted mainly in its separation from all but one operation: accessing
the hard disk. This all but eliminates opportunities for conflicts and
incompatibilities — which are almost countless at the file level but virtually
nonexistent at the disk level.
In order to illustrate how eDisk™
works, a look at basic Macintosh HFS principles is necessary.
Wasted space with normal storage
techniques
Regardless of size, normal Macintosh
hard disks are divided into 65,536 "Allocation Blocks." These represent
"addresses" of the individual storage sections. These 65,536 identically
sized Allocation Blocks will each grow larger with the size of the hard
disk.
On small hard disks, the smallest
possible Allocation Block is 512 bytes (1/2 kilobyte). On
disks larger than 33.5 megabytes, however, the total capacity divided by
65,536 Allocation Blocks will produce considerably larger minimum spaces.
For example, smallest storage address
on an 80-megabyte disk is 1,536 bytes. One single byte of data will "fill"
that Allocation Block and allow nothing else in that space.
Minimum Allocation Block size on
a 1.7-gigabyte disk is 25,940 bytes (25 kilobytes). The
inefficiency of this approach emerges routinely — whenever data to be written
does not match up exactly with the Allocation Block size (or a multiple
thereof).
On an 80-meg disk, a 3,100-byte file
will not fit within a one or even two 1,536-byte Allocation Blocks. Therefore,
it is assigned THREE blocks representing 4,608 bytes of space. It thus
wastes most of the third block, which cannot store anything else.
The same effect is more dramatic on a
1.7-gigabyte disk. Storing four ordinary 3K files will render 100K of disk
space unusable for anything but those four files! This
is akin to pouring four small glasses of water into four separate five-gallon
pails, simply because no smaller sized containers are available. Each Allocation
Block "container" is 1/65,536th the size of the hard disk, no matter what
size of container this math will produce.
To a degree, partitioning a hard disk
into two or more "volumes" can reduce the wasteful storage tendencies of
the standard HFS. Because each partition volume is treated as a separate
and smaller disk, the size of its Allocation Blocks is also reduced accordingly.
(For the sake of simplicity here, however, "disk" will refer to hard disks
or partition volumes within them.)
Using the water metaphor, smaller
partitions may create Allocation Blocks that are more like gallon jugs
instead of 5-gallon pails for those small glasses of water. Although
making Allocation Blocks uniformly smaller will tend to reduce wasted space,
it is still not ideal. Again using the water metaphor, you will still need
to dedicate two entire gallon jugs for storing a gallon PLUS JUST ONE MORE
DROP of water.
Custom-sized storage spaces with
eDisk
With eDisk, the arbitrary figure
of 65,536 Allocation Blocks no longer applies at the disk level. Neither
must all blocks be of uniform size.
Rather, with eDisk storage blocks
can vary in both size and number in order to accomplish a "tighter fit"
with the particular disk and the particular files stored on it.
Although the general principle and eDisk’s
relationship to other Macintosh operations will be explained, exactly how
eDisk creates and uses these customized block sizes is propriety technology
developed by Alysis Software Corporation. The
net result with eDisk is that a "one-gallon" quantity of data is placed
neatly in a "one-gallon" storage container, while a one-pint space can
be created to store a one-pint file efficiently and drip-sized data only
uses a drip’s worth of disk space.
In contrast to an earlier example,
with eDisk a 3,100-byte file would use exactly 3,100 bytes of space on
an 80-meg hard disk, rather than 4,608 bytes (three 1,536-byte Allocation
Blocks). This effect can be multiplied
by each of hundreds or thousands of files. On larger disks with larger
Allocation Blocks, the difference with eDisk can be even more dramatic.
What’s more, eDisk compresses the data
on the fly, then creates custom-sized spaces for it to multiply the efficiency.
Writing data from program to disk
To grasp how eDisk safely accomplishes
both data compression and creates more efficient storage spaces on disk,
its place in the Macintosh system must be understood. Ordinarily,
three levels of software are involved when you "save" a document, duplicate
a file with the Macintosh Finder, or otherwise write data to a hard disk.
They are the Finder (or application), Macintosh File Manager, and hard
disk software.
Let’s again use a 3,100-byte word-processor
file as an example:
• 1. The application (i.e., MacWrite,
Microsoft Word) gauges the size of the data to be written to the hard disk.
If the application was used to make revisions in the document, these changes
exist temporarily in RAM. This new data in RAM must be added to the document
size as it is written to the disk.
• 2. The application tells the File
Manager, which is part of the Macintosh operating system, that 3,100 bytes
must be written to the disk. Also specified is where the document should
be displayed in terms of a folder (or directory) by the Hierarchical Filing
System.
• 3. The File Manager receives the
3,100 bytes of data in seven blocks of 512 bytes, one block at a time.
In turn, it relays these blocks to the hard disk driver software (i.e.,
Silverlining, ETC Tools). The File Manager also specifies where to write
the data on the hard disk.
• 4. The hard disk software writes
the data (strings of "0" and "1") into various Allocation Blocks on the
hard drive. It blindly follows the File Manager’s orders on the "address"
of each sector to be used. It sends the disk’s read/write heads to those
places accordingly. How does the
File Manager know where there’s empty space on the disk? It has a map —
a "volume control block" supplied by the hard disk software when the disk
mounts. The File Manager stores this directory in memory. It begins with
the hard disk’s name and fans out into subdirectories — much like the Macintosh
folders-and-files desktop metaphor that users see.
How eDisk™ works with the Mac and
hard disk driver
eDisk software is positioned between
the Macintosh File Manager and the hard disk’s driver software. It plays
part of the role of each one. The File Manager thinks it is passing the
data and addresses to disk driver software. In turn, the hard disk software
believes it is taking orders directly from the File Manager.
While it seems "business as usual"
from both directions, in fact each side is dealing with eDisk software
between them.
eDisk translates the File Manager’s
inefficient orders before passing them on to the hard disk software. From
the other direction, eDisk interprets the hard disk software’s vision of
a 40-meg drive and tells the File Manager that it’s really an 80-meg drive.
This translation begins when the disk
is first mounted, and continues as its size, structure and data addresses
are updated in the volume control block. Advantages of eDisk’s unique position
in this chain-of-command are numerous:
• It allows Macintosh users to continue
choosing hard disk software with special features they want for partitioning
a disk into multiple volumes, enhancing read/write speed, setting aside
Virtual Memory blocks and so on. Unlike an older Macintosh driver-level
utility promising "transparent" operation, eDisk does not insist on replacing
the hard disk driver with itself. It does not alter catalog information
or low-level disk structures.
• Because it is situated on the disk
driver side of the File Manager rather than file side, eDisk is two very
important steps away from the highly complex file-level operations, where
incompatibilities can be rampant.
Previous so-called "whole disk compression"
utilities actually continue to work at the file level, just like the earliest
"archive" compression tools. When the Finder or an application attempts
to send file data to the File Manager to be written to disk, "transparent"
compressors had to intercept this data before it reached the File Manager.
These file-level utilities can scramble
the contents of files into compression coding, change file sizes (sometimes
artificially), and even alter file types before the File Manager gets them.
This can interfere with disk optimizing and file-recovery programs, as
well as indexing and searching programs like OnLocation and GOfer.
While successful if done well, file-level
compression typically requires considerable use of extensions and control
panels to "patch" normal system operations. What works fine today may "break"
with the next version of the application whose data they intercept and
restructure, or with an update of the operating system to which they relay
altered data.
In addition to consuming precious
RAM, these file-level compression tools may conflict with extensions or
other software attempting to use memory or execute patches in competing
areas. What’s more, the altered file data on disk is rendered useless if
the compression utility is turned off, absent, or not working correctly
between the operating system and application.
In contrast, eDisk compression operates
at the driver level so it is not prone to the usual incompatibilities that
file-level compression utilities try to deal with.
A hard disk expanded with eDisk can
be hooked up to any Macintosh without installing or reinstalling any software.
The eDisk compressor continues its translation role between the hard drive
software and the new Mac’s File Manager, whether it’s System 6.0.x or 7.x.x.
In fact, you can even reinitialize ("erase")
the hard disk without reinstalling eDisk or losing its enhanced capacity.
Even on a "blank" hard disk, eDisk will remain intact as part of its overhead
structure. (Reformatting, however, is among the ways to return to the disk’s
unexpanded status.)
Designed as an "install and forget"
expansion tool, the simplicity and ease of using eDisk make it ideal for
Macintosh beginners and experts alike.
There’s no additional risk to the
safety of data with eDisk. In fact, it enhances safety with redundant tagging
of data on disk. It is immune to file-level system viruses and poorly written
INIT or application software, since it operates on the other side of the
File Manager.
How does eDisk™expand the disk?
You’ve no doubt heard people say
this of some item’s value: "It’s worth whatever someone will pay for it."
Likewise, a hard disk’s capacity is whatever you can safely store on it.
It’s OK for eDisk to tell the File Manager that it’s dealing with an 80-meg
expanded disk instead of the original 40-meg disk, because it actually
serves as one. How does
eDisk "get away with it"? For starters, it recovers space that is normally
lost to inefficiencies in the Macintosh filing system, as described earlier.
Thousands of files are stored in precise, custom-sized spaces, rather than
partially wasting the space in each of thousands of Allocation Blocks.
Beyond that, eDisk will also compress
data as it’s passed from the File Manager to the hard disk driver software.
On the fly, it employs the fast and highly efficient compression algorithms
developed by Alysis Software Corporation with its leading-edge Resource
Compressor, SuperDisk!™ and More Disk Space™ products. To
describe this 1-2 punch in a simplified way:
• eDisk can take the data for "eeeeee"
and create an exactly "eeeeee"-sized space for it. This is already much
more efficient than the standard Macintosh Hierarchical Filing System.
• Secondly, eDisk can also compress
"eeeeee" into "6e," then create an exactly "6e"-sized space for this smaller
version of the data.
The powerful synergy of this eDisk
combination makes it possible to use hard disks at two, three or even four
times their standard capacity.
File compression alone cannot accomplish
this. Here are examples of the limits for file-level compression on a standard
disk, as depicted in the earlier graphic:
• With 1,536-byte Allocation Blocks
on an 80-meg hard disk, a 6,700-byte file will require five blocks and
thus tie up 7,680 bytes of space. (It thus wastes about 980 bytes of space
that eDisk could make usable.)
• Despite compressing the same 6,700-byte
file to half its size (3,350 bytes), it would occupy three 1,536-byte blocks
on the 80-meg drive — wasting 82% of the space in the third block. Although
this is better than occupying five blocks, it’s not as efficient as what
eDisk can accomplish with a single 3,350-byte storage space customized
for that compressed file.
• On much larger hard disks, with
Allocation Blocks of 25k or even higher, compressing data with standard
file-level utilities may have no effect at all. A 20k file compressed to
10k will still occupy all of a 25k Allocation Block.
About eDisk™ performance
The astute reader may have already
realized another potential payoff from the eDisk system — speed! The eDisk-enhanced
hard disk tends to "read" data in long, continuous strings at one address,
rather than skip around inefficiently. eDisk’s effect on performance will
depend in part on the hard disk’s access speed ("seek time") and data-transfer
rate capacity.
Choices made by the user also help
determine whether an eDisk-enhanced disk will be faster, comparable or
slightly slower than a standard disk. During installation, the user can
choose "Fast," "Faster" or "Fastest" compression levels, depending on priorities
for speed versus tighter compression. What’s
more, disk performance can be greatly enhanced by an eDisk option called
"Delayed Write."
In much the same way as a "RAM disk"
works to speed up an application, eDisk uses a "Delayed Write" cache to
accumulate data in RAM rather than always writing it to disk immediately.
This cache is managed with intelligence
to improve disk performance. eDisk "flushes" data from the cache (writes
it to disk) when the adjustable cache is filled, or after a certain amount
of time has passed with no user activity. Speed,
compression and expansion are also improved because one large disk write
is more efficient than several smaller writes of the same data.
Beyond 2x expansion
During installation, the user decides
whether to have eDisk assume a 2x, 3x or 4x expansion factor in multiplying
the hard disk’s capacity. A 2x factor for expansion of disk capacity is
easily achievable, thanks to eDisk’s data compression plus storage spaces
that are efficiently customized for each file. The
3x and 4x choices permitted by eDisk are unique but not unrealistic, especially
when the disk is primarily used to store text, graphic, data base and page-layout
files. Compressibility of data will determine to which level the disk remains
expanded. A 40Mb disk may expand 3x to 120Mb, then dip in size to 115Mb
or less if it primarily holds "dense" data such as system software, fonts,
or other files that are already compressed.
Even if the nature of the files does
not permit a 3x or 4x factor, the Macintosh will still display accurate
"in disk" sizes. Remaining free space will be automatically adjusted to
reflect results for data already on the disk. eDisk’s use of accurate figures
will allow uncomplicated backups and other movement of files from an expanded
disk to an ordinary disk.
|