Lucid Nonsense


HFS+ Compression in Snow Leopard

Thursday, 17 December 2009

One feature that made a relatively stealthy appearance in Snow Leopard was filesystem level compression, one of the tricks that allowed Apple to reduce the overall footprint of the release. This HFS+ Compression, detailed in John Siracusa’s excellent Snow Leopard article on Ars Technica, uses zlib to compress the file data, which is then moved into the files resource fork. All of the file access APIs in Snow Leopard have been updated so that the compression is transparent to users and applications, compressed files are automatically decompressed when required. Presumably it is an attempt to reduce latency that is a big reason for Apple’s system (by default) only compressing files under 20MiB in size - the time taken to decompress the files may start getting noticeable if they were of considerable size.

As mentioned in the Ars Technica article linked above this file system compression is not backwards compatible. Viewing a file compressed via HFS+ Compression in an earlier version of Mac OS X will just show a zero-sized file: the data is “hidden” in the resource fork, so HFS+ Compressed files viewed in 10.5 and below appear to be zero sized, and there’s no (easy) way to access the compressed data. This lack of backwards compatibility may be the reason that Apple haven’t exposed this feature to users; they have given no simple way to simply tell the system to compress part of the filesystem.

There is one way to compress files in the filesystem, using ditto. In Snow Leopard, on a HFS+ disk, this command would copy the folder ~/Documents/Archive to ~/Documents/Archive_compressed, while compressing the data:

ditto --hfsCompression ~/Documents/Archive ~/Documents/Archive_compressed

The Archive_compressed folder is still “live”, it can be browsed and files will look exactly the same, but in the background the system will be decompressing them whenever you open them.

Unfortunately this compression system uses a private API, so there isn’t an easy way for anyone to develop an application to let users get to this functionality easily. Fortunately, someone has managed to make a tool that does that. As detailed in this Macrumors thread, bkirch managed to reverse engineer the system calls ditto was making, and has created afsctool. This tool lets you specify different folders to compress, the settings used, and also lets you view details about which files have been compressed, and interestingly how much space has been saved by the compression.

There’s extra-overhead incurred in the background compression and decompression of these files, so you wouldn’t want to use compression exclusively across your entire system (and with the default settings only small files are compressed anyway) but for infrequently used files it’s definitely interesting. When I tested the compression level achievable with a number of 30-40GB folders of general files with a bit of a design tilt (including a fair number jpegs which won’t compress much further) I’ve seen reductions in disk space usage of around 30%. That’s not bad. Bear in mind that if you’re testing this with your own files, the Finder will always show the uncompressed sizes when you do a “Get Info” on a folder. The best option is to use afsctool with the -v switch, which handily shows the exact percentage saving the compression is making, the command is of the form:

afsctool -v ~/Documents/Archive_compressed

It would be nice if Apple gave more direct access to this feature, but in the meantime afsctool is useful, and it may be that this is simply a stop-gap feature until Apple’s new filesystem hits the streets.


Previous Entry: "VirtualBox"

Next Entry: "Merry Christmas"