
In 1977, NASA's Voyager program was launched from Earth, eventually becoming the longest space mission to date, and the only one flying out of our solar system.
Onboard the Voyager spacecraft were many interesting gadgets designed to scan (with the highest detail possible at the time) all the planetary bodies the spacecraft would fly by. But one tiny revolution remained hidden. With all the data being collected and the Voyager spacecraft being so far away, it was almost impossible to send back all the data. What follows is the invention of the Rice algorithms, something that continues to be used today.
Before you get confused, no, the algorithms were not related to rice (the food), but rather, Robert F. Rice, who was an award-winning scientist at NASA. Instead, this was a new method of compressing data, especially images.
These days, most images are saved in a digital format meaning that they need to be processed by a computer. To convert a real image into an image on a computer, the data representing the image (e.g., XYZ coordinates and the colors which vary across the image) must be stored. For images with lots of detail (resolution) this can be cumbersome as the file sizes become huge.
As many enthusiastic photographers will already know, most images can be split into lossy formats and lossless formats. Lossy formats (such as the default JPEG option) aim to be able to recreate an image that looks like the original image (but not necessary exactly the same, i.e., it could be that a few pixels here and there are not 100% right). Lossless formats aim to re-construct all of the image. However, as mentioned previously, the file sizes can be prohibitively large. Therefore, there is a preference for lossless images which can be compressed in a more compact manner. This is especially true in cases where the network communication bandwidth is slow but the image must be recreated properly for scientific reasons.
Prior to "Rice", most scientists used a technique involving Huffman tables. This is a process where the string of image data is converted into a Huffman tree based on how frequent different parts (i.e., a leaf node) of the data appear within the image. The tree branches are then labeled (like a road) and the path to each leaf node can be described by the labels. Using the table, the data can be re-created perfectly as illustrated here. Thus, these Huffman tables must be customised for each image since the Huffman tree must be different for each image. For example, an image of a galaxy may often have lots of black pixels due to the backdrop of Space.
One such Rice algorithm that was used by NASA (which does not use any Huffman tables) is shown to produce compression results comparable to JPEG with custom Huffman tables. Implementation of the Rice algorithm which requires only one pass is shown to be simpler than the implementation of custom lossless JPEG which requires two passes through the image data.
The Rice algorithm processes the data through 12 different optimal code options and selects the one that gives the best compression for the data on which it is working; this maintains efficient performance as data characteristics vary.
Later on, this algorithm was loaded onto specially designed hardware and spun-out from NASA by Advanced Hardware Architectures. Some of the hardware was even protected by a patent (US20060115164A1) that is still active today.
Today, the variants such as the Columb-Rice algorithms continues to be used on Lossless JPEGS and other formats used for example, in the medical profession.
So, the next time you see a spacecraft whizzing overhead, you never know what new “Earth-bound” technologies could be onboard!
