German Archivists Go Retro for Data Preservation
July 9, 2006The data pool in the universe is continuously growing. More and more information is being saved digitally on CDs and DVDs -- but these thin silver disks have a life-span of only about five years. That may be sufficient for the average user at home or work, but archives and libraries face the threat of massive loss.
At the end of June, specialists gathered for a colloquium in the library at the University of Stuttgart. They met to conclude the so called ARCHE project, which sought a long-term solution for storing digital data -- and found one.
If only we had digital parchment
Emperor Constantine was extremely forward thinking when he ordered that 10,000 disintegrating papyrus rolls be copied by hand onto "modern" parchment in the year 330. Parchment lasts for approximately 1,000 years. This ancient means of data preservation would be exemplary today, if we knew how to save digital data on parchment.
"We always get laughed at when we say we save digital data on microfilm. But at the moment we see that as the only automated way to process large quantities of data and still be able to have analog data storage with the highest possible information density," said Thomas Wendel, who is responsible for archive operations at the University of Stuttgart's library.
With a 500-year lifespan, color microfilm is only half as good as Constantine's parchment, but is 100 times better than CDs and DVDs. For this reason, archivists, librarians and researchers are also making use of microfilm for digital data as part of a project called ARCHE. The name references Noah's biblical ark, because it is to preserve valuable data -- instead of animal species -- for posterity.
Access for at least half a millennium
Wendel digitalized slides, books and documents from the National Archive in Baden-Württemberg as a data source for the ARCHE project. Every photographed page has a corresponding digital table containing the information that would normally be found in a card catalogue.
"Our goal is not to compile data, but to find away to make it available for as long as possible -- even the colorfastness. We're hoping for 500 years," said Wendel.
Low-tech solutions to high-tech problems
The cornerstone of the ARCHE project is a laser that is capable of re-writing digital data onto color microfilm. It was developed at the Fraunhofer Institute for Physical Measurement Techniques in Freiburg, which specializes in saving digital data on film. Findings from the institute have been applied in computer animated movies like "Jurassic Park," "Nemo" and "Ice Age."
With digital data saved on analog film "you have directly readable information that is independent from an operating system and from any kind of machine that might not be available in five years," said Wolfgang Riedel, project manager at the institute. "Who can still read a floppy disk?"
Accessing data on microfilm doesn't require here-today-gone-tomorrow technology: In a pinch, microfilm can be read with a loupe or microscope.
Digital-to-analog machine for the masses
"One strip of microfilm is 40 x 45 millimeters, no bigger than a slide. But on this format you can store about two million characters per color, and you have three colors that can be written on top of each other if you want, which makes sense when dealing with digital information," said Riedel.
That means six million characters -- or about 11 300-page books -- on a piece of film no bigger than a passport photo. Color filters allow the layers of data to be read separately.
Starting in July, the company MicroArchive Systems, together with the Fraunhofer researchers, will be adding microfilm data preservation to their product list. Devices for the process will go on the market in fall at the earliest.
"There are still situations that call for black-and-white," said David Gubler, manager of MicroArchive Systems. "We have an all-purpose instrument that can write in black-and-white and grayscale for the mass market, but also in color for the high-end sector."
The automated ArchivLaser can write about one terabyte (one trillion bytes) of data on 600 meters (2,000 feet) of film every day. In contrast, a person can only scan about 2,250 pages per day, so the digitalization is much more time consuming than saving onto analog film.
A mere drop in the data ocean
The next step, however, is to produce a device that can re-digitalize data from the microfilm -- a device that won't be obsolete in several hundred years.
In spite of the analog conversion project, considerable losses of analog and digital data are expected, as the daily addition to the vast sea of data is 100 times bigger that the largest library in the world (the Library of Congress in Washington). Even if everything could be stored for a million years, how would posterity ever have the time to sift through it all?