O J Way Oren Rosenthal Austin,TX

Archiving Part I: Maintaining Primary Source Information

Posted in Technology by OJWay on May 23rd, 2008 permalink

Appalachian Folk Singer Mary Lomax
Note: I’m dividing this post into parts one and two. The inspiration for this post is a New Yorker article called “The Last Verse” by Burkhard Bilger about musicologists making field recordings of lost blues artists in the Georgia mountains. Part II is available here.


The Digital Memory Crisis

The Information Age has made long-term archiving much more difficult. It’s ironic. At first glance it’s easier than ever to store documents on a flash drive, or Yahoo! Briefcase, or one-touch backup. But easy backup can leave people with a false sense of security because there are serious long-term hazards for stored digital information.

  1. Degradation of the storage media itself. Acidic paper and magnetic tape are the most obvious examples that come to mind, but did you know that even CD-Rs can be counted on to last only up to 5 years? For example, *reportedly* 10-20% of data from the Viking Missions to Mars is lost due to the degradation of the magnetic tape. (BTW, this has fueled conspiracy theories that NASA is suppressing information about life on Mars.) *I can’t find any authoritative confirmation of this fact though, so if you can find it please comment.*
  2. The obsolescence of media technologies often prevents the retrieval of digital information. As a vinyl record collector I know this phenomenon pretty darned well. A good example of this was the near-loss of data from the 1960 census that was stored in a format that could only be read by vintage UNIVAC tape drives. Good luck getting hold of one of those! It took several years to get that data back.

Taken together, the Council of Library and Information Resources calls this the Digital Memory Crisis. But there’s a third hazard as well:

  1. The needle in the haystack problem. The vast quantities of data being produced are often routinely backed up without regard to their long-term usefulness. The really useful stuff is saved alongside the dross of the information age, and the knowledge of where to find it may be forgotten after only a few days.

Way Back Machine

9-track take with protection ringSo let me reminisce about my days at Raytheon. I was working on a radar system, but it was already well along its development path. It had been commissioned in the mid-80s, but it called for technology that was “tried and true” even back then, and presumably bug-free. That had a lot of disadvantages, but a cool part was that I learned to use some older technologies. We wrote code in Fortran (a language of the ’50s) for VAX computers (machines of the ’70s) running the VMS operating system (introduced in 1980).

These systems utilized 9-track tape, which was a half-inch wide magnetic tape that came in these huge reels, they must have been a foot in diameter. Capacity: 170 MB. The tape “drives” were mounted on the mainframe machines, which by the way were the size of a refrigerator, and often positioned next to eachother and taking up a whole wall of the room. You had to mount the tape reel onto a hub, and thread the end of the tape into a take-up reel. (My friend and former colleague Matt reminds me that we had a fancy vacuum powered tape threader that sucked in the end of the tape so we didn’t have to attach it by hand.) They were rewritable, but I don’t remember how you’d find the location of your stored data, or what the protocol was to rewrite. If you remember please comment. The tapes came with a ring in the middle which you can see in the picture, and when you’d remove it they the tape was no longer writable.

Every quarter or so, or maybe it was every major release, we’d need to document the code for the sake of traceability, just in case we ever needed to revert to a last known good. So it was often my job to work those old tape drives and create the backups, a process known as “vaulting”. We’d always make two copies, one to keep on site, and another that would get stored in some kind of vault. There were dozens of those tapes hanging on racks in the computer room, which was pretty big in order to accomodate those massive Vax mainframes.

As for the vault, I never knew where it was, probably in the Pentagon, but in my imagination it was in some bunker buried deep within the Rocky Mountains. Anyway, those code backups are probably unreadable now, unfindable, and irrelevant.

WSBI (pronounced wĭz’bē )

On the other hand the data obtained from the operation of the radar was pretty useful, and may even be useful today. The radar would take soundings of different layers of the Ionosphere (the outermost layer of the Earth’s atmosphere). As part of the prototype testing, we operated the radar in Amchitka, Alaska, and the value of 20-year-old data obtained with cutting-edge technology at such an extreme latitude could be of interest to climatologists today.

It was certainly of interest to us then. The best task I was ever assigned at Raytheon was a special project to review the data and find the best examples of seasonal ionospheric changes. I’d review a type of image called a WSBI (wide sweep backscatter ionogram) which looked something like this picture here. In this example, the image on the left shows three very distinct layers, but the image on the right is more typical and shows two that aren’t so distinct. Except our WSBIs were truly beautiful, in glorious colors against a deep blue background.

We used these pictures in training manuals, and it was also used in some scientific report that was presented in Australia. Perhaps the pictures survive, but who knows if we’ll ever see that raw data again. Maybe - maybe - they did end up storing it in the vault, where the data quality has degraded over the 16 years of storage on old magnetic tape (Hazard #1). Or maybe it will never be looked at again because who knows where you can find a 9-Track tape drive and a machine that runs VMS and the code that can read the data and turn it into WSBIs (Hazard #2). Or maybe they’ll never find those tapes because they’re filed away amongst the dozens of quarterly backups of long-ago superseded radar system code (Hazard #3). Or maybe it’s useless anyway.

If there are any climatologists out there who could use this data, here’s a place to start. Good luck.

P.S.

If you’re looking to save the important stuff, print it out and hang onto the hardcopy. That’s what I told all my clients when I was teching computer basics to senior citizens. Today we can still easily read:

  • The Declaration of Independence from 1776: 232 Years Old

  • Shakespeare’s Plays from 1623: 385 Years Old

  • Gutenberg Bibles from 1454: 48 copies in existence on vellum and hemp-based paper, 554 Years Old

  • The Magna Carta of 1215: four known original copies still in existence, 793 years old

One Response to 'Archiving Part I: Maintaining Primary Source Information'

Subscribe to comments with RSS or TrackBack to 'Archiving Part I: Maintaining Primary Source Information'.


  1. on July 17th, 2008 at 4:43 pm

    [...] Part 1 is available here. [...]

Leave a Reply