X-Message-Number: 5558
Date: Wed, 10 Jan 1996 08:20:38 -0800 (PST)
From: Joseph Strout <>
Subject: Re: Data Storage

Again, many thanks for your comments on my archiving suggestion.  First, 
I'd like to respond to some comments from Edgar Swank (#5554):

> There is at least
> one commercial service that does this for $40 per CDROM, although that
> would not include data preparation.

This is a good start, but it is not quite enough.  A service like this 
might provide a subcontractor for another organization, though.  The 
characteristics we need (in addition to just creating CD-ROMs or 
whatever) are:

1. Safe storage: if we all keep our own CD-ROMs, they will probably get 
lost when we die (family members may not recognize their significance, 
etc.).  Or our house could burn down, they might get stolen, etc.  Better 
to have the company keep them in a safe -- or better yet, two safes, 
separated by a few thousand miles.

2. Long-term storage: Ideally, you could choose to (1) pay a monthly fee, 
or (2) pay a bigger up-front sum, for which they'll store your data 
"forever".  (If the service is associated with a cryonics arrangement, of 
course, the latter would be implied and probably covered by other fees.)

3. Active maintenance: the biggest losses of data are not through media
degradation, but through obsolescence -- we (as a society) forget how to
read old formats.  NASA has miles of tape that nothing can read anymore
(though, as I recall, they're attempting to decipher & upgrade them). 
This can only be avoided through (1) storing complete data on the format
in the same safe as the media, and (2) copying to new media whenever the
old becomes obsolete.  An interesting article on this problem appeared in
Scientific American recently: Rothenberg, J. Ensuring the longevity of
digital documents. Jan. 1995, vol.272, (no.1):24-9. 

4. Cryonics company involvement: I want my storage provider to at least 
know where that data is, and how they can get it when they need it.  
Better still is if they actually keep it for you, as CI does with some 
info now.


As you point out,

> OTOH, CDROM recorders for use with PC's now start in price around
> $1000, so anyone with a PC system already and especially if he already
> has a CDROM recorder for other purposes might want to assist on this.

...so rather than pay the above company forty bucks a disc, an alert 
cryonics provider might just invest in their own recorder.


Then in message #5556, Perry Metzger writes:

> 1) Once you get to a small enough level, "physical" and "chemical" are
>    the same thing.

Yes, but the CD-ROM pits aren't on a level that small.  I don't have the 
figures, but I have the impression that they're at least a few microns 
wide and deep, which is MUCH bigger than a molecular scale.  Can anyone 
give us the exact numbers?

> 2) CD-ROMs ... decay very nicely. Among other
>    things, degradation of the plastic and glues that surround the
>    pitted metal surface occur, as well as fun things like
>    photodegradation of the surface itself.

Photodegradation is greatly reduced by keeping them in the dark.  =)  
(This is a problem to which film is prone too, I might point out.)

>    Some early CDs sold at the beginning of the CD era have
>    become useless because of glues decaying or opaquing.

I think you think that the data is gone when you can't put the disk in 
your home CD reader and read it.  In fact, however, the data is almost 
certainly there; you'd just need a microscope to see it, and more 
sophisticated automation to read it.  It would take a long time for the 
pits to disappear completely, I think.

> 3) Properly developed and fixed b&w photographic negatives have a
>    *demonstrated* lifetime of at least six or seven decades, and in
>    some cases a century or more.
> 4) I have, in the past, routinely used thirty and fourty year old
>    microfiche with no noticeable decay other than that from use.

This is not comparable, because you're not attempting to read this data 
digitally.  If you did, they would not have demonstrated nearly this 
lifetime.  Instead, you look at them visually, as *analog* image data, 
which your brain (being very good at such things) can interpret despite a 
great deal of fading and noise.  To digital data, fading and noise mean 
"loss" to an ordinary reader.  So you're not judging the two by the same 
criteria at all.

Don't get me wrong: microfilm is not a bad idea either.  But I'd be more 
comfortable with digital storage, since we can expect it to be lossless 
(even after multiple copying), which is not true for the analog storage 
(esp. images) normally used on film.  Also, there is not no standard 
format for storing sounds on film.

> The whole question is always "how much data do you want to store, and
> how much are you willing to let it degrade".

Add, "How much are you willing to pay."  This is why I didn't suggest 
chisled stone or engraved metal, though I share your respect for these 
media. 

> I recommend going back to the future. Specially
> printed text in special OCR fonts on low acid paper. Special bar codes
> that are REALLY BIG and trivial to write scanners for.

I like this idea... in fact, rather than bar code, one could develop a 
robust and simple dot code which would encode any binary data equally 
well.  (Indeed, such a system was in use once; computer magazines printed 
this hash in the sidebar, and if you had a reader, you could read the 
code directly into your computer.)  As long as the format was clearly 
defined and relatively carved in stone, this would do.

> Unfortunately, it seems that people want to store lots of "bulky"
> information -- video, pictures, vast and bulky records, etc.

Yes, I think this is important.  At 72 dots per inch (we use such low res 
to reduce degradation), we fit about 50K on a page.  It would take at 
couple file drawers to hold all my archival data this way.

> Magnetic media have a known bad track record on this. Mag tapes
> recorded in the 1950s have in many cases decayed to uselessness. This
> is a well known problem. Paper isn't dense enough.

Agreed.  Magnetic changes are even more volatile than chemical ones.

> 1) Film, microfilm, and microfiche, and you accept the problems of
>    analog media.

Agreed.

> 2) Put the stuff on multiple redundant machine readable storage media,
>    use fiendish and expensive error correcting codes that would not
>    normally be used, and read and re-record the information onto the
>    most survivable known archival media every couple of years.

Yes, but I don't think this is as hard as you make it sound.  A simple 
checksum per block of data would suffice to detect errors.  Keep two 
copies of the data, and when an error is detected (due to a checksum 
mismatch), make a fresh copy from the good one.  Change your media type 
every ten years or so, when a new format becomes standard.

> Who knows if anyone will have good enough data to reconstruct the
> recording formats on other types of media, anyway...

That's why you need to define the format explicitly, and include all the
technical detail needed to build a reader.  (And how would you store this
information?  On high-quality paper!)  Of course, if we actively maintain
the data, this won't be necessary. 

,------------------------------------------------------------------.
|    Joseph J. Strout           Department of Neuroscience, UCSD   |
|               http://www-acs.ucsd.edu/~jstrout/  |
`------------------------------------------------------------------'

Rate This Message: http://www.cryonet.org/cgi-bin/rate.cgi?msg=5558