The Network is the Archive (or should be)

Throughout history man has tried to figure out how to save his works forever, through schemes that usually involved some kind of violence done to some kind of very hard medium. Stone tablets, for example.

With the dawn of the digital age this need hasn’t diminished, but the digital storage mediums devised so far have lifespans far less than forever. You’re lucky if that CD you burned 5 years ago is still readable.

The ease, convenience, and economy of digital media has undermined our focus on making things we create last forever.

I was reading an Ask Slashdot thread about this issue today, and though I read an awful lot of that thread nobody seemed to be suggesting the answer that immeditaely occurred to me.

At the same time that it sabotages our archival efforts by being untrustworthy, the digital age offers us the opportunity to create a truly Forever Archive. Or at least it has since the network was invented.

What prevents a physical, local archive from lasting forever is its very physicality. It’s a popular tenet among techy types that if you really want to back something up you keep at least three copies:

  • One you’re using on a daily basis. The working copy.
  • One on a separate hard drive or burned to CD/DVD.
  • One on a hard drive at some other location.

Really hardcore types make one or more of these copies part of a “RAID array“– basically a storage system where the information is spread out over more than one hard disk, and multiple copies are kept of the data, so if any one of the hard drives fails your information isn’t lost.

This 3-copy scheme protects you from your hard drive crashing and your house burning down, but it doesn’t protect you from the media going obsolete, *all* your backups dying because you were too lazy to maintain them, the company who hosts your offsite data going out of business, or a nuclear war. Depending on the location of your offsite backup, it may not even protect you from a snowstorm.

Not to mention it’s kind of expensive.

The solution is to remove the archives physicality. Put the archive in cyberspace. Make it fuzzy, a cloud, an abstract idea that magically results in a concrete actuality of a JPG of my mom through the power of software and algorithms. An Internet RAID array.

Use the principle of distributed computing (for example, Folding@Home or SETI@Home), but rather than using idle CPU power, use unused hard drive space. I’ve got lots, and I bet most people have at least 10GB to spare somewhere.

Take 100,000 participating computers across the world and make a large disk out of them, and put your data on there– redundant and encrypted. Nobody on the network can see what part of the array they have on their computer, but everyone who contributes hard disk space gets some fraction of that amount of storage on the Forever Archive. This would give me…

What I Really Want

  1. A way to store information that will always be retrievable. Natch.
  2. Not to have to maintain hard drives or archives any more. I don’t mind keeping a local backup, for convenience. But archiving? What a pain, and I can’t personally achieve a true archive without great expense.

In a nutshell:

Everybody who wants in designates some amount of space on their hard drive to be used for the network. Depending on the fraction of space you donate, you get that much space to use on the network, or some fraction thereof to account for redundancy and encryption. I would personally be willing to contribute 10GB of a hard drive in order to get 2GB of storage that would never die.

This kind of (sort of) already exists in the form of Freenet. From Wikipedia:

Freenet is a decentralized, censorship-resistant distributed data store originally designed by Ian Clarke. Freenet aims to provide freedom of speech through a peer-to-peer network with strong protection of anonymity. Freenet works by pooling the contributed bandwidth and storage space of member computers to allow users to anonymously publish or retrieve various kinds of information.

This sounds just like what I want! Except not really. All of that business about “censorship-resistant” and “anonymous” means that only creeps and hooligans want to use the thing. Focusing on the anti-establishment aspects of the cloud completely destroys its credibility. More quotes from the article:

Freenet’s founders argue that only with true anonymity comes true freedom of speech, and that what they view as the beneficial uses of Freenet outweigh its negative uses.

One analysis of Freenet files conducted in the year 2000 (before Freenet had proper support for web pages and chat) claimed that the top 3 types of files contained in Freenet were text (37 %), audio (21 %), and images (14 %). 59 % of all the text files were drug-related, 71 % of all audio files were rock music, and 89 % of all images were pornographic.

So, great. *eyeroll*

Most people using Freenet seem to want to host their illegally downloaded content on this network, in order to hide it from the authorities. And the founders of the system want it to serve as a way for Chinese dissidents to fight their government (which I’m in favor of, by the way). As long as the network is encrypted and I am protected from prosecution, I don’t care that my computer is being used to store that crap as long as I get my share. Besides, I don’t want a Forever Archive in order to store porn– I want to make sure I will never lose this picture:

My Mom

Therein lies the rub:

If Freenet is full of crap, there’s no way it’s going to catch on to the extent I need it to in order to serve as a Forever Archive. Average people will read the mission statement and think it does something other than what they want, or they will be turned off by the attitude or the bad press.

I really suspect that’s the case, because it’s been 8 years in development and Freenet still hasn’t achieved a 1.0 release. A small band of brothers fighting the good fight, trying to bring their technology to fruition despite no resources and no support. (Or rather no supporters who are actually willing to pay for things, for the most part. Yeah, go ahead, flame away.)

Besides, with their focus on anonymity and Sticking it To the Man (which I mostly support, by the way), their focus will be on those aspects of the technology rather than making the network truly bulletproof as far as what you put in you can always get out. Without looking further into it, what I’ve seen so far definitely doesn’t seem to be designed to serve as…

The Eternal Archive
(or “Cloud Archive” or “Forever Archive” or “Joe” or whatever)

So I think something else is called for. A Peer-to-peer system that will serve this function, and which is truly a mainstream application that doesn’t devolve to the lowest common pornominator for freaking once on the Internet for crying out loud, with the following requirements as its foundation:

  1. Data put in will stay in and be retrievable in its original form forever or until the submittor takes it out.
  2. Data put in should exist several times across the network, to guard against loss. This is redundancy.
  3. Data in the system will be encrypted. Nobody will use it if other people can just paw through their information. This isn’t necessarily anonymous, just encrypted.
  4. Speed is not an issue, or at least not in the top ten. If it takes a day to retrieve 1GB so be it– as long as that data is guaranteed to be uncorrupt.
  5. Network-agnostic. It would be ideal to have a group of software applications that allow parts of the Forever Archive to be placed on any kind of network– bittorrent, napster, Windows Sharing, Intranets, Token Ring, whatever. Each computer in the system runs a little bit of the archive, and even if France gets nuked the archive survives intact.
  6. Focused on preserving information for users, not on ideology. Unless “preserving information” is an ideology, which in some cases, like the Internet Archive, it is.

You probably wouldn’t want to store really sensitive information in the cloud, but even *moderately* sensitive information would be safe for the average person. Diary entries, test scores, everything outside financial information and passwords– which are transient anyway.

If you want a “trusted” Forever Archive, meaning one where your data is stored only by computers you own or people you know own, you can already use Freenet in its recently-introduced “darknet” mode, which spreads the cloud over only those computers you designate as trusted. No one else can get to the information. But for a Dark Net you have to have those trusted computers on the network in order to use them. Which has been a problem for Freenet during the development of version 0.7:

For much of the development process of Freenet 0.7, there was no Opennet mode, so that users would have to find Darknet connections. This was partly because it simply wasn’t implemented, but partly because of developers’ hopes that a true F2F [Friend-to-Friend] network would emerge.

However, this did not work out, because in practice most users didn’t know anyone else using Freenet, so had to use an IRC channel or a Frost board to find total strangers to connect to.

In order for the archive to truly be forever, it’s gotta be spread out and redundant. The Freenet Darknet mode doesn’t seem to quite meet that need for me personally, since nobody else I know is going to care. I want this to be so popular it’s in applications by default. A plugin for iTunes, every Bittorrent client, every eMule client, Windows, Firefox, and my cat. I would totally donate a cat to become part of the Forever Archive.

Now that I’ve put the idea out there into the ether, I have no doubt that somebody somewhere will build it. Just like all those Robert Cringely ideas over the years…



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s