"Internet history is fragile. This archive is making sure it doesn't disappear" PBS NewsHour 1/2/2017
SUMMARY: What's online doesn't necessarily last forever. Content on the Internet is revised and deleted all the time. Hyperlinks “rot,” and with them goes history, lost in space. With that in mind, Brewster Kahle set out to develop the Internet Archive, a digital library with the mission of preserving all the information on the World Wide Web, for all who wish to explore. Jeffrey Brown reports.
WILLIAM BRANGHAM (NewsHour): We're increasingly cataloging our lives online, Facebook, Twitter, Instagram, seemingly endless YouTube videos. It's a digital-first and often digital-only world.
The advantages of this unlimited Digital storage seem obvious, but how permanent are some of those records? How do we preserve digital history?
Jeffrey Brown has the story, part of our ongoing series Culture at Risk.
JEFFREY BROWN (NewsHour): So, this is like an ancient temple come to life in a modern age.
BREWSTER KAHLE, Founder, Internet Archive: In a modern day. It's a Greek-style building, which we loved, because the whole idea is the Library of Alexandria reborn now.
JEFFREY BROWN: It's an ancient idea, to gather and preserve the world's knowledge. But now that library will look like this.
These stacks of servers, Brewster Kahle told me recently, represent a 20-year and running effort to build a kind of digital Library, and to essentially back up the ever-expanding World Wide Web.
BREWSTER KAHLE: In one of these would be 100 years of a channel of television. Or this much is all of the words in the Library of Congress.
We need to be able to preserve our digital history.
JEFFREY BROWN: Kahle was an early Internet entrepreneur who in 1996 founded the Internet Archive, a nonprofit that operates out of an old Christian Science church in San Francisco.
It was designed to address a fundamental flaw in the original creation of the World Wide Web by Tim Berners-Lee in 1989.
BREWSTER KAHLE: The wonder of it is, it's very, very simple. Anybody could go and set up a web server on their computer and make it available to the world.
Unfortunately, it's too simple. It's fragile, that if something happens to that piece of equipment, that Web site just, blink, is gone.
JEFFREY BROWN: If it's online, it lives forever, right? Well, no.
Kahle says the average lifespan of a Web page is just 92 days. Information is altered and deleted all the time for all kinds of reasons.
A 2013 Harvard study, for example, found that half the hyperlinks in Supreme Court cases, today's equivalent of footnotes, are broken, a phenomena known as link rot. Government agencies remove documents, and companies fail, and with them the sites they host. Think of GeoCities, Yahoo! Video, and, more recently, the news site Gawker.
ABBY Smith Rumsey, Author, “When We Are No More”: People mistake the fact that the Internet is ubiquitous with the fact that it's permanent.
JEFFREY BROWN: Abby Smith Rumsey is the author of “When We Are No More: How Digital Memory Is Shaping Our Future.”
She began her scholarly career studying how information was purposely deleted in the totalitarian Soviet system. These days, she thinks, we have a new kind of storage and retrieval problem.
ABBY SMITH RUMSEY: It isn't permanent at all. And, in fact, the thing about digital technology is, you can inscribe something onto a computer, but you can't put it on a shelf and expect to pick it out at random at 50, let alone 500, years, and be able to read it.
In fact, you won't have the hardware or the software to do that. So, it's very fragile, indeed.