The most extensive Internet archive is the Internet Archive Wayback Machine. Many people don't know about it because it doesn't advertise all over the place. A non-profit Internet library founded by Silicon Valley millionaire Brewster Kahle in 1996, the Internet Archive's purpose is to record and archive for posterity every webpage (and their contents) out there in Cyberland. Kahle's Internet Archive has collected copies of tens of billions of pages in the past ten years.
A major advantage of the Internet Archive is that it's so inclusive. If you want to locate and read a defunct site or establish a beginning date for your own site (especially useful if someone is violating your copyright by plagiarizing you on their page), you can look it up on the Wayback Machine on the Internet Archive's main page. All you need is the URL.
Another major Internet archive is Google Groups. Google Groups is the former Deja News library of Usenet group messages. Deja News first began in 1995 and went under in 2000. Deja News' Usenet archive was sold to eBay and then to Google, which continues to maintain it for those accessing Usenet. This Internet archive holds them all, millions of messages from tens of thousands of Usenet groups like alt.spanking going back to the early 1990s.
Another large and popular Internet archive with a more recent list of discussion threads, Yahoo Groups, is an archive of self-contained listservs. Yahoo's groups not only archive every message, including those of groups that switched from another service to Yahoo, but unless the moderator or owner makes those messages private, they are searchable on the Internet.
You can also look up old versions of webpages in a more ephemeral and scattershot Internet archive: search engine caches, most notably those in Google and Yahoo. These are temporary archives of webpages that you can access in case the current version is down. This type of Internet archive can last for months after a webpage goes down for good. You can find the link "cache" right after the URL of the search hit to access it. Web caches are especially useful for finding a blog entry that has been moved or even deleted. You can also use them to find a webpage that has recently been erased, though the webmaster may take steps to block a cache copy or have it erased afterward. Also, very new webpages won't have copies until the webspiders have crawled them and recorded them for the Internet archive. This can take up to a week or so.
Another type of Internet archive is the private or subject-related database like Internet Movie Database (IMDB), especially one with long-term and extensive discussion boards. But these may or may not turn out to be ephemeral, since their mission statements are based on business concerns, not an intent to archive information for posterity. Wikipedia is another type of Internet archive - the encyclopedia that seeks to inform and/or educate. Associated Content, a collection of articles on consumer subjects, is a similar type of Internet archive to Wikipedia.
There are always gaps in an Internet archive. People or groups can have their messages or pages deleted, though this is not that easy. Also password-protected pages (like Yahoo groups with private message archives) are not recorded by long-term projects like Kahle's Internet Archive. These types of pages will remain poorly represented in the historical record that Kahle's group hopes to leave to the future. Meanwhile, administrators of non-Usenet groups, lists and blogs can delete any messages they choose from whatever Internet archive they control. This is a far more common practice on public bulletin boards (as at IMDB), or blogs where the illusion of anonymity is thin, than on private groups where administrators might be more interested in keeping a compete record of messages for their members.
Obviously, all of this archiving raises some concerns in terms of security and personal privacy. If something you said when you were an anarchist twenty-year-old is still up when you're forty and working for a corporate bigwig, this can throw a monkey wrench in your career plans. It is possible to have something erased from an Internet archive, but it can be difficult and you have to know where you posted something in the first place - and who has the page now.
But these concerns about Internet archives are also based on a fallacy: that the Internet ever promised anything but the random and indiscriminate spread of information pretty much anywhere for as long as the 'Net lasted. The Internet never really was private. People sitting in front of their screens at home invented the illusion of privacy in their own minds. Cyberspace doesn't work that way. It's not so much like chatting in your house as standing, whispering, in a marketplace full of microphones. And you can't take it back because you did say it. So, be careful what you whisper; in an Internet archive, it could echo for a long time.
Published by Paula R. Stiles
A 42-year-old American, I've taught fish-farming in Africa, run a rescue squad in Vermont and done a PhD in Scotland. You can find my published articles in history and both SF and Fantasy stories at: http://... View profile
- Is the Internet Still a Resource for Information?Bored with your internet experience? Then heed some of my recommendations for turning the Internet back into a profitable and productive place.
TCP/IP In-Depth Description (Transmission Control Protocol / Internet Pr...The TCP/IP protocol suite is installed on virtually every modern network and is what makes the Internet possible. Every time you check your e-mail or browse to a homepage you a...
Why Mozilla Firefox is the Best Internet Browser Out TodayWhy Mozilla Firefox knocks other browsers - especially Internet Explorer - clear out of the water.- Children and the Internet: Do You Know What Your Child is Doing?We have to educate ourselves about the risks involving our children and the internet.
- Book Review: Internet Scams...Exposed!Keep your personal information personal! Arm yourself with information that will protect you from identity thieves and Internet scammers.
- The Internet Archive: Noble, but is Internet History Worth Saving?
- Just What is the Internet?
- Beware of Fraud in Internet Cafes When Traveling
- What Services and Resources Can Be Found on the Internet?
- How to Stay Out of Spyware Trouble on the Internet
- Internet-Related New Technologies in 2005
- How to Succeed with Internet Marketing
- "Internet Archive Wayback Machine" (www.archive.org/) "Google Groups" (groups.google.com/) "Yahoo Groups homepage" (groups.yahoo.com/) "Internet Movie Database" (www.imdb.com) "Deja News-Wikipedia" (en.wikipedia.org/wiki/Deja_News) "Mieszkowski, Katharine. 'Dumpster diving on the Web.' Salon.com. (Nov. 2, 2001)" (archive.salon.com/tech/feature/2001/11/02/wayback/) "Lasica, J.D. 'The Net Never Forgets.' Salon.com (Nov. 25, 1998)" (archive.salon.com/21st/feature/1998/11/25feature.html)
- Anything posted to the Internet can continue long after it's been deleted.
- Internet Archive has been recording all webpages and their contents since 1996.
- Google Groups records and stores all Usenet discussions past and current.



