ASCII by Jason Scott

Jason Scott's Weblog

That Time I Put BITSAVERS into ARCHIVE.ORG —

Everybody needs a new year’s resolution. Most of them also need to follow through. I am going to try both.

It’s a simple one, too: “By the end of 2013, change online computer history forever.” I’ve been working at the Internet Archive for nearly two years now, every day better than the last, with memories and happiness among the finest I’ve known. Now it’s time to secure the rigging and sail into the sunrise.

To that end, I am concocting several grandiose projects intended to bring the maximum amount of computer historical data into the best possible and most accessible ways that I can, and ensure they’re at arm’s reach for research, knowledge and reference. Everyone has various subjects and specialties they particularly enjoy – this one is mine.

So let’s begin.

I am shoving over 25,000 manuals, reference sheets, catalogs, code listings and books into the Internet Archive from the directories of the Bitsavers Collection. As we speak, I’m past 2,000 of them, and hundreds are coming in every few hours.

CoDE_BW

Bitsavers is a brilliant scanning project that has been a dark horse wonder on the internet for years now. It’s the hard work of multiple good people including Al Kossow, who is the curator at the Computer History Museum in Mountain View, California. With millions of pages scanned from intimidating numbers of sources, Bitsavers is a vital resource, well maintained, and extensively mirrored.

It was only a matter of time, then, that scripts I’d written could be used to ingest all of Bitsavers into a collection at the Internet Archive. It’s only a few dozen gigabytes, right?

ingestor, the script I wrote that does it, will eventually be one of a number of public tools I’ll provide to help people bring bulk-upload projects up to speed. A number of Archive Team members already use these tools. They’re fast, the error checking is a-ok, and once you have a bunch of a certain type of file, you just sit back and watch the Magnificent Contraption (my name for the Internet Archive’s processing infrastructure) in awe.

The Internet Archive is, at its heart, a reading machine – a place where the data can be experienced (audio, video and books) by downloading or streaming all sorts of media. Bitsavers has, to its credit, heavily prioritized acquiring scans and data over presenting it all in a cute little package. Combining these two forces results in an unstoppable library.

Here’s some documents to check out, to see what I’m talking about:

And, I promise you, there’s a lot more in here.

Is it buried? A little. I’ve got plans on how to fix that as well. But for now, I’ll be shoving these documents in as fast as my scripts can wend them and the Magnificent Contraption can OCR/Convert them.

Again, I had nothing to do with the scanning and arrangements of these wonderful documents. I’m just putting them into another framework, another place. And I hope that the toil and effort taken by the Bitsavers volunteers can get even wider recognition.

Stop in, browse around. You might be surprised what you find.

And things are just getting started.


Categorised as: computer history

Comments are disabled on this post


2 Comments

  1. Thank you Jason for doing this! Great job! I was made aware of the news here: https://plus.google.com/u/0/107049823915731374389/posts/Z8qx726whhF

    Feel free to pop in and discuss it.

  2. […] Jason Scott, Mitarbeiter von Archive.org, stellt grade jede Menge alter Tech-Manuals und Dokumente von Bitsavers.org online, darunter dieses fantastische Manual eines TEC Terminals aus dem Jahr 1968: The Man Machine Interface. Nach dem Klick eine Doppelseite daraus mit Retrotech-Porn in High-Res. […]