ASCII by Jason Scott

Jason Scott's Weblog

The BITSAVERS Renewal —

The Bitsavers-Internet Archive bridge continues to be a wild success.


(I mentioned this whole thing with an announcement from a couple years ago.)

If you’ve not been aware, there’s this amazing, amazing project undertaken by a small handful of individuals to go through stacks of computer documentation and just flat-up digitize it all. No muss, no fuss, no flamboyance, no showtunes. In go endless booklets, manuals, blueprints and spec sheets, and out come PDFs. By the truckload, it can feel like.

The only problem, and it’s a minor polish considering the gargantuan amount of effort put into it, is that the bitsavers site is decidedly spartan. It’s just directory after directory, without any real way to browse the things. You kind of have to know what you want, and you have to download PDFs to see if they’re what you’re seeking. Again, it’s a minor complaint considering the hard, heavy lifting of grabbing this material and digitizing it is being done, dependably, for years on end.

Some time ago, I leveraged being a bitsavers mirror to write some code that would absorb newly digitized items into the Internet Archive’s collections. By doing this, the PDFs get previews, some amount of word analysis, and an online reader. Win-win.

That was some time ago. The collection was well past 28,000 individual items. Items that, if you find them, have gems aplenty:

And, literally, many, many more.

This has been running pretty smoothly for years, but even the best of automated scripts break over time, and I sat down and did some much-needed maintenance.

I finally wrote the script that needed to be written some time ago – it goes through what actually got uploaded and what missed out for a billion reasons over the past few years. It turns out, between broken connections, system downtimes, and the many pieces that could go wrong, over 4,000 files had been skipped over. Those are populating as we speak.

In total, it looks like in about a week’s time, the amount of items on the bitsavers collection on will go from 28,000 to at least 32,000. That’s books, magazines, brochures, articles… and due to the highly focused work of the bitsavers folks (primarily Al Kossow, who does most of the scanning), the material is very agnostic – it’s not someone who loves Apple scanning nothing but Apple, or a person who likes videogames, or even someone who thinks it should only be brochures or manuals that get scanned. It’s everything, literally everything that sat near or around computers.

Closing the air gap from stacks of paper in a warehouse to digital files is a long, boring, intense road. I’m so glad someone is doing this, and I hope the collection as it stands on and the other mirrors is of a lot of use to a lot of people for a very long time.

Categorised as: computer history | Internet Archive

Comments are disabled on this post

Comments are closed.