Hello, everyone. My name is Jason Scott, and my position is “free-range Archivist” at the Internet Archive (archive.org). I’m here as a private individual, and not as a spokesman for the Archive in any way.
I’ve been working there for about two years and have used them and interacted with their staff for significant years before that. I’m now a full-time employee and have been busy bringing in several hundred terabytes of data into their stacks. Along the way I’ve been adding all sorts of material, ranging from videos, magazines, and books, all the way through to software, website snapshots, and scientific papers.
However, my main interests in life seem to center around computer history, especially home computer history of the 1970s and 1980s. To that end, I’ve made a documentary about computer bulletin board systems, as well as a documentary on text adventures. I’ve uploaded most of the raw interview footage of these films to archive.org as well.
As a few people noted here (and elsewhere), I’ve begun uploading collections of program images named in the TOSEC format to archive.org. These are being added as large ZIP files, which works better among the archive.org item framework. A .ZIP browser built into the system allows per-image references. A good example of this system in place is here: http://archive.org/details/Camputers_Lynx_TOSEC_2012_04_23
In the case of these items, I’m standardizing on the date of the set for that platform, with this first set being moved right up to the date of the collection I acquired (2012-04-23). As updates are done, I’ll make new items. (I realize this means lots of redundancy in the image collections, but space is not a problem at the Archive, and it’s easier to just have multiple items and move people forward over time). It’s all still in rough shape and will be refined in the future.
I don’t wish to pull anyone’s energy or time away from the TOSEC work being done – I just know this project was going to gain attention over here, and I wanted it known I was the person doing this. Now for the why.
Huge organizations, museums and archives and libraries alike, have begun taking an interest in preserving software or aspects of software. In some cases they wish to preserve the items (say, a boxed commercial program) while in others, they find themselves desperately in need of older software (say, a copy of a word processing program or spreadsheet) to allow them to look at acquired old files they’ve been donated. They are often slow, are constantly hindered in their actions because of management or administrative concerns or standards, and are often forced to make less-of-two-evil decisions when it comes to the software being preserved.
TOSEC, meanwhile, has run a decades-plus massive worldwide effort to agnostically save as much of this software as possible. TOSEC has, with no question, blown past any other professional effort in terms of size and breadth of the software they’ve quantified and described. It is a stunning achievement. I have brought professional archivists near to tears showing them the work TOSEC has done.
So I’ve put it on the Archive. I realize there are concerns and debates about this effort, and I understand them. The Internet Archive is a non-profit library with worldwide servers dedicated to bringing humanity’s knowledge to as much of the world as possible. We are known the most for the Wayback machine, but we also have scanned over 2 million books and put most of them online, as well as thousands of movies, hundreds of thousands of music tracks, and an extensive amount of television news programs from around the world. Every 90 seconds, the Archive adds a new book: http://statusboard.archive.org and many, many new files are uploaded every day, of all types.
I respect the TOSEC effort, and hope to mirror as much of it as will shake out over the next couple months and years at the Archive. It’s a bold experiment, to be sure, but I believe very strongly that computer history needs to move forward and software must be treated like the culturally relevant artifact it is.
I’m reachable at firstname.lastname@example.org for comments and questions.