ASCII by Jason Scott

Jason Scott's Weblog

That Time Archive Team Decided to Back Up The Internet Archive —

iabakIt’s inevitable that Archive Team would try to archive the hand that archives it.

We dump a lot of data into the Internet Archive – hundreds of gigabytes a day. And the Archive itself has a goodly amount of petabytes in its stacks. Thanks to a series of articles and appearances, the Archive’s getting some pretty good general attention. Lots of it. People are amazed, filled with wonder, impressed.

They also tend to ask the same set of questions. Some of them tend to deal with the archive’s “backup plan” or various off-the-cuff engineering questions. It’s natural, I suppose. The Internet Archive definitely has engineering and backup plans; let’s get that straight.

But the idea intrigued me, just because I like the idea of there being data that people recognize is precious (“digital heritage” is still a new and not universal concept) and the inherent power that people felt with the Archive Team downloading projects being applied to storing away additional copies of collections on, not bound by geography, politics or censorship.

So, I kind of launched into the idea of an experiment to back up the Internet Archive. Here’s the initial essay and random thoughts about it. (It’s not required reading.)

What followed then was a miniature storm, with a bunch of people weighing in about how such a thing “should” be done, how impossible it was, good people will die on the beach, etc.

But after a couple weeks of poking at the project with a stick, a working prototype came into being. We’ve been working on it, here and there, ever since, and right now, roughly 10 terabytes of Internet Archive materials are now backed up in at least three geographically separated areas around the world.

More thoughts after the short list of relevant information I wanted you to have.

  • Again, it must be stressed, this is not an Internet Archive Project. Engineers and admins at Internet Archive work all day to make the site resilient. This is 100% separate.
  • We have 47 people/clients helping at the moment. We’re ready to take on many, many more.
  • Here is a page showing the current status of the project. You can see how we add more data, and how we have people worldwide contributing.
  • As the project absorbs and verifies the 3 additional copies of the collections, additional collections are being added. So the more people, the better.
  • If you’re packing a few hundred gigabytes of disk space (or more!) connected to the Internet, and are mounting it using a Unix/Linux variant, read up here.
  • The disk space you contribute need not be permanent – if you need it back, you can delete data in stages and the system will deal with it. We just want to use space you weren’t using anyway.

Again, the startup document for getting git-annex going on your system is located here.

Some thoughts.

First, the resistance and anger from some quarters when I brought this up was unexpected, although looking back, I guess it was inevitable. The idea that it might be done “wrong” in some way, that some attempt to back up the data in an errored approach would be worse than remaining at the status quo, seems to be endemic. Regardless, I strongly believe you need something done to be able to improve it, so we’re pushing on.

Next, the way to back up the Internet Archive is not to back up the entire Internet Archive – it’s to move forward, incrementally, playing the game of “what is in here that’s almost nowhere else and the world would be rather poor for it being going”. In that way, we go for more of the “historical usenet” and “old time radio recordings” than, say, a random 1990s dance music collection. That said, as things go on, and if this experiment is successful, the dance music will get gathered up as well.

Finally, what I like about this experiment is the amount of learning that goes into it. I like being on the ground, asking the questions that need to be asked – how big exactly is the whole thing? What sort of problems occur when you’re tracking petabytes of data to back up? How how disk space is floating out there, unused, looking for a purpose, even if only temporary? What constitutes vital digital heritage? Finding out answers to those questions, getting the answers down, talking about what the whole thing means – that’s where learning comes from.

IA.BAK – it’s the best thing you could be doing with unused disk space.


Behold the Emularity —


When, exactly, is the right time to introduce some new program or technology? What if, as is often the case, it is not quite 100% done or fully tested? Do you hold back until every corner has been sanded, and every surface sanded to a shine? Or do you put it out, splinters and smelling of the sawmill, and let it be finished in the open?

Well, now you know my opinion.

Announcing, everybody, The Emularity, once a simple name for a post on this weblog, but now its own piece of open-licensed software/scripts. I’ve been working with our lead coder on this project, Dan Brooks, for a number of months on this thing, and while it has a fairly spectacular and intimidating learning curve, I think it’s set up well enough to justify handing it out.

Consider The Emularity to be in a Beta state.

If you’re not in the mood to play around with a still smoking script package, then hold back and wait a little while until the Emularity is in better shape as a distributable. Oh, make no mistake, it works and works well. But there’s still some pieces to iron out and documentation to write, along with examples to help you.

Have I dropped enough caveats and warnings? Great.

The Emularity came about because while the Internet Archive has this wonderful collection of tens of thousands of software package (and it will continue to grow it), the fact is that the methods and programs used to bring in-browser emulation to you shouldn’t be locked away. They should be freed up, given away, and provided for others to build on, improve, debug and assist with its next level of adoption: Worldwide.

With this software (and a bit of noodling), you can now run emulation for anything, anything, in the realm of JSMESS, JSMAME and EM-DOSBOX, which are the three main Javascript emulators running at the Internet Archive. In fact, as the Emularity page shows, we have provided three sets of javascript emulators, already compiled and waiting, that you can just drop right in along with the acquired software and just have everything run.

Let’s say it again:

We Just Made In-Browser Emulation Ubiquitous, Peristent, and Easy.

People are figuring out how we did the work with the in-browser emulation at the Archive and there’s certainly been a handful of people and sites that went off and cloned things and tinkered enough to get them running on their own machines. Bravo. Always wanted that.

But The Emularity is the attempt to make it so simple, so very effortless, to install an emulator for the software/console/machine of your choice and just have it run – in a local file directory, or a website, or file server in your school or office. Nothing less, in other words, than turning emulation into something that you can do as simply as you might drop a .ZIP file into a directory.


Because it’s time.



After a year plus of life with JSMESS and its variants/cousins, it’s time to increase the audience even farther than it has been. To make it so that contributions to Emularity, or the JSMESS/JSMAME projects, is rewarded with your work being able to spread worldwide. It’s no secret – our projects need people, smart people, who can look at the work we’re doing with our loader and the .js files that the emulators consist of, and go “Oh, I know how to make that run twice as fast/twice as responsive” and contribute those improvements.

I’m asking people to come in, work with the Emularity, complain about this aspect or that aspect, improve it, and contribute back the improvements. I want people to wander over to the emulators themselves and help us find ways to speed them up, to optimize the code/compiling so the system requirements for the best sound and performance is much lower than it’s been up to this point.

It’s working. I want it to work better. And I want it, just like all the rest of the volunteers who’ve given countless hours to this project, to work everywhere.

Let’s do this thing.

Let’s celebrate the dawn of the Emularity.

Lazy Game Reviews: The Lazier Response —

In January of 2015, Lazy Game Reviews, a site dedicated to reviews of retro and historic video games (especially of the DOS stripe), reviewed the Internet Archive’s MS-DOS Games Collection that was making a bit of news back then.

It was…. mixed. Here’s a link to the video of the review.

I didn’t see much point in responding at the time, especially in the Hellmouth of Youtube Comments, but we’ve had a few months with this MS-DOS emulation out there (and a year of console/other computer emulation before that) and so maybe it’s just time for me to respond to some of the more salient points in the review, in no particular order.

So, here they are, lazier responses to a lazy review.

  • “I already use DOSBOX and have most of this.” – The purpose of doing in-browser emulation is not to compete with people who are the types of personalities to go through installing emulators on their home system, or who collect scads of old programs to play in those emulators. What it does do is provide instant access to old programs, for a massive variety of platforms (not just MS-DOS, but over 25 in the Internet Archive system, not counting the arcade games), and most importantly, keep access to the most obscure and edge-case programs that even the most strident hoarder would think twice about keeping around. Yes, you can play a DOOM clone or a famous Role-Playing game, but you can also try out a pregnancy calculator, a US Savings Note Valuation program, or an Exercise Bike Companion. Which if the standard emulator user was keeping on their hard drive, I’ll dine on Roast Chapeau tonight.
  • “Other sites have done this.” – Other sites tend to have Java plugins, Chrome or Firefox extensions, or similar requirements to install extra “stuff” into the Browser. Spoiler alert: People are doing that less and less, having been ultra-burned in the past. And in many cases, these sites provide you with rich, beautiful ads to accompany your program use and browsing. Some even do some roundabout amount of charging. The goal from the beginning was ‘one click, and you’re there” with the Internet Archive experience, and we’ve generally done that. It’s a little difference, but it does make the world.
  • Some of the games are broken. – Some were too fast, some didn’t work in the emulator quite as expected, and some just crashed. Over time, I’ve repaired what people have brought to my attention, or removed the program if it’s just not working deep into operation. We have 3,400 programs in the MS-DOS collection alone – it needs people reviewing them to know what’s up. Since the review was written, things are much better and will continue to refine.
  • There seems to be zero quality control – Guilty as charged, in so far that I simply snapped in a wide variety of at-reach software examples to put into the collection. Over time, some have been swapped out and some have been improved, including machine settings, or discovering superior versions of items. The entire collection was snapped into place by, essentially, a single person. Again, it’s gotten better in months hence as feedback has come in.
  • How long will this be up – A while to come, apparently.

People who read my stuff on this site have heard this before, but the fundamental intention with all the JSMESS/EM-DOSBOX in-browser emulation was to turn the experience of computing into a truly embeddable, referenced object. To say to you, as I can do right here, that there’s this completely weird pixels simulator someone made over a decade ago, and then have you click it and you are right there trying it out instantly, is the new world that I’m talking about bringing in.

Or maybe this interesting use of 167k of data might impress you. Or running a benchmark within the emulator, on itself. Or running a very early PC demo.

The point is, that’s what this is all about… not just about how better or worse it can play a specific game, or who is doing what out there in the realm of providing a free arcade. It’s about making software playable. Really old software. Really playable.

And if you want to define the last 3 years of work on this, it’s been anything but lazy.

The Springtime of Internet Archive V2.0 —

Here’s what Internet Archive did really well for a decade and a half: provide old webpages, and give fast and simple access to millions of items ranging across all kinds of media.

Here’s what it did not do well: change the website.

For years and years, the site looked very much the same. You can verify this, well, on the Internet Archive’s wayback machine itself. Here’s the website in 2004. Here it is in 2014, ten years later. (Go back earlier and you realize it’s different looking only to reflect the fact that people didn’t have monitors that went over 1024×768.)

Now, if you’re going to have to choose between “be pretty and follow website trends” and “save the goddamned data”, I think the choice made is pretty obvious, and that’s what the site did. And did well! You can look at data uploaded long before Flickr, before Youtube, before Facebook.

But believe me, it always galled me when people would link to things at the Archive, amazing and wonderful things, and they felt they had to apologize for the look of the place. The look, and more importantly the back-end structure of the presentation of the look, was locked in the past. Let me rush to say that gazillions of patches and upgrades were applied to the site over the years by the very talented development staff, but there hadn’t been a bottom-up redesign, one intended to reflect the modern realm of technology and information.

In 2014, that changed (really, it started in 2013), and the redesign of what’s called “Version 2” or “Beta” became available to the masses.

Now it’s generally available. In some places, it’s the default.

Now I’m going to tell you to use it all the time.


For a while, I’d swap between Version 1 and Version 2 as my needs were required. As of this year I stopped using Version 1 entirely.

The story of creating and designing V2 is not mine to tell. Much more involved and talented people were responsible for this, and I’m sure they’ll tell the tale as time permits.

No, I’m just going to tell you, again, to use it all the time.

Getting to it is easy – it either offers a lot of people the chance to “Try the Beta”, or you just visit and it will swap you into the new interface. (You can “exit” it in the upper right, if it’s all too much for you.)

What hits you, first, is how much more visual it is. Yes, there’s settings to go back to a “list” mode, and there’s places where there’s “just” some files so you don’t get pretty previews or informative screens. But for movies, music, software… you can see, in large tiles, what’s going down in a collection. And individual items, be they newspapers or television, have great preview frames that tell you instantly what’s down there.

Going over to the MS-DOS Games Software Collection, for example, you are literally beset upon by the wild colors and names of the MS-DOS era. Tapping a picture, you find yourself looking at a preview of the program, and a clear button to start the fun. It’s really easy.

Searches are more robust and allow clicking around to what you’re looking for. The site is responsive, allowing you to move to different widths (and platforms) and have the site adjust to the aspects of your browser. The Archive now knows what social media is. And the thing is…. it’s not done.

It’s not done by a long shot. And that’s the real magic, for me, knowing the work that went in to make this the case.

You see, the back-end was completely rewritten from the ground up while not touching the data that it’s presenting. The whole codebase shot up into the present day. And with all that time and innovation has come the ability for new features and modifications to arrive with 24 hour turnaround, instead of a nightmareish ballet of negotiating legacy code. It’s a new dawn.

I have really had a great time working at the Archive for the years I’ve been there (yes, it’s been years now!) and with this newest interface, things are just a solid joy to work with. Sure, it’s meant a lot more work to make sure everything has good previews and that descriptions are everywhere, but that’s the kind of work that I love doing, so things got nice indeed.

I’ve gushed enough. V2 is the future. And the future is rosy indeed.


Embed-able Computers are a Thing. —

This either works for you or it doesn’t.

If it works, a copy of Burgertime for DOS is now in your browser, clickable from my entry. If it doesn’t… well, no Burgertime for you. (Unless you visit the page.)

There’s a “share this” link in the new interface for sharing these in-browser emulations in web pages, weblogs and who knows what else.

The world’s getting weirder. Enjoy the ride.

Scan, No Scan (and a Cube in the mix) —

This week has been spent sorting through the Information Cube, that insane 40x8x8 shipping container in my back yard, and packing up magazines by the thousand to go away.

The reason this is happening is because of an arrangement I made last year, with the Strong Museum of Play (and is also the International Center for the History of Electronic Games). In this arrangement, they are going to be the new physical caretakers of the magazines I’ve been collecting and have been donated to for 30 years, and in return, I’m not going to die under a pile of crates of magazines that collapsed on me while I was trying to find “that issue with the ad for 9600 baud modems”. Fair enough.


I have negotiated a number of terms of this material, which would be a bit involved to go into right now. But I still have access to them, and I will have first refusal if they decide to de-duplicate and send to other archives. In an ideal world, I’d inventory all the magazines before they went up to Rochester, but in an ideal world I’d not be doing this in the dead of winter.


The archivists at the Strong have given me a large list from my own rough inventory of items, and Kevin Driscoll and I have been going through the crates and piles, cleaning things up as we go, and moving the relevant magazines into crates destined to go north.


The Strong is a good place. And the important thing is, we’re over the hump with regards to materials regarding computers – after taking a bit of a nap on them, institutions now realize they want these endless journals of computer programming and products. And after talking to a lot of places, I rested on the Strong as the place to take my magazines and hold them. They’re well funded, they take care of their stuff, and believe me, it’s a hundred times better than the Cube.


(National Geographics, however, are a curse, and I’m working hard to find some place or dealer to get them to. The world does not hurt for National Geographics.)



Anyway, here’s the REAL point of this post.

Every time I mention ANY of this, EVERY single time, and from EVERY quarter, comes the exact same theme of comments and the same question framed every which way, but coming down to this:

Where are the Scans of these, and Where Can I Read Them?

The Scans, the Scans, the Scans. On one level, it’s very encouraging that people have an interest in the material, and that they now recognize the value of access to computer materials online. Without a doubt, that’s what the place I work for is dedicated to and it’s the central thesis of the Internet Archive’s existence.

But there’s a missing part in there: the actual, physical, person-intense process of scanning.

It is not an interesting job. It definitely not enjoyable. And it’s endless, an infinite process of gathering items, getting them into machines, and then whatever level of quality assurance is done afterwards to make sure they’re functional.

Scanning, say, a brochure from a 1978 computer user group, which is a couple pages total, is at least a functionally rewarding and quick turnaround. Scanning what is ultimately 50,000 pages of PC Magazine is not. And believe me, it’s 50,000 at least.

There have been heroic, involved projects to scan magazines over the years. One of the leaders is Bombjack, but there are many others. And when these scans come into my radar, I get them into the Internet Archive’s stacks as fast as possible. Easy access, and quick reading, and the online reading capability are important things, and I’m proud we’re able to support these projects with that.

But I think people need to realize how time-intensive (and sometimes money-intensive) scanning is. I have a scanner in my house, but I also worked very hard on documentaries and emulation in the last couple of years. Being in the dark room with the scanner and adding materials has been my choice to avoid. Because I’m not in an urban center (and therefore can afford the space to have a Cube in the first place), people are not really able to get out here. I have had a volunteer who’s done some books, and he’s been great to do so, but I can’t have someone here day in, day out right now.

Meanwhile, people clamor for the scans. The scans, the scans, the scans. Why are you holding them back, they hue and cry.

One particular edge-case fellow has decried for years, years on end, that I am “holding back” on one particular magazine (one there are significant copies out there, trust me) that he believes I am intentionally not scanning, for reasons he can only construct as evil and self-centered, and holding back from the community. I realize he’s an edge case, but he doesn’t exactly inspire me to tramp out to the backyard.

So, rest assured, this is a situation that’s on my mind. It’s a victory – people realize this is a real and valuable thing. But it’s also the first step of a mountain – scanning magazines and materials is intense, intense work.

My proposed solution is simple.

First, get as many items safe and sound up into the Strong Museum and the ICHEG Archives, so that they’re taken care of, reference-able (with actual catalog numbers, so you can ask for June of 1984 and get it). This first run-through with the Strong has been huge, and I have a theory they actually want more, but we just haven’t synchronized inventories yet.

Next, work with the Internet Archive as a fundraising situation to get both a book scanner and paid volunteers to scan material at the Strong. Lots of scans. LOTS of them. These folks up there are friends, and scanning, like I said, is terrible work, and terrible work should be compensated wherever possible.

The Archive will gladly host the resulting scans. So as items get scanned, they’ll show up in the appropriate collections. The work will be done.

I believe there’s two types of scans that are possible in the contemporary world – “good enough” scans that give you all the information and insight you could want, and “artisanal” scans that are for specific one-off images or pages that have value and merit on their own. Obviously some materials deserve artisanal effort all the way down, but many, many don’t (Conference proceedings come to mind – black and white pages, never-ending, with no illustrations.)

It’s a huge project. I think the first big step was no longer locking items away in the Cube, like a bomb shelter, from the ravages of cleaned-out basements and dying enthusiasts. I think that step is far from over and I expect to continue to be sent items.

The next step, after ensuring their proper place among the stacks of art and works in the world, is to bring them to you digitally.

When I make the call for funding, I hope people answer.



Taxes —

So, this:



…is the culmination of a back-burnered fear tornado I’ve been dealing with in the last couple of years – mostly making that final adult transition into tracking my finances in a meaningful way.

To that end, it’s a $25,000 IRS tax bill for my 2012 finances, which were rather off-the-rails odd as far as the IRS goes. (I’ve already paid for every other year, except this last one, which I will pay before April 15th).

People generally treat their finances private, but I’m not generally people, so I thought I’d quickly touch into what this all is.


When I got offered the job of shooting the DEFCON Documentary, I came up with a number for being paid (outside of the budget of the movie). That amount was how much credit card debt I owed, plus extra for other debts. And man, did I owe a lot of credit card companies a lot of money – $19,000, to be exact. And that number was not really going down very well.

By that point, of course, a payment plan had been organized with the credit card firms, but the terms were pretty onerous and they were still charging some amount of interest. One wrong move, and they had the power to do some ridiculous stuff. I wanted out of the credit card world, and so this DEFCON money was money that would really help me.

I got the payment, and immediately wrote out three massive checks towards the companies (two of them were Bank of America at that point – the third was Discover). And I knew I was walking away from the dragon when I intentionally overpaid Discover by $500 and they very nearly had gotten interest in the intervening time from the last bill to overtake that $500. As a bonus, I told them to close the accounts, all of them, and both put me on the oven for trying not to leave, and then Discover didn’t actually close the account – 3 months later I got a bill from them. Oh, the noise I made. Again, this was at the end of writing three checks, three fat checks, for the full amount owed, and then some.

So, credit cards. Out of my life. Gone. Owe them nothing. yay.


But after I filed all my taxes in my own, stupid way, I had made a huge, huge mistake, which is that I didn’t properly list the DEFCON money. Frankly, it was out of my mind – they’d given me some 1099 stuff (they’re a legit organization), and I hadn’t listed it.

This, it turns out, really angers the IRS.

So when they sent me a tax bill set that was something in the $40,000 range, that effectively got my attention. In the intervening time, I hired an accountant, went through ALL my finances on every side, and found write-offs as well as repairs, and generally re-arranged how I do this whole income thing. I should also note that I actually had a refund in 2013, and I suspect 2014 will be the “normal” amount a fellow pays for his salary and other income in the world. I’m not letting this happen again.


So, there’s pros and cons to sharing this – I’m sure I’ll get some level of jibes, or references. But be clear – I did this to myself. Nobody snowed me, I wasn’t told one thing and given another, and I certainly didn’t find myself at the mercy of some clever manipulation of The Rules to get me here. This was a legit mess-up and 2015 will have a non-trivial percentage of my time being spent paying this off.

I make a decent salary, and I knew this was coming, hence my request that any speaking engagements cover lodging/travel – I just can’t afford it for now. And I definitely won’t be covering too many dinners. Also, ketchup in water makes soup.

Life won’t be exactly a hardship, but it does cut me down in terms of leeway. No problem – I’ve weaned myself off of most expensive habits and I really enjoy working on stuff in my house and visiting people who are near me. The saturation may be a little down, but the year won’t be grey.

But that’s not really the reason I’m sharing this.

The reason I’m even going into this is because there’s a situation that affects some of the communities I run in, be they hacker or tinkerer or otherwise “smarty-smart” groups. And that’s the two big fallacies about financies: If you’re smart in how you do programming or tech or math, then surely you should be a taxes/finances genius…. and that if you don’t execute brilliantly along your monetary aspects, this is a shameful situation you must never bring up and hide.

I call bullshit on both.

I call this out and I step forward because in mentioning, conversationally with friends and associates, of the stress of this situation, people would reveal that they too had experienced run-ins with the IRS and general finances – either due to taking on too much debt, or misreporting income, or not understanding how penalties and liabilities affected their income.. you name it. Some had been garnished wages. Others had transferred funds in various places. Someone who I know who would rather I not tell you anything else about them was signing checks over to his wife so that it wouldn’t be immediately garnished in lieu of mess-ups from 1998.

Hiring an accountant, a good one that walked through things with me and gave me the homework assignments (with guidance) that allowed me to re-factor all my finances into something sensible and boil it down to a single bill… that was the best decision I made in all of this. He helped me go through years of materials, and we took a $37,000 bill down to $20,000, which cost about $1500. I may be financially dumb in some ways, but paying $1,500 to get back $17,000 is a heck of a deal. I wish I had more deals like that.

You are not stupid. And you are not alone.


Stepping out from a miasma of concern and fear regarding my finances into one of understanding and clear, present goals has been a breath of fresh air. It was, truly, one of my remaining huge emotional fears and concerns. And while one might not call having a $25,000 bill being out of the woods, it’s a single number. I can work with a single number. I understand a single number. So that, among all my other goals this year, is to get that number gone. Here’s hoping.

Again! You are not stupid! And you are not alone!

(Some percentage of people might kick into some “help Jason’s tax bill” mode, and I’ll just say – buy documentaries if you’d like, get me speaking engagements with honorariums, or just relax – I’m sound of mind and body, I will earn my way out of it.)



The BITSAVERS Renewal —

The Bitsavers-Internet Archive bridge continues to be a wild success.


(I mentioned this whole thing with an announcement from a couple years ago.)

If you’ve not been aware, there’s this amazing, amazing project undertaken by a small handful of individuals to go through stacks of computer documentation and just flat-up digitize it all. No muss, no fuss, no flamboyance, no showtunes. In go endless booklets, manuals, blueprints and spec sheets, and out come PDFs. By the truckload, it can feel like.

The only problem, and it’s a minor polish considering the gargantuan amount of effort put into it, is that the bitsavers site is decidedly spartan. It’s just directory after directory, without any real way to browse the things. You kind of have to know what you want, and you have to download PDFs to see if they’re what you’re seeking. Again, it’s a minor complaint considering the hard, heavy lifting of grabbing this material and digitizing it is being done, dependably, for years on end.

Some time ago, I leveraged being a bitsavers mirror to write some code that would absorb newly digitized items into the Internet Archive’s collections. By doing this, the PDFs get previews, some amount of word analysis, and an online reader. Win-win.

That was some time ago. The collection was well past 28,000 individual items. Items that, if you find them, have gems aplenty:

And, literally, many, many more.

This has been running pretty smoothly for years, but even the best of automated scripts break over time, and I sat down and did some much-needed maintenance.

I finally wrote the script that needed to be written some time ago – it goes through what actually got uploaded and what missed out for a billion reasons over the past few years. It turns out, between broken connections, system downtimes, and the many pieces that could go wrong, over 4,000 files had been skipped over. Those are populating as we speak.

In total, it looks like in about a week’s time, the amount of items on the bitsavers collection on will go from 28,000 to at least 32,000. That’s books, magazines, brochures, articles… and due to the highly focused work of the bitsavers folks (primarily Al Kossow, who does most of the scanning), the material is very agnostic – it’s not someone who loves Apple scanning nothing but Apple, or a person who likes videogames, or even someone who thinks it should only be brochures or manuals that get scanned. It’s everything, literally everything that sat near or around computers.

Closing the air gap from stacks of paper in a warehouse to digital files is a long, boring, intense road. I’m so glad someone is doing this, and I hope the collection as it stands on and the other mirrors is of a lot of use to a lot of people for a very long time.

The Emularity —

Last week, on the heels of the DOS emulation announcement, one of the JSMESS developers, James Baicoianu, got Windows 3.11 running in a window with Javascript.


That’s impressive enough on its own right – it’s running inside the EM-DOSBOX system, since Windows 3.x was essentially a very complicated program running inside DOS. (When Windows 95 came out, a big deal was made by Gates and Co. that it was the “end” of the DOS prompt, although they were seriously off by a number of years.)

It runs at a good clip, and it has the stuff you’d expect to be in there.

windeskBai, tinkerer that he is, was not quite content with that. He wanted this operating system, sitting inside of a browser and running in Javascript, to connect with the outside world.

That took him about 3 days.

winThat’s Netscape 1.0n, released in December of 1994, running inside Windows 3.11, released in August of 1993, running inside of Google Chrome 39.0.2171.99 m, released about a week ago, on a Windows 7 PC, released in 2009.

And it’s connected to TEXTFILES.COM.

Windows 3.11 definitely works, and all the icons in there click through to the actual programs actually working. You can open solitaire and minesweeper, you can fire up MS-DOS, you can play with the calculator or play audio, and you can definitely boot up Netscape and NSCA Mosaic, or mIRC 2.5a or ping/traceroute to your heart’s content.

The world these Mosaic and Netscape browsers wake up in is very, very different. Websites, on the whole, and due to the way this is being done, don’t work.

hackernewsdefconIt turns out a number of fundamental aspects of The Web have changed since this time. There are modifications to the stream that can be done to get around some of this, and we’ll have screenshots when that happens. But for now, the connections are generally pretty sad looking.

digitalTo connect to the outside world, the Windows 3.11 instance is running Trumpet Winsock, one of the original TCP/IP conversions for Windows, and which uses a long-forgotten (but probably still in use here and there) protocol called PPP to “dial a modem” (actually, connect to a server), and transfer data to a PPP node (really just a standard web connection).

This means that somewhere, this instance needs to be connected to a proxy server, which assigns a 10.x.x.x address to the “Windows” machine, and then forwards the connections through. Basically, world’s weirdest, most hipster ISP on the face of the earth.

In other words, this is janky and imperfect and totally a hack.

But it works.

It took about three weeks after I decided we needed to go with EM-DOSBOX in addition to JSMESS to work with DOS programs, that we had it up on the site and going out to millions. It has taken two weeks after that for this situation to arrive.

Contrast with how it took poor Justin De Vesine, working hard with Justin Kerk and a host of other contributors, eight months to get JSMESS’s first machine (a colecovision) to run at 14% normal speed inside a browser, for one cartridge.

IMG_1813717583062Welcome to the Emularity, where the tools, processes and techniques developed over the past few years means we’re going to be iteratively improving the whole process quicker, and quicker, and we’ll be absorbing more and more aspects of historical computer information.

Now the stage is set – the amount of programs that can be run inside the browser is going to increase heavily over time. The actions that can be done against these programs, like where they can be pulled from or pushed out to, will also increase.

What becomes the priority (as it has been for some time) is tracking down as much of the old software as possible, especially the stuff that doesn’t sell itself like games or graphics do. I’m talking about educational, business, and utility software that risks dropping down between the cracks. I’m talking about obscure operating systems and OS variants that fell out of maintenance and favor. And I’m most certainly talking about in-process versions of later released works, which could stand to be seen in their glory, halfway done, and full of possibilities.

Documentation for the software just skyrocketed in value – we had bai reading 1995 books on PPP troubleshooting to get things going. MS-DOS programs on the Internet Archive will need links to manuals to become more useful (this is coming). And just grabbing context will continue to be a full-time job, hopefully split among a group of people who are as passionate as the folks I’ve been lucky enough to come into contact with so far.

I can entertain debates about the worthiness of this whole endeavor as an abstract anytime anyone wants. But the flywheel’s in motion. It’s not going to slow down.

We’re there.


I should have known this was click-juice. Welcome everyone. To speak specifically to folks who “just want to try it”, I ask for patience in terms of this being available to try – it’s still so new and fragile, and frankly, it doesn’t help to have thousands of people hit on the thing, go crazy when it acts weird, and complain bitterly.

If you’re new to the Javascript Emulator party we’ve been throwing for the last year, may I humbly suggest visiting the Internet Archive’s Console Living Room, Software Library and Arcade? With over 25,000 items to try, there’s plenty to keep your attention before the next generation of stuff becomes playable.


The Haze of Possibility —

The game Grand Theft Auto V will send you on a large variety of missions and quests, many of which will require travel between the southern, more urban locations and the northern, nature-filled expanses. A number of roads and paths connect these areas, including mountainous trails, single-lane routes and a number of highways. One of these is the Great Ocean Highway, a multi-lane affair that swings along the west side of the island and into the city.


The day in GTA V is shorter than “real time”, so it’s a good chance you will catch yourself on this highway in the sunset or sunrise, the twilight hours, the golden time. This popular highway is filled with cars making their way north and south, driving at various speeds into the city or away from it. Among them will be you, often caked with blood or carrying passengers to the next quest, or both.

In the mile or so before you get into the city, the view ahead of you stretches far into the distance, and besides the lights of buildings and skyscrapers intermingling with the stars, comes the boardwalk, bright and beautiful, a hazy line stretching into the ocean. At the end of this horizon line of glowing activity is the blur of the ferris wheel and roller coaster at the pier’s end, turning ever so slightly, twinkling into the darkness.


That view, that hazy view into the boardwalk, has a lot of special, internal meaning to me. It’s a view that promises, just over the horizon, the fun and excitement I might be wishing it was part and parcel of. I didn’t always have the free agency to find my own fun – sometimes I’d be somewhere because the family or the group I was with was on its way to another function, one not as fun, and seeing the glow and promise of things in the distance might be all I got, and I’d craft a dream and hope from it.

Grand Theft Auto V is a marvelous thing, technically. The world is full and real. When I run into the edges of it, the places the engine and the construction fall short, I am so far deep into a situation that my complaint seems ridiculous: Emergency services fail to arrive when I explode cars at busy intersection. Strip club appears to have no roof access to service air conditioning, requiring scaling from other nearby buildings. Specific style of taco truck seems unnecessarily prominent around city.

But far and away, the most brilliant, beautiful thing is the lighting, which ranges the gamut from sun showers to cold nights, and from summertime-quality days to buzzing, foggy evenings. To remember the process of generating a functional, single-color sprite on my Atari 800 by calculating the binary behind each set of 8-16 pixels, it has been a privilege to witness how far the technology has come in my lifetime.

That a moment in the game, a simple one of driving into the city on the Great Ocean Highway, music blasting on the radio, could call me back to a feeling and emotion that has been a part of my life since the beginning… well, that’s maybe the highest compliment I can pay.