ASCII by Jason Scott

Jason Scott's Weblog

The MS-DOS Flood (And the High Flight of V2) —

A couple days later from my entry above.

This has already significantly dwarfed the Internet Arcade. Thousands of people are hitting the site a minute, playing MS-DOS programs, writing reviews or keymappings or just freaking out about seeing this whole collection available and playable.

The choice to send everyone to the Version 2 beta interface has also been nice – lots of useful, interesting feedback, good comments, and people really working out the rough spots. The system got a fix-up during this and runs notably faster. The whole Internet Archive infrastructure is getting the workout it deserves.

A long time ago, I wrote about Opinion Spectrum Collapse Disorder, where if enough online attention is aimed at a thing, whatever that thing may be, the resulting flood of commentary will quickly expand out, like a gas, into every conceivable tangent, subject or opinion, until the whole thing is kind of meaningless for “this is what is being talked about”. Celebrity gossip stories seem to attract this particularly: talk about a celebrity buying coffee at a supermarket when they are anti-coffee, and the comments will quickly blow out into the celebrity, the notion of celebrity, something the celebrity said or did 10 years ago, coffee, supermarkets, Ayn Rand, and cat ownership. It’s just the way things are, now. I don’t know if there’s a vaccine for that. OH NO I SAID VACCINE

14223550930_ff50c301dd_oSo yes, people are coming to the Archive in droves to get their MS-DOS on. I think the number is either over a million already or will be soon, and it may be more. The Archive (very admirably) does not log users in any meaningful way, so it’s not like I’m going to be able to run Webalizer on the stats directory. I suspect that, like the arcade, we’ll be into multiple millions before the week is over. Enjoy.

And I think the feedback from the Beta/Version 2 interface has been a spectacular boon – it’s a direction the archive is going in, and the massive stacks of information it maintains will be that easier to go through. The journey to the upgrade isn’t over – but it’s bringing a lot of people into contact with the Archive’s information that might have missed out on it before. The bounty was always there, and folks inside and out are happy to enjoy it.

The rest of this entry is me just answering a bunch of questions for people and reporters, because both camps just kind of make stuff up when they’re not confronted with actual facts. The answering will be in the form of an essay, not a FAQ.

Actually, speaking of a FAQ, here is the Internet Archive FAQ for the MS-DOS emulator. That answers a lot of the technical questions, or what seems to slow people up. So if you’re just here because SPACE BLORTS isn’t working for you, I think that’ll help. Otherwise, let’s just throw out some thoughts I have and some information people seem to want.

Internet Archive Employees, Weekly Engineering Meeting


It is adorable that press and some bloggers have turned the narrative into “Jason Scott did this”. I am a top-hatted loudmouth for a very, very large amount of people who have poured decades of human years into what you see when you start the emulator and it goes “boop”. This includes the dozens of employees of the Internet Archive who keep things running, the developers at the Archive who have made the emulator work in the infrastructure, the developers and volunteers working on the JSMESS and EM-DOSBOX emulators and their porting to Javascript, the Emscripten developers (many) who have made the system work incredibly in the last couple of years, and the emulator authors (now we’re up into the hundreds) who have slaved, tirelessly and with scant recognition, for coming up on two decades to save the software by saving the running of it in simulated/emulated hardware.

I have a strong presence and personality, and a very strong will, so keeping my eyes on the focused goal over the past three years has been something of a contribution, but I’ve written a minuscule amount of code, mostly in the form of scripts. I’ve watched people max their brains out trying to make a system do something it was never designed to do, and watched others figure out how to wend deep into code to make completely indifferent and goliath codebases interact.

I am fearful to write out a list of names right now and miss people, but maybe sometime soon I’ll do an entry where I just talk about all the people and the worlds that have collided to do all this. It’ll be quite long.



Some quick technical things. This weblog is absolutely packed with entries describing the JSMESS system that the vast majority of systems on the Internet Archive use to emulate consoles, computers and arcade games. Just use the search box and look for “JSMESS”; you’ll be rewarded with the years of entries on the subject, including pleas, triumphs and tribulations. I don’t feel the need to repeat that here – if you want it, it’s there.

What I do want to say is a few words about the differences between JSMESS and EM-DOSBOX, which is what the MS-DOS systems work on.

I was not happy about adding an exception-based emulation system into the mix, but JSMESS isn’t quite ready for MS-DOS yet, and I think it’s a ways away from it. When it works, it’ll work really nice, and maybe we’ll even switch over to it, but for now, it started to become a real problem in the timeline of the Archive’s software collection that we were emulating the ZX Spectrum and the Atari 800 and Apple II, but not the massive juggernaut that has been decades of MS-DOS compatible programs. So I made that decision and a bunch of us put our faces to the grindstone and that’s what got announced – EM-DOSBOX.

A random slashdot anonymous poster claims they did critical work before the implementation of EM-DOSBOX, but the EM-DOSBOX Repository appears to be a straight fork from DOSBOX, so jury’s out, but I didn’t have any direct involvement in DOSBOX or EM-DOSBOX, but I’m fairly sure the lineage is direct (although there have been other javascript ports of DOSBOX in the past). If so, step forward and take your glory, soldier.

EM-DOSBOX is a very fast, very slick porting of DOSBOX, using the Emscripten compiler to bring the code into the browser. It implements some tricks which even JSMESS can’t do quite yet, and which might lead to speedups in the future along that line. I can run some EM-DOSBOX emulations at full-speed on my phone, which is sufficient for me to throw up my hands and say the future’s here.

The biggest difference between EM-DOSBOX and JSMESS is that JSMESS wants disk images, gathered magnetic copies of the original booting medium. (An example on the Apple II would be SPACESHIPS.DSK, where this is a “Spaceships” Apple II floppy portrayed as sectors that JSMESS reads like a disk). EM-DOSBOX is quite happy with a .ZIP file of all the files needed to run. It’s rather convenient because the vast majority of MS-DOS materials are sitting in .ZIP files (or .ISO files if they’re CD-ROMs, but we’re not doing those at the moment). The only downside to this is that you have to tell EM-DOSBOX what the “starter file” is in a collection. That’s something that can’t easily be automated, so I have to go into each .zip and do a best guess of what probably makes it go. AUTOEXEC.BAT’s a good guess, as well as PROGRAM.EXE where PROGRAM is the name of the program being run. But I get caught out all the time, and with thousands of programs, how the hell did I deal with that?



The answer was the Screen Shotgun, the automatic player-and-screenshotter script pile I’ve spent a while putting together over the last year. I’ve written lots of entries about THAT process as well, stalwart inquisitors – just search for “shotgun” in the weblog search box.

These days, I have a version of the script that can start up on an Amazon EC2 instance and I can run any amount of systems at 14 cents an hour to generate screenshots as fast as possible. These emulator-playing robots serve multiple functions, but the big ones are that they can take a bunch of screenshots of the programs and remove the doubles, and they can serve as a basic Quality Assurance team, because if the game doesn’t work, the screenshots are going to show it.

Hey, look, 2000 MS-DOS Screenshots taken by robots. That’ll hold you for a while.

The most annoying difference this time was that the MS-DOS programs switch up their resolution, with VGA vs CGA and so on, and there’s not a locked specific “size” of the canvas. A few kind folks on twitter took me under their wing, with various solutions, and I chose one that was very, very flexible, removing the need to hand-type-in resolutions going forward for future machines. The power of the crowd, they say.

1389149289357A number of people to pop in among both this and the other major software release events have said something along the line of “what’s the point”, or “this is a terrible thing”, and like anyone working on a project, it’s kind of a waste to engage people who are stopping by with that position. Obviously, if I spent years on it, I thought it was something worth doing. I’ve tried to articulate why on multiple occasions. It either sticks or it doesn’t.

But the fundamentals are pretty straightforward:

  • Software is history. It is culture, it is process, it is a signpost.
  • Software also has a ridiculous half-life. A year or two, with few exceptions.
  • Software, unlike many other media, has a massive barrier to using it.
  • This barrier does nothing but grow over time. Over enough time, insurmountable.
  • Coming up with solutions like in-browser emulators promises to lower barriers.
  • If people can’t see or interact with software, it disappears. Utterly.
  • This project is an attempt to lower that barrier, ideally to nothing.

After these statements comes a lot of endless debate and discussion, and I’d rather be sitting in some room with a microphone arguing with someone than dealing with e-mails and weblog entries, but I think, much like my work with Archive Team, that once the core subject being debated is deleted or gone (or effectively gone), the conversation essentially ends.

What this army of volunteers and collaborators have done with javascript emulators is we’ve stopped the conversation from ending. We’ve essentially jammed the door open. It’s here now – we’re in a world where computer and software history runs in browsers, and that’s that. It’s refinement and iteration here on out. If my time or priority takes me away, the work will continue. That’s done, that’s a fact. Nothing will take that away, now.

Going forward, I expect debate in places where this effort collides with commercial interests, but I can tell you one thing I’ve learned, maybe something intuitive to people who work in the space of preservation or curation, but still worth mentioning:

In the realm of historical objects that have started their cultural/contextual journey as products, there are a specific subset that have gotten the attention, effort, quality preservation and presentation befitting a worthy project to do so. But that’s it. The vast, vast, vast majority of commercial products, including hardware and software, sink without a trace, or end up in attics or collections with only glancing references made. Maybe a completist providing a text listing of “what there was”, or a wrap-in of a few more-obscure B-list examples, but then it’s this giant, yawning abyss.

I figured, in the example of Arcade Games, that The Nerds would have long ago gutted, garrotted and filleted the history of all arcade machines enough to tell us who worked on it, what it consists of in terms of gameplay, or the usual love one sees showered on a Pac-Man or a Mario. No. No, not at all – there were (and are) games for which we have the ability to emulate a ROM dump someone did as long as 20 years ago, and a photo of a machine sitting neglected in a warehouse. Nothing else. And this is arcade games, where one could argue there was a perfect storm of subject interest, intensity of fanbase, and overlap with the most curatorial personalities in terms of maintaining information. Nope. Sometimes the games exist as a single found circuit board and everything else was imparted from that for the purpose of an emulator.

Now extend that to software, and it’s much, much worse. Now extend it to non-gaming software, especially custom software, and it’s even worse than that.

While people wring their hands over the preservation of Doom, a hundred thousand programs are at risk of being locked away only to people willing to burn hours to spellcast them into function in private geek caves.

This is a real and functional problem, and it has real consequences. This set of projects is, at some level, a strike back against that. It’s worth the doubt and the derision.


Finally, the little prayer I always say.

I’m very lucky – I’m working for a wonderful non-profit library, a dream of a lifetime. It allows me to work on projects like this, to maintain the effort to keep the project focused, and to have a platform to make the project available. That’s due to the forward thinking of Brewster Kahle and my various managers at the Archive, that they let this maniac onboard to work on things. I never forget to be grateful.




Categorised as: computer history | Internet Archive | jason his own self

Comments are disabled on this post

Comments are closed.