ASCII by Jason Scott

Scanning a WHAT — August 13, 2011

I have done my first weblog posting on the Internet Archive Weblog: Scanning a Braille Playboy.

I could probably count on half of one hand the amount of group weblogs I’ve posted on, and I can count on one finger the amount of time any institution or company I’ve ever worked for has let me have the ability to post directly to their weblog with something I’ve written. I hope I’ve done the place right. Go ahead and find out the story of the Braille Playboy, which is in fact about how the Internet Archive scans and adds a new book to their collection every ninety seconds. Isn’t that amazing?

Great wagon my star is hitched to.

4 Comments

DEFCON 19 — August 1, 2011

As per my usual methodology, I’ll be attending the DEFCON hacking and security conference in Las Vegas, Nevada, starting this upcoming Wednesday and going through to Sunday. With a couple exceptions, I’ve been there every year since 1999, so it’s kind of a thing I do. I also happen to love Las Vegas, so going down there for a good part of a week works well for me. DEFCON has one of the largest sets of people who like the work I do, all in one place, and it’s nice to connect with folks who normally I only communicate with via e-mail.

Also as per my usual methodology, I am speaking at same DEFCON, and my talk is, not surprisingly, about all the work and events with Archive Team.

From the description of the talk in the DEFCON program:

“For the last few years, historian and archivist Jason Scott has been involved with a loose, rogue band of data preservation activists called The Archive Team. As major sites with brand recognition and the work of millions announce short-notice shutdowns of their entire services, including Geocities, Friendster, and Yahoo Video, Archive Team arrives on the scene to duplicate as much as they possibly can for history before all the data is wiped forever. To do this, they have been rude, crude and far outside the spectrum of polite requests to save digital history, and have used a variety of techniques to retrieve and extract data that might have otherwise been unreachable. Come for the rough-and-tumble extraction techniques and teamwork methods, stay for the humor and ranting.”

So yeah, that.

I know it’s weird that I have to say this, but it’s been proven that I miss out if I don’t: I go to this conference (and other conferences) to meet people, to talk about projects, and to engage with folks. I go much less for the speaking sessions, and I certainly do not go to be off in a corner, unapproachable, while you wonder if you should “bother” me with saying hello. Come up, say hello. It’s what I’m there for.

The conference is at a new hotel this year, the Rio, which is off the strip and has this strange little Brazilian theme it conclusively messes up, but most importantly does not have the vibe of a dying old man coughing up the last of his life while staring at you suspiciously, like the Riviera had. And most notably of all, DEFCON has secured the use of the Penn and Teller theater for the conference, which will be a place a smaller set of folks will be doing presentations from.

I will be one of them. Nicest room I’ve played.

See everyone there.

Update: It went amazing. Video should become available in coming months.

1 Comment

The Twitters — July 16, 2011

It occurs to me that some people who read my weblog might not know my twitter feed. I’ve been there for years, but it doesn’t come up here all that much. Perhaps not surprisingly, the feed’s name is @textfiles. I’m rangy, weird, abrasive, occasionally helpful and informative. You know, like usual. Check it out.

1 Comment

Floppy Disks: It’s Too Late — July 12, 2011

Someone has to break it to you, and that person is me.

It’s over. You waited too long. You procrastinated or made excuses or otherwise didn’t think about it or care. You didn’t do anything and it’s too late now.

I’m talking about Floppy Disks. And I mean the five-and-a-quarter (5 1/4″) floppy disks that actually are somewhat floppy and which are long and flat and which were the mainstay of home computing for well up and over a decade, back then. A decade, I hope I’ve made clear, that means quite a bit to me. And the history, the thoughts and dreams and knowledge and information that people put onto those floppy disks with a grinding noise and a large LED lighting up and flickering? It’s gone. Three-and-a-half-inch floppy disks, which are not really floppy at all and which got a real hayday in the waning years of the 1980s and the 1990s? Not as gone as the fives but definitely in bed in the ward that things go into but don’ t really come out of, but which you can still visit, if you remember to.

If you still have boxes of floppies sitting in your attic or basement or grandparents’ place or wherever else, I’m telling you the days of it being a semi-dependable storehouse are over. It’s been too long, too much, and you’ve asked too much of what the floppies were ever designed to do. If you or someone helping you gets data off of it, then it’s luck and chance, not engineering and proper expectation. A lot of promises were made back then, very big promises about the dependability, and by most standards, those promises came out pretty darn good – it has often been the case of extracting data from floppies long after the company that wrote the software, that made the computer, that manufactured the disk drive parts, and manufactured the disk have gone into the Great Not Here. You could be a totally different person, with people who you helped create running around your feet and many years younger than these floppies, and you could pull data off them to show the little people what their parent was up to so long ago. Maybe even get them excited about their turn at the screen and keyboard when the time came. It was like getting two sodas for one buck out of the soda machine. Cool!

No longer. Edge cases exist, and will always exist, but the ship is sinking; it’s not seaworthy. With some perseverance and faced against all the odds stacked against you, something might get out of these poor black squares, but I would not count on it.

Why am I telling you this?

I am telling you this because I am grabbing you by the fucking collar and shaking very hard because it is obvious you need to be shaken very hard and told that this is it. This is the endgame for floppies. We went over the hump, and the chances of rescue are slim to none now, but there are still chances. It’s a chance that needs to be taken now.

If you have an archive or cache or hoard of floppies, you need to get in touch with me. I will help get the data off of them for you, whatever piecemeal amount is still thriving on there. We’ll get errors up the wazoo, and some of them will be simply unreasonable, but it has to be done, I have to try.

Archiving history is now my full time job. Let me tell you how much I love that. I love it THIS MUCH.

So I’m throwing myself into the fire. I have people who have said they’ll step forward and help this happen. We can transfer the data off the floppies, get a hold of history before it goes into the zero device. Get in touch with me.

And please, one other group.

There are libraries, archives and collections out there with floppies. They probably never got funding or time to take the data off – there’s a great chance the floppies are considered plain old acquisition items and objects, like books or a brooch or a duvet cover. They’re not. They’re temporary storage spaces for precious data that has faded beyond retrieval. If nobody got around to pulling that information off, then a fundamental goal of many of these places dissolved under their noses and they’ve failed. I’m willing to forgive and forget, myself, if we can just ferret out these caches and help the items get into a more stable state. (As an aside, the conclusions of this study are wrong, although I appreciate the effort.)

Help me with this, before it’s too late. Because it is too late.

Help me now.

Update: There’s now a page on the Archive Team Wiki that I have created to give people options and information about the transfer of floppy disks into a more modern storage location. Please read or contribute.

79 Comments

To There and Back Again – And More Stuff! — July 8, 2011

I’ve been a little busy, doing a little travelling for the summer.

When I get back home this weekend, I’ll be kicking things into full gear with my work with Archive Team, Archive.org, GDC, and other projects. But let me mention a few things worth going into that I did while globetrotting.

The Arcade Manuals Archive has been getting the love from the growing Metadata Warriors (and there’s always room for more warriors, by the way). If you track the RSS feed for the collection, you can see it grow. Since I find arcade manuals massively fascinating, it’s like a fountain of awesomeness for me. If it’s not quite as fascinating for you, you can still look it over and see how things are going. And if that’s too much work, hey, Nintendo Light Gun maintenance manual. Who can argue with a light gun manual?

Enough GET LAMP interviews have been rendered, uploaded, and described that it was finally time to announce the GET LAMP Interviews Archives are now a “thing” on archive.org. Ten are up as of this writing and I’ve got a machine rendering high-definition, video-noise-reduced clip sets for you to watch at your leisure, assuming you want more thoughts from the people I had in the movies. Perhaps Jim Aspnes, the creator of TinyMUD, jumps out at you, but there’s also Lance Micklus, who was a software author and who ran a company during the opening age of the home computer industry. And more! What a great deal!

And the Bitsavers collection is heading up the rear, although I expect it’ll get up some steam anytime soon. Until it gets more, though, it’ll have to do with such items as the 2nd West Coast Computer Faire Proceedings, or the instructions for the Ohio Scientific 8K BASIC-in-ROM. And occasionally thumbing through the Commodore Component catalog.

So, more to come, after I land back in the US of A in a couple days. I’m refreshed, energized and ready to go.

3 Comments

The Metadata Mania — June 26, 2011

OK, now that I’ve gone pro with the archiving and I’ve been at it pretty hardcore for a few months, my programmer brain has kicked in and I’m trying to find inefficiencies and kill them so we end up with a ton of cool stuff online.

I already know what I am and what I’m about with all this. I’ve known for sometime: I’m the weird little widget, the fucked-up little strange item that is going to link a lot of people who haven’t linked up before. That said, there’s always the danger of being called, by each set of people, someone who is a poor replacement for them. And I am! I am an enjoyable archivist but not a full replacement for a “real” archivist and I’m a home computer enthusiast but there’s some people who truly pants me over compared to what I know. But my mission and forseeable future will be spent linking all these sets together, and that’s how it’s going to be.

So now that we understand each other, let’s get to work. Let’s get to work, in fact, on the biggest single problem I’ve found.

It’s metadata. Metadata is the slowdown.

If you don’t really know what metadata is, that’s cool, it’s a kind of weird concept. In simple terms, metadata is information about other data. If you have a pile of Apple II Floppy Disk images, then the fact they’re Apple II floppy disk images is the metadata, as well as what’s on them, when you transferred them, who owned them… anything that’s not the data itself is the metadata.

See, it’s not that hard to add lots of data somewhere – see cd.textfiles.com where I have three million files, and yes I am going to port cd.textfiles.com to archive.org and yes I am going to be adding scans of the discs themselves and the ISOs and oh yeah, that’s right baby we’re going big with that. But the three million files, left alone, wouldn’t be interesting or more accurately would be very interesting but be flooded by all the other files not in your immediate interest path for whatever you’re looking up.

Metadata, you see, is really a love note – it might be to yourself, but in fact it’s a love note to the person after you, or the machine after you, where you’ve saved someone that amount of time to find something by telling them what this thing is. Lives have been absorbed getting metadata, and so there’s an entire field of computer study about this idea, and making your machines do the hard work for you. Google’s got some interest in this, I heard. If you could completely generate Metadata, life would be pretty awesome. But you can’t. Not really. Not completely.

So let me announce another collection I’m working on. The Bitsavers Archive. Yeah, that’s right, that one. The one that people have been scanning, donating, and working on for over 15 years. I’m going to import it into archive.org. I got permission from them and we’re going in.

But again, Metadata becomes the immediate issue. I’ve written a script that lets me point at a bitsavers asset, say An Apple I User Manual, and type in a title and then the description and date for the item, and then the script does the rest – upload it, generate a metadata .xml file the archive.org system uses, and check in the item so it can going the collection. Truly, fire and forget. Everything automated is now automated. But the fact remains, I had to title the item, and then write a description. That’s the big holdup.

See, bitsavers.org has 19,000 items in its collection. They are not described in the manner they really need to be to be useful. Someone, me or other people, will need to describe them, before they should be checked in to archive.org. Sure, I could slam all 19,000 in WITHOUT descriptions, but then very little has been accomplished – imagine stacks of a library where all the back of the books have been covered with white-out. Not useful at all.

Similarly, the arcade manual archive I created to test out my scripts and acquisition approaches is a wild success, except in numbers. Of the 362 manuals in there, though, they’re all described nicely, thanks to a team of volunteers who stepped forward to do the reading and typing necessary. They’re credited in the front page for the work they did. Notably, though, I have 4,000 more manuals to add. I’ll be adding them as I go for a little while, but it’s just going to take a huge amount of time, even with every step but “describe” automated.

So, it basically comes down to me asking you, if the idea interests you, to sign up to be a metadata warrior, someone who will work with me to describe these items. I’ll help you find something personally interesting – some people really dig reading old computer manuals, others care about arcade stuff, and I’m sure there’s even more in what I’m adding to keep your attention up. I’m not asking you to do everything – even if you add a handful of stuff, that’s more than was there a week previous – and that’s helping.

And in case the thought occurs to you, Archive Team is not really the best thing for this – this is about long-term presentation, not saving burning data from assholes. When you help me with metadata, we’re helping make available really cool stuff that has been saved but which needs a nice tag on the outside so later people and generations can know what’s in there.

The e-mail is metadata@textfiles.com and I suspect I’ll have a mailing list to discuss this in the future.

1 Comment

Available for Speaking Engagements — June 25, 2011

Here’s the deal.

Like a lot of other people, I incurred some huge debt in the last 5-6 years, which then started to expand under the weight of compound interest, and which I then spent a lot of time on the phone negotiating to minimal interest and reasonable payoffs over time. It has been going well, but with the recent changes in my life, it’s starting to cause trouble – right now easily half my income flies back out again to debt repayment. This is a problem of my own doing, and one I am fine living with, but part of living with something is trying to solve it, and to solve in it in a way you can continue to live with, and yourself.

I have no interest in industrializing Sockington into something that will shoot some sheckels my way – people have fairly unrealistic views of what the benefits of making a product based around the cat are, and the subsequent ruination of what is a very special thing for me. I’ve run the numbers, it’s not worth it.

I am, technically, employed by two different organizations at the moment, one full-time and one doing part-time contractor work. The second will fade out sooner rather than later. And, anyway, like I said, the debt is just crushing things even with what appears to be two streams of income. Oh, and for anyone checking against the documentary sales – basically all of that is going into a different debt repayment – the musicians and the kind person who lent money for the world to get 1000 more BBS Documentaries.

Great, now you know more about me and money than anything. But let me explain one way I am going to try and get out of it.

Ah yes, speaking engagements. This is where I would come to an event, some presentation-laden schedule, and Bring It for the purposes of education, entertainment and whatever else. I’m willing to speak on events great and small, situations old and new, and tune the presentation for the audience.

I will, of course, continue to do events that I choose to for free or the amount they pay: I have presentations coming up for DEFCON, DerbyCon, and maybe a couple other. But it’s less about the places that I commonly visit to do my thing, but about that all-important notice to other places that desire speakers that I am available.

I’ve spoken to rooms with 5 people, and rooms with 5,000 people. I do not get nervous on stage. I have given profanity-laced standup routines based on actual events:

That Awesome Time I Was Sued for Two Billion Dollars

And I can do totally clean, straightforward descriptions of situations, such as this Wikipedia critique given in London:

Mythapedia

A presentation I gave on HOW to give presentations is one of the links DEFCON and others point to for other speakers to see:

The Presentation Presentation

And I can provide other examples and references from organizers and colleagues as needed.

The reason I bring up all this hat-in-hand stuff and re-iterate what a lot of my readers already know about my presentations and presentation style is that I am asking you to feel free to bridge me out to places you never thought I’d go. I have encountered, here and there, cases where people are excited at the prospect of getting someone like me to present at their events, but they’re nervous or unsure if I’d even want to talk with them, or they’re looking for my representation, and so on. What I’m saying is, don’t hesitate – I’m available.

We’ll see what comes of all this.

2 Comments

Experiment Successful! Two More GET LAMP Interviews — June 20, 2011

So the initial experiment with Chuck Benton turns out to have been pretty successful – the resulting video is clear, clean, well-mic’d, and provides you pretty much all the relevant statements on the subject of GET LAMP-related ideas. Someone mentioned a cut-off here and there, and unfortunately it’s been too many years to know why I might have done that, but it would never have been to avoid having a clip with something unusual/terrible – it would have been that he stopped and moved onto a new subject.

There is video noise reduction going on, and this slows the whole rendering process absolutely dramatically – it takes something like 4-6 hours for the system to render a 15-20 minute set of clips. As it is, with me in Australia and doing other things, making my machine do this via a VNC connection has very little personal pain of my main machine being tied up, so I’ve moving ahead with whatever are low-hanging fruit.

Therefore, I’m happy to point you to two more interviews uploaded:

Warren Robinett was the creator of Adventure for the Atari 2600, and in doing so, he pioneered all sorts of advances in gameplay for videogames. After a year or two, we finally got things working and we did a very short interview in San Jose – almost all of this material is on the DVD in either the main movie or in a couple bonus features I put him in. In fact, it was so short I remember us finishing and him going “Really? That’s it?” – but that was all I needed and I wasn’t going to grill him on random Atari history in this context. I was just pleased to get what we got.

John Romero contacted me about being in GET LAMP, because he’d done contractor work in his early days as a programmer, and one of those contract jobs was Infocom. I ended up giving him not too much notice, on a Sunday, and he personally let me into the building where he was working and we did the interview just before I drove at top speed to catch a flight out of California. There’s all this bullshit around this guy from the people whose contribution to gaming was to ensure Frito-Lay stayed in business, but I have now interacted with him multiple times both with this film and elsewhere and he is a fantastic dude, open and generous and informed.

I’ll have more interviews up soon – right now I’m rendering Scott Adams, and the machine has told me it will take 26 hours. Good thing I don’t have to be there! With Scott Adams’ being done, I’ll create a GET LAMP interview collection on archive.org and you can keep track of that.

Hooray for history!

4 Comments

The First GET LAMP Interview Experiment — June 15, 2011

During the shooting of the GET LAMP documentary, I generated what appears to be my stylistic mass of footage – over 120 hours of people talking about text adventures, early home computer software industry, inter-company politics, and a range of things about writing interactively. It was a huge range of subjects and of course only a tiny sliver got into anything on the GET LAMP DVDs – probably 3-4 hours in total.

My hope is to release almost all of it on archive.org, with a Creative Commons license, so people can listen to them, learn some additional stuff, and provide a direct-source historical record of events happening – after all, these are testimonials as well as discussions.

Just to set expectations for people waiting for specific folks from the interview list – some people had final approval of what was put in the film (they couldn’t change it, mind you – they just had a yes or no to finally appearing) and as such, I can’t exactly go ahead and just drop all their answers out in the wild, unless I check with them first. I’ll do my best to provide them with the proposed collection of clips and get a sign-off, but for now I’ll go for the low-hanging fruit and just go for people of direct historical interest who were fine with all of it going up.

As I learned the hard way with the BBS Documentary raw footage, people don’t exactly want the actual, full, unbroken interviews – my questions are repetitive across multiple sessions, I sometimes launch into stories or other tricks to bring out statements, and you generally get 20 minutes of “the good stuff” out of an hour-long tape. So, I’m doing an experiment this time – providing “cooked” interviews, where you are ONLY getting a set of clips consisting of 1. the subject’s answers 2. which at the time of editing I thought complete and relevant to the final work. This cuts things down dramatically. I have also applied some minimal noise reduction against the footage so that it compresses better and plays well, which should help as well.

So here’s the first in what I hope will be nearly all of them: Chuck Benton, late of On-Line Systems/Sierra On-line, creator of Softporn Adventure, which was later remixed into the Leisure Suit Larry franchise, and who also did a couple other great ports/works, like Frogger and B.C.’s Quest for Tires.

You’ll hear small bits of my voice, but otherwise I’m absent. The whole thing runs 30 minutes, less than the hour I ran with him originally. This was my first interview conducted with the new equipment, and the waterfall in the back, I decided, was too much to use the footage. (It turns out I could have included it, because I got MUCH better at post-processing.)

So here we go, check it out, and here’s hoping I can get many more to you this year.

8 Comments

The Hardcore Computist Collection — June 10, 2011

If it’s not obvious already, one of the major parts of my work with the Internet Archive involves going after those very subjects, items and collections that would fit perfectly within its hallowed walls but which nobody with the items, or archive.org, has made the connection. I’m the connection. And I’m connecting.

A lot of my recent effort has just been learning the archive.org back-end, which is a little strange but very geared towards keeping things around forever. A lot of machines do a lot of tasks, and attempts are made to give you a lot of options to get things out of the “stacks”. It’s really, really good for books and documents, which was the first primary item the site had after webpages – you can get versions of documents for reading online, downloading in PDF form, pulling into your Kindle or what have you. Music, also pretty good. Video? Working on it, getting better all the time (just this past week, more options for video showed up). Ironically, this is the opposite approach to pretty much all commercial sites that harbor media – they all hide the original and show you derivatives. Archive.org ensures you always can get the original and does its best with derivatives. So learn this system I have, and the learning continues.

Here, then, is my newest collection to present, one that has had a personal side and interest for many years: The Hardcore Computist Collection. Everything I’m going to say is to give context of this collection and how things got to where they are, so feel free to just go ahead and click that link and while away an afternoon.

As things shake out and the Apple II becomes a footnote to the contemporary world that thinks of Apple as a music and exotic hardware design company, I think it’s a great idea to capture as much of that earlier machine’s history and related culture before it all flattens out completely. The Apple II had a huge following for the time, is still being used for experimentation and fun, and continues to amaze and delight with the simplicity of its approach combined with the complexity of the ideas it inspires. To the goals of preservation, there’s a lot of scanning, interviews, software and hardware being assembled to capture it. I’ve been involved in some of that, but I’m a tiny, tiny needle in the haystack of all these folks. What I’m mostly good at, I guess, is bringing attention to aspects of this effort and bringing them the visibility and regard they deserve.

So it goes, then, with Hardcore Computist, one of the more amazing things to rise out of the wild success of the Apple II and industry and customer base around it.

It was, depending on who you talk to and in what context, a blow for computer users’ rights far in advance of issues twenty years away, or a software pirate’s journal, or a strange technical magazine dedicated to the liberation of understanding about the software being written for this amazing home computer. It might be all of them. What it unquestionably was, however, was a professionally printed magazine dedicated to removing copy protection schemes from commercial software for the Apple II.

The opening issue of the first incarnation of the magazine, Hardcore Computing, lays it out straight. The magazine, says publisher Charles R. Haight, is a salvo in a battle, a silent battle between the “establishment” Apple II magazines, and the users. The users, you see, had been stripped of the right to legal backups of their software, software they bought, and even though strides were being made to allow users to de-protect and duplicate their copies, magazines were refusing to write about this or publish ads for tools to help said users. This, to Haight, was censorship, plain and simple, and Hardcore Computing would fix that.

What followed is ten years of amazing publishing – dozens of issues, all written clearly, stridently, and never turning away from the nuts and bolts of hacking away at software programs to tame them and make them do what you want. It is, in many ways, a very beautiful work, with minimal but striking artwork (horses and space seem to be a recurring theme) and the occasional single-color paper tint. Over the years, color does make an appearance, but as the Apple II’s fortunes fade, so does Computist’s and the magazine shifts down to a newsletter, and then ultimately some scattered galley proofs of issues never published. It had several names, multiple focuses, and oh man, the ads. It’s worth it just for the ads.

That I know about this magazine, and that people were able for years to read this magazine online, was the work of one Mike Maginnis, who scanned all the copies of the magazine to the best of his abilities and began offering the images from these scans online, through a variety of sites. After finding out about his work, I mirrored it on textfiles.com, with a different interface, but didn’t add much of anything to it, and frankly I had it looking pretty and absolutely worthless for browsing. Fast forward, then, and I have taken the latest revisions of Mike’s scans and put them on archive.org to make permanent. Mike was all for it.

While I know the collection is not 100% complete, and some scans are fuzzy, this is an important step in the right direction – the issues are now protected under the auspices of a library, and can be checked out by anyone who wants to head over there and read them online, or download the original PDFs, or anything else that might grab their fancy. It provides, along with Mike’s site and collection, a non-profit registered-library home dedicated to its preservation, and that’s pretty darn cool.

To help with the entering of descriptions and metadata about these issues, I called out generally and a bunch of people answered the call. So please, a very big thank you to Heather Bowden, Louise Pichel, Lewis Collard, and Colin Djukic, who downloaded issues and typed out a bunch of relevant information so these issues would catch into search engines and make the future patrons of this library happy with how easy it is to find something.

This doesn’t mark the end of preserving Hardcore Computist and its related publications, though – it’s just a new stage. I’ve given it a new platform to thrive, but it won’t thrive without contributions from other historians and collectors who know a lot more about this world than I do. I’d love to see items added, descriptions puffed out, better replacement scans over time for items that could use them. It’d be a great blow against the tides of entropy and amnesia that affect all things now made, in many ways, obsolete or seemingly so.

Enjoy the read. And back up your software!

4 Comments