ASCII by Jason Scott

Jason Scott's Weblog

The Full BBS Documentary Interviews are Going Online —

This year, the full 250 hours of interviews I conducted for the BBS Documentary are going online at the Internet Archive.

There’s already a collection of them up, from when I first set out to do this. Called “The BBS Documentary Archive“, it’s currently 32 items from various interviews, including a few clip farms and full interviews of a bunch of people who sat with me back in the years of 2002-2004 to talk about all matter of technology and bulletin board history.

That collection, as it currently stands, is a bit of an incomplete mess. Over the course of this project, it will become a lot less so. I’ll be adding every minute of tape I can recover from my storage, as well as fixing up metadata where possible. Naturally you will be asked to help as well.

A bit of background for people coming into this cold: I shot a movie called “BBS: The Documentary” which ended up being an eight episode mini-series. It tried to be the first and ultimately the last large documentary about bulletin board systems, those machines hooked up to phone lines that lived far and wide from roughly 1978-2000s. They were brilliant and weird and they’re one of the major examples of life going online. They laid the foundation for a population that used the Internet and the Web, and I think they’re terribly interesting.

I was worried that we were going to never get The Documentary On BBSes and so I ended up making it. It’s already 10 years and change since the movie came out, and there’s not been another BBS Documentary, so I guess this is it. My movie was very North American-centric and didn’t go into blistering detail about Your Local BBS Scene, and some people resented that, but I stand by both decisions; just getting the whole thing done required a level of effort and energy I’m sure I’m not capable of any more.

Anyway, I’m very proud of that movie.

I’m also proud of the breadth of interviews – people who pioneered BBSes in the 1970s, folks who played around in scenes both infamous and obscure, and experts in areas of this story that would never, ever have been interviewed by any other production. This movie has everything: Vinton Cerf (co-creator of the Internet) along with legends of Fidonet like Tom Jennings and Ken Kaplan and even John Madill, who drew the FidoNet dog logo. We’ve got ANSI kids and Apple II crackers and writers of a mass of the most popular BBS software packages. The creator of .QWK packets and multiple members of the Cult of the Dead Cow. There’s so much covered here that I just think would never, ever be immortalized otherwise.

And the movie came out, and it sold really well, and I open licensed it, and people discover it every day and play it on YouTube or pull out the package and play the original DVDs. It’s a part of culture, and I’m just so darn proud of it.

Part of the reason the movie is watchable is because I took the 250 hours of footage and made it 7.5 hours in total. Otherwise… well….

…unless, of course, you’re a maniac, and you want to watch me talking with people about subjects decades in the past and either having it go really well or fall completely apart. The shortest interview is 8 minutes. The longest is five hours. There’s legions of knowledge touched on in these conversations, stuff that can be a starting port for a bunch of research that would otherwise be out of options to even find what the words are.

Now, a little word about self-doubt.

When I first starting uploading hours of footage of BBS Documentary interviews to the Internet Archive, I was doing it from my old job, and I had a lot going on. I’d not done much direct work with Internet Archive and didn’t know anything going on behind the scenes or how things worked or frankly much about the organization in any meaningful amount. I just did it, and sent along something like 20 hours of footage. Things were looking good.

Then, reviews.

Some people started writing a few scathing responses to the uploads, pointing out how rough they were, my speech patterns, the interview style, and so on. Somehow, I let that get into my head, and so, with so much else to do, I basically walked away from it.

12 years later (12 years!) I’m back, and circumstances have changed.

I work for the Archive, I’ve uploaded hundreds of terabytes of stuff, and the BBS documentary rests easily on its laurels of being a worthwhile production. Comments by randos about how they wish I’d done some prettify-ing of the documentary “raw” footage don’t even register. I’ve had to swim upstream through a cascade of poor responses to things I’ve done in public since then – they don’t get at me. It took some time to get to this place of comfort, which is why I bring it up. For people who think of me as some bulletproof soul, let it be known that “even I” had to work up to that level, even when sitting on something like BBS Documentary and years of accomplishment. And those randos? Never heard from them again.

The interview style I used in the documentary raw footage should be noted because it’s deliberate: they’re conversations. I sometimes talk as much as the subjects. It quickly became obvious that people in this situation of describing BBS history would have aspects that were crystal clear, but would also have a thousand little aspects lost in fuzzy clouds of memory. As I’d been studying BBSes intensely for years at this point, it would often take me telling them some story (and often the same stories) to trigger a long-dormant tale that they would fly with. In many cases, you can see me shut up the second people talk, because that was why I was talking in the first place. I should have known people might not get that, and I shouldn’t have listened to them so long ago.

And from these conversations come stories and insights that are priceless. Folks who lived this life in their distant youth have all sorts of perspectives on this odd computer world and it’s just amazing that I have this place and collection to give them back to you.

But it will still need your help.

Here’s the request.

I lived this stupid thing; I really, really want to focus on putting a whole bunch of commitments to bed. Running the MiniDV recorder is not too hard for me, and neither is the basic uploading process, which I’ve refined over the years. But having to listen to myself for hundreds of hours using whatever time I have on earth left… it doesn’t appeal to me at all.

And what I really don’t want to do, beyond listening to myself, is enter the endless amount of potential metadata, especially about content. I might be inspired to here and there, especially with old friends or interviews I find joyful every time I see them again. But I can’t see myself doing this for everything and I think metadata on a “subjects covered” and “when was this all held” is vital for the collection having use. So I need volunteers to help me. I run a Discord server that communicates with people collaborating with me and I have a bunch of other ways to be reached. I’m asking for help here – turning this all into something useful beyond just existing is a vital step that I think everyone can contribute to.

If you think you can help with that, please step forward.

Otherwise… step back – a lot of BBS history is about to go online.

 


The Undiscovered —

There’s a bit of a nuance here; this entry is less about the specific situation I’m talking about, than about the kind of situation it is.

I got pulled into this whole thing randomly, when someone wrote me to let me know it was going along. Naturally, I fired into it all with all cylinders, but after a while, I figured out very good people were already on it, by days, and so I don’t actually have to do much of anything. That works for me.

It went down like this.

MOS Technology designed the 6502 chip which was in a mass of home computers in the 1970s and 1980s. (And is still being sold today.) The company, founded in 1969, was purchased in 1976 by Commodore (they of the 64 and Amiga) and became their chip production arm. A lot of the nitty gritty details are in the Wikipedia page for MOS. This company, now a subsidiary, lived a little life in Pennsylvania throughout the 1980s as part of the Commodore family. I assume people went to work, designed things, parked in the parking lot, checked out prototypes, responded to crazy Commodore administration requests… the usual.

In 1994, Commodore went out of business and its pieces bought by various groups. In the case of the MOS Technology building, it was purchased by various management and probably a little outside investment, and became a new company, called GMT Microelectronics. GMT did whatever companies like that do, until 2001, when they were shut down by the Environmental Protection Agency because it turns out they kind of contaminated the groundwater and didn’t clean it up very well.

Then the building sat, a memory to people who cared about the 6502 (like me), to former employees, and probably nobody else.

Now, welcome to 2017!

The building has gotten a new owner who wants to turn the property into something useful. To do this, they basically have to empty it, raze the building the ground, clean the ground, and then build a new building. Bravo, developer. Remember, this building has sat for 16 years, unwanted and unused.

The sign from the GMT days still sits outside, unchanged and just aged from when the building was once that business. Life has certainly gone on. By the way, these photos are all from Doug Crawford of the Vintage Computing Federation, who took this tour in late 2017.

Inside, as expected, it is a graffiti and firepit shitshow, the result of years of kids and others camping out in the building’s skeletal remains and probably whiling away the weekends hanging out.

And along with these pleasant scenes of decay and loss are some others involving what Doug thought were “Calcium Deposits” and which I personally interpret as maybe I never need to set foot in this building at any point in my future life and probably will have to burn any clothing I wear should I do so.

But damn if Doug didn’t make the journey into this environmentally problematic deathtrap to document it, and he even brought a guest of some reknown related to Commodore history: Bil Herd, one of the designers of the Commodore 128.

So, here’s what I want to get to: In this long-abandoned building, decades past prime and the province of trespassers and neglect, there turns out to have been quite a bit of Commodore history lying about.

There’s unquestionably some unusually neat items here – old printed documentation, chip wafers, and those magnetic tapes of who knows what; maybe design or something else that needed storage.

So here’s the thing; the person who was cleaning up this building for demolishing was put into some really weird situations – he wanted people to know this was here, and maybe offer it up to collectors, but as the blowback happened from folks when he revealed he’d been throwing stuff out, he was thrown into a defensive position and ultimately ended up sticking with looking into selling it, like salvage.

I think there’s two lessons here:

  1. There’s no question there’s caches of materials out there, be they in old corporate offices, warehouses, storerooms, or what have you, that are likely precious windows into bygone technology. There’s an important lesson in not assuming “everything” is gone and maybe digging a bit deeper. That means contacting places, inquiring with simple non-badgering questions, and being known as someone interested in some aspect of history so people might contact you about opportunities going forward.
  2. Being a shouty toolbox about these opportunities will not improve the situation.

I am lucky enough to be offered a lot of neat materials in a given month; people contact me about boxes, rooms and piles that they’re not sure what the right steps are. They don’t want to be lectured or shouted at; they want ideas and support as they work out their relationship to the material. These are often commercial products now long-gone and there’s a narrative that old automatically means “payday at auction” and that may or may not be true; but it’s a very compelling narrative, especially when times are hard.

So much has been saved and yes, a lot has been lost. But if the creators of the 6502 can have wafers and materials sitting around for 20 years after the company closed, I think there’s some brightness on the horizon for a lot of other “lost” materials as well.


Jason Scott Talks His Way Out of It: A Podcast —

Next week I start a podcast.

There’s a Patreon for the podcast with more information here.

Let me unpack a little of the thinking.

Through the last seven years, since I moved back to NY, I’ve had pretty variant experiences of debt or huge costs weighing me down. Previously, I was making some serious income from a unix admin job, and my spending was direct but pretty limited. Since then, even with full-time employment (and I mean, seriously, a dream job), I’ve made some grandiose mistakes with taxes, bills and tracking down old obligations that means I have some notable costs floating in the background.

Compound that with a new home I’ve moved to with real landlords that aren’t family and a general desire to clean up my life, and I realized I needed some way to make extra money that will just drop directly into the bill pit, never to really pass into my hands.

How, then, to do this?

I work very long hours for the Internet Archive, and I am making a huge difference in the world working for them. It wouldn’t be right or useful for me to take on any other job. I also don’t want to be doing something like making “stuff” that I sell or otherwise speculate into some market. Leave aside I have these documentaries to finish, and time has to be short.

Then take into account that I can no longer afford to drop money going to anything other than a small handful of conferences that aren’t local to me (the NY-CT-NJ Tri-State area), and that people really like the presentations I give.

So, I thought, how about me giving basically a presentation once a week? What if I recorded me giving a sort of fireside chat or conversational presentation about subjects I would normally give on the road, but make them into a downloadable podcast? Then, I hope, everyone would be happy: fans get a presentation. I get away from begging for money to pay off debts. I get to refine my speaking skills. And maybe the world gets something fun out of the whole deal.

Enter a podcast, funded by a Patreon.

The title: Jason Talks His Way Out of It, my attempt to write down my debts and share the stories and thoughts I have.

I announced the Patreon on my 47th birthday. Within 24 hours, about 100 people had signed up, paying some small amount (or not small, in some cases) for each published episode. I had a goal of $250/episode to make it worthwhile, and we passed that handily. So it’s happening.

I recorded a prototype episode, and that’s up there, and the first episode of the series drops Monday. These are story-based presentations roughly 30 minutes long apiece, and I will continue to do them as long as it makes sense to.

Public speaking is something I’ve done for many, many years, and I enjoy it, and I get comments that people enjoy them very much. My presentation on That Awesome Time I Was Sued for Two Billion Dollars has passed 800,000 views on the various copies online.

I spent $40 improving my sound setup, which should work for the time being. (I already had a nice microphone and a SSD-based laptop which won’t add sound to the room.) I’m going to have a growing list of topics I’ll work from, and I’ll stay in communication with the patrons.

Let’s see what this brings.

One other thing: Moving to the new home means that a lot of quality of life issues have been fixed, and my goal is to really shoot forward finishing those two documentaries I owe people. I want them done as much as everyone else! And with less looming bills and debts in my life, it’ll be all I want to do.

So, back the new podcast if you’d like. It’ll help a lot.


The Bounty of the Ted Nelson Junk Mail —

At the end of May, I mentioned the Ted Nelson Junk Mail project, where a group of people were scanning in boxes of mailings and pamphlets collected by Ted Nelson and putting them on the Internet Archive. Besides the uniqueness of the content, it was also unique in that we were trying to set it up to be self-sustaining from volunteer monetary contributions, and the compensate the scanners doing the work.

This entire endeavor has been wildly successful.

We are well past 18,000 pages scanned. We have taken in thousands in donations. And we now have three people scanning and one person entering metadata.

Here is the spreadsheet with transparency and donation information.

I highly encourage donating.

But let’s talk about how this collection continues to be amazing.

Always, there are the pure visuals. As we’re scanning away, we’re starting to see trends in what we have, and everything seems to go from the early 1960s to the early 1990s, a 30-year scope that encompasses a lot of companies and a lot of industries. These companies are trying to thrive in a whirlpool of competing attention, especially in certain technical fields, and they try everything from humor to class to rudimentary fear-and-uncertainty plays in the art.

These are exquisitely designed brochures, in many cases – obviously done by a firm or with an in-house group specifically tasked with making the best possible paper invitations and with little expense spared. After all, this might be the only customer-facing communication a company could have about its products, and might be the best convincing literature after the salesman has left or the envelope is opened.

Scanning at 600dpi has been a smart move – you can really zoom in and see detail, find lots to play with or study or copy. Everything is at this level, like this detail about a magnetic eraser that lets you see the lettering on the side.

Going after these companies for gender roles or other out-of-fashion jokes almost feels like punching down, but yeah, there’s a lot of it. Women draped over machines, assumptions that women will be doing the typing, and clunky humor about fulfilling your responsibilities as a (male) boss abounds. Cultural norms regarding what fears reigned in business or how companies were expected to keep on top of the latest trends are baked in there too.

The biggest obstacle going forward, besides bringing attention to this work, is going to be one of findability. The collection is not based on some specific subject matter other than what attracted Ted’s attention over the decades. He tripped lightly among aerospace, lab science, computers, electronics, publishing… nothing escaped his grasp, especially in technical fields.

If people are looking for pure aesthetic beauty, that is, “here’s a drawing of something done in a very old way” or “here are old fonts”, then this bounty is already, at 1,700 items, a treasure trove that could absorb weeks of your time. Just clicking around to items that on first blush seem to have boring title pages will often expand into breathtaking works of art and design.

I’m not worried about that part, frankly – these kind of sell themselves.

But there’s so much more to find among these pages, and as we’re now up to so many examples, it’s going to be a challenge to get researching folks to find them.

We have the keywording active, so you can search for terms like monitor, circuit, or hypercard and get more specific matches without concentrating on what the title says or what graphics appear on the front. The Archive has a full-text search, and so people looking for phrases will no doubt stumble into this collection.

But how easily will people even think to know about a wristwatch for the Macintosh from 1990, a closed circuit camera called the Handy Looky..  or this little graphic, nestled away inside a bland software catalog:

…I don’t know. I’ll mention that this is actually twitter-fodder among archivists, who are unhappy when someone is described as “discovering” something in the archives, when it was obvious a person cataloged it and put it there.

But that’s not the case here. Even Kyle, who’s doing the metadata, is doing so in a descriptive fashion, and on a rough day of typing in descriptions, he might not particularly highlight unique gems in the pile (he often does, though). So, if you discover them in there, you really did discover them.

So, the project is deep, delightful, and successful. The main consideration of this is funding; we are paying the scanners $10/hr to scan and the metadata is $15/hr. They work fast and efficiently. We track them on the spreadsheet. But that means a single day of this work can cause a notable bill. We’re asking people on twitter to raise funds, but it never hurts to ask here as well. Consider donating to this project, because we may not know for years how much wonderful history is saved here.

Please share the jewels you find.


4 Months! —

It’s been 4 months since my last post! That’s one busy little Jason summer, to be sure.

Obviously, I’m still around, so no heart attack lingering or problems. My doctor told me that my heart is basically healed, and he wants more exercise out of me. My diet’s continued to be lots of whole foods, leafy greens and occasional shameful treats that don’t turn into a staple.

I spent a good month working with good friends to clear out the famous Information Cube, sorting out and mailing/driving away all the contents to other institutions, including the Internet Archive, the Strong Museum of Play, the Vintage Computer Federation, and parts worldwide.

I’ve moved homes, no longer living with my brother after seven up-and-down years of siblings sharing a house. It was time! We’re probably not permanently scarred! I love him very much. I now live in an apartment with very specific landlords with rules and an important need to pay them on time each and every month.

To that end, I’ve cut back on my expenses and will continue to, so it’s the end of me “just showing up” to pretty much any conferences that I’m not being compensated for, which will of course cut things down in terms of Jason appearances you can find me at.

I’ll still be making appearances as people ask me to go, of course – I love travel. I’m speaking in Amsterdam in October, as well as being an Emcee at the Internet Archive in October as well. So we’ll see how that goes.

What that means is more media ingestion work, and more work on the remaining two documentaries. I’m going to continue my goal of clearing my commitments before long, so I can choose what I do next.

What follows will be (I hope) lots of entries going deep into some subjects and about what I’m working on, and I thank you for your patience as I was not writing weblog entries while upending my entire life.

To the future!


Ted Nelson’s Junk Mail (and the Archive Corps Pilot) —

I’ve been very lucky over the past few months to dedicate a few days here and there to helping legend Ted Nelson sort through his archives. We’ve known each other for a bunch of years now, but it’s always a privilege to get a chance to hang with Ted and especially to help him with auditing and maintaining his collection of papers, notes, binders, and items. It also helps that it’s in pretty fantastic shape to begin with.

Along with sorting comes some discarding – mostly old magazines and books; they’re being donated wherever it makes sense to. Along with these items were junk mail that Ted got over the decades.

About that junk mail….

After glancing through it, I requested to keep it and take it home. There was a lot of it, and even going through it with a cursory view showed me it was priceless.

There’s two kinds of people in the world – those who look at ephemera and consider it trash, and those who consider it gold.

I’m in the gold camp.

I’d already been doing something like this for years, myself – when I was a teenager, I circled so many reader service cards and pulled in piles and piles of flyers and mailings from companies so fleeting or so weird, and I kept them. These became digitize.textfiles.com and later the reader service collection, which encapsulates digitize.textfiles.com completely. There’s well over a thousand pages in that collection, which I’ve scanned myself.

Ted, basically, did what I was doing, but with more breadth, more variety, and with a few decades more time.

And because he was always keeping an eye out on many possibilities for future fields of study, he kept his mind (and mailbox) open to a lot of industries. Manufacturing, engineering, film-making, printing, and of course “computers” as expressed in a thousand different ways. The mail dates from the 1960s through to the mid 2000s, and it’s friggin’ beautiful.

Here’s where it gets interesting, and where you come in.

There’s now a collection of scanned mail from this collection up at the Internet Archive. It’s called Ted Nelson’s Junk Mail and you can see the hundreds of scanned pages that will soon become thousands and maybe tens of thousands of scanned pages.

They’re separated by mailing, and over time the metadata and the contents will get better, increase in size, and hopefully provide decades of enjoyment for people.

The project is being coordinated by Kevin Savetz, who has hired a temp worker to scan in the pages across each weekday, going through the boxes and doing the “easy” stuff (8.5×11 sheets) which, trust me, is definitely worth going through first. As they’re scanned, they’re uploaded, and (for now) I am running scripts to add them as items to the Junk Mail collection.

The cost of doing this is roughly $80 a day, during which hundreds of pages can be scanned. We’re refining the process as we go, and expect it to get even more productive over time.

So, here’s where Archive Corps comes in; this is a pilot program for the idea behind the new idea of Archive Corps, which is providing a funnel for all the amazing stuff out there to get scanned. If you want to see more stuff come from the operation that Kevin is running, he has a paypal address up at k@savetz.com – the more you donate the more days we are able to have the temp come in to scan.

I’m very excited to watch this collection grow, and see the massive variety of history that it will reveal. A huge thank-you to Ted Nelson for letting me take these items, and a thank-you to Kevin Savetz for coordinating.

Let’s enjoy some history!


A Lot of Doing —

If you follow this weblog, you saw there was a pause of a couple months. I’ve been busy! Better to do than to talk about doing.

A flood of posts are coming – they reflect accomplishments and thoughts of the last period of time, so don’t be freaked out as they pop up in your life very quickly.

Thanks.


Please Help Us Track Down Apple II Collections —

Please spread this as far as possible – I want to reach folks who are far outside the usual channels.

The Summary: Conditions are very, very good right now for easy, top-quality, final ingestion of original commercial Apple II Software and if you know people sitting on a pile of it or even if you have a small handful of boxes, please get in touch with me to arrange the disks to be imaged. apple@textfiles.com. 

The rest of this entry says this in much longer, hopefully compelling fashion.

We are in a golden age for Apple II history capture.

For now, and it won’t last (because nothing lasts), an incredible amount of interest and effort and tools are all focused on acquiring Apple II software, especially educational and engineering software, and ensuring it lasts another generation and beyond.

I’d like to take advantage of that, and I’d like your help.

Here’s the secret about Apple II software: Copy Protection Works.

Copy protection, that method of messing up easy copying from floppy disks, turns out to have been very effective at doing what it is meant to do – slow down the duplication of materials so a few sales can eke by. For anything but the most compelling, most universally interesting software, copy protection did a very good job of ensuring that only the approved disks that went out the door are the remaining extant copies for a vast majority of titles.

As programmers and publishers laid logic bombs and coding traps and took the brilliance of watchmakers and used it to design alternative operating systems, they did so to ensure people wouldn’t take the time to actually make the effort to capture every single bit off the drive and do the intense and exacting work to make it easy to spread in a reproducible fashion.

They were right.

So, obviously it wasn’t 100% effective at stopping people from making copies of programs, or so many people who used the Apple II wouldn’t remember the games they played at school or at user-groups or downloaded from AE Lines and BBSes, with pirate group greetings and modified graphics.

What happened is that pirates and crackers did what was needed to break enough of the protection on high-demand programs (games, productivity) to make them work. They used special hardware modifications to “snapshot” memory and pull out a program. They traced the booting of the program by stepping through its code and then snipped out the clever tripwires that freaked out if something wasn’t right. They tied it up into a bow so that instead of a horrendous 140 kilobyte floppy, you could have a small 15 or 20 kilobyte program instead. They even put multiple cracked programs together on one disk so you could get a bunch of cool programs at once.

I have an entire section of TEXTFILES.COM dedicated to this art and craft.

And one could definitely argue that the programs (at least the popular ones) were “saved”. They persisted, they spread, they still exist in various forms.

And oh, the crack screens!

I love the crack screens, and put up a massive pile of them here. Let’s be clear about that – they’re a wonderful, special thing and the amount of love and effort that went into them (especially on the Commodore 64 platform) drove an art form (demoscene) that I really love and which still thrives to this day.

But these aren’t the original programs and disks, and in some cases, not the originals by a long shot. What people remember booting in the 1980s were often distant cousins to the floppies that were distributed inside the boxes, with the custom labels and the nice manuals.

.

On the left is the title screen for Sabotage. It’s a little clunky and weird, but it’s also something almost nobody who played Sabotage back in the day ever saw; they only saw the instructions screen on the right. The reason for this is that there were two files on the disk, one for starting the title screen and then the game, and the other was the game. Whoever cracked it long ago only did the game file, leaving the rest as one might leave the shell of a nut.

I don’t think it’s terrible these exist! They’re art and history in their own right.

However… the mistake, which I completely understand making, is to see programs and versions of old Apple II software up on the Archive and say “It’s handled, we’re done here.” You might be someone with a small stack of Apple II software, newly acquired or decades old, and think you don’t have anything to contribute.

That’d be a huge error.

It’s a bad assumption because there’s a chance the original versions of these programs, unseen since they were sold, is sitting in your hands. It’s a version different than the one everyone thinks is “the” version. It’s precious, it’s rare, and it’s facing the darkness.

There is incredibly good news, however.

I’ve mentioned some of these folks before, but there is now a powerful allegiance of very talented developers and enthusiasts who have been pouring an enormous amount of skills into the preservation of Apple II software. You can debate if this is the best use of their (considerable) skills, but here we are.

They have been acquiring original commercial Apple II software from a variety of sources, including auctions, private collectors, and luck. They’ve been duplicating the originals on a bits level, then going in and “silent cracking” the software so that it can be played on an emulator or via the web emulation system I’ve been so hot on, and not have any change in operation, except for not failing due to copy protection.

With a “silent crack”, you don’t take the credit, you don’t make it about yourself – you just make it work, and work entirely like it did, without yanking out pieces of the code and program to make it smaller for transfer or to get rid of a section you don’t understand.

Most prominent of these is 4AM, who I have written about before. But there are others, and they’re all working together at the moment.

These folks, these modern engineering-minded crackers, are really good. Really, really good.

They’ve been developing tools from the ground up that are focused on silent cracks, of optimizing the process, of allowing dozens, sometimes hundreds of floppies to be evaluated automatically and reducing the workload. And they’re fast about it, especially when dealing with a particularly tough problem.

Take, for example, the efforts required to crack Pinball Construction Set, and marvel not just that it was done, but that a generous and open-minded article was written explaining exactly what was being done to achieve this.

This group can be handed a stack of floppies, image them, evaluate them, and find which have not yet been preserved in this fashion.

But there’s only one problem: They are starting to run out of floppies.

I should be clear that there’s plenty left in the current stack – hundreds of floppies are being processed. But I also have seen the effort chug along and we’ve been going through direct piles, then piles of friends, and then piles of friends of friends. We’ve had a few folks from outside the community bring stuff in, but those are way more scarce than they should be.

I’m working with a theory, you see.

My theory is that there are large collections of Apple II software out there. Maybe someone’s dad had a store long ago. Maybe someone took in boxes of programs over the years and they’re in the basement or attic. I think these folks are living outside the realm of the “Apple II Community” that currently exists (and which is a wonderful set of people, be clear). I’m talking about the difference between a fan club for surfboards and someone who has a massive set of surfboards because his dad used to run a shop and they’re all out in the barn.

A lot of what I do is put groups of people together and then step back to let the magic happen. This is a case where this amazingly talented group of people are currently a well-oiled machine – they help each other out, they are innovating along this line, and Apple II software is being captured in a world-class fashion, with no filtering being done because it’s some hot ware that everyone wants to play.

For example, piles and piles of educational software has returned from potential oblivion, because it’s about the preservation, not the title. Wonderfully done works are being brought back to life and are playable on the Internet Archive.

So like I said above, the message is this:

Conditions are very, very good right now for easy, top-quality, final ingestion of original commercial Apple II Software and if you know people sitting on a pile of it or even if you have a small handful of boxes, please get in touch with me to arrange the disks to be imaged. apple@textfiles.com.

I’ll go on podcasts or do interviews, or chat with folks on the phone, or trade lots of e-mails discussing details. This is a very special time, and I feel the moment to act is now. Alliances and communities like these do not last forever, and we’re in a peak moment of talent and technical landscape to really make a dent in what are likely acres of unpreserved titles.

It’s 4am and nearly morning for Apple II software.

It’d be nice to get it all before we wake up.

 


Sandpapering Screenshots —

The collection I talked about yesterday was subjected to the Screen Shotgun, which does a really good job of playing the items, capturing screenshots, and uploading them into the item to allow people to easily see, visually, what they’re in for if they boot them up.

In general, the screen shotgun does the job well, but not perfectly. It doesn’t understand what it’s looking at, at all, and the method I use to decide the “canonical” screenshot is inherently shallow – I choose the largest filesize, because that tends to be the most “interesting”.

The bug in this is that if you have, say, these three screenshots:

…it’s going to choose the first one, because those middle-of-loading graphics for an animated title screen have tons of little artifacts, and the filesize is bigger. Additionally, the second is fine, but it’s not the “title”, the recognized “welcome to this program” image. So the best choice turns out to be the third.

I don’t know why I’d not done this sooner, but while waiting for 500 disks to screenshot, I finally wrote a program to show me all the screenshots taken for an item, and declare a replacement canonical title screenshot. The results have been way too much fun.

It turns out, doing this for Apple II programs in particular, where it’s removed the duplicates and is just showing you a gallery, is beautiful:

Again, the all-text “loading screen” in the middle, which is caused by blowing program data into screen memory, wins the “largest file” contest, but literally any other of the screens would be more appropriate.

This is happening all over the place: crack screens win over the actual main screen, the mid-loading noise of Apple II programs win over the final clean image, and so on.

Working with tens of thousands of software programs, primarily alone, means that I’m trying to find automation wherever I can. I can’t personally boot up each program and do the work needed to screenshot/describe it – if a machine can do anything, I’ll make the machine do it. People will come to me with fixes or changes if the results are particularly ugly, but it does leave a small amount that no amount of automation is likely to catch.

If you watch a show or documentary on factory setups and assembly lines, you’ll notice they can’t quite get rid of people along the entire line, especially the sign-off. Someone has to keep an eye to make sure it’s not going all wrong, or, even more interestingly, a table will come off the line and you see one person giving it a quick run-over with sandpaper, just to pare down the imperfections or missed spots of the machine. You still did an enormous amount of work with no human effort, but if you think that’s ready for the world with no final sign-off, you’re kidding yourself.

So while it does mean another hour or two looking at a few hundred screenshots, it’s nice to know I haven’t completely automated away the pleasure of seeing some vintage computer art, for my work, and for the joy of it.


Thoughts on a Collection: Apple II Floppies in the Realm of the Now —

I was connected with The 3D0G Knight, a long-retired Apple II pirate/collector who had built up a set of hundreds of floppy disks acquired from many different locations and friends decades ago. He generously sent me his entire collection to ingest into a more modern digital format, as well as the Internet Archive’s software archive.

The floppies came in a box without any sort of sleeves for them, with what turned out to be roughly 350 of them removed from “ammo boxes” by 3D0G from his parents’ house. The disks all had labels of some sort, and a printed index came along with it all, mapped to the unique disk ID/Numbers that had been carefully put on all of them years ago. I expect this was months of work at the time.

Each floppy is 140k of data on each side, and in this case, all the floppies had been single-sided and clipped with an additional notch with a hole punch to allow the second side to be used as well.

Even though they’re packed a little strangely, there was no damage anywhere, nothing bent or broken or ripped, and all the items were intact. It looked to be quite the bonanza of potentially new vintage software.

So, this activity at the crux of the work going on with both the older software on the Internet Archive, as well as what I’m doing with web browser emulation and increasing easy access to the works of old. The most important thing, over everything else, is to close the air gap – get the data off these disappearing floppy disks and into something online where people or scripts can benefit from them and research them. Almost everything else – scanning of cover art, ingestion of metadata, pulling together the history of a company or cross-checking what titles had which collaborators… that has nowhere near the expiration date of the magnetized coated plastic disks going under. This needs us and it needs us now.

The way that things currently work with Apple II floppies is to separate them into two classes: Disks that Just Copy, and Disks That Need A Little Love. The Little Love disks, when found, are packed up and sent off to one of my collaborators, 4AM, who has the tools and the skills to get data of particularly tenacious floppies, as well as doing “silent cracks” of commercial floppies to preserve what’s on them as best as possible.

Doing the “Disks that Just Copy” is a mite easier. I currently have an Apple II system on my desk that connects via USB-to-serial connection to my PC. There, I run a program called Apple Disk Transfer that basically turns the Apple into a Floppy Reading Machine, with pretty interface and everything.

Apple Disk Transfer (ADT) has been around a very long time and knows what it’s doing – a floppy disk with no trickery on the encoding side can be ripped out and transferred to a “.DSK” file on the PC in about 20 seconds. If there’s something wrong with the disk in terms of being an easy read, ADT is very loud about it. I can do other things while reading floppies, and I end up with a whole pile of filenames when it’s done. The workflow, in other words, isn’t so bad as long as the floppies aren’t in really bad shape. In this particular set, the floppies were in excellent shape, except when they weren’t, and the vast majority fell into the “excellent” camp.

The floppy drive that sits at the middle of this looks like some sort of nightmare, but it helps to understand that with Apple II floppy drives, you really have to have the cover removed at all time, because you will be constantly checking the read head for dust, smudges, and so on. Unscrewing the whole mess and putting it back together for looks just doesn’t scale. It’s ugly, but it works.

It took me about three days (while doing lots of other stuff) but in the end I had 714 .dsk images pulled from both sides of the floppies, which works out to 357 floppy disks successfully imaged. Another 20 or so are going to get a once over but probably are going to go into 4am’s hands to get final evaluation. (Some of them may in fact be blank, but were labelled in preparation, and so on.) 714 is a lot to get from one person!

As mentioned, an Apple II 5.25″ floppy disk image is pretty much always 140k. The names of the floppy are mine, taken off the label, or added based on glancing inside the disk image after it’s done. For a quick glance, I use either an Apple II emulator called Applewin, or the fantastically useful Apple II disk image investigator Ciderpress, which is a frankly the gold standard for what should be out there for every vintage disk/cartridge/cassette image. As might be expected, labels don’t always match contents. C’est la vie.

As for the contents of the disks themselves; this comes down to what the “standard collection” was for an Apple II user in the 1980s who wasn’t afraid to let their software library grow utilizing less than legitimate circumstances. Instead of an elegant case of shiny, professionally labelled floppy diskettes, we get a scribbled, messy, organic collection of all range of “warez” with no real theme. There’s games, of course, but there’s also productivity, utilities, artwork, and one-off collections of textfiles and documentation. Games that were “cracked” down into single-file payloads find themselves with 4-5 other unexpected housemates and sitting behind a menu. A person spending the equivalent of $50-$70 per title might be expected to have a relatively small and distinct library, but someone who is meeting up with friends or associates and duplicating floppies over a few hours will just grab bushels of strange.

The result of the first run is already up on the Archive: A 37 Megabyte .ZIP file containing all the images I pulled off the floppies. 

In terms of what will be of relevance to later historians, researchers, or collectors, that zip file is probably the best way to go – it’s not munged up with the needs of the Archive’s structure, and is just the disk images and nothing else.

This single .zip archive might be sufficient for a lot of sites (go git ‘er!) but as mentioned infinite times before, there is a very strong ethic across the Internet Archive’s software collection to make things as accessible as possible, and hence there are over nearly 500 items in the “3D0G Knight Collection” besides the “download it all” item.

The rest of this entry talks about why it’s 500 and not 714, and how it is put together, and the rest of my thoughts on this whole endeavor. If you just want to play some games online or pull a 37mb file and run, cackling happily, into the night, so be it.

The relatively small number of people who have exceedingly hard opinions on how things “should be done” in the vintage computing space will also want to join the folks who are pulling the 37mb file. Everything else done by me after the generation of the .zip file is in service of the present and near future. The items that number in the hundreds on the Archive that contain one floppy disk image and interaction with it are meant for people to find now. I want someone to have a vague memory of a game or program once interacted with, and if possible, to find it on the Archive. I also like people browsing around randomly until something catches their eye and to be able to leap into the program immediately.

To those ends, and as an exercise, I’ve acquired or collaborated on scripts to do the lion’s share of analysis on software images to prep them for this living museum. These scripts get it “mostly” right, and the rough edges they bring in from running are easily smoothed over by a microscopic amount of post-processing manual attention, like running a piece of sandpaper over a machine-made joint.

Again, we started out 714 disk images. The first thing done was to run them against a script that has hash checksums for every exposed Apple II disk image on the Archive, which now number over 10,000. Doing this dropped the “uniquely new” disk images from 714 to 667.

Next, I concatenated disk images that are part of the same product into one item: if a paint program has two floppy disk images for each of the sides of its disk, those become a single item. In one or two cases, the program spans multiple floppies, so 4-8 (and in one case, 14!) floppy images become a single item. Doing this dropped the total from 667 to 495 unique items. That’s why the number is significantly smaller than the original total.

Let’s talk for a moment about this.

Using hashes and comparing them is the roughest of rough approaches to de-duplicating software items. I do it with Apple II images because they tend to be self contained (a single .dsk file) and because Apple II software has a lot of people involved in it. I’m not alone by any means in acquiring these materials and I’m certainly not alone in terms of work being done to track down all the unique variations and most obscure and nearly lost packages written for this platform. If I was the only person in the world (or one of a tiny sliver) working on this I might be super careful with each and every item to catalog it – but I’m absolutely not; I count at least a half-dozen operations involving in Apple II floppy image ingestion.

And as a bonus, it’s a really nice platform. When someone puts their heart into an Apple II program, it rewards them and the end user as well – the graphics can be charming, the program flow intuitive, and the whole package just gleams on the screen. It’s rewarding to work with this corpus, so I’m using it as a test bed for all these methods, including using hashes.

But hash checksums are seriously not the be-all for this work. Anything can make a hash different – an added file, a modified bit, or a compilation of already-on-the-archive-in-a-hundred-places files that just happen to be grouped up slightly different than others. That said, it’s not overwhelming – you can read about what’s on a floppy and decide what you want pretty quickly; gigabytes will not be lost and the work to track down every single unique file has potential but isn’t necessary yet.

(For the people who care, the Internet Archive generates three different hashes (md5, crc32, sha1) and lists the size of the file – looking across all of those for comparison is pretty good for ensuring you probably have something new and unique.)

Once the items are up there, the Screen Shotgun whips into action. It plays the programs in the emulator, takes screenshots, leafs off the unique ones, and then assembles it all into a nice package. Again, not perfect but left alone, it does the work with no human intervention and gets things generally right. If you see a screenshot in this collection, a robot did it and I had nothing to do with it.

This leads, of course, to scaring out which programs are a tad not-bootable, and by that I mean that they boot up in the emulator and the emulator sees them and all, but the result is not that satisfying:

On a pure accuracy level, this is doing exactly what it’s supposed to – the disk wasn’t ever a properly packaged, self-contained item, and it needs a boot disk to go in the machine first before you swap the floppy. I intend to work with volunteers to help with this problem, but here is where it stands.

The solution in the meantime is a java program modified by Kevin Savetz, which analyzes the floppy disk image and prints all the disk information it can find, including the contents of BASIC programs and textfiles. Here’s a non-booting disk where this worked out. The result is that this all gets ingested into the search engine of the Archive, and so if you’re looking for a file within the disk images, there’s a chance you’ll be able to find it.

Once the robots have their way with all the items, I can go in and fix a few things, like screenshots that went south, or descriptions and titles that don’t reflect what actually boots up. The amount of work I, a single person, have to do is therefore reduced to something manageable.

I think this all works well enough for the contemporary vintage software researcher and end user. Perhaps that opinion is not universal.

What I can say, however, is that the core action here – of taking data away from a transient and at-risk storage medium and putting it into a slightly less transient, less at-risk storage medium – is 99% of the battle. To have the will to do it, to connect with the people who have these items around and to show them it’ll be painless for them, and to just take the time to shove floppies into a drive and read them, hundreds of times… that’s the huge mountain to climb right now. I no longer have particularly deep concerns about technology failing to work with these digital images, once they’re absorbed into the Internet. It’s this current time, out in the cold, unknown and unloved, that they’re the most at risk.

The rest, I’m going to say, is gravy.

I’ll talk more about exactly how tasty and real that gravy is in the future, but for now, please take a pleasant walk in the 3D0G Knight’s Domain.