ASCII by Jason Scott

The Bounty of the Ted Nelson Junk Mail — September 10, 2017

At the end of May, I mentioned the Ted Nelson Junk Mail project, where a group of people were scanning in boxes of mailings and pamphlets collected by Ted Nelson and putting them on the Internet Archive. Besides the uniqueness of the content, it was also unique in that we were trying to set it up to be self-sustaining from volunteer monetary contributions, and the compensate the scanners doing the work.

This entire endeavor has been wildly successful.

We are well past 18,000 pages scanned. We have taken in thousands in donations. And we now have three people scanning and one person entering metadata.

Here is the spreadsheet with transparency and donation information.

I highly encourage donating.

But let’s talk about how this collection continues to be amazing.

Always, there are the pure visuals. As we’re scanning away, we’re starting to see trends in what we have, and everything seems to go from the early 1960s to the early 1990s, a 30-year scope that encompasses a lot of companies and a lot of industries. These companies are trying to thrive in a whirlpool of competing attention, especially in certain technical fields, and they try everything from humor to class to rudimentary fear-and-uncertainty plays in the art.

These are exquisitely designed brochures, in many cases – obviously done by a firm or with an in-house group specifically tasked with making the best possible paper invitations and with little expense spared. After all, this might be the only customer-facing communication a company could have about its products, and might be the best convincing literature after the salesman has left or the envelope is opened.

Scanning at 600dpi has been a smart move – you can really zoom in and see detail, find lots to play with or study or copy. Everything is at this level, like this detail about a magnetic eraser that lets you see the lettering on the side.

Going after these companies for gender roles or other out-of-fashion jokes almost feels like punching down, but yeah, there’s a lot of it. Women draped over machines, assumptions that women will be doing the typing, and clunky humor about fulfilling your responsibilities as a (male) boss abounds. Cultural norms regarding what fears reigned in business or how companies were expected to keep on top of the latest trends are baked in there too.

The biggest obstacle going forward, besides bringing attention to this work, is going to be one of findability. The collection is not based on some specific subject matter other than what attracted Ted’s attention over the decades. He tripped lightly among aerospace, lab science, computers, electronics, publishing… nothing escaped his grasp, especially in technical fields.

If people are looking for pure aesthetic beauty, that is, “here’s a drawing of something done in a very old way” or “here are old fonts”, then this bounty is already, at 1,700 items, a treasure trove that could absorb weeks of your time. Just clicking around to items that on first blush seem to have boring title pages will often expand into breathtaking works of art and design.

I’m not worried about that part, frankly – these kind of sell themselves.

But there’s so much more to find among these pages, and as we’re now up to so many examples, it’s going to be a challenge to get researching folks to find them.

We have the keywording active, so you can search for terms like monitor, circuit, or hypercard and get more specific matches without concentrating on what the title says or what graphics appear on the front. The Archive has a full-text search, and so people looking for phrases will no doubt stumble into this collection.

But how easily will people even think to know about a wristwatch for the Macintosh from 1990, a closed circuit camera called the Handy Looky.. or this little graphic, nestled away inside a bland software catalog:

…I don’t know. I’ll mention that this is actually twitter-fodder among archivists, who are unhappy when someone is described as “discovering” something in the archives, when it was obvious a person cataloged it and put it there.

But that’s not the case here. Even Kyle, who’s doing the metadata, is doing so in a descriptive fashion, and on a rough day of typing in descriptions, he might not particularly highlight unique gems in the pile (he often does, though). So, if you discover them in there, you really did discover them.

So, the project is deep, delightful, and successful. The main consideration of this is funding; we are paying the scanners $10/hr to scan and the metadata is $15/hr. They work fast and efficiently. We track them on the spreadsheet. But that means a single day of this work can cause a notable bill. We’re asking people on twitter to raise funds, but it never hurts to ask here as well. Consider donating to this project, because we may not know for years how much wonderful history is saved here.

Please share the jewels you find.

Comments Off on The Bounty of the Ted Nelson Junk Mail

4 Months! — September 9, 2017

It’s been 4 months since my last post! That’s one busy little Jason summer, to be sure.

Obviously, I’m still around, so no heart attack lingering or problems. My doctor told me that my heart is basically healed, and he wants more exercise out of me. My diet’s continued to be lots of whole foods, leafy greens and occasional shameful treats that don’t turn into a staple.

I spent a good month working with good friends to clear out the famous Information Cube, sorting out and mailing/driving away all the contents to other institutions, including the Internet Archive, the Strong Museum of Play, the Vintage Computer Federation, and parts worldwide.

I’ve moved homes, no longer living with my brother after seven up-and-down years of siblings sharing a house. It was time! We’re probably not permanently scarred! I love him very much. I now live in an apartment with very specific landlords with rules and an important need to pay them on time each and every month.

To that end, I’ve cut back on my expenses and will continue to, so it’s the end of me “just showing up” to pretty much any conferences that I’m not being compensated for, which will of course cut things down in terms of Jason appearances you can find me at.

I’ll still be making appearances as people ask me to go, of course – I love travel. I’m speaking in Amsterdam in October, as well as being an Emcee at the Internet Archive in October as well. So we’ll see how that goes.

What that means is more media ingestion work, and more work on the remaining two documentaries. I’m going to continue my goal of clearing my commitments before long, so I can choose what I do next.

What follows will be (I hope) lots of entries going deep into some subjects and about what I’m working on, and I thank you for your patience as I was not writing weblog entries while upending my entire life.

To the future!

3 Comments

Ted Nelson’s Junk Mail (and the Archive Corps Pilot) — May 31, 2017

I’ve been very lucky over the past few months to dedicate a few days here and there to helping legend Ted Nelson sort through his archives. We’ve known each other for a bunch of years now, but it’s always a privilege to get a chance to hang with Ted and especially to help him with auditing and maintaining his collection of papers, notes, binders, and items. It also helps that it’s in pretty fantastic shape to begin with.

Along with sorting comes some discarding – mostly old magazines and books; they’re being donated wherever it makes sense to. Along with these items were junk mail that Ted got over the decades.

About that junk mail….

After glancing through it, I requested to keep it and take it home. There was a lot of it, and even going through it with a cursory view showed me it was priceless.

There’s two kinds of people in the world – those who look at ephemera and consider it trash, and those who consider it gold.

I’m in the gold camp.

I’d already been doing something like this for years, myself – when I was a teenager, I circled so many reader service cards and pulled in piles and piles of flyers and mailings from companies so fleeting or so weird, and I kept them. These became digitize.textfiles.com and later the reader service collection, which encapsulates digitize.textfiles.com completely. There’s well over a thousand pages in that collection, which I’ve scanned myself.

Ted, basically, did what I was doing, but with more breadth, more variety, and with a few decades more time.

And because he was always keeping an eye out on many possibilities for future fields of study, he kept his mind (and mailbox) open to a lot of industries. Manufacturing, engineering, film-making, printing, and of course “computers” as expressed in a thousand different ways. The mail dates from the 1960s through to the mid 2000s, and it’s friggin’ beautiful.

Here’s where it gets interesting, and where you come in.

There’s now a collection of scanned mail from this collection up at the Internet Archive. It’s called Ted Nelson’s Junk Mail and you can see the hundreds of scanned pages that will soon become thousands and maybe tens of thousands of scanned pages.

They’re separated by mailing, and over time the metadata and the contents will get better, increase in size, and hopefully provide decades of enjoyment for people.

The project is being coordinated by Kevin Savetz, who has hired a temp worker to scan in the pages across each weekday, going through the boxes and doing the “easy” stuff (8.5×11 sheets) which, trust me, is definitely worth going through first. As they’re scanned, they’re uploaded, and (for now) I am running scripts to add them as items to the Junk Mail collection.

The cost of doing this is roughly $80 a day, during which hundreds of pages can be scanned. We’re refining the process as we go, and expect it to get even more productive over time.

So, here’s where Archive Corps comes in; this is a pilot program for the idea behind the new idea of Archive Corps, which is providing a funnel for all the amazing stuff out there to get scanned. If you want to see more stuff come from the operation that Kevin is running, he has a paypal address up at k@savetz.com – the more you donate the more days we are able to have the temp come in to scan.

I’m very excited to watch this collection grow, and see the massive variety of history that it will reveal. A huge thank-you to Ted Nelson for letting me take these items, and a thank-you to Kevin Savetz for coordinating.

Let’s enjoy some history!

1 Comment

A Lot of Doing — May 28, 2017

If you follow this weblog, you saw there was a pause of a couple months. I’ve been busy! Better to do than to talk about doing.

A flood of posts are coming – they reflect accomplishments and thoughts of the last period of time, so don’t be freaked out as they pop up in your life very quickly.

Thanks.

1 Comment

Please Help Us Track Down Apple II Collections — March 20, 2017

Please spread this as far as possible – I want to reach folks who are far outside the usual channels.

The Summary: Conditions are very, very good right now for easy, top-quality, final ingestion of original commercial Apple II Software and if you know people sitting on a pile of it or even if you have a small handful of boxes, please get in touch with me to arrange the disks to be imaged. apple@textfiles.com.

The rest of this entry says this in much longer, hopefully compelling fashion.

We are in a golden age for Apple II history capture.

For now, and it won’t last (because nothing lasts), an incredible amount of interest and effort and tools are all focused on acquiring Apple II software, especially educational and engineering software, and ensuring it lasts another generation and beyond.

I’d like to take advantage of that, and I’d like your help.

Here’s the secret about Apple II software: Copy Protection Works.

Copy protection, that method of messing up easy copying from floppy disks, turns out to have been very effective at doing what it is meant to do – slow down the duplication of materials so a few sales can eke by. For anything but the most compelling, most universally interesting software, copy protection did a very good job of ensuring that only the approved disks that went out the door are the remaining extant copies for a vast majority of titles.

As programmers and publishers laid logic bombs and coding traps and took the brilliance of watchmakers and used it to design alternative operating systems, they did so to ensure people wouldn’t take the time to actually make the effort to capture every single bit off the drive and do the intense and exacting work to make it easy to spread in a reproducible fashion.

They were right.

So, obviously it wasn’t 100% effective at stopping people from making copies of programs, or so many people who used the Apple II wouldn’t remember the games they played at school or at user-groups or downloaded from AE Lines and BBSes, with pirate group greetings and modified graphics.

What happened is that pirates and crackers did what was needed to break enough of the protection on high-demand programs (games, productivity) to make them work. They used special hardware modifications to “snapshot” memory and pull out a program. They traced the booting of the program by stepping through its code and then snipped out the clever tripwires that freaked out if something wasn’t right. They tied it up into a bow so that instead of a horrendous 140 kilobyte floppy, you could have a small 15 or 20 kilobyte program instead. They even put multiple cracked programs together on one disk so you could get a bunch of cool programs at once.

I have an entire section of TEXTFILES.COM dedicated to this art and craft.

And one could definitely argue that the programs (at least the popular ones) were “saved”. They persisted, they spread, they still exist in various forms.

And oh, the crack screens!

I love the crack screens, and put up a massive pile of them here. Let’s be clear about that – they’re a wonderful, special thing and the amount of love and effort that went into them (especially on the Commodore 64 platform) drove an art form (demoscene) that I really love and which still thrives to this day.

But these aren’t the original programs and disks, and in some cases, not the originals by a long shot. What people remember booting in the 1980s were often distant cousins to the floppies that were distributed inside the boxes, with the custom labels and the nice manuals.

On the left is the title screen for Sabotage. It’s a little clunky and weird, but it’s also something almost nobody who played Sabotage back in the day ever saw; they only saw the instructions screen on the right. The reason for this is that there were two files on the disk, one for starting the title screen and then the game, and the other was the game. Whoever cracked it long ago only did the game file, leaving the rest as one might leave the shell of a nut.

I don’t think it’s terrible these exist! They’re art and history in their own right.

However… the mistake, which I completely understand making, is to see programs and versions of old Apple II software up on the Archive and say “It’s handled, we’re done here.” You might be someone with a small stack of Apple II software, newly acquired or decades old, and think you don’t have anything to contribute.

That’d be a huge error.

It’s a bad assumption because there’s a chance the original versions of these programs, unseen since they were sold, is sitting in your hands. It’s a version different than the one everyone thinks is “the” version. It’s precious, it’s rare, and it’s facing the darkness.

There is incredibly good news, however.

I’ve mentioned some of these folks before, but there is now a powerful allegiance of very talented developers and enthusiasts who have been pouring an enormous amount of skills into the preservation of Apple II software. You can debate if this is the best use of their (considerable) skills, but here we are.

They have been acquiring original commercial Apple II software from a variety of sources, including auctions, private collectors, and luck. They’ve been duplicating the originals on a bits level, then going in and “silent cracking” the software so that it can be played on an emulator or via the web emulation system I’ve been so hot on, and not have any change in operation, except for not failing due to copy protection.

With a “silent crack”, you don’t take the credit, you don’t make it about yourself – you just make it work, and work entirely like it did, without yanking out pieces of the code and program to make it smaller for transfer or to get rid of a section you don’t understand.

Most prominent of these is 4AM, who I have written about before. But there are others, and they’re all working together at the moment.

These folks, these modern engineering-minded crackers, are really good. Really, really good.

They’ve been developing tools from the ground up that are focused on silent cracks, of optimizing the process, of allowing dozens, sometimes hundreds of floppies to be evaluated automatically and reducing the workload. And they’re fast about it, especially when dealing with a particularly tough problem.

Take, for example, the efforts required to crack Pinball Construction Set, and marvel not just that it was done, but that a generous and open-minded article was written explaining exactly what was being done to achieve this.

This group can be handed a stack of floppies, image them, evaluate them, and find which have not yet been preserved in this fashion.

But there’s only one problem: They are starting to run out of floppies.

I should be clear that there’s plenty left in the current stack – hundreds of floppies are being processed. But I also have seen the effort chug along and we’ve been going through direct piles, then piles of friends, and then piles of friends of friends. We’ve had a few folks from outside the community bring stuff in, but those are way more scarce than they should be.

I’m working with a theory, you see.

My theory is that there are large collections of Apple II software out there. Maybe someone’s dad had a store long ago. Maybe someone took in boxes of programs over the years and they’re in the basement or attic. I think these folks are living outside the realm of the “Apple II Community” that currently exists (and which is a wonderful set of people, be clear). I’m talking about the difference between a fan club for surfboards and someone who has a massive set of surfboards because his dad used to run a shop and they’re all out in the barn.

A lot of what I do is put groups of people together and then step back to let the magic happen. This is a case where this amazingly talented group of people are currently a well-oiled machine – they help each other out, they are innovating along this line, and Apple II software is being captured in a world-class fashion, with no filtering being done because it’s some hot ware that everyone wants to play.

For example, piles and piles of educational software has returned from potential oblivion, because it’s about the preservation, not the title. Wonderfully done works are being brought back to life and are playable on the Internet Archive.

So like I said above, the message is this:

Conditions are very, very good right now for easy, top-quality, final ingestion of original commercial Apple II Software and if you know people sitting on a pile of it or even if you have a small handful of boxes, please get in touch with me to arrange the disks to be imaged. apple@textfiles.com.

I’ll go on podcasts or do interviews, or chat with folks on the phone, or trade lots of e-mails discussing details. This is a very special time, and I feel the moment to act is now. Alliances and communities like these do not last forever, and we’re in a peak moment of talent and technical landscape to really make a dent in what are likely acres of unpreserved titles.

It’s 4am and nearly morning for Apple II software.

It’d be nice to get it all before we wake up.

20 Comments

Sandpapering Screenshots — March 16, 2017

The collection I talked about yesterday was subjected to the Screen Shotgun, which does a really good job of playing the items, capturing screenshots, and uploading them into the item to allow people to easily see, visually, what they’re in for if they boot them up.

In general, the screen shotgun does the job well, but not perfectly. It doesn’t understand what it’s looking at, at all, and the method I use to decide the “canonical” screenshot is inherently shallow – I choose the largest filesize, because that tends to be the most “interesting”.

The bug in this is that if you have, say, these three screenshots:

…it’s going to choose the first one, because those middle-of-loading graphics for an animated title screen have tons of little artifacts, and the filesize is bigger. Additionally, the second is fine, but it’s not the “title”, the recognized “welcome to this program” image. So the best choice turns out to be the third.

I don’t know why I’d not done this sooner, but while waiting for 500 disks to screenshot, I finally wrote a program to show me all the screenshots taken for an item, and declare a replacement canonical title screenshot. The results have been way too much fun.

It turns out, doing this for Apple II programs in particular, where it’s removed the duplicates and is just showing you a gallery, is beautiful:

Again, the all-text “loading screen” in the middle, which is caused by blowing program data into screen memory, wins the “largest file” contest, but literally any other of the screens would be more appropriate.

This is happening all over the place: crack screens win over the actual main screen, the mid-loading noise of Apple II programs win over the final clean image, and so on.

Working with tens of thousands of software programs, primarily alone, means that I’m trying to find automation wherever I can. I can’t personally boot up each program and do the work needed to screenshot/describe it – if a machine can do anything, I’ll make the machine do it. People will come to me with fixes or changes if the results are particularly ugly, but it does leave a small amount that no amount of automation is likely to catch.

If you watch a show or documentary on factory setups and assembly lines, you’ll notice they can’t quite get rid of people along the entire line, especially the sign-off. Someone has to keep an eye to make sure it’s not going all wrong, or, even more interestingly, a table will come off the line and you see one person giving it a quick run-over with sandpaper, just to pare down the imperfections or missed spots of the machine. You still did an enormous amount of work with no human effort, but if you think that’s ready for the world with no final sign-off, you’re kidding yourself.

So while it does mean another hour or two looking at a few hundred screenshots, it’s nice to know I haven’t completely automated away the pleasure of seeing some vintage computer art, for my work, and for the joy of it.

Comments Off on Sandpapering Screenshots

Thoughts on a Collection: Apple II Floppies in the Realm of the Now — March 15, 2017

I was connected with The 3D0G Knight, a long-retired Apple II pirate/collector who had built up a set of hundreds of floppy disks acquired from many different locations and friends decades ago. He generously sent me his entire collection to ingest into a more modern digital format, as well as the Internet Archive’s software archive.

The floppies came in a box without any sort of sleeves for them, with what turned out to be roughly 350 of them removed from “ammo boxes” by 3D0G from his parents’ house. The disks all had labels of some sort, and a printed index came along with it all, mapped to the unique disk ID/Numbers that had been carefully put on all of them years ago. I expect this was months of work at the time.

Each floppy is 140k of data on each side, and in this case, all the floppies had been single-sided and clipped with an additional notch with a hole punch to allow the second side to be used as well.

Even though they’re packed a little strangely, there was no damage anywhere, nothing bent or broken or ripped, and all the items were intact. It looked to be quite the bonanza of potentially new vintage software.

So, this activity at the crux of the work going on with both the older software on the Internet Archive, as well as what I’m doing with web browser emulation and increasing easy access to the works of old. The most important thing, over everything else, is to close the air gap – get the data off these disappearing floppy disks and into something online where people or scripts can benefit from them and research them. Almost everything else – scanning of cover art, ingestion of metadata, pulling together the history of a company or cross-checking what titles had which collaborators… that has nowhere near the expiration date of the magnetized coated plastic disks going under. This needs us and it needs us now.

The way that things currently work with Apple II floppies is to separate them into two classes: Disks that Just Copy, and Disks That Need A Little Love. The Little Love disks, when found, are packed up and sent off to one of my collaborators, 4AM, who has the tools and the skills to get data of particularly tenacious floppies, as well as doing “silent cracks” of commercial floppies to preserve what’s on them as best as possible.

Doing the “Disks that Just Copy” is a mite easier. I currently have an Apple II system on my desk that connects via USB-to-serial connection to my PC. There, I run a program called Apple Disk Transfer that basically turns the Apple into a Floppy Reading Machine, with pretty interface and everything.

Apple Disk Transfer (ADT) has been around a very long time and knows what it’s doing – a floppy disk with no trickery on the encoding side can be ripped out and transferred to a “.DSK” file on the PC in about 20 seconds. If there’s something wrong with the disk in terms of being an easy read, ADT is very loud about it. I can do other things while reading floppies, and I end up with a whole pile of filenames when it’s done. The workflow, in other words, isn’t so bad as long as the floppies aren’t in really bad shape. In this particular set, the floppies were in excellent shape, except when they weren’t, and the vast majority fell into the “excellent” camp.

The floppy drive that sits at the middle of this looks like some sort of nightmare, but it helps to understand that with Apple II floppy drives, you really have to have the cover removed at all time, because you will be constantly checking the read head for dust, smudges, and so on. Unscrewing the whole mess and putting it back together for looks just doesn’t scale. It’s ugly, but it works.

It took me about three days (while doing lots of other stuff) but in the end I had 714 .dsk images pulled from both sides of the floppies, which works out to 357 floppy disks successfully imaged. Another 20 or so are going to get a once over but probably are going to go into 4am’s hands to get final evaluation. (Some of them may in fact be blank, but were labelled in preparation, and so on.) 714 is a lot to get from one person!

As mentioned, an Apple II 5.25″ floppy disk image is pretty much always 140k. The names of the floppy are mine, taken off the label, or added based on glancing inside the disk image after it’s done. For a quick glance, I use either an Apple II emulator called Applewin, or the fantastically useful Apple II disk image investigator Ciderpress, which is a frankly the gold standard for what should be out there for every vintage disk/cartridge/cassette image. As might be expected, labels don’t always match contents. C’est la vie.

As for the contents of the disks themselves; this comes down to what the “standard collection” was for an Apple II user in the 1980s who wasn’t afraid to let their software library grow utilizing less than legitimate circumstances. Instead of an elegant case of shiny, professionally labelled floppy diskettes, we get a scribbled, messy, organic collection of all range of “warez” with no real theme. There’s games, of course, but there’s also productivity, utilities, artwork, and one-off collections of textfiles and documentation. Games that were “cracked” down into single-file payloads find themselves with 4-5 other unexpected housemates and sitting behind a menu. A person spending the equivalent of $50-$70 per title might be expected to have a relatively small and distinct library, but someone who is meeting up with friends or associates and duplicating floppies over a few hours will just grab bushels of strange.

The result of the first run is already up on the Archive: A 37 Megabyte .ZIP file containing all the images I pulled off the floppies.

In terms of what will be of relevance to later historians, researchers, or collectors, that zip file is probably the best way to go – it’s not munged up with the needs of the Archive’s structure, and is just the disk images and nothing else.

This single .zip archive might be sufficient for a lot of sites (go git ‘er!) but as mentioned infinite times before, there is a very strong ethic across the Internet Archive’s software collection to make things as accessible as possible, and hence there are over nearly 500 items in the “3D0G Knight Collection” besides the “download it all” item.

The rest of this entry talks about why it’s 500 and not 714, and how it is put together, and the rest of my thoughts on this whole endeavor. If you just want to play some games online or pull a 37mb file and run, cackling happily, into the night, so be it.

The relatively small number of people who have exceedingly hard opinions on how things “should be done” in the vintage computing space will also want to join the folks who are pulling the 37mb file. Everything else done by me after the generation of the .zip file is in service of the present and near future. The items that number in the hundreds on the Archive that contain one floppy disk image and interaction with it are meant for people to find now. I want someone to have a vague memory of a game or program once interacted with, and if possible, to find it on the Archive. I also like people browsing around randomly until something catches their eye and to be able to leap into the program immediately.

To those ends, and as an exercise, I’ve acquired or collaborated on scripts to do the lion’s share of analysis on software images to prep them for this living museum. These scripts get it “mostly” right, and the rough edges they bring in from running are easily smoothed over by a microscopic amount of post-processing manual attention, like running a piece of sandpaper over a machine-made joint.

Again, we started out 714 disk images. The first thing done was to run them against a script that has hash checksums for every exposed Apple II disk image on the Archive, which now number over 10,000. Doing this dropped the “uniquely new” disk images from 714 to 667.

Next, I concatenated disk images that are part of the same product into one item: if a paint program has two floppy disk images for each of the sides of its disk, those become a single item. In one or two cases, the program spans multiple floppies, so 4-8 (and in one case, 14!) floppy images become a single item. Doing this dropped the total from 667 to 495 unique items. That’s why the number is significantly smaller than the original total.

Let’s talk for a moment about this.

Using hashes and comparing them is the roughest of rough approaches to de-duplicating software items. I do it with Apple II images because they tend to be self contained (a single .dsk file) and because Apple II software has a lot of people involved in it. I’m not alone by any means in acquiring these materials and I’m certainly not alone in terms of work being done to track down all the unique variations and most obscure and nearly lost packages written for this platform. If I was the only person in the world (or one of a tiny sliver) working on this I might be super careful with each and every item to catalog it – but I’m absolutely not; I count at least a half-dozen operations involving in Apple II floppy image ingestion.

And as a bonus, it’s a really nice platform. When someone puts their heart into an Apple II program, it rewards them and the end user as well – the graphics can be charming, the program flow intuitive, and the whole package just gleams on the screen. It’s rewarding to work with this corpus, so I’m using it as a test bed for all these methods, including using hashes.

But hash checksums are seriously not the be-all for this work. Anything can make a hash different – an added file, a modified bit, or a compilation of already-on-the-archive-in-a-hundred-places files that just happen to be grouped up slightly different than others. That said, it’s not overwhelming – you can read about what’s on a floppy and decide what you want pretty quickly; gigabytes will not be lost and the work to track down every single unique file has potential but isn’t necessary yet.

(For the people who care, the Internet Archive generates three different hashes (md5, crc32, sha1) and lists the size of the file – looking across all of those for comparison is pretty good for ensuring you probably have something new and unique.)

Once the items are up there, the Screen Shotgun whips into action. It plays the programs in the emulator, takes screenshots, leafs off the unique ones, and then assembles it all into a nice package. Again, not perfect but left alone, it does the work with no human intervention and gets things generally right. If you see a screenshot in this collection, a robot did it and I had nothing to do with it.

This leads, of course, to scaring out which programs are a tad not-bootable, and by that I mean that they boot up in the emulator and the emulator sees them and all, but the result is not that satisfying:

On a pure accuracy level, this is doing exactly what it’s supposed to – the disk wasn’t ever a properly packaged, self-contained item, and it needs a boot disk to go in the machine first before you swap the floppy. I intend to work with volunteers to help with this problem, but here is where it stands.

The solution in the meantime is a java program modified by Kevin Savetz, which analyzes the floppy disk image and prints all the disk information it can find, including the contents of BASIC programs and textfiles. Here’s a non-booting disk where this worked out. The result is that this all gets ingested into the search engine of the Archive, and so if you’re looking for a file within the disk images, there’s a chance you’ll be able to find it.

Once the robots have their way with all the items, I can go in and fix a few things, like screenshots that went south, or descriptions and titles that don’t reflect what actually boots up. The amount of work I, a single person, have to do is therefore reduced to something manageable.

I think this all works well enough for the contemporary vintage software researcher and end user. Perhaps that opinion is not universal.

What I can say, however, is that the core action here – of taking data away from a transient and at-risk storage medium and putting it into a slightly less transient, less at-risk storage medium – is 99% of the battle. To have the will to do it, to connect with the people who have these items around and to show them it’ll be painless for them, and to just take the time to shove floppies into a drive and read them, hundreds of times… that’s the huge mountain to climb right now. I no longer have particularly deep concerns about technology failing to work with these digital images, once they’re absorbed into the Internet. It’s this current time, out in the cold, unknown and unloved, that they’re the most at risk.

The rest, I’m going to say, is gravy.

I’ll talk more about exactly how tasty and real that gravy is in the future, but for now, please take a pleasant walk in the 3D0G Knight’s Domain.

1 Comment

The Followup — March 14, 2017

Writing about my heart attack garnered some attention. I figured it was only right to fill in later details and describe what my current future plans are.

After the previous entry, I went back into the emergency room of the hospital I was treated at, twice.

The first time was because I “felt funny”; I just had no grip on “is this the new normal” and so just to understand that, I went back in and got some tests. They did an EKG, a blood test, and let me know all my stats were fine and I was healing according to schedule. That took a lot of stress away.

Two days later, I went in because I was having a marked shortness of breath, where I could not get enough oxygen in and it felt a little like I was drowning. Another round of tests, and one of the cardiologists mentioned a side effect of one of the drugs I was taking was this sort of shortness/drowning. He said it usually went away and the company claimed 5-7% of people got this side effect, but that they observed more like 10-15%. They said I could wait it out or swap drugs. I chose swap. After that, I’ve had no other episodes.

The hospital thought I should stay in Australia for 2 weeks before flying. Thanks to generosity from both MuseumNext and the ACMI, my hosts, that extra AirBnB time was basically paid for. MuseumNext also worked to help move my international flight ahead the weeks needed; a very kind gesture.

Kind gestures abounded, to be clear. My friend Rochelle extended her stay from New Zealand to stay an extra week; Rachel extended hers to match my new departure date. Folks rounded up funds and sent them along, which helped cover some additional costs. Visitors stopped by the AirBnB when I wasn’t really taking any walks outside, to provide additional social contact.

Here is what the blockage looked like, before and after. As I said, roughly a quarter of my heart wasn’t getting any significant blood and somehow I pushed through it for nearly a week. The insertion of a balloon and then a metal stent opened the artery enough for the blood flow to return. Multiple times, people made it very clear that this could have finished me off handily, and mostly luck involving how my body reacted was what kept me going and got me in under the wire.

From the responses to the first entry, it appears that a lot of people didn’t know heart attacks could be a lingering, growing issue and not just a bolt of lightning that strikes in the middle of a show or while walking down the street. If nothing else, I’m glad that it’s caused a number of people to be aware of how symptoms portray each other, as well as getting people to check up cholesterol, which I didn’t see as a huge danger compared to other factors, and which turned out to be significant indeed.

As for drugs, I’ve got a once a day waterfall of pills for blood pressure, cholesterol, heart healing, anti-clotting, and my long-handled annoyances of gout (which I’ve not had for years thanks to the pills). I’m on some of them for the next few months, some for a year, and some forever. I’ve also been informed I’m officially at risk for another heart attack, but the first heart attack was my hint in that regard.

As I healed, and understood better what was happening to me, I got better remarkably quick. There is a single tiny dot on my wrist from the operation, another tiny dot where the IV was in my arm at other times. Rachel gifted a more complicated Fitbit to replace the one I had, with the new one tracking sleep schedule and heart rate, just to keep an eye on it.

A day after landing back in the US, I saw a cardiologist at Mt. Sinai, one of the top doctors, who gave me some initial reactions to my charts and information: I’m very likely going to be fine, maybe even better than before. I need to take care of myself, and I was. If I was smoking or drinking, I’d have to stop, but since I’ve never had alcohol and I’ve never smoked, I’m already ahead of that game. I enjoy walking, a lot. I stay active. And as of getting out of the hospital, I am vegan for at least a year. Caffeine’s gone. Raw vegetables are in.

One might hesitate putting this all online, because the Internet is spectacularly talented at generating hatred and health advice. People want to help – it comes from a good place. But I’ve got a handle on it and I’m progressing well; someone hitting me up with a nanny-finger-wagging paragraph and 45 links to change-your-life-buy-my-book.com isn’t going to help much. But go ahead if you must.

I failed to mention it before, but when this was all going down, my crazy family of the Internet Archive jumped in, everyone from Dad Brewster through to all my brothers and sisters scrambling to find me my insurance info and what they had on their cards, as I couldn’t find mine. It was something really late when I first pinged everyone with “something is not good” and everyone has been rather spectacular over there. Then again, they tend to be spectacular, so I sort of let that slip by. Let me rectify that here.

And now, a little bit on health insurance.

I had travel insurance as part of my health insurance with the Archive. That is still being sorted out, but a large deposit had to be put on the Archive’s corporate card as a down-payment during the sorting out, another fantastic generosity, even if it’s technically a loan. I welcome the coming paperwork and nailing down of financial brass tacks for a specific reason:

I am someone who once walked into an emergency room with no insurance (back in 2010), got a blood medication IV, stayed around a few hours, and went home, generating a $20,000 medical bill in the process. It got knocked down to $9k over time, and I ended up being thrown into a low-income program they had that allowed them to write it off (I think). That bill could have destroyed me, financially. Therefore, I’m super sensitive to the costs of medical care.

In Australia, it is looking like the heart operation and the 3 day hospital stay, along with all the tests and staff and medications, are going to round out around $10,000 before the insurance comes in and knocks that down further (I hope). In the US, I can’t imagine that whole thing being less than $100,000.

The biggest culture shock for me was how little any of the medical staff, be they doctors or nurses or administrators, cared about the money. They didn’t have any real info on what things cost, because pretty much everything is free there. I’ve equating it to asking a restaurant where the best toilets to use a few hours after your meal – they might have some random ideas, but nobody’s really thinking that way. It was a huge factor in my returning to the emergency room so willingly; each visit, all-inclusive, was $250 AUD, which is even less in US dollars. $250 is something I’ll gladly pay for peace of mind, and I did, twice. The difference in the experince is remarkable. I realize this is a hot button issue now, but chalk me up as another person for whom a life-changing experience could come within a remarkably close distance of being an influence on where I might live in the future.

Dr. Sonny Palmer, who did insertion of my stent in the operating room.

I had a pile of plans and things to get done (documentaries, software, cutting down on my possessions, and so on), and I’ll be getting back to them. I don’t really have an urge to maintain some sort of health narrative on here, and I certainly am not in the mood to urge any lifestyle changes or preach a way of life to folks. I’ll answer questions if people have them from here on out, but I’d rather be known for something other than powering through a heart attack, and maybe, with some effort, I can do that.

Thanks again to everyone who has been there for me, online and off, in person and far away, over the past few weeks. I’ll try my best to live up to your hopes about what opportunities my second chance at life will give me.

4 Comments

The Other Half — February 24, 2017

On January 19th of this year, I set off to California to participate in a hastily-arranged appearance in a UCLA building to talk about saving climate data in the face of possible administrative switchover. I wore a fun hat, stayed in a nice hotel, and saw an old friend from my MUD days for dinner. The appearance was a lot of smart people doing good work and wanting to continue with it.

While there, I was told my father’s heart surgery, which had some complications, was going to require an extended stay and we were running out of relatives and companions to accompany him. I booked a flight for seven hours after I’d arrive back in New York to go to North Carolina and stay with him. My father has means, so I stayed in a good nearby hotel room. I stayed with him for two and a half weeks, booking ten to sixteen hour days to accompany him through a maze of annoyances, indignities, smart doctors, variant nurses ranging from saints to morons, and generally ensure his continuance.

In the middle of this, I had a non-movable requirement to move the manuals out of Maryland and send them to California. Looking through several possibilities, I settled with: Drive five hours to Maryland from North Carolina, do the work across three days, and drive back to North Carolina. The work in Maryland had a number of people helping me, and involved pallet jacks, forklifts, trucks, and crazy amounts of energy drinks. We got almost all of it, with a third batch ready to go. I drove back the five hours to North Carolina and caught up on all my podcasts.

I stayed with my father another week and change, during which I dented my rental car, and hit another hard limit: I was going to fly to Australia. I also, to my utter horror, realized I was coming down with some sort of cold/flu. I did what I could – stabilized my father’s arrangements, went into the hotel room, put on my favorite comedians in a playlist, turned out the lights, drank 4,000mg of Vitamin C, banged down some orange juice, drank Mucinex, and covered myself in 5 blankets. I woke up 15 hours later in a pool of sweat and feeling like I’d crossed the boundary with that disease. I went back to the hospital to assure my dad was OK (he was), and then prepped for getting back to NY, where I discovered almost every flight for the day was booked due to so many cancelled flights the previous day.

After lots of hand-wringing, I was able to book a very late flight from North Carolina to New York, and stayed there for 5 hours before taking a 25 hour two-segment flight through Dubai to Melbourne.

I landed in Melbourne on Monday the 13th of February, happy that my father was stable back in the US, and prepping for my speech and my other commitments in the area.

On Tuesday I had a heart attack.

We know it happened then, or began to happen, because of the symptoms I started to show – shortness of breath, a feeling of fatigue and an edge of pain that covered my upper body like a jacket. I was fucking annoyed – I felt like I was just super tired and needed some energy, and energy drinks and caffiene weren’t doing the trick.

I met with my hosts for the event I’d do that Saturday, and continued working on my speech.

I attended the conference for that week, did a couple interviews, saw some friends, took some nice tours of preservation departments and discussed copyright with very smart lawyers from the US and Australia.

My heart attack continued, blocking off what turned out to be a quarter of my bloodflow to my heart.

This was annoying me but I didn’t know it was, so according to my fitbit I walked 25 miles, walked up 100 flights of stairs, and maintained hours of exercise to snap out of it, across the week.

I did a keynote for the conference. The next day I hosted a wonderful event for seven hours. I asked for a stool because I said I was having trouble standing comfortably. They gave me one. I took rests during it, just so the DJ could get some good time with the crowds. I was praised for my keeping the crowd jumping and giving it great energy. I’d now had been having a heart attack for four days.

That Sunday, I walked around Geelong, a lovely city near Melbourne, and ate an exquisite meal at Igni, a restaurant whose menu basically has one line to tell you you’ll be eating what they think you should have. Their choices were excellent. Multiple times during the meal, I dozed a little, as I was fatigued. When we got to the tram station, I walked back to the apartment to get some rest. Along the way, I fell to the sidewalk and got up after resting.

I slept off more of the growing fatigue and pain.

The next day I had a second exquisite meal of the trip at Vue Le Monde, a meal that lasted from about 8pm to midnight. My partner Rachel loves good meals and this is one of the finest you can have in the city, and I enjoyed it immensely. It would have been a fine last meal. I’d now had been experiencing a heart attack for about a week.

That night, I had a lot of trouble sleeping. The pain was now a complete jacket of annoyance on my body, and there was no way to rest that didn’t feel awful. I decided medical attention was needed.

The next morning, Rachel and I walked 5 blocks to a clinic, found it was closed, and walked further to the RealCare Health Clinic. I was finding it very hard to walk at this point. Dr. Edward Petrov saw me, gave me some therapy for reflux, found it wasn’t reflux, and got concerned, especially as having my heart checked might cost me something significant. He said he had a cardiologist friend who might help, and he called him, and it was agreed we could come right over.

We took a taxi over to Dr. Georg Leitl’s office. He saw me almost immediately.

He was one of those doctors that only needed to take my blood pressure and check my heart with a stethoscope for 30 seconds before looking at me sadly. We went to his office, and he told me I could not possibly get on the plane I was leaving on in 48 hours. He also said I needed to go to Hospital very quickly, and that I had some things wrong with me that needed attention.

He had his assistants measure my heart and take an ultrasound, wrote something on a notepad, put all the papers in an envelope with the words “SONNY PALMER” on them, and drove me personally over in his car to St. Vincent’s Hospital.

Taking me up to the cardiology department, he put me in the waiting room of the surgery, talked to the front desk, and left. I waited 5 anxious minutes, and then was bought into a room with two doctors, one of whom turned out to be Dr. Sonny Palmer.

Sonny said Georg thought I needed some help, and I’d be checked within a day. I asked if he’d seen the letter with his name on it. He hadn’t. He went and got it.

He came back and said I was going to be operated on in an hour.

He also explained I had a rather blocked artery in need of surgery. Survival rate was very high. Nerve damage from the operation was very unlikely. I did not enjoy phrases like survival and nerve damage, and I realized what might happen very shortly, and what might have happened for the last week.

I went back to the waiting room, where I tweeted what might have been my possible last tweets, left a message for my boss Alexis on the slack channel, hugged Rachel tearfully, and then went into surgery, or potential oblivion.

Obviously, I did not die. The surgery was done with me awake, and involved making a small hole in my right wrist, where Sonny (while blasting Bon Jovi) went in with a catheter, found the blocked artery, installed a 30mm stent, and gave back the blood to the quarter of my heart that was choked off. I listened to instructions on when to talk or when to hold myself still, and I got to watch my beating heart on a very large monitor as it got back its function.

I felt (and feel) legions better, of course – surgery like this rapidly improves life. Fatigue is gone, pain is gone. It was also explained to me what to call this whole event: a major heart attack. I damaged the heart muscle a little, although that bastard was already strong from years of high blood pressure and I’m very young comparatively, so the chances of recovery to the point of maybe even being healthier than before are pretty good. The hospital, St. Vincents, was wonderful – staff, environment, and even the food (incuding curry and afternoon tea) were a delight. My questions were answered, my needs met, and everyone felt like they wanted to be there.

It’s now been 4 days. I was checked out of the hospital yesterday. My stay in Melbourne was extended two weeks, and my hosts (MuseumNext and ACMI) paid for basically all of the additional AirBNB that I’m staying at. I am not cleared to fly until the two weeks is up, and I am now taking six medications. They make my blood thin, lower my blood pressure, cure my kidney stones/gout, and stabilize my heart. I am primarily resting.

I had lost a lot of weight and I was exercising, but my cholesterol was a lot worse than anyone really figured out. The drugs and lifestyle changes will probably help knock that back, and I’m likely to adhere to them, unlike a lot of people, because I’d already been on a whole “life reboot” kick. The path that follows is, in other words, both pretty clear and going to be taken.

Had I died this week, at the age of 46, I would have left behind a very bright, very distinct and rather varied life story. I’ve been a bunch of things, some positive and negative, and projects I’d started would have lived quite neatly beyond my own timeline. I’d have also left some unfinished business here and there, not to mention a lot of sad folks and some extremely quality-variant eulogies. Thanks to a quirk of the Internet Archive, there’s a little statue of me – maybe it would have gotten some floppy disks piled at its feet.

Regardless, I personally would have been fine on the accomplishment/legacy scale, if not on the first-person/relationships/plans scale. That my Wikipedia entry is going to have a different date on it than February 2017 is both a welcome thing and a moment to reflect.

I now face the Other Half, whatever events and accomplishments and conversations I get to engage in from this moment forward, and that could be anything from a day to 100 years.

Whatever and whenever that will be, the tweet I furiously typed out on cellphone as a desperate last-moment possible-goodbye after nearly a half-century of existence will likely still apply:

“I have had a very fun time. It was enormously enjoyable, I loved it all, and was glad I got to see it.”

33 Comments

Now That’s What I Call Script-Assisted-Classified Pattern Recognized Music — December 25, 2016

Merry Christmas; here is over 500 days (12,000 hours) of music on the Internet Archive.

Go choose something to listen to while reading the rest of this. I suggest either something chill or perhaps this truly unique and distinct ambient recording.

Let’s be clear. I didn’t upload this music, I certainly didn’t create it, and actually I personally didn’t classify it. Still, 500 Days of music is not to be ignored. I wanted to talk a little bit about how it all ended up being put together in the last 7 days.

One of the nice things about working for a company that stores web history is that I can use it to do archaeology against the company itself. Doing so, I find that the Internet Archive started soliciting “the people” to begin uploading items en masse around 2003. This is before YouTube, and before a lot of other services out there.

I spent some time tracking dates of uploads, and you can see various groups of people gathering interest in the Archive as a file destination in these early 00’s, but a relatively limited set all around.

Part of this is that it was a little bit of a non-intuitive effort to upload to the Archive; as people figured it all out, they started using it, but a lot of other people didn’t. Meanwhile, Youtube and other also-rans come into being and they picked up a lot of the “I just want to put stuff up” crowd.

By 2008, things start to take off for Internet Archive uploads. By 2010, things take off so much that 2008 looks like nothing. And now it’s dozens or hundreds of uploads of multi-media uploads a day through all the Archive’s open collections, not to count others who work with specific collections they’ve been given administration of.

In the case of the general uploads collection of audio, which I’m focusing on in this entry, the number of items is now at over two million.

This is not a sorted, curated, or really majorly analyzed collection, of course. It’s whatever the Internet thought should be somewhere. And what ideas they have!

Quality is variant. Finding things is variant, although the addition of new search facets and previews have made them better over the years.

I decided to do a little experiment: slight machine-assisted “find some stuff” sorting. Let it loose on 2 million items in the hopper, see what happens. The script was called Cratedigger.

Previously, I did an experiment against keywording on texts at the archive – the result was “bored intern” level, which was definitely better than nothing, and in some cases, that bored internet could slam through a 400 page book and determine a useful word cloud in less than a couple seconds. Many collections of items I uploaded have these word clouds now.

It’s a little different with music. I went about it this way with a single question:

Hey, uploader – could you be bothered to upload a reference image of some sort as well as your music files? Welcome to Cratediggers.

Cratediggers is not an end-level collection – it’s a holding bay to do additional work, but it does show the vast majority of people would upload a sound file and almost nothing else. (I’ve not analyzed quality of description metadata in the no-image items – that’ll happen next.) The resulting ratio of items-in-uploads to items-for-cratediggers is pretty striking – less than 150,000 items out of the two million passed this rough sort.

The Bored Audio Intern worked pretty OK. By simply sending a few parameters, The Cratediggers Collection ended up building on itself by the thousands without me personally investing time. I could then focus on more specific secondary scripts that do things and an even more lazy manner, ensuring laziness all the way down.

The next script allowed me to point to an item in the cratediggers collection and say “put everything by this uploader that is in Cratediggers into this other collection”, with “this other collection” being spoken word, sermons, or music. In general, a person who uploaded music that got into Cratediggers generally uploaded other music. (Same with sermons and spoken word.) It worked well enough that as I ran these helper scripts, they did amazingly well. I didn’t have to do much beyond that.

As of this writing, the music collection contains over 400 solid days of Music. They are absolutely genre-busting, ranging from industrial and noise all the way through beautiful Jazz and acapella. There are one-of-a-kind Rock and acoustic albums, and simple field recordings of Live Events.

And, ah yes, the naming of this collection… Some time ago I took the miscellaneous texts and writings and put them into a collection called Folkscanomy.

After trying to come up with the same sort of name for sound, I discovered a very funny thing: you can’t really attached any two words involving sound together and not already have some company that has the name of Manufacturers using it. Trust me.

And that’s how we ended up with Folksoundomy.

What a word!

The main reason for this is I wanted something unique to call this collection of uploads that didn’t imply they were anything other than contributed materials to the Archive. It’s a made-up word, a zesty little portmanteau that is nowhere else on the Internet (yet). And it leaves you open for whatever is in them.

So, about the 500 days of music:

Absolutely, one could point to YouTube and the mass of material being uploaded there as being superior to any collection sitting on the archive. But the problem is that they have their own robot army, which is a tad more evil than my robotic bored interns; you have content scanners that have both false positives and strange decorations, you have ads being put on the front of things randomly, and you have a whole family of other small stabs and Jabs towards an enjoyable experience getting in your way every single time. Internet Archive does not log you, require a login, or demand other handfuls of your soul. So, for cases where people are uploading their own works and simply want them to be shared, I think the choice is superior.

This is all, like I said, an experiment – I’m sure the sorting has put some things in the wrong place, or we’re missing out on some real jewels that didn’t think to make a “cover” or icon to the files. But as a first swipe, I moved 80,000 items around in 3 days, and that’s more than any single person can normally do.

There’s a lot more work to do, but that music collection is absolutely filled with some beautiful things, as is the whole general Folksoundomy collection. Again, none of this is me, or some talent I have – this is the work of tens of thousands of people, contributing to the Archive to make it what it is, and while I think the Wayback Machine has the lion’s share of the Archive’s world image (and deserves it), there’s years of content and creation waiting to be discovered for anyone, or any robot, that takes a look.

1 Comment