GODADDY SOPA BLAH

So, very quickly. SOPA is just the latest in really stupid laws that are intended to change the very nature of online life (along with a lot of aspects of offline life) to bring the Internet in line with the “real world”, e.g., Shit.

It was made by people trying to fundamentally change how this internet thing works, in ways that it can’t possibly. Granted, a lot of people have given up internet for internet-like things, but bear in mind that a single cellphone, that is, one individual’s cellphone, running 4G, has greater bandwidth than the Internet Backbone did in the early 1990s, and you see how far we’ve gone in so short a time.

A lot of people are talking about how the SOPA (Stop Online Piracy Act) is a piece of crap, and it is crap. I don’t have the interest or the taste in going deeply into that, because people who are much better at being all legal-wrangly-nutty can do it. No, I only want to speak to one thing, and even that is mostly in the realm of preservation, my big passion these days, and by “these days” I mean “that I’ve been alive”.

When what we think of as “Domain Names” started up, it was a volunteer side-effort of registering names, one done by hand and totally unreliable in terms of turnaround. You can say what you want related to what came next, but they were kind of Bad Old Days. If a domain was offensive, or they were busy that week, or anything else, you had to basically hope the forces mixed together and you got your domain name. The process of changing domain names, of doing a lot of other domain-related transactions, was weird, slow and stupid. Somewhere around there, I got my COW.NET domain, which I still have.

Network Solutions were slow-moving, unresponsive, dull assholes.  Network Solutions also had a defacto monopoly,  and once they started charging for domain name registration, you got better response, and they got a fuckton of money from domain name sales, and domains weren’t cheap. Let’s be clear about that: $50 a year.

A decent enough showing of how weird those pre-money times were is in this 1993 Wired article.  Joshua Quittner’s a bit of a toolbox but the article serves the function, so there you go. Wild and wooly, slow, and unpredictable. And after the monopoly kicked in, it was wallet-rape city – remember, Verisign bought Network Solutions in 2000 for 21 BILLION DOLLARS.

So imagine when the monopoly was broken, and a chance arose for someone, especially someone like me who’d been doing domain names for nearly a decade, to get domains much cheaper, that is, $8 a year. Well fuck yeah! Thus I and others started going to these other domain registrars, doing our best to make sure they were in some way legitimate. I went with two: EasyDNS for stuff I cared about, Go Daddy for stuff I didn’t quite care about.

So, EasyDNS is fucking perfect. Let’s leave it at that.

Go Daddy was mostly a case that they were cheap, and their interface was somewhat easier to use, especially compared to Network Solutions. Network Solutions had done some sketchy shit in the past, in one case utterly breaking DNS. At the time, if someone had put a hammer in my hand and gave me a free flight to their offices, we would have had quite the news story. In this environment, anything looked better, EasyDNS was expensive (but awesome!) and the domains I only somewhat cared about went to Go Daddy.

ANYWAY

DNS and domain name garbage are like funerals and busted water heaters. You don’t want to deal, when you come into problems it’s usually under duress, and when it’s all over you stop thinking about it until the next time.  Such as it has always been with me for Go Daddy.

Most of the time, with Go Daddy for me, it’s been “Oh, I need to register something hilarious (or somewhat hilarious – I’ve owned INAPPROPRIATELYDRESSED.COM or DISRESPECTCOPYRIGHT.ORG and many other things of that ilk), I don’t want to spend any money, I don’t care too much…. OK, off to Go Daddy.” Once I’m there, I’m reminded how much of their business is trickery, deception, misleading user interface, endless endless endless endless add-ons and attempts to make more money from you, and finally a shit-ball storage of your stuff. But in the end, the domain registers, it “works”, and I’m done, and I can go on making the joke site or whatever.

Somewhere in there, Go Daddy went from “bargain basement generic registrar” to “sleazeball make-ads-that-piss-people-off jingoistic hey look at me fuck you pussies registrar”. Now, as someone who did contract work for ROTTEN.COM as a writer and who uses “Fuck” as an adjective, I’m content with anyone being all controversy-and-tits and putting a stake in the ground, with business being gained or lost by those clear and present actions. It’s called “taking a stand”. T-Shirt Hell, which makes offensive t-shirts, had this schtick for years and has always kept that schtick – great. So it was with Go Daddy.

See, but now things have come to a head. It turned out that not only was Go Daddy happy to put their names supporting SOPA, which is a hell of a restricting, dangerous, and censoring law, but they’d helped to write some of it and, even more offensively, were exempted from it. In other words, they’d found a way to be as legally and liberty-crushing offensive as their ads and their posts and declarations were liberty-defending. In other words, hypocrites.

So, a bunch of people, including myself, are beginning to leave Go Daddy in droves. I have about 20-30 domains with them, and they’re all leaving. This process, you will not be surprised to hear, is somewhat laborious, with Go Daddy throwing ALL sorts of things in the way, including spectacularly crappy and misleading tricks (you unlock a domain to allow transfer by clicking on a menu called “Locking” and then unclicking a box that says “lock domains” and then hitting the button), and then a waiting period. Plus, I know better than to do all my domains through a process at once without testing it, so I’m only doing one minor domain first, going through the waiting period and then making sure it’s all kosher, and then off I will do the rest. Go Daddy may call me about this – I have a “celebrity” domain which they have a specific call center number devoted to. Really. And best of all, it’s Sockington.

But when they call, they can take a flying fucking leap. We’re done.

ANYWAY

When the shit rained down from the world over the SOPA thing, Go Daddy thought they would have their legal counsel explain, point by point, why they were going to say Fuck You and keep supporting SOPA. They wrote a pretty massive weblog entry, actually.

Once people really kicked in, moving tens of thousands of domains off Go Daddy, well, then the fun began, and Go Daddy announced they were “reversing” their position, and that they still saw a need for certain protections, but SOPA was apparently not it, and oh fucking god please stop leaving us in such massive droves and please we’ll do anything you want goddamnit we have children ACTUAL KIDS HERE that need clothing and shelter and we went too far.

First of all, the best part was they’d still written the law, and were still exempt, and were still officially supporting it. All they’d done is made a new weblog entry to try and placate the mouth-breathers, the utter morons they think their customers are who think the tits-and-controversy image was fucking awesome and just wait for them to no longer care about this and we can all go back to the upsells and the deception.

So, in that way, they DELETED THE WEBLOG ENTRY DEFENDING SOPA.

And so, here we are, here I am, to say, FUCK YOU, GODADDY.

Here’s your lame-ass defense, permanently enshrined. Go suck a banana. My domains are leaving you as soon as possible. I hope everyone leaves. Go into the ground, put a plastic bag over your head, and play astronaut. You’re done.

The original weblog entry you hid:

And here it is as a .zip file. (A huge thanks to Vitorio Miliano for sending this along.)

Anyway, back to my regularly scheduled Merry Christmas.

The Flood Never Ended (And a Pledge Drive)

Still lovin’ the job at the Internet Archive.  I’m starting to forget I ever worked anywhere else and all those times I wasn’t enjoying myself.

(I actually enjoyed myself a lot at the various jobs I used to have, but it was rarely because of the job itself.)

I last posted that I’d added some materials to the archive back in September.  That list of periodicals and other materials is way out of date, kids. Let’s do a quick update.

So there’s another thousand magazine issues for you to paw through.

“What, is that it?” you say. Archivist, Please

How about some french-language computer magazines? I got a huge ingestion of those a while back, and I’ve been steadily adding them the last couple of months. They include:

There’s plenty more to add (over 100 different runs) but that’s ongoing.  Spanish and German collections are arriving as well.

But who the hell wants to read, you say. What you want is some sort of software.

Yeah, on that as well.  In the Shareware CD Archive I’ve been curating,  I took the thing from an embarassing 35 CD-ROMs to the current count of roughly 761 CD-ROMs, including a massive collection of FREEBSD installationCD-ROMs courtesy of a donor from the Noisebridge hackerspace. They were going to be turned into wall art, and someone on their list said “Maybe swing those by Jason, first?” so here we are with a pretty much complete set of CD-ROMs from FreeBSD version 2.0 up through 5.4 – a motherlode of unix and programming history.

With this latest batch, it is my firm belief that archive.org is now the largest collection of historical shareware on the internet. I would love to be proven wrong, just so I can make things right the only way I know how, by absorbing even more into the archives.

The full GET LAMP Interviews are still coming in, although they tend to hose the machine that’s doing the rendering, due to the High-Def and the noise reduction and all the rest. But they are getting done! Interviews were added for David Shaw,  Lucian Smith, and the one and only Don Woods.  Additionally, all the footage I shot in the cave that Adventure is based on is now online in a big pile, and the High-Def version of the MC Frontalot video I shot snuck on one evening.

Other dumps include the 2010 @Party Demoparty Footage, the ROFLcon Summit presentations including this one with me and Brewster Kahle of Internet Archive, and terabytes and terabytes of Yahoo! Video.

Wow, STILL not satisfied? Fine, I whip out the best for last.

The DNA Lounge in San Francisco makes webcasts available of performances going on at the club. All the performances. All the time. Since they re-opened in 2002.  Well, people who care have been saving those webcasts. They sent the webcasts to me, on a hard drive.

So here you go: Over 2,000 performances of acts at the DNA Lounge over the last 10 years. This is over 10,000 hours of music, spoken-word, DJs, breakdowns, triumphs and musical madness. Ten thousand hours.

While you’re eagerly browsing the acts and checking out the years,  let me now make an appeal to you.

The Internet Archive is amazing. Besides the massive amount of data I just dumped there, there’s many other groups adding untold quantities of books, sounds, video and whatnot. Top among that is the Internet Archive itself, which I calculated out as adding a new digitized book every 90 seconds to the site. Seriously. They’re adding that many, that fast. To do this, they have a very small staff, and the costs of the archive, while a massive bargain for what it does, still means that they have to always be on the lookout for new donations, new underwriters, all that stuff that comes along with providing this service, a service that includes the unique and amazing Wayback Machine.

So this year, the Archive is trying a pledge drive. Here’s the pledge drive page.  Donations to the archive are potentially tax deductible depending on where you live.

I just threw over 25 terabytes of material at you. Try throwing 25 bucks back.

And thanks.

Javascript Hero: Change Computer History Forever

Besides adding thousands of items to archive.org and uploading terabytes of data (I’m at 28 terabytes of data uploaded since May of this year), I’ve also been working among a bunch of fronts to bring a whole raft of knowledge and history into the browseable, usable world. Trust me, a lot is getting in there.  Allow me to both reveal the next step in this grand arch plan, and put a call out for people to help.

To review, the Grand Arch Plan that has been going on for 30 years.

Step One: Begin collecting computer history. I started this step when I was 9, pulling together printouts, cassettes, later floppy disks, and hardware.

Step Two: Put it all up on the Web. I started this step when I was 28, creating textfiles.com and consistently adding to both that collection and related collections.

Step Three: Absorb the human stories. This is what BBS Documentary, GET LAMP and the next three documentaries are for. This has resulted in hundreds of hours of footage of people talking about computer history, almost all of which I am putting online into the collections begun in step two.

And now the next step:

Step Four: Ubiquity. Make it possible to get to all of computer history from everywhere, as wherever feasible. Do what it takes to make it feasible.

I’m well into this step, having affiliated myself with one of the largest public data collections in the world and giving them massive piles of materials from the first three steps. Everything is open, everything is on fast pipes, everything is easy to pull down and do what you want with it. It’s going very, very well.

But on the whole I am primarily dealing with artifacts and not experience.  A number of people have done some good work to bring in experience of computer history, most notably the Emulator People. In fact, if you don’t go too crazy on the rococo specifics of the accuracy of emulators, they do really really well to take you from “I wonder what it was like to play Choplifter” to “Wow, I am playing Choplifter“. And as someone sitting in the channels of several emulation projects, I will tell you they are all getting better, every single day – improvements in speed, accuracy, flexibility and expandability.

So here is what I’d like to do.

I want to help port the MESS and MAME emulators to Javascript.

Without sounding too superlative, I think this will change computer history forever. The ability to bring software up and running into any browser window will enable instant, clear recall and reference of the computing experience to millions. Setting up images that provide walkthroughs of specific computer history/reference, that will allow playing and and recall of all manner of things online for the last 50 years (the MESS emulator has support for the 1960 PDP-1). I am more than willing to engage in debate over this – but my hope is that you’re past this and going “but how is that even possible?”

It’s possible. Javascript has become unbelievably powerful. Here’s some stuff you may not know Javascript has been able to do so far:

  • Linux. Specifically, a javascript emulation of PC hardware, with an entire Linux OS running on top of it.
  • H.264 – They’ve now implemented a H.264 codec in Javascript.
  • PDF Reading. The pdf.js reader will allow you to read PDFs in anything with Javascript support.
  • Apple II – Gil Megidish has implemented an Apple II emulator in Javascript, which you can play games in.

My strong belief is the emulator people should focus on emulation, and the javascript people on javascript – that javascript should just be one of the ports of MESS and MAME to accompany all the other ports. I feel like there are emulation people who are really focused on the proper accuracy and reliability issues, and Javascript people who are really good at taking accurate, reliable code and making it work in Javascript. In fact, I suspect it’s very easy – we just need someone focused on it.

I’m focused on it. It’s what I do. It’s what I’ve been doing for 30 years.

I am right here. I can be reached at jscott@archive.org or jason@textfiles.com and we can get started making an ad-hoc group to work on it. I can answer questions and talk to anyone. This is priority one for me.

Hope to hear from you.

Jason Scott: Shareware Calvacade

There’s fast, there’s ultra-fast, and then there’s the speed at which Adrian “IronGeek” Crenshaw has rendered out and uploaded the full talks from his first annual Derbycon hacker and security conference. As it was, and due to an extremely silly scheduling conflict, I could only attend the first day of the conference, and because of a series of late flights and missed connections, I got in so late on Thursday that during Friday I had to take a 2 hour nap just to be functioning for my 7pm talk.

But regardless, I got my chance to present a new speech, Jason Scott’s Shareware Calvacade, and he has it up on youtube.  Here you go:

Here’s a direct link.

This really is just a fun little speech, mostly providing an overview of the history of Shareware, some wild tangents, and some weird images of the computing past. It’s not infused with the weight of responsibility or an overarching theme – it was meant to be a pleasant post-dinner (or pre-dinner) collection of Neat Crap, meant to inspire people to my big works coming down the pike from my Internet Archive work. I hope it does that, as well as allow me to scream at an Acer Laptop and tell the Worst X-Box Live Joke Ever.

One thing I do want everyone to bring out of it is how I’m looking for more material for CD-ROMs and software in general! Don’t hesitate to contact me if you think you have some lying around and want it to live again as an exhibit or archive.

I’d like to thank the Derbycon folks for an amazing time, even if I truncated it, and to congratulate them on their wild success (the convention was sold out, and filled to the brim with awesome folks).  Next year, I’m there the whole way through!

 

 

A Cloud of Opinion

As is often the case with Reddit, some random user randomly linked to one of my weblog entries. In this case, the Fuck the Cloud entry. And as is often the case with Reddit these days, it smeared any previous record of hits I ever got on the weblog, ever, since I started doing this, and I had my top reading day ever: 41,000 users in a 24 hour period (and another 6,000 the next day).

And as is typical, people found ways to discuss every possible interpretation of the entry and every possible interpretation of everything not the story: the color scheme, my sex life, my age, my resume, my own use of “cloud-like” services, you name it. Opinion Spectrum Collapse Disorder – I coined it!

Much more interesting was a little rumble of second-wave folks finding me and addressing me, ones who missed the whole thing the first time and maybe missed me all this time, and who came to me for more. And of that, there was WebProNews.

WebProNews would not normally be the type of entity I would either browse, or even think about – caked with ads, resembling a horse-racing call-sheet more than a website, this place creates tons of news stories with a perky host, and posts almost every day, giving you ads and sponsor links galore while providing content.  But for some reason, when they asked about talking Cloud with me, I said yes.

I did it over Skype, which was damned convenient, although maybe with my hair such a mess I should have worn a hat.  On the other end was a lady with a notepad and a green screen, whose name was Abby Johnson (No, not that Abby Johnson), and damn if she wasn’t one of the best interviewers I’ve ever had, save for Kevin Poulsen.  She asked all sorts of good questions, gave good followup responses, and took the conversation all over the place.

So here, in a rare show, I link you to two versions of the same video on WebProNews:

Instead of Ranty-Go-Bragh Jason which is what we usually get related to the Cloud, this is thoughtful, measured Jason, a rare sight indeed, like two unicorns chained together with goblin gold. I figure faithful readers deserve to see it. If you just want the video, you can click on the little chain at the bottom of the video window and grab the whole thing, like I did.

Great job, Abby. It’s rare indeed I link to such a site like that, but such respect deserves my respect.

P.S. Fuck the Cloud.

The Floodwaters Rise

So this is what my job is and that’s pretty goddamned great.

I am currently in a project to gather up scanned copies of every computer magazine or newsletter that has gone out of print. That’s what I’ve mostly been up to, and that’s where I am.

As of this writing, I have put up the following magazines and newsletters.

Computer Magazines

Computer Newsletters

This is over 2,500 issues of magazines. It’s a little harder to calculate page counts, but I believe we’re somewhere in the order of a quarter million (250,000 pages) uploaded in the last seven days. When I’m productive, I’m productive.

Let’s get things clear – I am not the person who scanned these magazines, not the person who collected them (in a few cases, I’ve been sent copies of magazines from this list or which should be on this list, but I didn’t scan them). I’m just someone who has gone out and gathered these from a huge amount of sites that have one or two magazines, or huge piles of newsletters and magazines, and I’m purely a middleman.  A very, very active middleman.

The Internet Archive backend is amazing and I have been learning a lot of ways to work with it. Specifically, I wrote a script called ingestor which can be handed PDFs of magazines and feeds them into the right place, the right collection, and with a bit of the information needed to make it slightly useful, i.e. title and date and grouping it into the right place. From there, the even more amazing deriving mechanism of Internet Archive converts the PDF into Epub, Kindle-compatible, Daisy, Djvu and whatever else it thinks it’s capable of. It also provides a kick-ass online reader so your device or browser can just start immediately reading the document. The whole thing is hardcore and intense and when I hand off the PDF of a magazine issue, I truly feel that issue is saved, preserved, and relevant.

But the work isn’t done. Here’s what I hope gets done, what I think is left to do.

When a magazine issue is up, say, this issue of Your Commodore magazine, you only have a small amount of obvious information that a script could yank in: Title, Issue Number, Month, Date. What should be up there is at least a copy of the table of contents in the description, which will help searchers greatly. What would be a dream would be a set of metadata pairs like the UPC code, the editor’s name, the page count, and the rest. I have little hope that we’ll get the dream, but the stuff that should get up there will ideally become the province of interested parties.

Besides the scanning that the army of anonymous or not-so-anonymous groups are doing, I’d like to see the descriptions and table of contents get swapped back and forth between us – archive.org having a copy, the scanning or hosting site having a copy.  If you look at some of the magazines, such as Compute! Magazine, you can see indexes and descriptions are already in. That’s because work was done on another site – I’ve given that site, atarimagazines.com, credit for doing so. I believe intensely in giving credit for it, and making those indexes generally available. I hope people consider adopting a shorter-run magazine, and doing all the describing – the cool part is you can read up on this magazine and learn everything about it, reading every issue and getting steeped in the history. But we’ll see how this all pans out.

The work continues. My intention here is to no less than utterly change the state of access to this vital aspect of computer history, this collection of programs, advertisements, images, and information related to home computers. I hope the payoff is worth the effort.

Get reading!

After the Flood

There’s good and bad in the Star Trek franchise, but certainly The Next Generation had some mighty subtle portrayals buried in among the episodes, especially as things got a bunch of seasons in.  One that has always struck me, personally, is a short scene in the beginning of an episode called The Chase, where the captain’s old mentor in archaeology tries to convince him to take a leave of absence to go on a hunt for something or other;  the rest of the episode concerns this, and isn’t relevant to what I’m saying. What is relevant is where he direct’s the captain’s attention to an artifact on the table and asks him to check it out.

I’m in no mood to discuss Star Trek canon, but just want to point out that Patrick Stewart, on the right, is an excellent actor. The scene called for him to be told that this artifact, rare enough as it is, was also intact, containing smaller statues inside and therefore unbelievably rare. To show his character reacting to this fact, Stewart put a lot into his performance – his incredulity that the statue was there at all, and then a wide-eyed releasing of breath to realize it’s complete, that he is holding in his hands one of the rarest (and most precious) things he could want to find, especially as an archaeologist. It is, like I said, remarkably subtle and way too subtle for what the scene calls for, which is mere motivation to later avenge the professor’s death and go on a chase through the galaxy.

But that feeling, of encountering something thought long gone and realizing you not only can see this thing that once was, but see it complete and whole is one of the strongest motivations I’ve had in the last 15 years of all this collecting materials of online life.  I’ll see some site with a handful of badly scanned remembrances of something no longer active, and a few cramped words, and then we’re expected to be happy this is all there is left. Well, maybe we’re not happy and we wish there was so much more. And if I have to be the one to assemble the much more, so be it.

When I bring something online or assemble something from distant or obscure sources, I’m doing it for someone who is not me. I’m doing it for a person who has a whisper of a memory, be it a turn of phrase or a hacker’s handle or a BBS name or even some sort of image, like a toy robot or a smiling girl holding up some product in an ad. I  want them to find, as easily as possible, that very thing, and get not just exactly what they sought, but to have the entire pantheon and context of what they were searching for.

So, remember Compute! magazine?

 

Boy, I sure do. Compute! was one of the biggies, one of the magazines you’d see down in the magazine rack at the bookstore and which was filled with bright, happy pages promising the world if you just typed in one of the programs, or which gave an optimistic outlook on how much wonderful stuff you could do with computers. If you subscribed to it (or merely bought every issue, like I tended to), then Compute! was a centerpiece of your computing experience at that time.

Well, here you go.  Here’s every single issue of Compute! Magazine on archive.org.

Go ahead, browse around. You can open any issue, read it online, or download a PDF or kindle/e-reader-ready version, and look back on that awesome time, with breathless ads and helpful tips and ideas on what the next best thing was going to be, with the added advantage of knowing when they were right and wrong. Smile with knowing delight as someone predicts the future, and wince when they get so close but not close enough. The fact that nearly every person ended up becoming a self-contained GPS, communications and information hub wasn’t really on the horizon, so in many cases they’re amazingly off, assuming that machines would continue to be tethered to desks and the phone system would be a constant thorn in the side with its zone calls and strange mechanics.

You know, Compute! had a spin-off magazine? It was called the Compute! Gazette, and focused directly on Commodore machines. (Commodore 64, 128, Amiga, and so on.) It was also really well-put-together and some folks subscribed just to the Gazette if they were particularly Commodore-oriented. (Although most just ended up buying both magazines.)

Well, good news, because here’s every single issue of Compute! Gazette.  Again, all browsable, all readable, all downloadable, every single issue. There were about ninety issues of Compute! Gazette, and something in the 170 issue range for Compute!, depending on how you count the specials and the one-off issues.  Or maybe Atari computers were more your style, which is fine because here’s every issue of Antic Magazine, an Atari-oriented periodical, and there were eighty-five of them in all.

Gosh, well, maybe Ham Radio was more your deal, and just over this corner, we can see every single issue of Ham Radio Magazine, all two hundred and sixty-eight issues.

Go more obscure? Diehard, the Magazine for Commodore 8bitters was a mid 1990s phenomenon, all 23 issues there. If you weren’t a computer user in the UK, you probably never encountered the full run of Big K Magazine.  I guess that’s about it… no, wait, here’s 85 issues of Your Commodore magazine, can’t forget that one.

Folks, I just dumped five hundred magazines into your lap.

Let’s be clear – I didn’t scan any of them; an army of volunteers, fans, historians and romantics did so, over a number of years. The indexing, generally, didn’t come from me (I did write some up), but came from sites such as atarimagazines.com that have been doing this whole “bringing in the magazines” thing for years before I focused on it. What’s different, mostly, is now these items are in a library, the archive.org archives, where they can be read and experienced in one place, in a uniform interface, although I will be the first to admit the interface can be much better at some of the other sites.  I merely wrote scripts to ingest them into archive.org and did the occasional call for volunteers to help describe the issues without descriptions, a task that continues to the present.

I got these 500 magazines up in about 3 days. 72 hours. Imagine what things are going to be like by the end of the year. An exoskeleton, indeed.

Let’s step back, though, and this is what I actually want to talk about.

 

The often-automatic and frankly entirely valid question that comes from encountering, say, a 500-issue online stack of 1980s computer and technology magazines is “Why are you doing this? What purpose could this serve?” And my general answer has always been “Get the fuck out of the way, we’re losing precious items while we dawdle and diminish“, and while that is definitely still the case and my fight goes on to rescue lost data and artifacts, the question’s relevance and merit begins to leak into the margins of my work.

How much stuff is too much? I don’t have an answer for that and I don’t really see encountering “too much” in terms of acquisition, ever, where computer history is concerned. I’ve been in a room with basically every train set piece the Lionel Company ever made and that’s a lot of trains, folks, and I don’t see anything wrong with that. I don’t have an upper limit for the actual containing of items. But I do have concerns about the capacity of the potential audience to gain any amount of context or relevance onto the items.

So, being all Archive Team and stuff has gotten me running with a large social group of archivists, and that’s been a rather interesting experience. Archivists are not, professionally, a particularly bold bunch. Oh, off hours they are fucking insane but within the confines of the kinds of institutions that tend to hire archivists, being someone like, say, me, isn’t a recipe for long-range job security, so there’s a lot of contrast between my outlook/expressions and “the industry”. But often I’ve used different terms for things the “industry” has been dealing with for years, maybe decades, so forgive me if I keep doing that, forever, because I don’t tend to go to many archivist cons and I believe there’s something like a dozen librarian bloggers who would tie my shoelaces together if they saw me waiting near a train platform.

So like every industry, they have the usual things that linger in the background that drive everyone nuts but the solutions are rather difficult, and then there’s someone standing up one day going “pie is bad!” to much hmmmphing and rumbling, accompanied by “pie is good!” arguments and “why are you all arguing about pie” hand-wringing from the sidelines.  Every group does this, there’s no shame in it. But the issues they discuss relevant to my growing pile of magazines and items are metadata and curation.

When John Flansburgh of They Might Be Giants worked in publishing (the basement of Conde’ Nast), he would often encounter strange images that then got incorporated into the band’s album artwork and flyers. To a Flansburgh, curation and metadata have no relevance – he would just go through everything as a matter of work, and then come in with “this is weird” and set it aside. Life provides a lot of opportunity for stumbling, endlessly. Nobody should be worried about the Flansburghs.

But if there’s someone who looks at a massive stack of magazines, online or off, and the question is “When did Synapse Software begin advertising in these issues and when did they stop, and what products did they advertise from their catalog?” then metadata becomes critical. Especially if the questions keep coming; a one-0ff crazy search is one thing (Pixar will do things like find every jailbreak scene, or every bank robbery scene, for reference material, from the corpus of known films), but if this is all you’re doing, all the time, then metadata is the difference between getting things done and focusing all your energy on convincing people with requests they don’t actually want to make that request, i.e. Asshole Librarian Approach.

Curation, the flip side, is presenting a smaller portion of an archive or container in such a way to focus on a subject and bring clarity to an audience or yourself – images of consumer-based robots, perhaps, or video games in which someone has been taken hostage. A curator wants metadata badly, but depending on the quality of the curation or its needs, they can get by with a minimal amount. The quality of the curator, by the way, leads to low-hanging fruit getting put out there (oh boy, another image of “Pac Man” to illustrate any sort of video game) while also bringing in a breathtaking amount of amazing work when done right, like Jambe Davdar’s beyond belief series on Star Wars, which is one of the most masterful commentary/annotations of a film series you will ever see. But the amount of work involved… well, that’s the secret pain of the curator.

Again, you’ve been handed 500 magazines. What’s your response? Do you smile, grab your iPad, find a quiet part of the house, and read? Or do you make a fist and pump it because you’re finally going to be able to type in that program way back when that you played as a kid? Or does just the fact that it exists comprise your entire interest in it, that you want to know that it’s out there, somewhere, being held “just in case”?

Or maybe you look at the pile and go “But we need to mark who wrote for what article! We need to list the companies in each issue! We need…” and what you do next tells me if you’re a complainer or a doer. I don’t have time for complainers, frankly. I’m too busy doing.

This flood is rising. It’s been around for a lot of other subjects and I’m going to help bring the flood to what we call “classic home computing”. Steady yourselves and decide what you’re going to be to it.

Laughing by Telegraph

We had a little discussion on one of my mailing lists about the origin of the phrase “LOL”, which has a couple claimed fathers – but I thought I’d throw this into the mix. Reprinted from The Annals of Hygiene, Volume 6, published by the Pennsylvania State Board of Health.  Official date of publication: 1891.

Laughing by Telegraph.

Telegraph operators lead a highly monotonous life, and are entitled to all the diversion they can extract from the unemotional machine over which they preside. A laugh transmitted over the wires cannot be of a very infectious nature, but it can be accomplished, nevertheless. When an operator becomes lonely, says the Indianapolis News, and his sounders are clicking out messages not intended for him, he calls up some friend and opens a conversation. This, of course, cannot be continued long before something ” funny” is said. It then becomes the duty of the operator to laugh, which he does by making four dots, then one dot and a dash, thus: . . . .  . —, spelling ha. Thus, to all jokes he replies h-a, h-a. From the same authority we learn that surprise or incredulity, as well as amusement, can be conveyed by a few clicks; thus four dots followed by two dashes make the expression “hm,” the precise meaning of which, in any given instance, is to be judged, no doubt, by the context.

And from a book entitled Historical Sketch of the Electric Telegraph, published in 1852:

To expedite transmission, the communications are made as brief as possible, by the elision of letters, and syllables, and sometimes of half a word; besides which, many conventional signs are made use of. ‘We have,’ says Mr. Walker, ‘a signal for the period or full stop and for paragraphs; and we have one for underlining words. And we have many very valuable special signals. There is also a signal among the clerks for laughing, and one for the whistle of astonishment.’

A certain phrase of interest to the contemporary internet set appears in The New England Farmer, 1869:

There sat the little fellow, busy with his blocks, and in reality not heeding a word of what was being said. But no sooner did the paus’e come than he turned round, and rolling on the floor, laughing as though his little sides would burst, shouted: “Go right on! that’s just such as I like to hear every day!”

If you look hard enough, everything’s already there.

With Friends Like These: Archive Team Saves Friendster

Let’s get the numbers out of the way immediately before they’re misreported. Archive Team rescued, roughly, 20 percent of all the profiles on Friendster.  This took us many months and reflects the sheer mass of Friendster’s data, well into the 70-80 terabyte range, behind custom software, and which they summarily deleted from their user accounts.  We got maybe 10-15 terabytes of it, as best we could.

If you were firemen and going into a burning home, you’d grab the people, then the pets, and then you’d let the fucker burn because no fireman or person’s life is worth the risk. If you were the person in the house, however, you’d probably go after the other people and then grab family photos and maybe that sack of gold under the dresser and perhaps that Picasso you hang over the coat rack. In other words, you prioritize.

Now, instead of a fire, imagine it’s the Bagger 288 moving over the landscape towards an apartment complex with 112,000,000 apartments in it, and there’s this asshole shouting over a megaphone thanking everyone for living in their apartments for so long, and Archive Team is there with hacked-together go-karts flying down hallways and grabbing as much as they can with robot arms.

IT WAS EXACTLY LIKE THAT.

As a result, we prioritized ourselves – going after the first millions, then scattershot through the whole range of IDs, grabbing representative examples of profiles, up through to the deadly end, where we had Friendster accounts of our own (it took having profiles to acquire some information, and a couple brave Bothans who wrote custom scripts to yank out the un-yank-out-able). As a result, we got pretty unique directory structures, including blogs, photos, “shout-outs” and messages, all available to the general public. Oh yeah, Friendster killed tens of thousands of blogs, and millions of photos, and… great job, people.

This one was much more of a learning experience than Geocities, and the next time we’re faced with such a huge place going down in flames, we’ll be sure to use what we learned. Until then, we have these blocks of data.

We had one soul go through the entire userbase and groups of Friendster to produce this social graph/database of all the membership inter-relations, for the purposes of historical and academic study. Those aren’t too bad in size.

But the main data… that’s pretty big stuff.

But enough introduction: may I present the first million Friendster accounts.  There are still gaps as I put together the in-flooding collections from dozens of team members, which will take some time – the way the archive.org system works, I can inject the missing fils into the gaps as time goes on. Believe me, though, with 112gb of data, there’s plenty to check through as it is.

I know it might seem we should be proud of our work, but to be honest, I just consider it all with a blanket of sadness. It’s terrible this is happening – it’s awful that years of work are being destroyed in a ill-advised, greedy, misdirected moments. For years after this, people will go to check on their Friendster account, find it wiped, and find the whole thing has been sold down the river and is unable to help them get any of it back. With luck, Archive Team got it, but there’s no promise of that, and the form we have it in will be like finding out someone burned your place down (that metaphor again!) and Archive Team has a generic cardboard refugee box that we jammed some percentage of your stuff into and stuck into a massive shelf. That’s no replacement for the platform of expression and self and network of friends you had. None at all.

But it’s all we got.

How many more times?

A Human Moment in Spam (Update: Nope, Still Machine)

On the post I did about Floppy Disks, that’s gotten a lot of attention in its own right, there was a comment the weblog software marked as suspicious, and possible spam.  The content itself was rather innocuous and in theme:

I will always keep a few floppies and a 3.5 inch drive. Can’t install older versions of XP or 2003 without them to load custom drivers at setup.

I also have the 5.25 inch drive and disks (just in case).

In 25 years some government secret on a floppy will need to be read and I’ll have the last drive and hardware to read it. Then I’ll be rich. Mark my words.

But yes, it was definitely spam – the website attached to the user, their e-mail address, and the name of the user were all references to a specific fruit that appears to be the new thing garnering the need for attention and weblink juice to get its message of “please fucking buy me” out to as maximum an audience as possible. It is, without question, spam.

There’s a range of commentary I make where I am probably not saying something new to a lot of people, and I get a lot of response of “well duh”, but I still feel the need to mention it, for historical context later.

Spam, as you know, is the idea of unsolicited commercial speech being thrust upon audiences without their consent, assent, knowledge or awareness. I’m going to leave it at that general definition for the purposes of what I’m talking about.  Originally a reference to endless messages on Usenet (and emulating the Monty Python Spam Sketch), it expanded out to e-mail as e-mail became more prevalent. From e-mail, where it got the most hand-wringing and programmatic work to divest it away, it has followed, at various paces, to every single platform in which any amount of people congregate. Spam has cropped up in twitter, facebook, online games, IRC chat, Flickr, wikis, and places infinite.

Someone solved the spam problem a while ago: go after the tiny handful of banks that the spammers take payment from.  But, knowing how to solve something and solving it are two different plateaus on a very large landscape with dangers and silliness fraught between.

So instead, let me focus on what I’m observing here, in this specific situation, and why I think it needs to be made into a historical note.

A person made this. For me.

It was weird to have machines scripting spam in the 1990s beyond the standard “and take this pre-written pitch and shove it in as many newsgroups/e-mail addresses as possible”. You wouldn’t have it, say, look at the writing of a newsgroup, re-arrange the last major topics talked about, plagarize items from a year or two previous, and compose a “new” message that turned out to be links to spam. If Usenet had survived into the later 2000s, it would have. And that’s what I’ve been seeing – scripts that knock on my doors with composed text taken from the weblog itself. So there’s been advancement there.

But this… I checked. This is a valid comment. They’re saying something reasonable and clear, something that would be a perfectly fine comment about Floppy disks, their perspective to them, and what they might have said if they were some pleasing retro-nerd happy I was talking about 5.25″ floppies. It was composed for real, for this weblog posting.

This is an enormous human cost, compared to just scripting once and blowing it out across a thousand million weblogs.  It means, and I knew it was coming to this, that hiring humans to do things so bone dull has become enough of a commodity that spammers are employing this to get their messages injected into the conversation. If my software didn’t show me the spammy e-mail and website attached to the posting, I’d have approved it.

I don’t like it at all.  I don’t know where this is going to lead. But it’s worth noting, for the future.

Update: It turns out this was a spam script, which took one of the comments on a Reddit link to my entry and then refashioned it into my commenting structure. Brilliance. So, we’re still in machine land. I am actually rather relieved. There are some things human beings just shouldn’t be doing.