The MUD Archives: It’s About Goddamned Time

OK, just a few days into my Archiving position at Archive.org, I think we’ve had enough time before I start initiating new projects.

In Archive.org’s realm, you can add “items”, which can be movies or books or software or what have you. You can then put those items into a “collection”, which is where you can declare all these disparate items as being related in some way. You can also put a bunch of collections into a greater, meta-collection. In other words, we have plenty of space, plenty of ways to classify, and there’s lots of items you can put in.

At this point people who know of me from the last ten years or so might not know that I was a co-founder and administrator of a MUSH, a MUD Variant, called TinyTIM. There’s still a site for it, and it’s still up, 21 years after being founded. Longevity, we has it.

While I had a storied and checkered history with MUDs, I do think it was worth the years of my life I spent doing it, and I can personally attest there’s a rich enough tapestry of artifacts, lore, events and people that it is an absolutely valid “thing”. It’s something worth writing books about.  It’s something worth making a documentary about. And it is certainly and completely critical thing to archive.

Wikipedia is quickly showing itself to have an unexpected measurement usage: it is an early-warning system for finding out what knowledge is falling out of favor and has a danger of being forgotten or lost. At one point, they started attacking the demoscene pages, anything with groups or events, and desperate attempts to keep the articles about Demoscene-related subjects revealed obvious things, like how Wikipedia Encourages Bureaucratic Assholes, but also less-obvious things, like how there was at best spotty examples of specific demoscene information in a greater social context. If you knew where to look, you could find these artifacts and stories, but since Wikipedia only wants an ever-shifting set of “legitimate” sources for the viability of a subject, a bunch of stuff was deleted.

Well, they started doing it with MUDs. And MUDs, trust me, are as real and vital a subject to our computing history as many others. MUDs were virtual spaces dealing with the wonder and misery of human nature long before these started showing up in what we currently call social media. They were mass communities that spanned the globe, led to relationships and hatreds, and influenced a generation of computer users in how they would think about these computers as tools and barriers. So, Wikipedia’s assery is history’s gain, because the fire is lit.

It’s time for a MUD archive.

Now, I will happily take on this mantle. I am your go-to guy for collecting MUD history, lore, data, stories, you name it. I want photos. Recordings. Listings. Source Code. Webgrabs. Lectures. Heck, databases and .tar files sitting in the back of your hard drives. I will help you get them on the archive, get them into the collection, ask you the questions you need to answer and do the lifting. I am here for you.

Let’s not let this die. I conducted interviews with MUD’s co-founder Richard Bartle and TinyMUD’s creator Jim Aspnes as part of GET LAMP. I will ensure these get onto the Internet Archive. I will provide any materials I can related to my own MUD life. I will start contacting archives out there and see about copies getting on the Archive.

It’s time. Let’s go. Spread the word. My e-mail address, and aww yeah do I love saying this, is jscott@archive.org.

Let the work begin.

The Splendiferous Story of Archive Team

I figured I’d show you what I wrote as notes for my presentation at the Internet Archive.
The audio from this presentation is here. (20min, 28mb).

PRESENTATION BY JASON SCOTT
PERSONAL DIGITAL ARCHIVING CONFERENCE 2011
FEBRUARY 24, 2011
INTERNET ARCHIVE, SAN FRANCISCO HEADQUARTERS

There is nothing more tiring than an activist. They’re boring in
conversation, hard to have within earshot, and there’s a sense, coming
back to them later, of nothing having changed, because they’re saying
the same things again, and again. They’ve got this single dimensionally
about them. It’s just… tiring.

My name is Jason Scott, and I am an activist. I’m an activist about a
bunch of things, but today I’m going to talk about being an activist
concerning digital heritage.

Before I fill the full fifteen minutes, all I can say in my defense is
that even though I’m an activist, even though I can keep a shrill
one-note symphony going for way too long about the subjects I care
about, I also have a sense of humor.

I have a cat named Sockington. At this moment, my cat has nearly 1.5
million followers on twitter. He’s been featured in magazines,
newspapers, television and has fan art made about him. In the past few
years he’s been discussed during morning drive radio, and just this past
week, he was a question in a quiz in the Ladies’ Home Journal. He’s been
bombarded with endorsement deals and offers of representation, all of
which he ignores. He is after all, a cat.

So if nothing else, you can say you met the guy who has the most popular
animal account on twitter. But I hope you’ll remember the rest of what I
have to say.

I’ve been a collector for years. (I’ve learned, over time, that there’s
places to call yourself an “archivist” and places not to, and a room
full of archivists
who spent a lot of time and money on degrees and
training is not one of those places). As a collector, I had already made
it a point of going after marginalized data, the textfiles and message
bases
of dial-up bulletin board systems of the 1970s, 80s, and 90s. It
turns out that in many cases, I had one of the only copies of some of
this, pulled from printouts or late-night visits in my teens, and my
immediate urge was to share and provide them for as many people as
possible.

I could go into this further, but there’s no time. Let me just leave it
at that – history is something I care about, that I learn from, and
which, when I acquire it, I work as hard as I can to share to the
maximum amount of people.

This all became a site called textfiles.com, which continues to house a
terabyte of bulletin board system history, as well as a range of other
interesting collections that have fallen into my hands. From this, I
moved into documentary filmmaking, hosting events, and giving
presentations. And with that came being known as someone who would take
old stuff off others’ hands, and with THAT came being contacted when old
stuff is in danger.

Obviously, a phrase like “old stuff” means a wild amount of things, and
in my case, I mean customer-grade home computer stuff. Old floppies, old
machines, piles of magazines, printouts. I became the go-to guy for this
- half of what I acquire comes from a group of people discussing what to
do with old material, and someone says “Jason Scott”, and one of my RSS
feeds gets triggered, and I show up an hour later asking what I can do
to help. I’m like an EMT for computer history.

And while I’m defining things, let me say what I mean by “Danger”. I
mean danger of deletion, a danger of being lost, a danger that a piece
of history, with its value unrecognized and a lack of interest in what
it might mean, might just be lost forever. That kind of danger.

And what happened in the last decade or so, is that an awful lot of
computer history is in danger. A lot of it has been deleted. In fact, if
you step back and look at it, the loss of data has moved to epidemic
proportions. I use the term epidemic specifically here; I mean that
there is a mental condition to accept the loss of data as the price of
doing business with computers. And beyond that, the expectation that
data will be lost, and the spreading of this idea to the point that data
loss becomes no big thing.

Well, it’s a big thing. It’s a huge thing. It’s so terrible that I don’t
even know how to frame it half the time. So let’s start with what I
guess could be called my awakening to the problem.

The shutdown of a site called AOL hometown, which was actually a bunch
of previous sites put together, was only the most recent of occasional
shutdowns of user-content sites. They gave two months notice, and then
completely deleted the data, with no recourse for people to get it back.
Two months, for a site that had been up for a decade. In a lot of cases,
e-mail was out of date. Or it went to an address people didn’t use. And
when it was gone, it was gone.

Something about this really cranked me out. I guess it was that sense
that all this stuff people had made online was being wiped away as if it
all meant nothing, all that writing, all that creating, all those sites
that, even though nobody was maintaining them, still had information
other people were referring to. It was all just… gone.

I said, in an angry entry on my weblog, that there ought to be a team of
people who could rescue this data, who could swoop in and grab a copy
before it was all gone, before some decision from nowhere wiped it out.
Some sort of Archive Team.

Well, people took me seriously, and within a short time, dozens of
people offered to be around to help save sites. And so we formed
archiveteam.org, we made some fun logos, and we waited.

And then Yahoo! announced they were closing Geocities. And by announced,
I mean they quietly stuck a side mention of it in a FAQ answer buried in
the support pages. But regardless, geocities was being shut down.
Geocities!

The reactions I saw from websites and press were awful. “Good riddance.”
they’d say. “The blink tag is dead. Who needs crappy animated GIFs and
MIDIs in the background and webrings, don’t get us started about
webrings.”

But I think what they lost was that Geocities arrived in roughly 1995,
and was, for hundreds of thousands of people, their first experience
with the idea of a webpage, of a full-color, completely controlled
presentation on anything they wanted. For some people, their potential
audience was greater for them than for anyone in the entire history of
their genetic line. It was, to these people, breathtaking.

This is a site created by a mother to commemorate her lost son, who died
as an infant. What struck me, if you look at the dates, is that he died
in 1983, a full 15 years before Geocities came along, and her feelings
were still strong in two ways – she wanted to keep his memory alive, and
she saw Geocities as the way to do it. Wiped away completely with the
shutdown.

Again, I don’t have time to walk through other examples of user
creations worth studying or considering, but rest assured, there’s
plenty. And with an arbitrary, vicious, heartless move, Yahoo! shut
Geocities down.

But not before some copies got made.

Archive team used dozens of people on hundreds of IPs, imitating other
search engines and utilizing a whole bunch of tricks, and we duplicated
as much of Geocities as we could. There were other parallel efforts and
those are appreciated, but we got 900 gigabytes of Geocities. We have no
idea what percentage of Geocities we got, but all I know is that
Geocities fits, continues to fit on a hard drive the size of a pack of
cards.

In the time since, we occasionally get contacted by people who watch the
geocities shutdown happen, watched their own sites get shut down, and
tone-deaf policies and lack of response meant they sat back and watched
it happen, feeling entirely helpless. In one case, a widow had her
husband, a veteran, upload all his photos from his enlisted years into a
geocities account, then die off, and never give her the password. She
could browse the site, but she couldn’t change it. Imagine that horror
as she watched the site come down, to have her husband die again. And
imagine the letter we got when she got it all back again.

We took our copy of Geocities, that 900 gigabyte collection, and we
compressed it down to 645 gigabytes. 645! And then we did what I think
any reasonable person would do – we released it as a torrent.

That torrent will be fully seeded by the end of the week, and a few
dozen people will have Geocities to study, to research, to work with.
And a half-dozen USB drives recently went out to waiting and grateful
people as well.

This got a lot of attention, a lot of press – I read a lot of articles
and listened to a lot of podcasts about what Archive Team represents,
what it means, and the rest. And here, as I start to wrap up, is what I
think needs to be understood.

New York City is on the verge of banning smoking in public places. It
may or may not pass, but previously, a few decades earlier, it was
considered impolite while you were smoking in a restaurant to blow the
smoke into a baby’s face. I lived in Waltham, Massachusetts for years -
and decades before I lived there, you could tell what the next year’s
fashion colors would be by the colors of the dyes in the Charles River.
My point is, things were a certain way once. People who did things then
were just following the general order, and to do differently would be
strange. Friendly, or accommodating to an unexpected degree, but
strange.

Right now, we live in a world where the wholesale destruction of a place
like Geocities is a punchline, a tossed off puff piece. The natural
order of doing business. It IS the natural order of doing business.

The current natural order of things for hosting user-generated content
is this: Disenfranchise. Demean. Delete.

Disenfranchise. Cut off any amount of support or awareness by users of
their environment and what they are putting their lives into.

Demean. When a site falls out of favor, act like it’s an electronic
ghetto, not worth consideration as a valid entity. Think Friendster,
orkut, myspace, geocities and a dozen others. Say their name in the
company of people who understand the technical issues, and they snort.
For a lot of people, these sites are parties, and the party is over.

Delete. Give a random amount of warning, and I mean, it really is
completely arbitrary and made up, and then delete, with no recourse,
nobody to ask for a copy, nobody to contact to retrieve your lost data,
your husband’s history, your child’s photos. I’ve seen periods as long
as a year and as short as 48 hours. There’s nothing, no standardization,
no agreed upon procedure for decommissioning these sites. It’s all just
being made up as it goes along.

Somewhere around now, people start using phrases with me like business,
profit, how the world works. This isn’t about business. This is about
understanding that user data is a trust, a heritage, history. And
because we’ve turned it into just another thing just as millions and
millions are going online, the disasters will keep coming.

So until this gets straightened out, before we stop blowing smoke in
babies’ faces, we have ad-hoc solutions like Archive Team.

Archive team doesn’t ask. It takes. It takes and it dupes and it saves.
Sometimes, it’s been cheered as it does so. Sometimes it’s been
ridiculed, criticized, threatened. But this isn’t a party, or a
nightclub, trying to be the new popular thing and the new way to pump
your fist and act like you did something. We’re getting stuff done.

As I speak here, dozens of people are downloading Yahoo! Video, which
announced late last year that it was closing on March 15th.
Specifically, they announced they were deleting all user-generated
content, but keeping the general site. We’ve been coordinating
bandwidth, disk space, and how to get the most data out in the most
efficient manner. We expect the resulting collection will be 25
terabytes of data. Perhaps that sounds like a lot now, but you can buy 2
terabyte drives for $80 on special. It is, in fact, not a lot. So we’re
doing it.

Besides the scraping of millions of Delicious users, a small subset of
archive team has formed URL team, dedicated to pulling down the content
of URL shorteners. URL shorteners may be one of the worst ideas, one of
the most backward ideas, to come out of the last five years. In very
recent times, per-site shorteners, where a website registers a smaller
version of its hostname and provides a single small link for a more
complicated piece of content within it.. those are fine. But these
general-purpose URL shorteners, with their shady or fragile setups and
utter dependence upon them, well. If we lose TinyURL or bit.ly, millions
of weblogs, essays, and non-archived tweets lose their meaning.
Instantly. To someone in the future, it’ll be like everyone from a
certain era of history, say ten years of the 18th century, started
speaking in a one-time pad of cryptographic pass phrases. We’re doing
our best to stop it. Some of the shorteners have been helpful, others
have been hostile. A number have died. We’re going to release torrents
on a regular basis of these spreadsheets, these code breaking
spreadsheets, and we hope others do too.

I’m glad to have made your acquaintance. It’s been a fun ride. Come
along. And if you find yourself in a position of making a few key
decisions about user-generated content, and exporting, and retention or
shutdown policies, I’m always available to chat.

Or, you could just follow my cat.

Thank you.

 

This is Fairly Big News

Let’s just cut to the excellent, ass-kicking chase.

This month, I was offered and accepted a job.  The job’s title is “Archivist” and the office is here:

Internet Archive

That is in fact the main office of the Internet Archive, yes, that Internet Archive, and yes, this has really happened.

You’re either indifferent, delighted, or reeling, so let me fill in some details. I will continue to be primarily located in NY state, with occasional trips out to SF to work on projects or meet with people locally.  None of my current online projects, like textfiles.com or Archive Team, are going to disappear or become owned by the Internet Archive, although the opportunities for collaboration and mirroring just went up exponentially. It primarily means I can stop thinking about where my next meal is coming from or live in terror of getting sick and start thinking where the next cache of must-save data and computer history is. I am still going to make documentaries. I am going to continue to be GDC’s historian/archivist, helping them save a bunch of computer history. I’m just going to be doing what I’ve been doing, but do it more.

If I had to describe it in a one-liner, I’ve just gotten a powerful exoskeleton for the archive projects I’ve been up to for the last few years. I mean, seriously, look the fuck out.

I finally got to meet and talk with the folks from the Archive during my trip out to speak at the Personal Digital Archiving conference, which was hosted in their Sanctuary. Oh, you haven’t seen the Internet Archive Sanctuary?

Sanctuary at Internet Archive

See, these people have style. I am a fan of style, a huge fan of style. After touring the place, meeting the folks, and just finding out what was up… well, let me say, let me honestly say, I couldn’t find a single thing to complain about. This is me we’re talking about. Nothing! They do everything right! They approach it fantastically! The goals and mandates and awesomeness are so pervasive you start to wonder where the Oompa-Loompas are, and which room has the everlasting gobstoppers. It feels like home.

There are still details to work out, parameters to figure out, all the sort of things that collaboration brings.  I will not be speaking for the Archive – I am, as said, an archivist and I have my hands in a lot of projects that don’t necessarily overlap and the Archive has a whole bunch of projects that I am not necessarily involved in. That said, I hope a thousand wonderful things bloom from this, and when I ferret out and do fieldwork to bring in history, I will have some pretty ass-kicking tools and resources at my disposal.

So there you go, that’s about as big as it gets. I can’t wait to get started in earnest.

But before I do.

So in September of 2009, I decided I was done with having a job I tolerated to make it fund the stuff I loved. I decided my job would be something I loved. I was in no position to make that happen alone. I reached out, in a Kickstarter campaign, to raise funds to finish GET LAMP and keep me going while I got my life in order.

342 people (plus a dozen others through other means) stepped forward and made that goal a reality. It gave me the time to finish the movie, begin showing it, move, and begin my new life in earnest.  Because of all those people, I was able to accept work for the Game Developers Conference and do archive work for them.  And I am now about to turn to the Internet Archive on a dime and immediately accept an offer to work with them.  So let me be clear about that: those people changed my life. Permanently.

Other, more concrete details will arrive in time. Until then, well, let’s go with FUCK YEAH.

Get Lamp @ Alpha One

GET LAMP San Francisco Showing Tonight

OUT OF NOWHERE comes a viewing.

Well, OK, they scheduled this a while ago but for some reason I don’t tend to announce these showings on this weblog. But, why not.

I am in San Francisco, and the San Francisco Bay Interactive Fiction Users’ Group is having a showing of GET LAMP, with me attending for a Q&A.

Here’s the details on where to RSVP for the showing.

Consider coming by, if you can.

Archive Team Yahoo Video Final Push (and a rousing speech)

What, two Archive Team posts in a row? Well, it comes down to several factors:

  • I’ve been travelling for well over a week and change
  • A lot of the posts in the hopper are essays and rants not quite out of the oven
  • This whole yahoo video download is very important
  • I am attending GDC in my capacity of Historian, and that is an all-day thing

So many times I’ve gone out to some location for a while for a conference and all I have to show are a a few photographs and the stated fact I went to the conference. This time, I’ve been also talking with people at night, and also working really hard with the swelled-ranks Archive Team to download the Yahoo Video juggernaut.

To recap: Yahoo are fucks. Wait, let me try again.

Yahoo! are about to delete all user-generated content on Yahoo! Video and that is really busting my crank, as well as the crank of a lot of people that have joined Archive Team to rescue it.  We’re now to the point that the whole process is pretty smooth, and we’re getting in the end-time amount of stuff left to do, but we need your help, UNIX-knowing person with a server having more than 500gb free. Oh, you know who you are.

For anyone who can’t join in the fun, let me post this speech I gave at the Personal Digital Archiving conference last week. There’s both an MP3 and a text script. Bear in mind the audio does not match the script – I can’t help but improvise. I am sure they will have a video  for watching later, although there’s no slides, so all you’re missing is my crazy gesturing and hat.

Why is it echo-y? Well, the Internet Archive has the most awesome speaking room ever:

They moved recently into an old Christian Science hall and the servers and offices are all scattered within this incredible building, now redesigned to use the heat from the servers to heat the building.

Anyway, here’s the pitch about the Yahoo Video final push:

We’ve been downloading like crazy. There are 9.3 million user accounts/spaces on Yahoo! Video. We’ve scraped the user information and user photos from over 9 million of those. We’re expecting the remainder to go down very quickly.

Meanwhile, we’ve downloaded roughly 7 terabytes of Yahoo Video and are downloading a bunch more, from whatever 4 million of those 9.3 accounts uploaded.  A lot is coming in, and this is due to the work of dozens of people pitching in where they can as we have a raucous time.

The generous folks at rsync.net have donated a month’s storage of over 6 terabytes for video for a holding spot while folks rush to get videos somewhere so they can download more.

Yahoo deletes ALL these files from the site (it will likely continue as a directory or paid-content version) on March 15. That’s less than two weeks. It’s going to be close. We need people with UNIX, Bandwidth, and Disk. If you have that, please come to #archiveteam on the EFnet IRC Network and join up, or talk to us, or whatever you’d like to do.

Or, send us cash. Literally. The way we’re doing this is to put them on pairs of 2 terabyte drives. Those drives are relatively cheap but cost money. It costs us roughly $180 including fast shipping to get every 2 terabytes (again, a pair of them; data shouldn’t go to a single drive). Anything you want to donate to this cause will help us buy more drives. paypal to jason@textfiles.com marking clearly that this is toward drive space.

OR you might be or know an institution who would like a copy of what we’re downloading, and can provide us an array of disk space to send you a copy. These are going to be upwards of a few million .flv files, along with .html files describing them and user account information besides. Maybe you’re an academic institution or a research facility or whatever. Would you like 4 years of a self-driven sociology and history project run by millions of folks? Sure you would. Write me or come on the IRC channel.

There, I’ve made my pitch. Within a few days, it will become harder and harder to give folks blocks of things to do, although we expect we will be splitting up in-process tasks as we lift away the hundreds of videos an hour we’re currently bringing in. It’s a huge management headache but we seem to have it all going well. We’re learning a lot about a huge volunteer team project, too.

Enjoy the speech. Give some cash. Give some time.