Well, nothing like coming back from a holiday trip to find yourself slashdotted. Actually, there’s really nothing like finding yourself slashdotted back in 2001. Nowadays, not so much; a mere six thousand visitors over a couple days with a few more crawling in days after that.  Web servers have gotten pretty good at what they do and Slashdot has changed over the years, not representing the bottleneck and hit-gun it used to be.
What hasn’t changed at all is that awesome level of Opinion Tourist that accompanies the wave of page views and commentary. Within a short time, both on the Slashdot site itself and in my comments section of my entry, came a gaggle of responses ranging from insightful to, shall we say, distracted.
This jury of my sort-of-peers arrived as a result of this weblog entry, in which I describe the absolutely shitty way that AOL Hometown was shot behind the shed, and the deeper meaning and ramifications of this act with the amount of shared data resources we now depend on, not to mention the cultural/anthropological loss.
Positive responses were enjoyable/informative to read, as anybody agreeing with you tends to be an enjoyable read. Non-positive responses can be lumped into the following general headings:
- Allow me to explain that I keep backups. I am awesome.
- HOLY CRAP YOU SAID LAW I HATE LAW NOT SURE WHAT ELSE YOU’RE SAYING BUT HOLY SHITFUCK YOU SAID LAW
- If The Service Is Free, No Compassion For Thee
- Your comparison to evictions is terrible, based on a number of criteria that I mostly made up
- I sure do hates me some AOL Hometown and the People Who Use It
- Watch as if by magic I say something indicating that not only did I fail to read your weblog posting or consider what was discussed, I apparently didn’t even read the slashdot summary or, possibly, my screen.
Getting hung up on the solution set is a classic problem of a left-brained person: they see the issue being discussed and then a proposed solution, and instead of acknowleding the problem, they start re-engineering and insulting the solution, pointing out where the problem is. So a lot of people got way-laid by me saying “law”, some using the example of “I do free hosting, I would never be able to do this” or “How dare you consider passing a law, that’s like shoving radioactive rods up the nose of children.”
Let’s talk terms here. A guy who is giving free hosting to a couple buddies or even a business or two, especially if he’s not actually incorporated or a business, is a fucking couch. Obviously when we’re discussing the liability of hotel chains your lame RV parked over in the Wal-Mart down the street doesn’t count with regard to Innkeeper Legislation. OK? OK.
Similarly, I happen to think law is the solution in this case, because I am convinced the stakes are so high, with data being so critical to our infrastructure, not to mention our history. You may differ, and that’s why some places require almost no formal education to become a teacher while other places subject you to massive interview and evaluation cycles to get anywhere near a group of students alone. The Spectrum of Opinion, welcome to it. The fundamental discussion is whether this data retention is of sufficent importance to warrant attention.
One brain surgeon or two explained to me that real-live eviction law doesn’t allow for maintaining of space for an orderly exit. I beg to differ.
And as for “free”, I think we’re going to have a few rounds of root beers over whether a place, like Google, that browses through your e-mail via robots and uses it to generate statistically relevant advertisements on your page, or places like Flickr that do in fact have advertisements for seeing your content and charge you on top of that for additional features, or places like Ustream that have profit-sharing and used to do indirect advertisement but now overlay ads on your content, are “free”. Some people confuse “no money down” with free and that’s why they’re getting fucking kicked out of their houses, finding themselves at the mercy and procedures of actual eviction law.
But then again, people are people, and they turned a lot of this into a discussion of AOL Hometown specifically, and AOL Hometown policy, and what’s “right” and “good” and what they “had” or “didn’t” do, as if this wave of “sorta” was going to be convincing under any harsh light.
So let us be crystal clear about what the situation was with AOL Hometown’s shutdown; this was not a case of catastrophic disk failure or a revelation of bad data integrity practice. We’ve had plenty of these, most recently with a site called journalspace.com that was primarily mirroring but not backing up data and had a loss, permanently. This happens all the time and is really sad and we all get to stand here in the future and point fingers and giggle at the past, but it is not the situation being discussed here. AOL chose to shut down the site and pulled the data from being accessible; the webserver was disconnected and the URL redirected to a new location, where a smug little weblog posting was all that remained to mark its passing. The fact that you utilize a multi-tier self-started backup operations paradigm across geographically variant hoo-ha is not what we’re talking about, Poindexter.
This was a case where someone or a group of someones made a decision to take the site down, and by take down, they chose to rip it down posthaste, with a specific amount of “warning time”, and accompanied by a flawed, scattershot attempt to mail everyone associated with the sites, and then doing a massive, global redirect of many tens of thousands of “sites” to a single weblog posting. This procedure happened because of money issues, most certainly, and likely not out of a sense of evil or meanness, but it also happened in an environment where this approach was considered legitimate and valid. This is the heart of what I’m trying to get to: they saw absolutely nothing wrong with this.
They could have spun the data off to a separate firm. They could have contacted archive.org about a transfer of sites. They could have alerted the media in the manner that, say, some entities have posted public notices about auctions or bankruptcies. They could have made the timetable six months intead of four weeks. And, once the data was down, they could have provided a read-only, FTP-only, or otherwise non-browsable accessibility point that a person with the proper credentials could retrieve said data from for months from now, just like (for another real world example) a commercial entity will tell you that the material you used to keep with them is now being kept in a new location and with proper ID, you can retrieve it. They don’t just burn the fucking building down and then put up a sign saying “We burned the building! Thanks for visiting!“.
But right now, this approach is considered so inherently A-OK that a good percentage of people writing responses or nib-nabbery paragraphs about this situation totally skipped over it. Not only was the situation acceptable, it was beyond that – it’s considered normal. One of the core points I’m trying to make is that it shouldn’t be considered normal. As a archivist, this horrifies me. As a historian, it saddens me. And as a fifteen-year user of what people now call The Web, it infuriates me.
I happen to think it specifically being AOL Hometown is besides the point, but some people have decided to focus on it being AOL Hometown and ignore the larger issues, and never let it be said I can’t drill down to the specific from the general. So let’s go enjoy a history lesson.
September, 1998. The internet is still new enough that Jon Postel will not be dead for another month. Google has just been incorporated as a for-profit company, Paypal has been founded, and MySQL has been introduced. A 25 gigabyte hard drive is about to be announced by IBM. And does America Online have a deal for you!
If you are a member of AOL, your dashboard has a new service announced, called AOL Hometown. All you need to do is tell your AOL dashboard to pull in your site and they will double your disk capacity. For the majority of people pulling over to the service, this is a offer that’s almost impossible to refuse: people crow at the expansion of their sites up to 12 megabytes of disk space. Remember, though, that you can’t just be any old schmuck to sign up; you have to already be an AOL member, and it’s provided as part of your service with AOL. Â As you can see in this message, it definitely costs money, and isn’t even cost competitive, in a world where you can get an extra 10 megs of hosting space for a dollar, “like you would use that much space to begin with”.
A lot of people hate AOL. AOL caused the September that never ended a mere five years previous, and even though Netscape and Microsoft were in the middle of what was called The Browser Wars and had done their part to turn the previously-more-technical Internet into a graphical interface, AOL was the leader in the ratio of preparation-to-deployment for users. People were being shoved onto the Internet at large and being given very little in the way of direction, but they all knew that one of the best values in the world was having A Web Page.
With A Web Page, you see, you could create a full-color, hyperlinked, beautiful page about any subject you wanted. At a time when color printing could cost you a dollar a sheet of paper, you could have a full-color presentation available all over the world. Perhaps a person who now carries a music machine with 80 gigabytes can’t envision this, but this technology was amazing, vast, and falling into the hands of people who wouldn’t have ever composed a newsletter, or even a diary. While it probably would have been great if everybody had been given Netcom accounts and made to learn about HTML the hard way, the trend was towards easier and easier methodology, and most importantly, some very non-technical people were given a voice.
AOL itself was purchased by Time Warner in 2000. Its fortunes rose and declined. Through this, AOL Hometown was shifted around, customer service experienced deep and wide-ranging changes, and over time it became harder to get customer service, to make sure you were notified of changes, and to be given news of your website, one which you might have not changed for a half a decade but which contained information, hopes, dreams, history.
The approach by which AOL deleted AOL Hometown was haphazard, obviously ad-hoc, and internally inconsistent. If you try and see the page where they explained to people how they might export data, you will find that this page has been deleted and backdated.
While four weeks seems like more than enough time for warning to some folks, the question rests: why? There are some very specific rules with the retention of financial data for a licensed company. Logs of Yahoo Searches are currently kept for 90 days, but that’s just Yahoo making stuff up based on social pressure; there’s no law in effect regarding this. Google simply says they will maintain your deleted mail for a limited time; there’s no policy with regard to how they would bail out Gmail data to you if they went under or shut down the service.
What? Google? Pshaw you say. Google is forever! Well, just look no further than their shutting down of Lively, a goodbye message that clearly states that “thousands” of chat rooms, locations and avatars were built by users, and which represented probably hundreds of not thousands of hours of work. And what was the export policy by Google? Well, please take a gander at their announcement: “We’d encourage all Lively users to capture your hard work by taking videos and screenshots of your rooms.”.
FUCKING AWESOME. If a place that encouraged you to come in and place items you’d created, and was suddenly going to shut down, told you that you could use a point-and-click camera at the window to capture your “hard work”, you would be setting motherfuckers on fire.
The point is not “Google”, not this or that, but a general malaise and terrible lack of ethic with regard to this work. It’s considered a normal thing to ask if a discontinued product will be open sourced. They might say no, but it’s considered quite normal to ask. Similarly, it’s totally within the realm of reason to throw ideas of Creative Commons at any generated artwork; places like Flickr and now Wikipedia kind of force-funnel you down that road. You can choose not to, but it’s weird and a special case.
By the way…
Need further proof that AOL is just making stuff up like everyone else? Throughout November and halfway through December, all the files were still accessible. You could log in via an FTP site and download your files. It is believed they had 25 servers with this data, and they decided to delete them all at once, with no retention. AOL never explained this. Why should they? There’s no social stigma other than being told they suck, which the remaining employees are quite used to hearing. There’s certainly no law on retention or accessibility of this data. There’s just the chortling echos of technically-savvy website owners or people sitting on equally-shaky footing wiping their brow and being glad this sort of thing will never happen to them.
Until it does.
So what do we do?
Well, let me give a personal example.
Through one of the weblogs I browse, I found out a website called podango.com (a podcast hosting site) was going down. The word had gone out to subscribers of the service that the company was going to be going through some rough times, much as a hedgehog being thrown into a blender was in for some tough times, and maybe you should get your shit off our servers immediately. In line with what I’ve been talking about, they gave everyone five days at the end of December 2008 to do it. Five days. Five days versus four weeks; what’s the gooddamn difference? Technically savvy people given less than a week, over Christmas, to figure out how their data was going to be transferred, to figure out how to get RSS feeds transferred. Some people came back from holidays and found all their shit gone. Didn’t check e-mail during Christmas? Sorry, podcaster!
So what did I do?
I fucking downloaded it.
Check this out, kids:
30506 Â applephoneshow.podango.com
8242 Â Â caseclosed.podango.com
4 Â Â Â developer.podango.com
58280632 Â Â download.podango.com
5916 Â Â gildersleeve.podango.com
4 Â Â Â image.podango.com
14 Â Â Â insidepodango.com
4 Â Â Â my.podango.com
20 Â Â Â sites.google.com
2835048 supernova.podango.com
18128 Â suspense.podango.com
4 Â Â Â www.ilifezone.podango.com
8512904 www.podango.com
# find . -print | grep '[Mm][Pp]3' | wc -l
4080
What you’re looking at is about 70 gigabytes of data from podango.com, lock stock and barrel. Over 4000 distinct episodes of podcasts. It took my machine five solid days to do it, but I downloaded all of that lame site. Do I have a favorite podcast on there? No. Did I know someone with a podcast on there? No.
I did it because I had the means (disk space), the motive (the sense of history and the recognition that this was historically relevant work representing thousands of hours) and the opportunity (a fast connection and five days before they were to die). A back-of-envelope calculation tells me I just rescued 41 days of podcast, along with all relevantly hosted images, show descriptions and XML data.
This one will pay back immediately; people are already contacting me, profusely thanking me.
So what am I saying here?
We need the A-Team.

ARCHIVE TEAM would be like CERT (the Computer Emergency Response Team) used to be, where it was a bunch of disparate people working together to solve a problem in a nimble and networked fashion. They’d find out a site was going down, and they’d get to work.
They’d go to a site, spider the living crap out of it, reverse engineer what they could, and then put it all up on archive.org or another hosting location, so people could grab things they needed. Fuck the EULAs and the clickthroughs. This is history, you bastards. We’re coming in, a team of multiples, and we will utilize Tor and scripting and all manner of chicanery and we will dupe the hell out of your dying, destroyed, losing-the-big-battle website and save it for the people who were dumb enough to think you’d last. Or the people who, finding you’d been around forever, had the utter gall to not be near their computers during your self-created, arbitrary sunset period.
Archive Team would also help publicize your demise in their mailings and discussions, getting the word out to a greater audience that you were dying. If law isn’t the answer, vigilante teams of mad archivists are the answer.
I really don’t have the time to formalize this, so feel free to take up my torch and run around setting barns on fire. I’ll stay on #archiveteam on EFnet for a while, in case people want a place to hang out. Set up a bot. Set up a way to communicate things are dying or that a site needs reverse engineering to yank the crap out. Find out where a mirror ended up, or what needed said mirror. Let’s do this thing.
Anyway, so there’s my clarification.
UPDATE:Â Holy crap, Archive Team is now real. Check it out.
40 Comments