Been a Little Busy. —
About to start posting here again. Thanks for the patience.
Jason Scott's Weblog
About to start posting here again. Thanks for the patience.
I had really good success when I put out the call for the Archive Team, so let’s try that again, with an entirely new idea.
I would like to declare November 2012 the very first Let’s Just Solve the Problem Month.
Here’s how it works, and what problem I want to solve.
As that sexy pontificator Clay Shirky has said on several occasions, instead of getting hung up on whether Wikipedia is great or not great, instead realize that Wikipedia represents a massive expenditure of energy recovered from not watching television. Not only that, but Wikipedia is one of what could be many different things happening that benefit the world. All you need is a dash of organization, a clear set of principles, and off you go.
I buy into this.
I also buy into the idea behind National Novel Writing Month, which has at its core that everyone has at least one (incredibly shitty, possibly unreadable, vogon-level-quality) novel inside them, and by setting aside one month of you being encouraged, forced, guilted and tortured, you will blow out one 50,000-word novel in that time. What happens next is up to you – burn it and move on, take it aside and polish it until you’re the next JK Rowling (or Hunter S. Thompson), or whatever tickles your fancy. But at the end, YOU WROTE A NOVEL BEFORE YOU DIED. Not bad.
What I know to be true is that there are a number of “problems” out there that need to be solved, that need one single thing to push them from “impossible” to “solved”, or, at least, “1.0”. And that thing that it needs is a lot of human thinking. Often rote, often boring, but necessary, to slam that thing out.
So since I got to come up with this idea, let me declare the first month, November 2012, to be SOLVE THE FILE FORMAT PROBLEM MONTH.
Here’s the problem, in more detail:
In the last couple centuries, we’ve created a number of self-encapsulated data sets, or “files”. Be they letters, programs, tapes, stamped foil, piano rolls, you name it. And while many of those data sets are self-evident, a fuck-ton are not. They’re obscure. They’re weird. And worst of all, many of them are the vital link to scores of historical information.
Everyone knows this problem. It’s why old novelists cry they can’t pull their first novel out of Wordperfect. It’s why someone who used U-matic tapes to record the first meetings of a famous protest group goes “oh well”. It’s why, in all things, someone looks at anything older than five years, and goes “bye”, figuring there’s nothing they can do.
And I’ve had to listen to the mewings about this problem for at least 20 years now, in various forms. A lot. And then the person lights up about maybe solving this problem, and then dims and says “well, we can’t really solve the problem”. Because they know – it’d take an army of people to do it.
Let’s make that goddamned army.
And before I give you a battle plan, let me say: This will solve a major issue. This will give thousands, later millions, access to a whole range of materials now shut off from each other. Stuff being made after 2012 will be scrutinized to see if it has made ways to access it clear. Stuff made before? We’ll have docs, or a thread, or even a few first steps towards understanding what it was. People writing modern software will be able to make filters or plugins that use these standards – it’ll drop from being a needless rathole to being a simple matter of writing a perl library or a javascript routine to pull the data in and make it work with the new thing. That will be very helpful indeed.
Battle Plan:
Now, if you just read all this, let out a big “pffffff” and are having your fingers twitching with the urge to write about how this is all impossible, just get the fuck out now. The project doesn’t need you, now or ever. Just enjoy the summer, grasshopper, and come knocking on the ant’s door in December when we’re at 1.0.
So who’s with me? SEE YOU IN NOVEMBER.
So, Archive Team has become very good at downloading sites before they go down. Way good. Some might almost venture too good. Someone trying to make a point, like me.
Take, for example, Tabblo, a photo-sharing site that HP stabbed in the neck because it wasn’t getting the same internal love as Snapfish, which HP also owns. Instead of doing the reasonable thing and providing users of Tabblo with an upgrade/switch path to Snapfish, HP simply told people it was going down and thanked them for existing. Except, of course, HP sent it to out of date e-mail addresses, or it got caught in spam folders, or people didn’t check that folder for a month, because when Tabblo went down, people came out of the woodwork screaming, many of them finding out Tabblo was down when they tried to upload new photos of their kids to accompany the old ones. Nothing like finding out your entire child’s early life in photos is dead and gone for no obvious reason.
Enter Archive Team’s download of Tabblo. We got the news about Tabblo shutting down a bit late – about a week before the shutdown. We activated our machines, ran our awesome new clients, and 39 people downloaded all of Tabblo in 36 hours. Then, we did a quality check, found some hundreds of broken/timed-out downloads, and re-ran them, cleaning THOSE up in just a few hours more. If you had a photo up on public in Tabblo, Archive Team grabbed a copy, and we’ve been making them available back to people crying about their lost photos – and getting grateful responses.
Tabblo worked out to just about a terabyte and change of data. Not small, but not terrifying either – just a lot of individual pictures, which we tucked away on archive.org. In fact, we tend to tuck a LOT of data on the Internet Archive’s servers, and they’ve been very generous.
But we crossed a line with the brand new MobileMe download. The Archive Team MobileMe download is over 275 terabytes. That’s over a quarter of a petabyte.
Even in today’s crazy storage amounts, this is significant. A lot of drives at the Internet Archive are storing MobileMe for future generations, and that was space allocated for a lot of other just as worthy projects and creations. Nobody’s said anything to me, but I feel this is going to strain the nice situation and relationship, just like staying at someone’s house and you park six cars out front. It’s just not the best thing to do.
So, I’d like to begin arranging a Kickstarter to donate money to the Internet Archive. And I’d like your help.
Yes, we could just keep printing the Internet Archive donation link, and people most certainly use that… but folks often like to contribute as part of an “event” or a goal. So that’s what I’ll be doing at some point soon – putting together a Kickstarter to have people contribute towards a goal, and then that money (minus fees) goes to the Internet Archive. I checked, and it appears your donation would be tax deductible, which is a nice bonus.
The way I’d like to get help is this.
Kickstarter works best when there’s prizes or rewards at various amounts. Donate $100, get a hat, donate $1000, get a dinner with someone famous, etc. I’d like to have some of those. I’m going to chat up people about this, see if I can get friends and old contacts to make themselves available, provide tutorials or dinners or items, and so on.
If you think you have some item, or action, or contribution that could be used as a prize, please contact me. I’m shooting to have this kickstarter begin in August, so we can bundle up some good prizes, get things going, shoot the video, and so on.
Help Archive Team be a slightly better houseguest, although we’ll never been able to explain off what we left in the fridge.
Thanks.
When you overshoot your documentaries as much as I do, you end up with a lot of perfectly good, probably movie-worthy footage that just doesn’t make it into the final cut. When I edit down the dozens of hours of interviews, I have thousands of clips of people saying on-topic, coherent things, all labelled, and then build the movie out of that. It works well, with my subjects completing each others’ sentences and a smooth narrative going from one topic to another. But that leaves hours and hours of interview unseen.
My solution, which I started with the BBS Documentary, is to post up the interviews for everyone to see. I do this thanks to the Internet Archive, which has the space and bandwidth to provide the gigabytes of interview I have on my hard drives.
Recently, we finally set up a situation where I fedex a drive into the Archive from my home in New York, meaning that I can dump in hundreds of gigabytes at once and not wait for a California trip or an anaemic cablemodem to do the work.
So, here we go: 50+ interviews from GET LAMP are now up. They all have some level of description of the circumstances of the interview, including why I interviewed the person. It even includes interviews for which no footage showed up in the movie. (So far, that’s Keith Nemitz, Richard Hewison, and Paul Meyers.)
For the insightful, motivating portion of the audience who fat-finger out a “WHAT TOOK U SO LONG” on the release of these, it mostly came down to time and opportunity. The previous desktop machine, which GET LAMP was cut on, could take between 10-20 hours to render out a video-noise-reduced, full-hour interview. The new machine does same in less than an hour. Also, I was working on a lot other priorities before this, so it took a while to set aside the process of putting these interviews together. So that’s why. GET LAMP was released in August of 2010, and here it’s roughly a year and a half since that time, so it’s not too bad. But since the interviews were done in 2006-2008, it does mean that up to six years have gone by since these nice people have sat down and talked to me. I’m sure their lives and opinions have shifted and changed since some maniac wanted to talk to them extensively about text adventures.
If you’re not sure where to start, I suggest starting with Dave West, the lead of the cave expedition filmed for GET LAMP. Then if you want some clearly-stated nostalgia, Rob Griffiths goes over what it was like to be at the beginning of text adventures as a young player. From there, you can just click around, listen to people. I uploaded the HD files as MPEG2 renders, and then the system redid them into H.264 versions. I suggest looking at or downloading the H.264 versions, unless you’re looking to cut a new version of the movie – then you can just use the original MPEG2.
These interviews are just the answers given by the people, not my questions. Why? Well, two reasons, one bitter, one not.
BITTER: With the BBS Documentary, I started uploading the full hours, just going from one end to the other, dumping out the entire tapes. What I got, frankly, was a lot of shit from a lot of cheeto-spitters about my style of interview, one which results in very good footage. It comes down to the fact that these aren’t just interviews about a new app someone just came out with, or what people think of a current political situation – these are intense mining situations where the interviewee and I are working really hard to bring back memories that have been absolutely dormant and untouched for, in some cases, decades. I tell jokes, I give them common-ground stories about myself or what I’ve heard.. really, whatever it takes to get the footage, the emotion, the sense of “being there”. That’s how the sausage is made, and I got tired of people ripping into the interview footage and just stopped uploading the fuckers. I don’t want to walk away from that for GET LAMP and so this new approach sans Jason is underway. I am sure I’ll get back to the BBS Documentary and do it this way as well.
NOT BITTER: Besides the fact that going back and re-ripping the full interviews would delay this project another couple years, the fact that I went through the months of pulling the gems from these interviews and putting together clips I thought were GET LAMP worthy means that the clips should be able to stand on their own. Cases where I would say “Was it worth the trouble to code this feature into your game” and the person would answer simply “yes it was” are not present in the final clips – the clips had to impart information and be useful in the film. That filter means these interviews are pure fact – not chumming around, not going off into wild tangents about where it’s good to eat around here or what happened down the street last week. And they’re not filled with me telling the same stories over and over to help dredge footage. They’re “cooked”, not “raw”.
In some cases, this means the person says something, the clip stops, and the clip plays with an overlap, as it goes to a new idea. For these minor hiccups, I think they’re very watchable, and informative.
So enjoy!
There’s still copies of GET LAMP available for sale, with gold coin included.
UPDATE: This kickstarter hit $20,000, meaning Joey will be working on this new project for a whole year. I can just imagine the possibilities as it matures and gets done.
Over the course of this year I will occasionally push you to kickstarter projects related to things I think are important to Archive Team‘s goals and mission. I’m not as likely to write these for just any project I like – if you want to check THAT sort of thing out, it’s probably a better idea to just follow me on Kickstarter and see what I’m up to.
Today’s suggested project is GIT-ANNEX ASSISTANT, by Joey Hess. At this moment, there’s about 11 days left to get in on it. He asked for $3000, he has about $13,000, and he wants $20,000. I want him to have it too. This is why.
To understand the power of GIT ANNEX, you need to understand GIT. To understand the power of the GIT ANNEX ASSISTANT, you need to understand neither. Let’s do what we can.
GIT is, at its heart, a version control system, allowing multiple people or entities to work with code and ensure nobody steps on anybody, changes can be reversed, and that progression, rather than mistakes and screaming, are the order of the day. Programmers love git. They love it enough that a site that does nothing but make git respositories public, called GITHUB, is a hugely popular site. But you’re not a programmer, or you work alone and don’t need this. Fine. Stick with me.
Normally, with GIT, you need to “check in” everything you’re coding into a GIT Repository. This is fine when we’re talking about a few dozen or hundred of megabytes of code. But, and we’re making a jump here – GIT isn’t really just for coders and developers. GIT can track any data. And that any data can be photos, videos, music, CD-ROM images, you name it. And with that comes the power of tracking changes, of sharing updates, and of undoing silly things done by others or by yourself. But you really don’t want some poor GIT respository to hold every single point of data being tracked – not when you’re using, say, GITHUB or another public repository unwilling to deal with that level of stuff and preferring to stick to the “keeping track of how it is being handled” job. Well, GIT ANNEX does this very thing – instead of the repository having to keep a copy of, say, every file in all your hard drives, you can instead use GIT ANNEX to say “just keep track of the description of these files” and that’s all it’ll do.
Am I losing you? Keep with it.
I am not a developer, not a programmer. I don’t use GIT at the moment for any of my projects (although plenty of Archive Team members do, and I and many others benefit from its use). What excites me about GIT ANNEX is how it fundamentally tracks the backup and availability of any data you own, and allows you to share data with a large or small audience, ensuring that the data survives.
This is where we get to GIT ANNEX ASSISTANT.
GIT ANNEX ASSISTANT, built on top of these rock solid building blocks that already exist and which are in use around the world, will bridge that last final and critical gap from programmer-nerd-wonkery to next-generation-data-availability for the end user. It will enable you to tell this graphic interface to “just do it”, and ignore all the amazing stuff underneath that’s making sure stuff works for you.
If you’re stuck back on the “woah, hold on, what exactly is GIT ANNEX there, bucko” phase, then feel free to browse around the GIT ANNEX site. Joey Hess has done an amazing job explaining this utility.
But GIT ANNEX ASSISTANT will allow you to use all this cloud technology for what it promises, instead of what it delivers.
Now Hold Up.
Did the dude who wrote Fuck The Cloud just endorse cloud storage?
Well yes, yes I did.
Because GIT ANNEX (and therefore GIT ANNEX ASSISTANT) makes it so that you can make sure your data is on a bunch of clouds, and local storage too, meaning that you don’t have to trust any of those fuckers, including yourself. Fuck the Cloud before it Fucks you! GIT ANNEX ASSISTANT will be tracking things for you, asking you to plug in drives when needed, and retrieving your data.
Oh, but even better. It allows you to declare data in such a way that a bunch of you can share file repositories, not unlike Dropbox or an FTP site, so that everyone gets access to the data, changes are tracked, and so on.
Joey is fantastically cheap. I went over this whole project and his lifestyle with him, and $20k gets him full-time on this project for a full year. He’ll be going through the list and coding up greater platform access, improving GIT ANNEX ASSISTANT to be even easier, and a whole host of related projects. This is a fantastic bargain.
As I write this, he has $13,000 and 11 days to get to his goal. I am endorsing him and telling you this will help the world get away from obfuscated, cheap-ass and unregulated cloud storage, and back to you being in control of your shit without your shit taking control of you.
Here’s a link to the kickstarter page. He has my full endorsement.
Thanks.
The best projects cause a lot of people to shrug or go “oh, that’s nice” and then a much smaller percentage of people to drop whatever they’re carrying and stagger forward in disbelief. Here’s one of those.
For some people, just bringing up the phrase “The Internet Underground Music Archive”, or “IUMA”, summons memories of all sorts of cool bands who were technically savvy enough to make their music available online for a small but growing audience. Through the IUMA main site, you’d browse hundreds, later thousands of bands, and within those bands would be thousands, later hundreds of thousands, of songs. It was huge, and it was long lived – founded in 1992, we’re talking a site that brought you music through .AIFF, .MP2, RealAudio and MP3 formats because it wasn’t clear which of those would be dominant.
Trust me – IUMA was the place to be, one of the rock-solid sites on the net, as powerful and as well-known a force as Hotwired/Wired, Suck, Drudge, Salon, The Well. A mainstay.
If you want an awesome overview of what the hell IUMA was, please zip on over to this YouTube video of a CNN story on IUMA from 1994 – all the founders are there, the power and strangeness of audio in the 1990s is there, and best of all, the sky is obviously still the limit and clouds have not yet appeared. (There’s a similar story from 1994, but it’s on MTV and even by 1994 MTV news is pretty awful.)
So what happened? Well, like a lot of other such endeavors, it was sold in the dot-com boom years, Y2K edition, where it fell under the retarded purview of multiple owners, and then entered a state of living death through the mid 2000s until undergoing a sad, dull little shutdown around 2006.
With that initial 2000 buyout came increased resources and bandwidth and reach, no doubt, so a lot of bands signed up with the place, dropping example tracks and then asking you to buy their CDs or come to their concerts. The entity that ate its lunch as a profitable venture was likely Myspace for many bands, with others being sucked directly into things like iTunes and Amazon and so on. IUMA didn’t keep up, and now it’s just a distant memory.
Well, until now.
6 years ago, John Gilmore (yes, That John Gilmore) saw that IUMA was in a zombie state and very unlikely to ever get out of the ICU. So he grabbed a copy of all he could – which wasn’t all of it, of course, but it was a hell of a lot of it. He stored them on some backup tapes, and as the site went down, disappeared, and faded into the mists of memory, he looked for a chance to have someone get a copy up somewhere. I was that person.
What I just spent most of the last week doing was taking some dupes of John’s backup tapes, writing scripts to ingest them into archive.org’s servers, and now I am going to tell you that I put IUMA back up.
Oh, you are in for a treat and a hell of a lot of modern musical history just got saved. This is over 45,000 bands and artists, and over 680,000 tracks of music. That number sounds made up, but I’m not kidding – six hundred and eighty thousand songs are in this collection. I did a back-of-the-google-calculator check and came up with 243 days of music – solid, 24-hour days of songs. You could leave it running now and look up in 2013 as your playlist ran out.
Where possible and where they were grabbed, I added descriptions from the HTML files for the site. Pictures do not appear to have been saved due to a quirk of the download (they kept the photos on a weird server) – but you have enough to go on. Compare, for example, this page from the wayback machine and what I got ingested into the site. Not perfect, but something.
This should all be considered 1.0 – if I find more ways to pull in information properly, I’ll do so. And naturally I’ll ensure the original, before-jason-messed-with-it data is stored safely away so the next set of folks can try better techniques to get it back.
It once was lost, and now it is found.
Earlier this year, I was hired by DEFCON to do a DEFCON documentary.
DEFCON, if you’ve missed out, is a “hacker conference” that has been held since its beginning in Las Vegas, and this year is the 20th. That’s big enough news that the organization wanted to capture it in some way, and they hired me to do it. I’ve been a DEFCON attendee on and off for about 12 of those years, and I’ve made an occasional documentary about technical subjects, so here we are.
My Kickstarter backers knew about this back then as well, and I also extended an offer to them to have people be in my crew, to see what my documentary approach looks like close up. Six people went for it, and they’ll be filming the main conference along with me, using lightweight rigs that will enable a lot of flexibility. We did a test run of the equipment setup at the MoCCA festival a couple weeks ago. Here’s what I look like with the rig:
If it doesn’t look like all that, it’s not meant to look like all that. But this rig can record 8 hours of high-definition (1080p) footage, can move among crowds with ease, and can be run by a single person in a pinch. The cameras these days are also so lightweight and computationally flexible with regards to lighting needs that a separate light rig won’t be needed for where we’re shooting. I chose this because I know that shoving one of these bastards into a hacker’s face does not result in a very delighed hacker:
(Naturally, I’m also shooting with my DSLR setups for a bunch of things along this event, but you trade a really nice image for a notable reduction in flexibility and ease in capturing what are fleeting moments.)
Hired is the word of note here. This is an exercise in something I’m being paid to do, to capture DEFCON to the best of my ability and hand over finished “stuff” for the conference organizers, including a film to be shown at DEFCON itself and a later, more elaborate production to be available a few months after. I wouldn’t have taken this on along with the other three documentaries if I didn’t think I could piggyback all the productions together, and that’s what I’m up to. All my work benefits from this.
Some thoughts on this whole endeavor follow, along with a plea for assistance. Look for the bullet points if you just want the plea.
I’ve been in this “fuckin’ with computers” deal for 30 years now and the nascent dregs of hackerdom in the “fuckin’ with computers in ways the manual and the laws perhaps don’t encourage” for slightly less than that. If there’s cred needed, I have some level of cred, at least as an observer and as someone who brings respect and understanding to the regard of the whole idea of “Hacking” and what that has come to mean to many different people over many different years. Through my websites and archives, I’ve brought a lot of old stuff back from the dead and I’ve spent a good number of years interviewing and getting to know a whole range of people on all sides of the idea of what people are about and their ideas about not just “hacking” but “computing” and “online” and all the stuff in between. I’m a good guy for this project, I promise you.
So in that frame of self-aggrandization, I’ll also happily cop to ignorance on a variety of topics, especially specific ones. I avoided going to anything like a “Hacker Con” until 1999 because I didn’t wish to be targeted. (Once TEXTFILES.COM went up, I started going since I knew I was basically a known target after that, so why hide?) Despite the calls to learn Assembly, C, VMS, PrimOS throughout my teens, I just didn’t go there – I still don’t consider myself much of a coder or developer type, sticking with shell scripts and pre-made applications by others. I don’t know it all. I don’t have every frame of a potential documentary planned out beforehand.
Unless they have a correction to issue, I was chosen by DEFCON to do this documentary project because I’m part of the culture in some fashion, a DEFCON regular, a guy who has actually shipped shit, and someone who works thoroughly to a fault. That’s the gist of the conversation that went down.
So in that characterization, I am approaching this documentary like I did with the others, with the exception of a much more intense recording of the event itself, utilizing outside crew for the first time and having to organize said teams throughout the event. Other than the run and gun nightmare that that will be, everything else is me doing what I’ve always done: researching the context, sending out messages, doing phone calls, acquiring images and video and sounds, and amassing all the pieces to make something that captures DEFCON as best as I think I can.
So, what exactly is DEFCON?
Well, approach any of the tens of thousands of people who’ve been to DEFCON over the two decades and you’ll get a unique story and description. This is, frankly, the case with any large-scale event or conference – it’s the same for Origins or PAX or CES or Comic-Con and all those perspectives converge from wild directions with thematic similarities. So in the case of DEFCON, you’re going to get that as well.
If there’s anything unique about the DEFCON situation, it’s that it is, literally, a grouping of individualists, a pack of leaders, a regulated and protected anarchy. This tension is not just viewable here and there – it’s endemic. It has always been the case, and it always will be the case. From this arises an enormous amount of tension, derision, weird feelings – that’s something I have on my radar.
And like all such events, there are “eras”, different times when different people were the prominent figures, and the influence of the various hotels overlaid different aspects to the proceedings as well. This fog of influence and perspective is present throughout the idea of DEFCON, and I’m working on trying to capture some bit of it.
For some people, the whole idea of a documentary is stupid. Sorry. Conversation ends there.
For everyone else, I’d like to reach out to ask for assistance acquiring pieces of DEFCON lore and legend, artifacts and history. Here’s the bullet points:
This is going to be a hell of a project. I’ll have more updates, but I wanted this all out here. If you can help in any way, great. I want something that does the event justice, or at least resulting in something people can point at to explain to others what it’s about.
Onward!
I officially started work at the Internet Archive over a year ago.
Let’s remove any tension – it has been a fantastic year, where I have gotten more done in the way of preservation and computer history work than my entire previous 40 years combined. Internet Archive seems to like me, I really like them, and I’m staying.
Here I am with the boss:
Brewster and I are two rather different people, and although the Venn diagram of our interests does not intersect everywhere or even close to it, what we do share in terms of goals and passions is very similar. There’s no hidden agenda with this guy – the headquarters in SF isn’t secretly a meth lab, we’re not actually some lobbying group or anti-whatever think tank trying to destroy or anything. There is only the Mission, the goal to bring as much human knowledge as universally as possible, and to preserve and keep all matter of knowledge as reliably as possible.
Oh, there’s occasional office flare-ups and disagreements and I’m sure some clenched fists and not every day is an endless buffet of awesome, but every single person in this organization understands the Mission and pretty much 100% of the disagreement is how best to achieve that mission with what resources there are (or to gain new resources). That’s rather refreshing from, oh, let’s say, every other goddamn place I’ve worked at, where the goals of some people are “get to retirement age” combined with others who mostly have signed up for “do absolutely nothing until you either get bored and leave or get fired”. That’s not going on here. It was a shocking office culture to run into, everyone just kind of pressing in towards the overarching mission without being waylaid by one group trying to undermine the others for some other bonzo reason unrelated to what the place was going for. Again: People leave, people join this place, but they all understand that dream, that plan, that hope, that dream. Maybe this happens elsewhere, but not in my previous lines of work. So that’s somewhat mind-blowing on a daily basis.
I will feel really stupid if I start listing out names of co-workers and then miss some, so I will tell you that I have someone who is a “handler” for me, and she is far and away one of the best bosses I’ve ever had, and I’ve had a few really, really great bosses in my time. I have people I sit with when I’m in San Francisco who are brilliant, hardworking people (again, all aimed at this goal) who do stunning work. We have communication channels where various groups talk, and it’s like shoving your face into a Brilliance Fountain 24/7. I’m not making this stuff up to butter anyone.
Remember, this isn’t people all sitting around figuring out how to monetize farting or who are blowing up paradigms with slide-scale infradoobles using Ruby on Crack combined with Hibbledoo Middleware. This is a non-profit online library providing petabytes (petabytes!) of data to millions in the most efficient way possible. Speaking of which…
One of the job descriptions/goals for me was “bring in data”.
I just checked the internal tracker to see how I’ve been doing on the upload front. Very well, apparently – I have uploaded 120 terabytes of data. That’s into 82,438 individual items, which could be anything from texts or songs up through to .tar files of web captures. When I started, I said my goal was to upload a terabyte of data a month. As I am apparently doing ten times that amount, I’ll consider that goal met.
I’ve brought in so much “stuff”, in fact, that it would nearly impossible for me to tell you all of it. Let’s throw out some highlights.
I was asked to look into bringing in software. So, I started out with CD-ROM shareware discs, not dissimilar to what I have with cd.textfiles.com. Well, that has been a wild success. I am ready to declare The Internet Archive as the largest collection of shareware on the Internet. Seriously. First, there’s over 1,100 CD-ROMs and DVD-ROMs contained in the CD-ROM collection. But oh, it gets better. You see, functionality was added this year to allow you to browse inside the ISO images. Feast your eyes inside this CD-ROM, for example. You just add a slash at the end of the ISO image reference and there you are. But let’s go even further than that: Let’s take a GIF file of Winter from 1991: http://archive.org/download/SoMuchSharewareV1_918/SoMuchSharewareV1_1991.iso/GIFS/WINTER2H.GIF
You see how you can reference a file inside a CD-ROM image in a permanent URL that can be pulled from anywhere? That’s why, as far as I’m concerned, The Internet Archive now has well over four million shareware programs, artworks and documents online. At least. That’s a game changer. And this year? We’re going to double it.
Computer magazines. Lots and lots and lots of computer magazines. Out of print, fondly remembered and otherwise obscure magazines on a range of technical subjects, currently the province of attics and basements and long-unopened warehouses and a smattering of living spaces – now up and readable.
This collection of computer magazines as well as a smaller spanish-language set constitute 30 years of technical publication, and well over a thousand individual issues, many of those in the hundreds of pages rage, which means there’s a lot of history squished into all this data. I’ve already been informed of university and high school classes out there using these issues to bring up discussions of history or to point out aspects of computer technology that have shifted or changed. Some of the issues have indexes already (the Compute! Magazine collection is a shining example) and I hope more will get them over time. I’ve got lots more issues to add, too.
Manuals! Damn, do I love getting manuals up where people don’t have to search like crazy to find them. It actually saves the environment to some small amount, since people will happily buy older equipment knowing they can get the manual easily and make the use of the item. So manuals are a big deal:
Arcade manuals. DEC manuals. Synthesizer manuals. Commodore manuals. Whenever I track down a cache of these or get sent them, they go up. I want to be able to have someone grab any piece of equipment new or old and understand what exactly everything does on it, and maybe even the why.
Audio! Video! 59,000 open-licensed albums. 2,100 nights of live and club music. Hours of GET LAMP raw interviews. A complete port of 10 years of Jesse Thorn’s The Sound of Young America. Bit by Bit. There are many other such audio and video projects where I use scripts to get them into the archive as collections – part of my work has been writing stuff to inject massive amounts of data into archive.org’s servers to make it that the uploading is the least of the issues. Which brings us to:
FUCK YEAH, ARCHIVE TEAM. I can’t begin to really describe how much data Archive Team has brought in – so many people working together to take snapshots of important things that are being shut down with poor or no notice, as well as proactive “panic downloads” where we recognize things are on the outs and we grab as best a copy as we can.
Like the Internet Archive itself, Archive Team’s collections are not always meant to be short-term beneficial and in fact are pretty clunky – 50gb .tar files and the like. What they are meant to be is raw material for later efforts and rescue of lost data – the panic downloads are basically someone stepping in at the present time and running the duper just before a whole range of data disappears forever. Some of it will be absorbed into the Wayback machine. Some will be filleted for their GIFs or mp3s or who knows what else. And still others will result in data, meaningful first-generation data about how people used computers or how solutions were found to old problems. Or maybe we’ll just laugh at the hair.
I’d go off more on Archive Team but I’m scheduled for something like a half-dozen speaking engagements around the world this year related to it, so I’ll probably just link to those talks when they come out. Actually, here’s a talk I gave about it a couple months ago, which is hosted at, and took place, at the Internet archive.
As we speak, Archive Team is uploading something like 25 gigabytes an hour into the Internet Archive. Chew on that for a bit. So many good people, so much good work, on both sides of the wire.
This is getting a bit long, and I’ll split more out into entries this year to give context and meaning, but the upshot is that this has been a very successful year, a lot of amazing things are happening and continue to happen, and every single waking moment I spend related to this “job” is what I’ve always wanted to do.
And that’s pretty nice. Thanks for taking the gamble, Brewster!
Over two and a half years ago, I wrote a statement on Sockington selling out, where I basically said the following:
“I am not going to sell Socks out. Period. Drag your “proposal” or ‘touching base” or “big idea” or “possibility” to your trash icon, or I’ll kindly take the time to do it for you. The store is closed. It was never open.”
I also felt I had to clarify what “selling out” meant, and what I came up with was:
For me, “Selling Out” for something like Socks comes when the cat or myself are doing things we would never do on our own, and people give us money to convince us to do this. Oh, they may couch it as “paying for your time and effort” or “to help with your maintenance costs”, but it’s taking cash to do something otherwise never happening.
So in that spirit and knowing that, I will inform you I was contacted a month or two ago by a company that wanted me to sell Socks out. Do a promoted tweet, as they call it. As usual, I toy with these guys, so I needled them along and asked how much. $5000/tweet, they said.
Hmm.
Sockington (the actual grey cat) is now (we think) about seven. Penny (the actual orange cat) is now easily ten. Penny and Socks have both seen years of very nice health – no major issues, no overnight visits, you name it. They’re kept on a good diet and kept indoors and exercised and they’re in a rather beautiful home outside of Boston, a place of stairs to climb and rooms to run and general Cat Heaven.
But cats don’t live forever and let me tell you, being involved in the whole Sockington Twitter thing has told me an awful lot about how quickly things go with them. One day they’re purring in your lap and wondering when the next mealtime is, and the next day they’re very, very sadly meowing and you go to the Vet and the Vet suddenly drops in your lap some sort of terrible decision. And that decision is often one that, bluntly, translates to “For three thousand dollars, your cat will live for at least months and probably years, or we can kill it.”
So, call it the years piling on, but I had this rough idea in my head that it might be kind of sweet of Socks and Penny (and Tweetie, the third cat, who needed actual $1500 surgery a year or two ago) to have a couple tweets that ensured that the only decision in a future medical malady was how fast they could get back on their feet. Call it a wavering, a moment of weakness, maybe for some of you a “cold dose of reality”. But what the heck, it wouldn’t hurt to talk to these people.
It took a while, weeks really, but they eventually came back with actual things to sign. And, of course, because everyone in that sort of business is a bait-and-switch scumbag, it suddenly went from “$5000/tweet” to “$2500/two tweets”. And if that sounds awesome, it would be scripted by someone else, specifically to sell a product, a product that had nothing to do with cats.
Subsequently, I could see how this was going to go. With an Non-disclosure agreement attached, and more requirements than I could shake a stick at, it was going to be months of e-mails to get the actual money, money which, again, falls down into the “I sort of have access to that kind of money” level and which, like any such thing done, was a loss of principle and meaning in the name of some hocked-together justification. It’ll heal my cats!!!!!!!!
So while there was this tiny, tiny percentage of me going “hmm”, most of me got sickened by the entire enterprise within a day or two, and as the actual documents piled in, that pretty much settled it. So no, Socks continues to not push house cleaning products on his twitter feed. He continues to be a very strange, very odd little cat.
And as for my little bait-and-switch scumbag marketing guys, who wanted to make a cat tweet about some household product for a few bucks, and were willing to jerk me around for eyeballs? It turns out I don’t do that.
But what I will do is drop all the documents and contracts related to the deal.
So here we go, here’s everything they sent me.
Oh, it’s all in there – the kind of stuff you sign your life away with as “talent” in a promotion, the willingness to get involved in secrecy, in acting like oh, one day Socks decided to start tweeting about some sort of product, in a way unlike himself, and right off to those delicious million eyeballs.
Somewhere down there is this line:
Talent agrees that if Talent commits a material breach of any provision of this Agreement or at any time fails or refuses to fulfill Talent’s obligations hereunder, then Marketer or Agency may terminate this Agreement and Talent will not be entitled to any compensation. Talent further agrees that if Talent should die, or fail to fulfill Talent’s obligations hereunder due to illness, injury or accident so that, in Marketer’s or Agency’s judgment, Talent’s disability will preclude Talent from rendering the Services described above, then Marketer or Agency may terminate this Agreement and Talent will not be entitled to any compensation.
Since I didn’t sign, and I have told you about it, I guess that constitutes a “material pre-breach”, where I’ve already taken a big ol’ dump on the whole prospect of the cat being turned into a mouthpiece for said products. Oh well. I’ll get over it.
Enjoy the glimpse into how bad it can get.
And I’ll pet Sockington for you.
So, last year I had this dream to help encouraging the porting of MESS (the Multi-Emulator Super System) to Javascript. We used a Colecovision emulator in the system because MESS had a compilation option to do just that system (called mess-tiny) so we could focus on the main problems. We did it, and this past week we had a working Colecovision emulator (currently slow, no sound, but playable in many browsers). That took five months.
Within ONE DAY of getting this all working, the team has gotten the port working for a second platform: The Magnavox Odyssey 2 (1978). I don’t have all the details but we may be capable of making this work for all 326 emulated computer platforms in MESS, now. This is why I wanted this system to go forward – once we worked out bugs, the effort would leapfrog like crazy. (A special shout-out to DF Justin for getting this O2 emulation working so quickly.)
So, now that it can do the Odyssey 2, it was trivial for it to emulate a very famous piece of software: Munchkin/K.C. Munchkin, a program that was pulled off the shelves because of the first “Look and Feel” legal battle in software. Here, then, is an example page of how one might use this program in the future – a page where you read about the Odyssey 2, about K.C. Munchkin, and then you try K.C. Munchkin.
Now, we still have a long way to go on some things – for example, it ALWAYS tries to start the game when you reload, and we still have no sound, and it’s still very slow. But I hope that demonstration page shows what I’ve been shooting for – a world where you read about a piece of software, learn about the context of it, and then take it for a spin, right there, and try some things out. Just like we do with movies, with music, with documents.
That’s pretty amazing.