ASCII by Jason Scott

Jason Scott's Weblog

The Tourist, The Researcher, The Maniac —

This year was primarily about software, specifically collecting and emulating it.

That went well.

Asked what my goals or projects are for work for the next year (2015), I have resolved that it will be cleaning up and making it easier to go through the mountain of material in the Internet Archive’s stacks.

Naturally, I’ll be continuing to help with Archive Team, improving JSMESS, and acquiring tons of crap to throw into the hard drives as well, but I really want to take advantage of the tools and time to really dig deep into the Internet Archive’s wonder as a collection. It is a huge, huge set of materials and I’m positive people would have minds blown if they really came into contact with the true size and breadth of the material.

So that’s how it’s going to go in 2015.

But, as usual, I can’t wait and I’m starting now.


First, understanding the mass of material already in the Internet Archive that has come flooding in in the past few years is a task in itself. We don’t have a few monks chanting – we have 4,700 chants by a group of monks in France. Or maybe you want your sermons in english. Well, that’s fine, here’s a set of 21,000 of them. Or religion bums you out – that’s OK, take your mind outside and look at 8,000 books from the library of the New York Botanical Garden. Or go back in time to see a hundred episodes of Starcade, the game show for video gaming.

I can do this all day. There are thousands of groupings of thousands of items down there. It goes in many directions, and it goes deep as your tolerance can go. Books? The Internet Archive’s contributors added 1,000 books yesterday, and 1,000 the day before that as well. (It’s been a good couple of months.) Audio, don’t get me started. Newspapers, tens of thousands of issues of newspapers!

Stuff. We have it.


Before I talk further, let me also say I’m not the only one thinking about this, especially within the organization. There is a very intense effort underway to completely redo the Internet Archive’s interface. If you visit you will go into the Beta Interface, which you can see is coming along swimmingly. (There’s a button in the upper right to turn it off and return to the current interface.) A team of people spent a year dealing with both the backend and the frontend aspects of this new design. It has a way to go, but it actually brings this whole issue to the forefront, since it’s going to become a lot easier to browse the collection.

But for my little bit, I want to find ways to get a better hold on the millions of items, move aside spam and broken items, and group up really cool and similar things without me spending every waking hour doing it.

But before I even do that, it’s time to talk about who I think the audience for the Internet Archive’s media riches are in the first place.

Hence, the Tourist, the Researcher, the Maniac.


The Tourist washes up on land, pushed in by the general tools of contemporary discovery – popular links and search result. The Tourist can be very compelling, because whatever dragged a Tourist to your site probably dragged a lot of other Tourists. They can almost seem like a coherent movement, or a congregation. But they’re not, any more than one or two methods out of a building results in an actual orderly line. They’re a mob, a gawking fog of incoherent attendance, a group who has come in because there was a blinking sign somewhere, and having found something similar to what they want, have smiled and gone away, or they’ve stuck around just enough to pass judgement on somewhere they’ve never been and will likely never return again.

You can tell a Tourist because they go back to comment at the place they heard about the link, and the commentary they make tends to be along the lines of “I have a lot to say about the way about how nothing impresses me.”

Tourists can definitely become recurring users, of course. But their introduction to the place will almost always be “here was a thing you might like”, and they run in and out, because these are busy people, and there’s a lot of things to like.

All is fine and good with Tourists, unless your site or the site attracting Tourists thinks Tourists are the best people to acquire, what with their bloating numbers and all. They’re not – they’re basically sugar. Quick high, bad aftereffects, no long-term benefit.


The Researcher has come into your life and your site because they’re Looking For Something. Hopefully you have it.

Their requests are usually somewhat strange, but they’re also really flexible about what they find. They browse, they look up similar items, and they try to do some of your sorting job for you. In a good day, they send some of the results of their work over – “I have a list of items that need to go into a different collection” is a common one Internet Archive gets in the inbox.

Researchers often want something that is not quite what search engines can provide. A query can be something like “I want all the magazines that reviewed the Apple III”, or “I want to look at movies that feature skateboards or skateboard parks.” You can feel like everything is an incomplete patchwork of metadata and fairy dust after a Researcher walks around. But they’re not trying to be mean or disruptive – they just really appreciate the place and want to get the most out of it.

Researchers hunt in solitary or in small packs.


The Maniac sounds like who you’re doing this all for.

You are not.

The Maniac comes in, and never leaves.

The Maniac has a dream inside their head of how the world should be and you are failing at it, right now, even while you’re not doing anything.

The Maniac has lots of ideas for the site, how things should be done, but they’re really a bundle of unfunded mandates and issues that affect a very, very small number of people. As a bonus, they become anti-ambassadors, posting frequently and viciously in other realms about how your place wouldn’t suck if you only decided to use the sdfjkhsfsdjfh standard or you would just allow unicode in usernames. I’d say The Maniac hasn’t done anything, but they probably have done something and by Ramthar’s Crowbar you will hear about it, a lot, as explanation of why they get to come in and take a dump in the potbelly stove.

DPI matters a lot to the Maniac, as does impeccable presentation and completeness. These, in themselves, are excellent things to have, if you can get them. But Maniacs are very, very quick to flip out if any of an internal checklist isn’t met. Maniacs flip out a lot. It’s what they do.

Maniacs are almost always solitary, although they can give the impression they travel in packs – but to be honest, every Maniac thinks all the rest of the Maniacs are tourists.


So, there’s a fourth group, called The Future – but going too insanely over the top to appease people that don’t exist with goals and ideals you can’t possibly anticipate is not today’s exercise. So you keep out there in the hazy realms of Tomorrow, you crazy Futures.

I think the sweet spot are the Researchers. You want to push Tourists to consider becoming Researchers if the mood takes them, and you want Maniacs to either tamp down into becoming Researchers, or go spill their hot oil of hate out from another castle.

Make the site make sense, start culling out items that are spam into places that won’t discourage browsing, and help build collections that sing. That’s 2015 for me – researchers welcome, Tourists a delight but sorry about the sdfjkhsfsdjfh being sub-par – I’m sure we’ll get to it soon.

Now, let’s get some sorting done.



Categorised as: Internet Archive

Comments are disabled on this post


  1. Michael Bluth says:

    “…or go spill their hot oil of hate out from another castle.”

    Jason, you produce these gems at an alarmingly high rate. It’s freaking awesome.

  2. Nate says:

    What is that odd briefcase thing? C64 with “Hustler” cassette tape and key overlay in Swedish???