ASCII by Jason Scott

Jason Scott's Weblog

Paper and Geocities Update —

I’ll just combine these two while I’m here, since they’re just updates on ongoing projects.

PAPER

I have a lot of magazines. Did I mention that? A lot. A whole lot. Being made to split them out and get them into special bins instead of scattered among my belongings like sesame seeds means I’m starting to see the size of the pile. Right now, if I had to guess, I would say I have between 3500 and 5000 magazines in the house. Bagged. In bins. As I’m going through them, I’m kind of amazed at the variety – I tend to keep everything. And not everything is magazines, either.

Therefore, I’ve split off the catalogs and “ephemera” – stuff I thought would be good to save before it disappeaed complete. These will be cataloged at a later date. And when I say ephemera, I really do mean all sorts of randomness: convention programs, tickets, posters, how-to booklets, handouts, you name it. Some has meaning, some doesn’t, but all made me think they needed a chance to arise up a few years down the line.

So now the paper.textfiles.com site has a few more improvements: additional stuff catalogued (600 issues), descriptions of some properties put in (temporary, not researched), and I fixed a columnation issue that made it split “Summer 2006″ into two issues. 

GEOCITIES

Archive Team is still downloading Geocities; no surprise there. Right now the canonical collection is about 530gb, and covers, basically, a metric ton of stuff. I don’t have hard stats at the moment; it takes way too long.

There are now several branches working on this. One branch is using the archive.org crawler. Some of us are direct downloading things. I have crazy scripts doing crazy things. We’re setting up a service where you can see if we have a copy of a given URL in the archive. And so it goes.

I intend to have us keep downloading until we either run out of things to download or Geocities is shut down. And believe me, once you start responding mentally to the URL “geocities.com” in reading stuff online, you realize how many things were using Geocities as the central information repository, for better or worse. Dude, shit is going to break when Geocities shuts down. Just to warn you.

I have an incredible group of people helping me and everybody’s getting a big hug when we move to the next phase: sitting on top of the pile and going “WOOOOO HOOOOOOOO”. It’s a ways off, though.


Categorised as: computer history | housecleaning | jason his own self

Comments are disabled on this post


6 Comments

  1. disambiguated says:

    Will you write a post giving us some examples of what will break? There are several lessons here, IMHO.

  2. ross says:

    HEY I’M BACK ON planet earth. funk soul brotha.

  3. Earle Martin says:

    Are there any ideas/plans for setting up some kind of search mechanism for the GeoCities material when you’ve rescued as much as you can? Or will you be letting Google take care of that when it gets online? I ask because I have a 1995-vintage site in there somewhere that I lost track of most of a decade ago, and I’d love to find it again.

  4. asciigod says:

    Jason, hope this isn’t too off topic but I’m kind of fascinated how you keep all these projects organized in your head. As in, how do you develop a plan of attack, deciding which project to work on when and how you judge your progress. I also work on some similiar projects (on a vastly smaller scale thank God), and often have difficulty keeping it all together. If you’ve already covered this topic in a previous entry, could you please point me in the right direction? Otherwise, I’ll see if you have anything to say on the subject here.