ASCII by Jason Scott

Jason Scott's Weblog

The Big Time —

Another important milestone:

The BBS Documentary is now available on Amazon.

The entry says that the DVD has Region 1 encoding; that’s not true, and I’ve sent in a correction. The DVDs have no Region 1 encoding.

Now, people who know of the BBS Documentary through this website and through the main BBS Documentary website might say “Well, that’s kind of odd; it costs more than on your website”. And here we get into the wierdness of selling through a big-name retailer like Amazon.

Basically, I make less money than I do selling the documentary myself. Significantly less. Yes, even selling the documentary for more on Amazon, I make less. But the balance of this is that it’s on friggin’ Amazon.

People like yourself who know of me and my project for years (and in some cases, years and years) have the score down already; you either bought it through my site or intend to buy it through my appearances at various conferences, or maybe you’re just going to download it when you have the chance. But this makes you one segment of the audience. Another segment is going to hear of this project by word of mouth, from the radio or from a message base or in a conversation at a bar or what have you. These people will, naturally, go home and either go to Google and look for “Documentary BBS” or something, or they’ll go to Amazon.

Other people put more trust in the Amazon brand name than a guy in the middle of who-knows-where. They’ll want the Amazon guarantee and refundability and whatever else they want from Amazon. So for those people, I have the DVDs for sale that way.

I don’t LOSE money selling through Amazon; I just make significantly less. But the trade-off is that I reach a lot more people this way. You see my issues. Also, no autographs through Amazon, for what that’s worth.

Note that Amazon doesn’t have some special “Jason Batch”, or anything; they’re waiting for my boxes just like you are, if you already ordered.

Oh, and I added it to my Wish List. That felt great.


Text goes In, Text goes out —

Among my various directories have been contributions sent to me by folks that I haven’t completely gone through as of yet. Generally, when something comes in from someone, I have scripts that let me integrate the new data into my archives. They find doubles, they let me describe what the files are, and then they put them into the right directories and re-build the directories with the new information. People wonder how I do so much, and I always respond “with scripts”, because I use them extensively.

My inbox was originally shoved away in a weird place, but I decided that I should probably leave the files that are not yet sorted in a public place, so people who find something they need or who want to do some looking themselves can do so.

So, tonight, I introduce to you:

The TEXTFILES.COM INBOX.

This little weird site will have all the archives I’m currently on tap to sort through when I have time. The main reasons that I don’t go through an archive are usually that there’s a ton of same-named-but-not-the-same file, which raises suspicions and needs a much closer look, and files which have a place on the site but I’m not sure where (NFO files are an example of this; they’re going to get their own site soon). In other cases, I have been sent programs, and in yet others, I am grabbing a copy of a website and haven’t sorted the ugliness yet.

It’s weird stuff, but there you go, if you feel the need to browse through my inbox as if you were visiting the textfiles.com offices or had an internship or something, now you know what it’s like. And yes, it’s about 165 megabytes.


Days of Delay —

Date: Wed, 11 May 2005 11:50:35 -0700
To: ‘Jason Scott’
Subject: Scheduling delays …

Hey Jason – Wanted to let you know that current capacity is causing delays for DVD production.

Here’s the latest scheduling information that we have – looks like the replication of the discs is now scheduled to be complete tue/wed of next week (5/17-18) – once that’s completed – the discs get forwarded to the assembly queue, and the manual assembly and final pack-out scheduling would have the first discs probably shipping-out the end of next week.

Now, I can request that the plant ship you about 500 each of the BULK discs and 500 each of the DVD-Digi’s and Slipcases, and you could assemble them yourselves – I would, of course, back-out the assembly costs that we have in our pricing for those 500…

Sending bulk items to you would get you materials at least a few days before you would receive the assembled first batch.

Let me know if you’re interested in this option, and I’ll see what the plant says.

None of this would be as problematic or concerning for me, if I hadn’t implied, committed, and otherwise promised people this whole thing would be going out the door in December of 2004. It has been very troubling for me how much I misjudged. This is a small delay but it has been on top of a dozen small delays. For this, I apologize to everyone who has ordered it, for my incredible delay in finally shipping.

It’s real, it’s actual; I’ve shown one or two episodes to groups of folks, so it’s not vaporware or anything, but man, what an annoying time delay. I’ve learned a lot about DVD production, mostly that it is expensive and time-intensive. Now I know why they say a film is going to DVD and then everyone sits on their hands for 3 months.

A bunch of people, upon finding out that I was willing to autograph these, have asked me to autograph their copy. I have no problems doing this at all, so don’t be afraid to ask.

I’ve been working on the website, with a few new subpages going live soon, allowing you to download music from the films, get an introduction to BBSes, and so on.


Packing Boxes Arrive, DVDs in about a Week —

1,000 packing boxes arrived at my house today. This is what $800 worth of packing boxes looks like:

It’s starting to dawn on me more than it did before just how much stuff I ordered to my home. I finally sat down and did the numbers… at least one room of my house is going to be nothing but storage of DVDs, with another couple rooms being the assembly and autographing and mailing locations. I’ve basically committed to turning my home into a factory.

Now, the stuff people care about:

I recieved mail from my DVD place. They tell me that the printer has me in for the printing around next Thursday-Friday. I have told them Saturday Next-Day delivery is a-ok with me, but there’s a chance the stuff won’t go out until Monday. So I’m now saying we can expect this stuff within 9 days.

This is a lot of stuff. Egad.


Swastikipedia —

Some days, I feel like I should have never written anything about Wikipedia, positive or negative. Like many cults, it has extreme members or well-meaning folks who do not understand what they are part of, and who take me on personally and then fall back into the ranks should I respond poorly. Some of them, should I respond within the confines of Wikipedia, point to the rules of discourse on Wikipedia and how I am breaking them.

Fine. It is not hard to post here and have people reference my ideas here; Wikipedia now sends hundreds of folks to my site on a regular basis, all wondering about looking at the strange fellow who does not love Wikipedia. I wave to you, from behind my glass.

But I am not really the “Anti-Wikipedia Guy”. I like to think I have more important things to do. Wikipedia will not live or die by my words, so I will not waste words easily aimed at the betterment of my own sites for the sake of proving my own thoughts to people who fundamentally disagree with me.

But I can spare a few words.

I was asked… well, demanded, really, to show an example of my general belief that “a low barrier leads to crap”, which has been misinterpreted a number of ways (and really, my entire essay has been misinterpreted, but that’s the way of life online). The tautology, which is flawed, is that if I can’t find an article on Wikipedia that is poorly written, my contentions are false. Well, that depends on what you think my contentions are.

Therefore, I will rest my case on a single entry: That of the Swastika.

Here, contained in one entry, is everything that I have issues with regarding the implementation of Wikipedia as it currently stands with its rules. A person could look at the first entry and then the last one, see how big and fluffy and full of photos the last entry is, and go “success!”.

But dig deeper under the surface of this entry, and then you start to see the cracks in this “success”.

With over 1,500 edits done to this entry over its 3 year lifespan, the process of becoming even slightly familiar with the editing pattern could be a full day’s work. I spent some time with it and my analysis is nowhere near complete, but here’s some interesting points along its journey.

The Swastika entry starts its life in March of 2002. By the end of 2002 it has gotten 11 edits, mostly minor nips and tucks trying to get a grip around what it exactly is a symbol of and what way to format the image.

In January of 2003, someone coming from an IP address makes a selection of changes over the course of a few days. His revision history shows someone who was big in 2002 and 2003 and then faded away (or they got an account, but it’s strange they would feel no need for an account for nearly half a year and then suddenly decide they need one). It also highlights one of my issues; without asking people to at least register in some way before making changes, it devalues all the other people willing to be tracked and cited when working on entries. It’s not like it costs money or that you can’t have a billion accounts… it just makes it that more disheartening when your stuff is changed by someone who you hope is on a static, non-shared IP address.

By July of 2003 there have now been roughly 30 edits to the Swastika entry, resulting in a bit of change but basically the same information.

And something happens in July of 2003. It gets over 50 edits during that month from roughly 15 different people. And then the troubles begin.

If you start going through the edits, one by one, and only a maniac would at this point, you see points raised, links created, statements made, and then slowly, over time, they’re removed.

A link between the Nazi Symbol and Socialism is put up, and later, someone called “Nlight” calls it “presumed nonsense” and removes it. Why? Who the heck is Nlight? Well, someone who couldn’t take it anymore, apparently. But if you go look back at his older entries about himself, you see he’s a computer geek from the northwest. Why did he remove the link between socialism and nazism? Because he felt like it. Because he “presumed” it was “nonsense”, according to the edit. So now the socialist guy has to become a content defender, pulling back his socialism link with a citation of it. But now here comes Rasmus_Faber, about 20 minutes later, to undo the socialist guy’s work and return it to the non-socialist link. What is called a “revert war” then occurs, with Socialist guy trying desperately to keep his entirely valid Socialist Party link about the Swastika alive while Rasmus Faber (who is, as his page says, a software engineer) repeatedly stops his changes from staying.

Throughout “The Battle of January 31”, the changes go back and forth between Socialist Guy, Rasmus Faber, Nlight, and Mrdice, who, as far as it can be surmised, simply jumps into the Melee to “help” Nlight’s valiant attempt to not link Socialism with the use of it by the Nazi Party. (Mrdice, by the way, gives up on editing Wikipedia in early 2004, leaving behind a legacy of zip-and-run edits where he accuses, demands, dictates and runs away, with none of that boring, time-wasting need to show any authority or reputation with his subject.)

And lo and behold, that little nugget of information is lost, the work of four people working at odds with each other over a battle, all of them located all over the world, fighting over what actually might be a real fact.

The story of the swastika’s entry continues after this, for over 1,200 edits. Dozens of people are involved, lots of facts are lost, many are gained… and you would be hard, hard-pressed to show why many of these folks should be editing the Swastika entry in the first place. Calling this “open source” and comparing it to programming projects is borderline insane: open-source programming projects have a core team with goals in mind that they state clearly, who then decide what gets in and what does not get in. Sometimes this works, and sometimes it does not, but people with anonymous IPs can’t just come in and fundamentally redo the graphics code on the program and then disappear, never to be seen again.

This is what I mean; you have a brick house that, from a distance, looks decently enough like a house that people say “see, community building works”. But what isn’t obvious on the surface is how many times those bricks have been pulled apart, reassembled, replaced, shifted, modified, and otherwise fiddled with for no good reason other than battling an endless army of righteous untrained bricklayers who decided to put a window there… no, there… wait, no window at all. If you declare the final brick house a “victory” while ignoring the astounding toll of human labor required to get it so, then you are not understanding why I consider Wikipedia a failure.

And all of this wouldn’t be important at all, if we didn’t start to see the Wikipedia definitions propogating throughout the internet, being something you get automatically on a lookup from Trillian or Yahoo using it as a way to get facts. That goes beyond scary.. it borders on negligent.

Now, if you’ll excuse me, I have a documentary website to take care of. It’s waiting for me, and nothng gets done unless I work on it… which is just fine with me.


A Quiet Transaction —

Every day, fifteen thousand people visit textfiles.com. That number is a little hard to fathom for me, although I try. My statistics program tells me that across a month, it works out to roughly a quarter of a million unique visitors from around the world. A quarter of a million.

I field about 200-300 emails a month about the site, ranging from dewy-eyed wonder to seething anger. That means less than one-tenth of one percent of the people who are on my site communicate with me.

This is fine with me. In fact, it’s more than fine.

I am saddened when I hear of sites that are popular, that get a lot of visitors who come because that site offers something unique or at least alluring, who then turn around and consider this not a gift, not a wonder, but a field waiting to be harvested.

I’ve sat through the emotional paragraphs, the insistent screeds, the angry rants indicating that they have the right to treat their audience as a series of floating coins in the air; to bounce around and snatch them like a game of Mario Brothers. I’m sure they hear the little “ding” noise while they do it, too.

Children learn by watching what others do, and I come into contact with young people who see that their sites must have banners, must have ads, must ask for Paypal donations because that’s how the world works, because that’s what they are told the world works like.

Make no mistake, I like money. I like money a lot. In fact, go ahead and send me money, tons of it. I’ll swim around in it like Scrooge McDuck and spit out the occasional gold watch.

But money does something to you, when you start to get it in tiny amounts from your site. It makes you change; it makes you look at things a little harder, consider things a little differently. Should I discuss this subject to get more hits? Should I not talk about this subject because it’ll drive my page ranking down and cause less donations? Suddenly, you’re no longer running a site… you’re running a storefront, a dingy amateurish storefront with a few glittering items in the window desperately trying to drag folks off the street long enough for it to register with the ever-seeing camera you’ve installed that will throw out coins if the person stays in your store long enough.

No, thanks.

There is a plugin program for the Firefox web browser called Greasemonkey. Greasemonkey is going to shoot a lot of this approach to a website in the head. Greasemonkey shoots out a tendril into your dingy storefront, smashes the camera, rips the advertisements in half and grabs your shiny baubles, all in about a millisecond and while other tendrils are doing the same thing all up and down the street.

The tendrils that shoot into textfiles.com will do work, but not very much. The biggest “my fault” complaint I get besides various concerns about content are the green and white color scheme, which I solved for people some time ago. My site doesn’t assault, doesn’t demand, doesn’t declare… it just offers the world as I have collected it to you.

Tens of thousands of people come to my site. Sometimes they come for one file, skipping my welcome screen, directory, explanations and context, just to directly yank their specific target and disappear forever. Sometimes people come and go and never knew they were on my site. I don’t brand the textfiles and I don’t use javascript trickery to detain folks like drunks in a cell until they are subjected to their required ad-watching. (With Greasemonkey on the job, they wouldn’t be able to anyway).

Make no mistake, I used to brand textfiles. I proudly wrote my script, made it brand the textfiles with where they came from, added a demand they visit me, insisted they know who I was and how great I was and how lucky they were to be getting this file from me.

I was also 13.

It was 20 years ago.

I also used to smash mailboxes.

I grew up.

A quiet transaction, that’s what I give. A silent, non-judgemental transfer of information from human being to human being, via machines designed to do so as quickly and fully as possible, with no data lost, no aspect removed. It is not flashy, it is not lucrative, it is not judged.

It is a miracle.


Last Check Disc Approved —

I recieved the last check disc in the shower.

Actually, I was showering when FedEx arrived with a stack of check discs of the first DVD of the set, which close-watching fans know was the remaining disc to go through the approval process. As was proven by Disc 3, submitting a dual-layer DVD instead of a couple of DLT tapes was the trick, and the whole thing came out working just as I had both expected and hoped.

So that’s it, the final piece of the puzzle has been locked into place and the printing will begin. I am paying to have a subset of the discs shipped to my house so I can get them out to people absolutely as soon as possible.

It is relatively surreal to watch the DVDs now; they’re all done, I can do no more changes, and the urge to want to nip and tuck is somewhat maddening. It’s also entirely unnecessary. These things are good. I will, of course, be encouraging disinterested third parties to write about the documentary. I’ll just hunker down in my lair and wait for the onslaught.


Podcast Work —

I figure it’s worth it to describe each step along the way of this Podcast collecting thing, in case people are actually going to use this as a template for their own collecting, or for some other weird purpose like trying to discover my “secret sauce” for all this.

When I’m not emotionally invested in a collection and don’t get a particular joy of mulling over each new item (like I do with my “last straw” collection), then I want to script as absolute much as I possibly can, lest I lose personal interest and the collection languishes as a result.

In the case of the podcasts, we’re in luck, since, as mentioned before, the items in the collection are doing their damndest to be collected; so they end up providing these nice little RSS feeds for me.

So the structure I currently have is:

NAME OF PODCAST/.url (The URL of the RSS Feed)
NAME OF PODCAST/.xml (Directory with grabbed RSS Feeds)
filename.mp3
filename.mp3
filename.mp3
[...]

The name of the directory functions as the name of the feed. If this ends up being insufficient, I’ll put a .fullname file inside that will override the name for the purpose of reports.

Here’s the full source of the PODLIST script, which creates a file with a list of all the podcast directories and their feed names:

#!/bin/sh
#
# PODLIST: Go and build up the feed list from the podcast directory.
cd /podcasts/podcasts
echo "" > PODCASTLIST
for each in *
do
FOF=`cat "$each/.url"`
echo $FOF "($each)" >> PODCASTLIST
done

…nothing too big. So you end up with a file called PODCASTLIST which lists all the podcasts. I use this for a quick-reference when I add new feeds; if the new feed matches anything in that list, I already have it.

I need a script to download a podcast, checking for the .url file. This is that, the core of the downloading:

#!/usr/local/bin/bash
#
# PODSUCKER: Jason Scott's Script for Pulling Podcasts
#
# Based off of BASHPODDER and its Secret Sauce
# Originally by Linc 10/1/2004
# Find the original script at
# http://linc.homeunix.org:8080/scripts/bashpodder
#
# Modified by James Rayner, iphitus@gmail.com
# www.iphitus.tk

PODCASTDIR=/podcasts/podcasts
cd $PODCASTDIR
# Use the current directory as the location of all the directories.
for MUSICBIN in "${1}"*
do
cd $PODCASTDIR
echo "[%] $MUSICBIN"
# Is there a .url file? If not, then this isn't really a podcast.
cat "$MUSICBIN/.url"
if [ -f "$MUSICBIN/.url" ]
then
# Grab a copy to store locally.
cd "$MUSICBIN"
if [ ! -d ".xml" ]
then
mkdir .xml
fi
podcast="`cat .url`"
timedate=`date '+%Y%m%d%M%S'`
wget --output-document=.xml/$timedate $podcast
file=$(wget --tries=5 --append-output=.retrieval.log -q \
$podcast -O - | tr '\r' '\n' | tr \' \" | sed -n \
's/.*url="\([^"]*\)".*/\1/p')

for url in $file
do
echo "    $url"
wget -nc "$url"
done
fi
done

Scary if you’ve never seen Bourne Shell script in action; stupid if you use Perl, weird if you use Bourne. Basically, I go into a directory, grab the URL, pull a copy to store locally, then re-pull it and pull out all the mp3 files referenced and download them. WGET does the cool thing of making sure I don’t re-download a file I already have.

Why do I pull the stuff TWICE? Well, I could probably get the information from the pulled file, but regardless, I keep a copy of the file in the .xml directory so that, down the line, any information stored in the .xml file that isn’t directly related to mp3 filenames is saved for history. So I’m trying to think ahead here.

Then we need a script to add new unique feeds. Here we go:

#!/bin/sh
#
# ADDPODCAST - Add a Unique Podcast, if it exists
#              addpodcast  

if [ "$1" ]
then
POG="$1"
POF="$2"
echo "Checking for $POF.."
else
echo "What is the RSS feed URL for this podcast:"
read POF
fi

if [ ! "`grep -i $POF /podcasts/podcasts/PODCASTLIST`" ]
then
if [ "$2" ]
then
echo "Calling it $1.."
else
echo "It's a newbie!"
echo "What is the name of this directory?"
read POG
fi

mkdir "/podcasts/podcasts/$POG"
echo "$POF" > "/podcasts/podcasts/$POG/.url"
echo "$POF" >> "/podcasts/podcasts/PODCASTLIST"
echo "Added. Now sucking it down."
/podcasts/podsucker.sh "$POG"
else
echo "Already got it, chief:"
grep -i $POF /podcasts/podcasts/PODCASTLIST
fi

A lot is going on here. First of all, it looks very weird because it can be called at the command line, or, if no command line options are given, then it asks for you to add them.

If the RSS feed you tell it to use is already taken, it fails out. Otherwise, it creates the directory, gives it a .url file with the RSS feed’s URL, and then calls the podcast grabbing script (Podsucker).

By now it a couple things should be clear (besides that I write really sloppy code): You create a bunch of little tiny scripts that do one thing well, and then have each script all the others, so you can concentrate on the little tasks, instead of trying to make a monolithic script from hell. And, you try to handle a bunch of errored contigencies.

I attended a talk about programming held by Tom Jennings of FidoNet (I’d interviewed him a year earlier, but also attended this talk) and one of his big statements during his talk was that 95 percent of a programmer’s work was handling errors. He had to work on design, workflow, structure, but the rest of the work was trying to handle the stupidity of man or the unexpected contingencies. He’s quite right.

So, we have:

– The thing that will download a podcast’s files
– The thing that will list all the podcasts we have
– The thing that will add a new podcast directory and call the above

And so if I let this stuff run, it will do a very good job, without my efforts, of downloading all the podcasts I find.

So, where do I get these podcasts from? Well, I have to scrape other sites. “Scrape” isn’t my favorite verb to describe the process, but it’s what I do. Here we get into insane magic mojo.

I give unto you, now, my one questionable script. Given a number, it will go to the site PODCASTDIRECTORY.COM and pull down the name of the XML feed from that podcast, as well as the official name, and then run the addpodcast script.

#!/bin/sh
#
# GRAB from the podcastdirectory.com site.

wget -O beets "http://www.podcastdirectory.com/podcasts/index.php?iid=$1"
FLAM=`head -240 beets | tail -1 | sed 's/.*<p><b>//g' | sed 's/<\b><p>.*//g'`
FLIM=`head -240 beets | tail -1 | sed 's/.*<p><a href=//g' | sed 's/><span class=.*//g' | sed "s/'//g"`
rm beets
./addpodcast "$FLAM" "$FLIM"

Crazy, huh. A small number of lines, but thanks to the scripts I previously worked on, it will go to podcastdirectory, get the name and RSS feed URL, and then add it to my collection if it doesn’t exist. I only pull a few K (and no images) from podcastdirectory, so I don’t feel overly bad. And it’s for the good of history, anyway.

So there we go, a little insight. Next time, I’ll go into some of the other issues of this process.

In case you’re keeping track at home, I am currently pulling down 1,299 feeds, and currently have 17,059 files to show for it. 202 gigabytes of podcasts. And I’m just getting warmed up.


Delays and Ankles —

I have been informed that the final check disc (the remaining piece I need to sign off on before we see the duplication happen) will not arrive until Monday. Ow! This is the nature of things, where I end up in the queue of a big printer and the re-submission of things screwed up the timeliness. So Monday, ideally, is when I sign off and the machine goes into full motion.

Meanwhile, I’m placing the orders for the hundreds of mailing boxes and am beginning to print out the labels for all the outgoing packages. So, again, very close.

Speaking of Ow, I have successfully twisted my ankle. Being laid up here on my couch, in a good amount of pain with drugs and ice, I once again realize what I usually forget: I’m not the type of person who likes sitting still. I have to be doing something. It’s quite maddening.


Quite the Offer —

I made the following offer on the news page of the documentary site. I might as well include it here as well.

“For those of you watching the calendar carefully, you then know that I am going to end up with a technical win, but not a real win. The documentary will be printing before the end of the month, but the time of the printed DVDs getting to me and then sending them out to all of you, it will likely leak over past May 1st. I am therefore making the following offer.”

“If you pre-ordered a copy, and by that, I mean ordered copies any time before now, you can choose to be notified by a personal phone call from me.”

“E-mail your mailing address (so I can match it up to your order) your phone number, and a good time to reach you (include time zone) and I will personally call you to tell you your DVD set has gone out the door.”