ASCII by Jason Scott

Jason Scott's Weblog

All The Podcasts: An Update —

Well, it’s been a little while since I announced that I was setting off to archive all the podcasts. I figured I’d take some time to let you know how all that’s been going on.

Obviously, my documentary and then my regular duties have been taking priority over the project, so initially, the whole thing was first to get a sense for how big the whole thing really is, what would be involved, and what effect that would have on my resources. To do this, I started using a program called doppler, which is a multi-threaded podcast downloader that also was wired into the ipodder directory, enabling me to click and say “gimme all”, and then “keep all forever”, and so on. I figured if I’d opened the floodgates, I’d get a good chunk of the stuff.

Now, as it turned out, Doppler wasn’t up to this task. I hardly blame the developers for this situation; how many of their users would be expected to be downloading 1,200 feeds? So I started to run into problems of corruption, doubled feeds, and other annoyances. I ended up having to write a perl program that went through the doppler configuration files and cleaned them. It was picking stuff up, but only sort of.

After a couple months of jamming doppler on full, and keeping in mind I was just doing this in the background, low-priority compared to my documentary work, I had about 1200 podcasts and about 11,000 files, equalling 150 gigabytes of data. Large, but not stunning. If you play the game of saying that mp3s are about a megabyte a minute (which is very, very rough and doesn’t take a lot of factors into account), I have downloaded over 104 solid days of podcasts.

So let’s step aside a moment and go over where I am in the collecting process.

I had at this point collected a large amount of audio files, but hardly a comprehensive collection. Some of the podcast sets are incomplete, just representing what was in recent XML pages, and not going back far enough. Some of them are doubled. Others are weird, broken files, not really podcasts; some people didn’t implement them correctly and they don’t have any audio files at all, just a JPEG file and a PDF file and nothing else. In other words, it’s definitely big, but it’s also a big mess.

This is that critical juncture I mentioned in the previous entry. My collection is neither convenient or complete; neither well-sorted or easily browsable. It is, basically, a huge crap-pile of audio files.

Obviously, I will press on. But here, after taking stock of my initial collecting, I have begun the process of re-doing things right.

First of all, I’ve had to stop using Doppler. It was fun and easy to use and a nice client, but I was using it on a Windows box to access a samba share presented on my freebsd file machine (which has a couple terabytes of disk space, in case you’re wondering where I’m putting all this). This was tying up two machines for no good reason, and also was adding a ton of network traffic that didn’t need to be there (pulling it to the windows box and then throwing it all back at the freebsd box).

So now I switched to bashpodder. Actually, that’s not really true. I switched to taking bashpodder and IMMEDIATELY dropping its transmission and rewriting it basically from scratch. So I use the “secret sauce” line that yanks all the mp3s out of an XML file, but the rest of it all is using my directory structure and approach, and is using whatever URL is stored in the directory with the mp3s as the place to check the feeds… and it’s also now grabbing a copy of the XML file and archiving it as well.

The reason behind this last move is because I’m finding that a lot of the XML feeds have all sorts of important information in them, like descriptions, text, ideas and explanatory paragraphs that are not anywhere else. I have no way, right now, to match them up with the file, but maybe someday. Or someone else will do it. Either way, it will enable me to keep the whole collection somewhat sane. This is a case where I am Collecting for the Future, not for my own information/education. I may never read these archived XML files again; but they’ll at least be somewhat near the MP3 files they reference, so someone studying that particular podcast can see information about the MP3s that would otherwise be lost.

The new directory structure allows another situation which I knew was coming but wasn’t handling yet: splitting the collection among more than one drive. Right now, the whole collection is mirrored, but in point of fact we were heading happily towards the 240gb limit of the hard drive, which meant we were going to have to use multiple drives, and I am currently avoiding RAID. So this new structure means that the information on how to get the mp3s is located within the directory with the mp3s themselves, so multiple scripts can be running. I’ve written the scripts so you can say things like “check and download all the feeds starting with the letter ‘A'” so the scripts don’t bother each other.

So now we have everything a little cleaner. But! In cleaning up the situation, I found rapidly that I was missing a ton of podcasts. As might be expected, my half-hearted “download everything” selection was working, generally, but missing a lot of feeds with broken links or weird syntax, or otherwise not making sense to Doppler. So, with my new scripts in place, I’ve started downloading. And downloading… and downloading.

As of right now, I have downloaded 20 gigabytes of podcasts on this day alone. And I’m nowhere near done. This is because I’ve started the second phase in this type of collecting; hand-checking my feeds to be accurate, and filling in gaps where needed. So now I don’t just get the last 5 mp3s of a feed; I go back and get ALL of them.

To help me, I’ve written a script that allows me to add a podcast to the directory. This script will check to see if the feed is already in use anywhere else by any other feed (meaning it’s a double if we were to add it), and then, if it isn’t, creates the directory for the feed, puts the URL in there for later script use, and then issues an immediate download of the feed. All of the things it does can be called at the command line, so I can have a script that gets a bunch of “new” podcasts, and adds the unique new ones AND downloads them immediately. Good stuff.

Let’s step aside even further as I talk a little bit more about my opinion about podcasts.

As said in the previous entry, I do not consider podcasts particularly new or revolutionary. That said, I am happy that people are making such an effort to record themselves and then slave away to make those recordings available to the largest number of people they can get the attention of. To an anthropologist, it’s like this huge self-service oral history project. Maybe they’re talking about tech issues or news items of the day or other “disposable” subjects, but not always, and there’s a lot more information in these than you might expect.

One of my favorite ironic works is Maciej Cegłowski’s “Audioblogger Manifesto”, which was created to attempt to point out the folly of audio weblogs, and how the bandwidth of information was nothing compared to text blogs. Of course, Maciej didn’t intend this to be ironic, but guess what, it ended up being just that. (You can read a transcript of his speech at his site as well).

While he goes on about how audio files are a step back, how by removing the advantages of hypertext and forcing your audience down a single path without the ability to skip around or add additional information, you’re hearing music play in the background. In other words, you are getting two simultaneous streams of information in your ears. You can hear Maciej’s voice dripping with either sarcasm or stilted emotion, depending on your point of view. He talks about all the disadvantages of spoken word without pointing out all the advantages, like how his pauses and expression come across in ways they wouldn’t with the written word; this is why, for example, many authors going back many years have gone on speaking tours, reading from their own works; you never saw Mark Twain show up to one of his many engagements, point to a book of Huck Finn, and say “Just shut up and read.”

He is, essentially, a man sitting on the porch watching a new airplane fly over, going “nothin’ wrong with walkin’. Don’t like flyin'”. Or a guy seeing new antibiotics going “Make the kid walk it off.” In other words, even now, 8 months after he wrote his words, and Google and other sites add text search of audio files and people are taking radio shows (which have been around for decades by the way, and aren’t mentioned in Maciej’s timeline), and putting them in podcasts so you choose what you want to hear when you want to hear it… We are seeing the things he claimed were the disadvantages of audiobloggingand having them be turned around to advantages. His own joke is on himself.

Speaking of misguidedness, for all my liking the audio form as a means of expression, I contend quite heavily that there are a lot less podcasts than people are trying to puff up in the growing Podcast Industry. This is because, for example, I don’t count shows that basically play pre-recorded music in a set order. That knocks out a good amount of them. I also don’t count shows that are basically re-run professional productions, like FM/AM talk radio shows. I am COLLECTING these, make no mistake, but you’re basically photocopying already-existing material in both cases and then making them available on demand. That’s a little different than sitting down for an hour or half-hour regularly and talking about a set of subjects.

There’s also what I call the 4 Month Death Wall. This wall exists, basically, in all projects, not just podcasts. I’ve seen it in Zines, BBSes, high school bands, relationships, gardening projects and anything else that requires constant or semi-constant attention. You get to a point where you’ve been doing something for a while, it takes some amount of your time, and then it bumps up against the rest of your life. You make a choice at that juncture; and the vast majority of people choose to shitcan it.

After four months, you will have been doing a lot of work on your show. You will have been spending a lot of time before it preparing, and a lot of time afterwards cleaning up. And for a lot of people, the thrill has gone. At that point, they tend to hang up their mike, sit back, and get their life back. This contributes to what they now call “churn”.

I don’t see how the churn in podcasting will be any different. And that solves two problems for me: having a constantly growing collection grow exponentially, and keeping track of a specific site. I suspect as time goes on, my collection will have more and more dead feeds, missing any new updates until finally they 404 out and I have, trapped in amber, a little bit of online history.

One final thing before I finish this “small” update. This is easily the most amount of data I’ve ever trying bringing in at once. It’s really stretching my mental muscles trying to keep track of everything, handle issues of bandwidth and sorting information via scripts. It is, if nothing else, an incredibly beneficial exercise. I feel like, doing this, I could go on to collect most anything. So there’s always that.


Documentary: Disc Art —

This is what the documentary disc art looks like:

Good stuff! It’s all done in Freehand, and that’s how the printers want it. I had to relearn freehand from my nascent days as an art kid back in early 1990s, when I was all about nothing else but cartooning, sketching, and using weird art programs to get what I wanted.

The design is simple because the box it goes in is complicated. I don’t need to mark the thing up with a billion little text boxes and warnings and copyright fru-fru; the box does that. Plus, somewhere, SOMEWHERE on the box, we needed that insane “Countdown” font that was on so much “computer” stuff in the 1970’s, just to reference it. Using it for the disc number made the most sense.

So there we go. I am officially out of the creative side, and now into the pure “after” work: talking to people, getting interviews, and generally letting the world know my little project is here.


Documentary: Disc #3 is working, Disc #1 Resubmitted —

Well, that’s some excellent news. Disc #3, which was submitted using the new DVD-R method, worked absolutely perfectly. It works both in my DVD players and in my computers, and the DVD-ROM section is browsable and working fine. I even listened a little to a few of my speeches, which are included in the DVD-ROM section. So, no problems.

For the folks at home, that means:

DVD #1: UNFINISHED
DVD #2: FINISHED
DVD #3: FINISHED

Now, I’ve burned out a working, functioning version of DVD #1 and am fedexing it out to the printers today, along with a $1750 check to re-do the glass master, and so on. Since it works fine in everything, and the process worked fine for #3 (which was many times more complicated than #1), things look good.

Artwork goes to them today for the Discs themselves; it’s rather simple, because I think complicated overwrought disc art looks silly. So it’s basically the outline of a phone and what’s on the Disc. Simple, easy.

As my free time is returning, I have more time to write essays and work on various projects, so expect more of that.


My Most Precious Collection —

I have a number of in-transit projects I’m always working on. Some of them are in very good shape, others are in states of disrepair and decay. It’s always a constant game to decide what needs my time and what can go another few months without my meddling. But in terms of in-progress sites, the winner is “The Last Straw”, which I have been working on for seven years.

I have kept it to myself, but I figure maybe people want to help me find additions to this precious, precious, collection. So let me explain what it is.

There comes a time in the lifespan of any creative project when it reaches a critical point; will it continue or will it be disbanded? This juncture is sometimes the case of being long in the tooth, or in danger of being obsoleted, or maybe, just maybe, because the maintainer/creator of the project has gone completely utterly batshit loco insane.

It is from this final set of circumstances that we will sometimes see The Last Straw Letter, a piece of correspondence or final note from this person that is intended, once and for all, to burn any and all bridges related to the project. In fact, not only must every bridge be burnt, but the roads leading to the bridge must be laced with landmines and the landscape around the roads limed in such a way so that nothing will grow there again.

It is, to me, the mind at full bore: the complete and total insanity, the love of the project mirror-universed into loathing hate; the kicks to the groin and pokes to the eye of everyone who ever (in the author’s eyes) abused or ruined the project with their actions or inactions.

I present to you, my tiny, tiny collection: the last straw.

If you must only read one, read the one that I consider the classic, the true and final standard by which to base the Last Straws of the future: The Nathan Mates Letter.

In this letter, Nathan Mates announces his withdrawl from the Apple II development community. In doing so, he attacks the members, the other developers, the pirates, and everything else in his path. That, itself, is fascinating in a car-wreck sort of way… but when he reveals that he has low-level formatted his hard drives to ensure no remaining piece of his work exists? That is, truly, sublime. It is a level of hate that one rarely sees in life, and it is laid bare here.

The other parts of the collection are nearly as classic; all of them should be read as if the writer is literally shaking their fist at the sky; at least, that’s how I do it. The bitterness, the gritted teeth, the flaring lips and eyes… it’s like candy.

Call it a guilty pleasure. Call it strange, weird that I would take such interest in this. But there you go, there it is. The Last Straw.


Disc #1 Re-Done, going to Printer —

The two options for submitting a DVD to the printer for duplication are DLT (Digital Linear Tape) or an actual burnt DVD. Dual-Layer DVDs (which is what my DVDs are) are somewhat new in terms of burning them at home, and while my printer indicated that they were sure they could take it in that format, I was skeptical about being one of the first. Of course, once the Ulead program died on the third Disc (which has a DVD-ROM section with thousands of photographs), I ended up sending it in that format anyway. But I sent the other two Discs in DLT format.

Well, I’m going to send the first in DVD format as well, since that allows me to test the medium before it goes out.

And it does work, perfectly, when I put it on this DVD DL. I’ve now tested it again, gone through it again, and it’s all fine. I even fixed an (extremely minor) captioning issue that was bothering me but which I was going to let go. So the thing’s a little better than when I was originally going to duplicate it. I guess that’s the positive side. I would have preferred not paying $1750 for that change, however.

(By the way, one of my buddies brought up donation in the comments of my previous entry about this DVD business. I wouldn’t suggest that. In fact, I would request you not do that. This thing is being sold, it is a business, and if you feel like “donating”, you should just buy a copy of it. In the early days of the production, back in 2001, I actually did put out a call for people to donate, and a bunch of people did; all those people are getting copies as a result, regardless of what they donated. It pays to be generous!)

I get the check disc for disc #3 tomorrow. If it checks out, that will leave just one disc left to approve before the printing begins in earnest. In the scale of a 4 year production, a week or two goes to near-margin-of-error, but still, I don’t like these little delays.


Documentary: Disc #1 fails, Must be Resubmitted —

Well now.

As you might have picked up, the process of doing the three DVDs inside the documentary set works as follows:


  • Create Content
  • Create DVD Master
  • Plant creates DVD “Glass Master” from DVD Master
  • Plant presses 5 example discs (“Check Discs”)
  • Jason plays Check Discs, approves
  • Plant makes one hundred billion discs
  • Discs go in packages and out to customers

All well and good. Disc #2 went through this process and is waiting at the “one hundred billion discs” step for the other two DVDs to arrive. So it’s done.

Disc #3 is still going through the “Presses 5 example discs” step.

Disc #1 came today, and I played it.

It is busted.

Oh, not just busted, “busted” is not the word to describe the situation. “Super Mecha Busted 3000” comes closer. It’s not just half-baked; it shot out of the oven and set the Sous Chef on fire and is somewhere in the main dining hall eating everyone’s hair. It is, basically, unacceptable. I will have to re-master and go through the process again, adding 5 days to the timeline.

Of course, I’m glad to have 5 of this broken disc than 5,000. It will, however, add roughly two grand to the cost of printing. So I am not happy, in the general sense.

I would like, at this point, to thank Ulead Software, creator of Ulead DVD Workshop 2, for creating a piece of software that, should you want to make a nice DVD of photos of your cats, you’re in luck, but if you are in the mood to create a DVD that consistently works in, you know, DVD players, you’re going to go through the trials of Hercules. I would also like to issue fair warning to these folks that I will. from this point on, carry a pile of cream pies with me at all times and will be prepared to hit them in the face with said pies should we meet.

This is, of course, why one takes on “check discs” in the first place, so that these problems are found before the mistake becomes so out of control and rampant that my years of work suffer. But still, I thought I’d not be facing this stuff for a while and now I know what I’m doing for the next couple of days. It sucks.

While everyone’s waiting, I’d like to go ahead and share something special. When I opened to pre-orders, people who pre-ordered were able to submit a paragraph of text that would be placed on the DVDs for others to see. They are at the root level of the DVD-ROM portion of Disc #3. These paragraphs are in both HTML and text formats.

Here is the text version of the paragraphs.

A big thank you to all the people who sent me paragraphs, who believed in the project enough to pay with real money before a product was ready to ship, and for their patience.

We’re close. This was a setback, but a minor one.

It occurs to me that some people would want a more technical version of what went wrong. I will summarize this way: “Ulead is extremely poor, more than any software should be in 2005, at handling changes in digital assets it is referencing, including obvious changes in length or format. Because this disc configuration was the oldest and had the most changes, inherent flaws in Ulead’s mastering software made it unable to properly track MPEG and subtitle files accurately, leading to lost chapters, dropped subtitles, and missing graphics. They are poop.”


A Bright Day of Shovelware —

With the documentary’s day-to-day production out of my life, I can focus on getting back to a number of projects that were seriously back-burnered. The office I work in here has a bunch of large metal racks that basically function like a huge, piled “to-do” list and the list had gotten very physically large.

One of them is very simple and very satisfying; integrating another 30+ CDs into the cd.textfiles.com website. I don’t talk about that website much, because it’s basically a minefield, but a minefield I’m running for a bunch of good reasons.

The site contains CDs of shareware. Mostly BBS-era, but a few that postdate the heyday of dial-up BBSes. And some isn’t shareware. But they are CDs. Lots of them. Over 90 gigabytes of them, in fact. There’s a bunch of doubled files, but I’ve decided to let that go in favor of having the included CDs maintain their structure.

The Shovelware CD era came and went pretty quickly; about seven years from about 1991 to 1998, with a few before and after. It’s actually a situation that crops up again and again in subcultures and societies: people coming along and grabbing what they can, then repackaging and selling it back to the people they got it from. Basically, these are CDs containing programs, graphics, songs, and whatever other files the company could pull from dozens of BBSes, repackaged onto the then-cool CD-ROM medium, and then offered to BBS Sysops and users as a way to quickly get their hands on a bunch of files to offer for download. Because little regard was held for quality and a lot was held for quantity, these became known as “Shovelware” instead of “Shareware”, since they were basically trying to fill the nearly-700mb capacity of a CD-ROM to point to as a selling point. In doing so, they would grab nearly any cruft that any BBS had to offer that anyone would ever potentially want.

In other words, they unexpectedly became historians.

By grabbing any and all files, we end up with a range of captured aspects of BBSes that would have otherwise never existed: old versions of programs, writings from BBSes and services long gone, and even collected messages and photos from users who thought they were sharing things to a small group but who were in fact destined to share them with the world, forever. This is where I come in.

For quite some time, I have been trying to get my hands on as many shareware CDs as I can. Any. All. From anyone. Hand-done ones, commercial ones, forgotten ones, and for all sorts of platforms: Atari, Amiga, Palmpilot, PC, UNIX… whatever they have. I’m not picky. I’m whatever the exact and total opposite of picky is.

If, in fact, you have any shareware CDs, consider copying them and sending me copies. I’ll integrate them into the site as fast as I can. And that is, in fact, fast.

See, while on one side we have a bunch of old programs and demos written for platforms not in as much use, and with operating system changes that make any programs kind of not work…. we also have on these discs an entire snapshot of history at that time. The way people wrote, the priorities in their documentation, the references to the world around them…. what I find is that, like a library, you can go through the whole thing 100 times with 100 different agendas and find 100 different examples without overlapping. With over 800,000 files and counting, my Important Research Topic of the day likely exists to some level on these discs.

One day, I might be looking for a bunch of old BBSes. Another day, artpacks. Yet another, I might be trying to prove something existed before a company patented the (simple) idea. I’ve done all of these things at one time or another. And it all uses the same material.

There’s so much to learn here, and as I pile on CD after CD, growing my collection by 650mb a hit, I know that this will function, in some cases, as the only way to get to this information on the Internet. It’s a shame it’s not in more places.

This is part of what I am doing as a goal; putting up so much stuff that people are getting things out of it that I never dreamed of. Life shouldn’t be spent trying to accomodate a range of set possibilities; it should be lived so that possibilities you couldn’t dream of are welcome and find a home.


I Paid, You Get It for Free —

I mentioned on April 2 that there was an auction for a CD-ROM of 20 megabytes of message bases, and how I tried to get it donated but the seller was not interested. Well, I bought it. I paid $7.50 plus $2.00 shipping, so $9.50. For BBS messages.

And, as the proud new owner of this clump of text, I give it to you.

It compressed to 6 megabytes. You know, because it’s text and all. Text that shouldn’t have been sold in the first place, but it was and there you go. That’s $9.50 I absolutely hated spending.


Some Other Documentary Meta-Things —

This goes outside the news, so it’s here and not there. Another reason to keep this weblog in a reader or otherwise check it. That and, of course, my penchant for dropping entirely weird links.

I had considered for some time that when I got my check discs back, I would put a couple sets of the check discs up for auction, with the proceeds going to the Electronic Frontier Foundation. I am likely to do this, but only after all the DVD sets have shipped; otherwise, I am screwing over people who threw their money at me in a show of faith and respect months ago. So, expect an auction for them, likely signed and with original printed art, in the future.

I will be adding a series of dates when this documentary is showing or I am available to answer questions about it at showings. If people wish to book me or otherwise contact me about being at festivals or other such events, I am more than willing to discuss it. I’m jason@textfiles.com.

Similarly, I am available for pretty much any interview, radio show, or other such news/media situation, if people wish to talk to me. I’m pretty open to most any such venue or group.

I have been putting together trailers for the episodes, and other informative footage. That will be on there soon.

I thought I had more!


Documentary Update: #2 Done —

The first of three check discs arrived. Actually, the first SET of check discs arrived, of what
will be three sets. This is disc #2. Am I being confusing?


Let’s start again. The way that it works with DVDs is that you submit the raw data to the printer,
then they make a “glass master”, and then do a limited (very limited) run of 5 discs from that
master. They send it to you, and you play with it on your DVD players, make sure it runs, and then
say “Yes, I approve.” Then it goes down the line and they make thousands of copies from it. If you
don’t like it (because they did it wrong) then you tell them and they redo it. If you don’t like
it because YOU did something wrong, then you send them NEW raw data, and pay some money, in my case
about $1500.


I have three DVDs in my set. What I chose to do was to have them do just one of the three discs, so
if the fundamental problem was “they just don’t work”, I would only have to see the problem and
fix it with ONE disc, not THREE. So I’d be out $1500, not $4500. This adds a few days to the process,
but you see where I’m coming from, I hope.


So now that I like this one (I’ve been playing the disc in my players all day), I will be
approving the creation of the two other check discs. I will also be sending out another
check to the printer that will be the equivalent of buying a Chevy Aveo and pushing it off
a cliff. Although, I suppose, I do get the DVD sets for my trouble.


It’s kind of surreal to hold the DVDs. No weird DVD-R trickery, no purple side, no other stuff. It’s
an actual fully-made DVD. The name of the production company I run that made this documentary is
called Bovine Ignition Systems. This is Disc #2. So on the inside of the discs, it says
BOVINEIS1B. Moo. I didn’t have any say in that, otherwise I’d have made some neat BBS reference.


One other nice feature is that since the DVD players aren’t dealing with a whacky burned
DVD-R and have an actual machine with pits and angles built in, it loads a ton faster and works
better. The Layer Break is there, in ARTSCENE, during a specific shot, but on half my players it’s
not there (they do the right thing) and the rest pause to various amounts. I had no choice on this
disc, because it has 3 full episodes and a ton of bonus materials… in fact, Disc #2 uses
7.9 gigabytes of a total 8. That’s pushing it. But that’s what I do!