FTP’s Bright Sunset and Frozen Night —
FTP is kind of over.
Now, don’t get mad at me for telling you this. It’s not like I’m the one killing it, and I’m certainly one of its biggest fans. It’s a really mature technology that does exactly what it’s supposed to do. It is flexible, pumps through almost anything, and has features that do everything you probably want to do with file transfer.
And when I say over, I don’t mean obsolete. And I certainly don’t mean unused in the present day. Many things still use FTP as a method of transferring files, and providing access to all sorts of material.
But it’s quite obvious this isn’t the way of the future. Companies and individuals that utilize constantly changing data, or data that needs to be distributed, utilize a whole other variety of technologies. Many of these are web-based, while others are special protocols that blow out over the web into devices and phones. If it’s starting up, and it needs to get you some data, it is probably not using FTP. If it is using FTP, it’s probably not telling you it’s using FTP. And people who need to get things done don’t reach out for FTP.
But more markedly than FTP the protocol, it is FTP sites in particular that are really on the way out.
Coming as I did from the bulletin board system era, a well populated BBS might have a few minor text files, followed in the early 90s by a CDROM in a drive, and maybe topping out with a stack of CD-ROMs for a few hundred megabytes of accessible disk space. There were a notable handful of massive bulletin board systems that had much more data, but these were unbelievably rare and often cost a monthly fee.
Compare, then, the experience of an FTP site on the growing pre-web Internet, which would have thousands or tens of thousands of files that dwarfed anything you could get through a BBS. The names of these sites were extremely obscure, reflecting host names of systems within departments and shoved into some dark worthless corner of a science lab.
Even though the name didn’t tell you what the contents were, these sites became so populated and so important that their names became synonymous with what they held. ftp.luth.se. ccnga.uwaterloo.ca. ftp.funet.fi. ftp.uni-stuttgart.de. Trust me – you know what these things were for, and what they held. They were stunning in their power and they were the true online libraries of their time.
This summer time of the FTP site dominated the 90s and into the early 2000’s. Support files, drivers, game demos, hilarious films, browser executables, pictures, you name it. Many of them became fairly organized shambles of files, containing thousands of some obscure aspect of online life. There were even websites to help people navigate these FTP sites, trying to find what you needed. A number of FTP search engines existed, although they often required the filename more than what they contained or what they represented.
Many of these FTP sites did their best to join the World Wide Web and its unique needs, littering themselves with.files and HTML overlays. The Gopher format allowed superior methods and browsing the information. But as gopher thought of favor, and the major browsers stopped supporting it, it was another doomed way to navigate.
Finally, we had the experience of various FTP sites going down, and mirrors of those sites becoming subdirectories of the remaining ones. This Russian nesting doll situation has grown ludicrous to the point of some sites being in amalgamations of dozens of previous ones. Besides a few other navigational headaches, this also means that the loss of an FTP site in the modern era could actually be the deathknell for hundreds.
For the past couple of years, but really picking up intensively in the past few months, archive team has been aggressively downloading these FTP sites. We are not pursuing at risk FTP sites – we are in fact considering all FTP sites to be at risk at this point. If it’s on FTP, it’s probably doomed.
The archive.org FTP site collection is now in the hundreds of gigabytes and is growing constantly.
Naturally, when somebody puts up a block of data like this, it doesn’t take too long for the “you missed a spot” nerds to show up and start critiquing randomly.
The most valid and yet invalid argument is that currently these FTP sites exist on archive.org as massive tar archives or zips. “Give us our FTP sites back like they used to be,” cry the people who cry about such things. Well, sorry. That ship has sailed. I think that ship is on fire. Oh well, that ship is actually now burning other ships.
Instead, the should be considered what they are: cryogenic capsules of masses of data, waiting for the sort of duration, extensibility, and data mining efforts that so many of our computers are becoming so good at. They can be split apart, refactored into new ideas, or even pulled back into some mega FTP site of the future. By making the clearest, least fiddled-with archives of these FTP sites, we give the future infinite options. Anything else would be kind of silly.
We are ramping up faster and faster to do this. FTP sites die quiet deaths – a letter to faculty, a dropped connection. They go quite gently into the night.
Buried on these sites are proof of versions of software that others claim never existed, unique pieces of art that lived only on bulletin boards but were pulled up as refugees in the early 1990s, and even one-of-a-kind pieces of code that might otherwise have disappeared. This is to say nothing of the drivers, support documentation, configuration programs, and other parts related to hardware and software now not just obsolete but potentially forgotten. The value was obvious.
As I and others spend this new year gathering up all of this data, I look forward to projects coming along that utilize it or reference it. That’s kind of why it’s done. Sure, I myself would love the ultimate FTP site providing me every piece of the 1990s computer world for reference and utility. But you have to have the data, to make it pretty.
Combined with the efforts to grab every piece of CDROM plastic that has ever existed, I hope the plans are clear. The world is lost so much of what is come before. But it won’t this time.
Not on my watch.
Categorised as: computer history | Internet Archive
Comments are disabled on this post
You have such an awesome job.
Man, you’re right… I had all but forgotten about my nights poking about on ftp in the 1990s. The early days of the WWW, but this was still a thing I did — you could find software, interesting documents, ascii art and strange pr0n and funny .gifs and even stranger things, mostly arranged in subdirectories… oddly mostly by file type rather than topic.
Being an especially novice computer user at the time, I only had the dimmest understanding of what I was actually doing. I found an FTP app called fetch somewhere on a file server on campus, and it had built in connection profiles to a bunch of large sites, and I just started randomly connecting to them one at a time, almost like reading a dictionary or phonebook out of boredom, and tried to see what I could find. It was like sifting through someone else’s attic, only it was full full of filing cabinets, filled with the most random interesting things. And you had no clue what something was, other than what its file name, type, size told you. There was no preview, just a tantalizing line of text telling you vague promises of what it was. And then there were the privileged subdirectories… who knew what was in them? Sometimes there’d be too many users and you couldn’t get on, or the connection would die in the middle of a transfer and you’d watch the progress status report ever diminishing throughput. Or you’d get booted for using too much bandwidth or leeching or using someone else’s account… if only someone back then would have taught me how to get in contact with the administrators of those sites. and didn’t really know about irc or anything. It probably would have been a lot of fun to chat with other people who used those sites. Especially the uploaders. I always just browsed anonymously… The internet was Free back then!
Later on, I found ftp useful for finding freeware and shareware without having to deal with all the BS clickbait mazes of http-based download sites like cnet and tucows and the like. It was way easier to go to http://ftp.softwarecompany.com and find stuff, especially outdated versions that did something cool that they removed in some later version. But i was always hunting for something specific back then, never just browsing idly and discovering wonders.
I haven’t just browsed ftp in ages, and I really only use it now in managing my own sites. Lol, I just googled fetch, and it still exists?! I use filezilla these days.
Thanks for the memories. And the preservation.
“The names of these sites were extremely obscure, reflecting host names of systems within departments and shoved into some dark worthless corner of a science lab.”
I’m a huge fan of your work, Jason, so please allow me to make one correction, which actually strengthens your point. Laboratory space is anything but worthless; it tends to be actively hotly contested.
As you point out, a lot of these FTP servers are machines owned by research groups (a lot of them physics, my personal academic briar patch). The reason that they’re so long-lived at all is that they don’t live in the University’s front-line server rooms, where hardware gets replaced every few years. They are instead a machine owned by a research group that gets re-purposed by a young professor or more likely a graduate student, and then it becomes a file server for other things, and then its role expands because it’s a convenient place to stick files where they’ll always be available.
These systems often sit in the corners of laboratory space, which I assure you, is immensely valuable. It’s always hotly contested because people need space with industrial power and lighting and network for their research that allows them to have equipment in a half-assembled state while students work on it. This sort of space also has the excellent characteristic that generally people will go greatly out of their way to not touch a piece of equipment if it’s plugged in and running, in case it’s taking or logging or monitoring data in some way for someone’s research (or dissertation experiment).
While the group is active and working, this usually means that such servers are long-lived and well protected. But funding changes, priorities change, and at some point some young new keen professor needs research space and one of the labs needs to be cleaned out. Someone forms a space committee that feels it needs to oversee what all the labs are doing, at which point a winnowing out of research space happens and actual time is taken to figure out which chunks of space are actively being used. At that point, a graduate student who inherited an old server from another graduate student may not have the clout to preserve it and since they’re just trying to get their dissertation done, so they let it go because they have their own battles. (I’ve fought and mostly lost this battle many times as a graduate student and as a professional.)
So your absolutely right, Jason. These old pillars of the FTP world are crumbling. But it’s not because the science lab space is worthless, it’s because it’s hotly contested. So assuming that the average PhD graduate student “generation” is about 6 years, someone who’s now finishing their degree might have inherited a prominent FTP server from the graduate student before them who started it about 12 years ago, and may well not care to spend the time move it onto yet another physical machine to keep it going.
(A side note–I don’t even remember what it stored any more, but I do remember that in the early FTP days of the internet, wustl.edu was one of the big early sites in the days of gopher. Many (more than ten) years later I realized that Washington University in Saint Louis is where my parents-in-law went to college.)
It was meant to be a turn of phrase regarding how they would shove the machines into spaces that human beings and other equipment wouldn’t fit, but I’ll happily take it as an invitation for a first-person account!
FTP is pretty much dead as something used by end-users to obtain files, but it still has its uses for developers needing to put things on servers that were developed on local machines. (Though the popularity of content-management platforms means that more content now is created directly on the server through a web interface, but there are still lots of things I maintain, both personally and at work, that are still sent up to servers by FTP.)
Yes, FTP, another technology which had a big problem of today: it did its job effectively. Few commands and file goes to computer. No dozens of spamming download buttons, no site/OS-specific download managers (someone should reverse-engineer them, for archiving reason of course 🙂 ) or advertisement pages before file link.
But it’s still used in some intranet applications, as other systems have big incompatibilities between implementations, especially in different operating systems.
I remember some good FTP sites in 90s. Gaming fans will probably remember 3drealms, I also remember server called “Monash” which hosted palmtop applications in one of its subdirectories. In Poland, Commodore 64 scene fans had Elysium server (I have even a copy).
This is great. All those old FTP domain names are so evocative: wustl.edu! Names I learned before I even understood the geography of the U.S. I remember the cyberpunk thrill of knowing my 386 was connected to an FTP site at the University of Mexico, and the time my college roommate spent the night downloading GIFs from the Playboy FTP archive. (He claimed he was doing it to learn high-tech skills for an FBI internship.)
I am assuming you are doing the same for Gopher sites? I remember gophering from my freshman dorm room to a server in Canada and running a program from an AI lab, a conversation sim.
When the WWW became popular a few months later, my first reaction was that it seemed dumb and redundant when we could just grab what we needed from FTP servers.
FTP suffered like TELNET, sending cleartext passwords over an ever-more-spied-on internet link. But yes, the wealth of pre-web information available on those sites was staggering, if not sufficiently organized and double-clickable.
Standard FTP certainly had that problem, but SFTP has become a widespread solution to that problem, much lke SSH took over from TELNET. There was a period where there were competing encrypted FTP standards, but SFTP seems to have won out (the other standards are still extant, though more and more rarely used).
I agree with Mr. Jason– the issue that marginalized FTP was a difficulty in tagging files with metadata. Filenames do not good metadata make.
I’ve been thinking about this same problem a lot lately. I’m trying to use a Raspberry Pi as a music server. Most of the applications I’ve found for it (Ampache, Jinzora, a few others) are sort of heavy for an Rpi. As well as that, tend to lock you in to a narrow set of applications to use, and are universally horrible at handling updates to the music collection (“You added a new album! Great! Wait two hours while I rebuild the metadata from scratch!”). I’m thinking about a good alternative, and that would be a two-part thing: The first part is a program that operates on the server to compile the music collection’s file list and metadata. It sends that data along a single SQLite file to the Part 2: the client-side music player. Then the client-side music player reads the SQLite file and lets the user build a playlist with it. Once the playlist is built, it translates that to a list of files and grab those files from the server.
Once I get that going, I need to find a a widely-used, reliable, relatively lightweight Secure File Transfer Protocol.
Old protocols never die; they get shoved into the backend.
Here’s an interesting article on why FTP Must Die, which serves as an introduction (a biased, pissy, snarky introduction) to the protocol: http://mywiki.wooledge.org/FtpMustDie
It’s always interesting to look at the things that got our predecessors angry. Some religious groups are only known through the writings of the people who fought against them, for example.
I *still* remember the IP of the earliest FTP sites I mined extensively, because there were no proper TCP/IP stack for DOS (hey, it was 1994/5…). nic.switch.ch, 18.104.22.168. You would log in, you switched directory by carefully typing commands on the terminal and… hey! Doom levels. Undocumented WIndows system calls. Clever Pascal tricks (SWAG, anyone?) . free pen-paper role-f*cking-playing games, in the glory of ASCII text, ready to be printed.
20 years later, I still use http://FTP... but as a mere system tool, not an information retrieval system. Yet, I look at these times with a small tear of nostalgia, since I was 20 years younger.
Keep up the AWESOME work. It’s the one of the legacies to pass on to our sons.
How can I help and where do I sigh up?
I have terabytes of unused storage and a machine that runs 24/7. Downloading porn can wait.
Given the scarcity of ftp sites I can’t think of an effective way of single handedly automating the process.
If you could tell me how I might be able to help I would really appreciate it
(I barely remember dial up so you’ll have to forgive my lack of knowledge around this issue, but that’s one of the reasons I’m so interested.)
I’m really happy that this is being done as I learned this lesson the hard way. There used to be a great ftp site hosted by UK ISP Demon which had a load of great win31 era software and drivers. I went back to it about a year ago and it had only just shut down – I wish I had archived it earlier, but I took it for granted. Best of luck for your archiving task for these remaining FTP sites.