People, it’s time.
Actually, it was time probably 5 years ago, but better late than never. If you believe Clay Shirky, we can just keep burning energy nearly forever in terms of collaboration energy, but let us not waste too much more than we have, can we? It’s like how bananas used to taste different and then we broke bananas and had to get new bananas to replace them.
As an information activist, I like it when stuff “happens” that brings into sharp focus a bunch of issues at once. It can get hella dreary explaining countless levels of intra-specific concepts that eventually deploy a meager payload of only relative interest to the masses. But one good clusterfuck, well, that’s worth a week of seminars.
We just had that happen with Wikipedia. This is a very simple introduction to what happened, but it’s a still a fuckload of a lot of work to read all that. So, I’ll summarize thusly:
Fox News decided they were going to do a story on how much porn and adult material is on Wikipedia, in the form of images. As part of this, they started contacting Wikimedia/Wikipedia donors, specifically the big-ticket ones instead of the peons. This scared the fuck out of Jimbo Wales, so he started deleting hundreds and hundreds of images, any that might possibly cause any raised eyebrows anywhere in the western world. By the time people caught onto it, he’d done an enormous amount of deletion, and after a lot of infighting and debating, a percentage of the images are back, some are calling Jimbo a hero, and some admins are resigning in protest. Meanwhile, Fox News was able to release a story instead claiming victory for helping to purge Wikipedia of adult material.
So, listen. We could point out the problem is Jimbo Wales, and yes, he’s kind of a problem, in the way that your crazy perv uncle is a problem – you can’t predict him, he’s sometimes a lot of laughs but other times he’s creepy as all get-out. We could point out a problem in how the masses of admins and Wikipedia users reacted to this, with their endless discussions and finger-pointing and votes and whatever other bureaucratic bullshit they like to burrow into. We could even take the idea that Wikipedia itself is to blame, with the editing and the unrealistic goals and the claims to be censorship-free and all that.
But that’s not the problem. The problem is a different one, and it’s a problem we solved a long, long time ago.
Wikipedia is fucking centralized.
It’s on a bunch of servers that serve code to each other, most definitely. It has shared resources and it goes out to a couple datacenters and it’s got some level of redundancy, so it’s not on one server. That’s not what’s meant by centralized. I mean that one entity controls it, one entity has fiat, that entity makes decisions and the decisions lead to policy, actual policy, like policy in the code and the construction and the implementation of right and wrong, and it’s all centralized. That’s why one vandal, Jimbo, was able to do so much damage, so quickly. That’s why it took dozens, maybe hundreds to undo it. That’s a problem. It’s not a good thing.
Now, in the way that all entities that centrally control the domain will tell you the domain being centralized is awesome, I’m sure the Wikimedia Foundation and Wikipedia’s actual rulers will tell you this is good, that it means that Wikipedia is protected and it will live on, and those delicious, delicious donors will be able to do a one-stop-shop and give their dollars to a place and give Wikipedia a building and everything is great. That is what controlling entities do. It’s not evil, it’s not bad – it’s just their nature. It’s like getting mad at a dog that bites you – the dog bites things, and you dipped your hand in meat sauce 2 minutes ago. Don’t get angry at the dog – either stop dipping your hand in meat sauce, or don’t go near the dog for some time after you dip your hand in meat sauce.
If every Jeep Cherokee stopped working for 15 minutes at the same time all over the country and it turned out it was because a server crashed in Texas, we’d be concerned, right? We’d start asking some questions. When Boston’s single-point-of-waterpipe broke and the city had no clean water for a couple days, people started talking about redundancy. You know, once you have a pretty clear indication that something is wrong, you start to talk about solutions.
So let’s talk about solutions.
We’re lucky – the Wikipedia “problem” I’m talking about was solved years ago. It was called Usenet.
Usenet was a major solution to a problem it didn’t even know it had. It was founded in the beginning era of the general Internet, the network of networks where things were going on in all directions and there were almost no guarantees and almost no idea what it was going to all be about. All that was known was it had potential and could be really cool and really powerful. So Usenet went through a number of iterations and a bunch of fights and a whole lot of events, and guess what – it got shit done.
Now, let’s give a moment for people to say Usenet didn’t work, or that it’s caked with spam and bullshit, and it’s broken and a terrible model to base Wikipedia on, since Wikipedia works.
Bullllllllllllllllllllllllllllllllllllllllllllshit. Bullshit on all levels.
Usenet worked fine. It had a bunch of security issues because it assumed if you were big-cheese enough to run a Usenet node you were probably mature and sane enough not to do crazy insane things, an assumption that became less valid as more people showed up to the party and the barriers to entry dropped like pants at said party. And without a doubt, a lot of Usenet got overrun with spam, but a lot of Usenet had functionality to deal with spam, including on both the reader and the server side. It was a known problem. Also, it was an attention crash issue – once people left servers running without maintenance, spam increased, just like entries on Wikipedia increase in spam and problems when they lose the attention of folks. Stuff can sit on Wikipedia for days, months, years – until it’s fixed. Compared to the situation that Wikipedia has a single point of failure in terms of presentation and control, Usenet’s functionality concerns and issues are a rounding error. Usenet worked fine. Usenet works fine. It’s not the big hot thing – but the servers go up, and they stay up.
Critically, Usenet was decentralized. Different Usenet servers gave changes to each other – they provided articles to each other, some sections had moderation, others did not. The protocol was designed to assume there would be dozens, hundreds of places around the internet, some of them accessing only a few times a day or week – and the changes would go to them as resources permitted. As a result, some servers had a wide range of postings, and a long retention rate – others could barely keep up and not using them for a few days meant you missed stuff. People wrote indexing and archiving utilities to cleave off what was needed and let a person seeking information find it. Backups happened, that years later resulted in us having decades-old saves of Usenet articles. Seriously, this is good stuff. We learned a lot.
With Wikipedia, we forgot it all.
Now, not to say that it’s all Wikipedia’s fault – webservers worked in this “one server gives out the info, oops, that one server is gone” way as well. FTP servers, also, worked in this way. To one way of thinking, Wikipedia was just following the trend, the dominant paradigm.
But in both the cases of FTP and webservers, mirroring mitigated this situation by having ways for various FTP servers or various webservers check on their “masters” and do changes accordingly. If the “master” disappeared or was overrun, the mirrors were right there to save shit. And again, mirrors were not just down for repairs. Censorship, shutdowns, fights, politics…. all of these disasters were reduced in scope with Usenet, with mirroring.
Wikipedia forgot that little bit, that not all disasters are code-based, not all downtimes are hardware based.
I therefore propose DistriWiki, a set of protocols and MediaWiki extensions that push out compressed snapshot differences of the Wikipedia software and which allow mirror MediaWikis to receive these changes and make decisions based on them.
Imagine a world where this happens.
Imagine a world where the main Wikipedia would issue a deletion out to servers around the world, and some would follow it, and some would not? A set of rules on the mirrors, like “do not automatically delete any article that is more than 100 days old” or “do not delete this subset of articles under any condition”.
Imagine a world where these little Wikipedia mirrors have their own subsets of Wikipedia space that are different than Wikipedia, where other thoughts other than the grey goo consensus of Wikipedia rules the day; where a separate “article space” exists there, which can be shared on other Wikipedias at will – demoscene space, muppet space, all the crap that Wikia offers in a commercial setting, except now being done by various vendors and non-profits, and not reliant on a single point of political failure?
Imagine these Wiki variants existing:
- PuritanWiki: Nothing with anything adult-oriented ends up being covered – people can send their kids to browse it related to education and whatever other nanny-tistic approaches they want to and not be worried that their children will ever discover other people have genitals or how they can protect themselves from pregnancy any other way but never having sex, and they’ll never figure out who Hugh Hefner is.
- ScienceWiki: Perfect for people trying to find out about scientific information without having every single link end up somewhere between Deep Space Nine and Red Dwarf.
- FandomWiki: Every single last piece of every last pop culture world lives and breathes and may be the stupidest thing you can imagine, but people who want this are in heaven. Wikipedia may have long ago deleted every reference to every fake element in your favorite sci-fi show, but it lives on in this space.
Please don’t tell me this is technically impossible. Go back to school. It is not just technically feasible, it’s nearly trivial. There’s been so much advancement in compression, difference tracking, and network protocol hub-dubbery that this is the kind of project that could be done in beta by CS students as a final project. It would have scale and bugtracking issues, but it would work. Don’t even tell me that in a world that people can use GMail as a filesystem or jam with people in realtime or use processing over the web or any of a thousand other miracles we see every week, we can’t handle this. We can. The hurdles are political and mindset-related. Wikimedia isn’t going to want this – it’s more work and lessens control for their non-profit, without realizing that with collaborative networking comes competitive quality, and they merely have to maintain being the best to stay ahead and validate the millions. Jimbo Wales will be against it, because Jimbo is in it for Jimbo and something that takes control out from under Jimbo is not going to serve Jimbo. And Jimbo doesn’t like that. But these are minor hurdles in many ways, too.
If this had been in place, Jimbo doing this deletion binge would have been a minor setback – the mirrors would have retained or not retained the pictures and information, and the choices made by someone in the heat of fear for his little PR outlook would have been ignored or followed – but you know which servers I’d have wanted to be on. As further information fads infect Wikipedia (“oh my god, we need to delete everything about gay people or the United Arab Emirates will stop donating money”) , this decentralized, mirroring, robust and variant ecosystem of interconnected Wikis would resist them, like the diseases they are.
Look up the history. Or don’t, and trust me.
Comments are disabled on this post