An in-depth look at how Wikipedia's technical infrastructure evolved from a single server to a global platform, exploring the human stories behind the code, the social dynamics of open-source development, and the ongoing tension between community governance and centralized control.
The story of Wikipedia's technical evolution is not merely one of servers and software, but of human ambition, trust, and the complex interplay between code and community. From its origins as a hobby project on a single server to its current status as a global platform operating across multiple continents, Wikipedia's infrastructure has been shaped by thousands of contributors whose individual efforts, as former Deputy Director Erik Möller noted, created an environment where "brilliant people found a niche in which they individually made incredibly impactful contributions."
{{IMAGE:2}}
The Foundations of Trust
In the earliest days, access was remarkably informal. Brion Vibber, the Wikimedia Foundation's first employee, recalls that "if you showed up and put in good work helping out, you might well hear 'yes' to getting some fairly direct access to things because that was the only way we were going to get anyone to do it!" This assumption of good faith extended to the source code itself, maintained in a CVS repository on Sourceforge.net. Gabriel Wicke, later Principal Software Engineer, explains that "getting revision control access... was about winning trust with whoever set up accounts." Tim Starling, current Principal Software Architect, received CVS access from Lee Daniel Crocker as soon as he expressed interest, though there was no pre-commit review. The code on the server wasn't automatically updated, so commits were theoretically reviewed before going live—a process that would be unthinkable in today's development environment.
The path to root access, however, was more arduous. Starling remembers "a ridiculously long and painful period between getting shell access and getting root, like six months." During this time, he had read/write access to the database, could edit code, and view server logs, but root privileges remained elusive. Domas Mituzas, former system administrator and WMF board member, took a more personal route: "It took me a bus trip to Berlin and sleeping on a German Wikipedian's couch and meeting everyone... that eased everyone into the idea of giving me root."
Scaling Challenges and Creative Solutions
By 2003, Wikipedia's growth had outpaced its infrastructure. Running on only two servers, the site rendered every page view from scratch. A "then-fancy 64-bit Opteron DB server upgrade helped briefly, until it started crashing," Wicke recalls. The site was often down, and it was clear that any further growth would quickly consume whatever hardware had just been added.
{{IMAGE:3}}
Wicke proposed adding caching using Squid, posting his proposal to the community and receiving feedback from Jimmy Wales, Vibber, and Jens Frank. He then prototyped the integration on his own servers, which would serve the main site by February 5, 2004. "There were issues of course, like missing purges for images or transclusions," he says. "Those were fairly quickly resolved or worked around, and there was a lot of tolerance given the preceding phase of poor site availability."
The true test came on February 25, 2004, when Wikipedia was featured in the German news program Tagesthemen. Watching the live traffic stats, Wicke and others "got all excited on IRC when the site briefly went from ~25 [requests per second] to around 1,500 without falling over."
The Hardware Journey
Before 2004, Wikipedia's servers were managed by Bomis, a dot-com startup. Jason Richey, a Bomis employee, managed the hardware and would occasionally log in to restart things or fix downtime. Sometimes this involved literally going the extra mile—he lived in Los Angeles while the servers were in San Diego. Starling remembers Richey "having to drive 4 hours to San Diego to fix downtime caused by a really simple problem, like a broken hard disk."
{{IMAGE:4}}
Some tasks required his intervention that today seem unthinkable. "My favorite memory from the very early Bomis days is that if you wanted to upload an image, you emailed a guy named Jason who would helpfully place it on the server for you," Möller says.
In 2004, Wikipedia moved to a datacenter in Tampa, Florida, to be closer to Wales's new home. "I believe Wales helped to rack the first batch of servers in Tampa," Starling says. A year later, the Board named Mituzas the Hardware Officer, putting him in charge of, as he describes it, "placing servers in a shopping cart and then asking Wales to pay for them."
Instead of ordering servers one by one, Mituzas tried a more exponential approach—buy 20, then 40, then 80. "Each time we'd land those batches the site would get much snappier and within a few weeks we'd have more users to fill all the capacity," he says. "We bought cheap servers that needed hands in datacenters to do anything with them, but we had the capacity to survive the growth."
When others in the WMF's leadership wanted to use some funding to pay other bills, Mituzas pointed out that if the site wasn't up, there wouldn't be any other bills to pay. The challenging part of the role was being the first one to grab Wales when he got online—so paying attention to IRC notifications was key, otherwise they would not get their servers.
Growing Pains and Contingency Planning
Between rapid growth in traffic, insufficient technical resources, and constant feature implementation, Wikipedia was down rather frequently. Mituzas recalls one outage when nearly all of the developers met for the first time in Berlin. "Kate, who wasn't at the meeting, deployed Lucene-based search that nobody knew about and thus we were trying to understand why is Java running on our servers and why is it taking everything down."
Other times, developers created contingency plans in response to real-world events. During Hurricane Charley, Starling (who lived in Australia at the time) made off-site backups of Wikipedia's data as if there was a chance of Tampa being flattened. "I guess it's nice that we were making off-site backups of private data for the first time ever," he says.
In 2005, tripped circuit breakers took the site down. It took developers a full day to restore editing after the primary database became corrupt. Starling recalls that Wales quipped that "downtime is our most profitable product."
The Deployment Revolution
Changes were often queued up for months at a time before being deployed to wikis in a single release. But eventually, the breaking point was hit. In early 2011, developers scheduled a 6-hour window to attempt to deploy the 1.17 upgrade. "If we couldn't fix things within those 6 hours, we would roll back to 1.16," explains Roan Kattouw, current Principal Software Engineer.
The first two attempts failed, with the site going down at times and major issues that couldn't be fixed quickly enough. It took about three tries to get it working "successfully"—meaning the site was up and stable without critical issues. "One of the issues I remember us encountering was all redirects being broken on French-language wikis," Kattouw says. "Today, that kind of issue would be considered a major problem and a train blocker, but that day it was so far down the priority list that we left it broken for about 12 hours."
Soon after, developers began working on "heterogeneous deployment," allowing for progressive deployments. "This way we can deploy a new version to only a few small wikis first, and work out the kinks before deploying it to larger wikis," Kattouw explains. "We were able to accelerate this over time, and nowadays the deployment train runs every week, with major wikis getting new changes only two days after the first test wikis get them."
Expanding Functionality and Controversy
Wikipedia originally ran on UseModWiki, written in Perl. Magnus Manske, a biochemistry student at the time, wrote a new wiki engine in PHP to allow for adding more Wikipedia-specific functionality. The "PHP script," as it was known, added features like namespaces, user preferences, and watchlists. It would be officially named "MediaWiki" when it was rewritten by Lee Daniel Crocker.
{{IMAGE:5}}
Other features taken for granted today, like an autogenerated table of contents and section editing, were controversial when initially introduced. "As I recall the table of contents feature was a bit more contentious (no pun intended), mostly because of the automagic behavior," Möller says. "With section editing, the first visual design was a bit cluttered, but I think most people could fairly quickly see the appeal."
In other cases, editors forced developers to add features. Carl Fürstenberg, a Wikipedia administrator, created {{qif}}, which allowed for conditional logic in templates. "At one point I realized that the parameter expansion logic could be 'misused' to create a way to inject boolean logic into the at time limited template syntax," he says.
The developers weren't pleased. Starling wrote to the wikitech-l mailing list that he "caved in and wrote a few reasonably efficient parser functions... that should replace most uses of {{qif}}, and improve the efficiency of similar templates." Fürstenberg says he never expected {{qif}} to be used so widely. "I think I first realized it had become widely used when it had to be protected as any edit to it halted Wikipedia for a while."
In his 2006 mailing list post, Starling blamed the 2003 introduction of templates and the MediaWiki namespace, saying he didn't understand "what a Pandora's box" he opened. But that functionality was key to enabling one of MediaWiki's greatest strengths: localization.
The Localization Challenge
Niklas Laxström, founder of translatewiki.net and a WMF Staff Software Engineer, originally submitted translations via Bugzilla, then worked up his courage to ask Vibber to deploy them for him, sometimes breaking the Finnish Wikipedia because he forgot a semicolon.
"It was no wonder then, that many opted to do translations in the Wikipedias itself using Special:AllMessages," Laxström says. "There was no risk of syntax errors and changes were live immediately, as opposed to potentially taking many months as deployments were few and far between."
Laxström started modifying Special:AllMessages to make translation easier but didn't feel those changes were acceptable to go back into MediaWiki, so he hosted them on his own wiki. Today, nearly all localization of Wikipedia's interface is done via translatewiki.net, rather than on individual wikis. He credits Raimond Spekking with managing MediaWiki's localization process for over a decade. "He checks the changes to mark the translations outdated where necessary, he renames messages and performs other maintenance activities. He exports translation updates multiple times per week. He does this so well that it can feel like magic."
Divesting Power and the Technocracy Question
Early versions of Wikipedia's software gave immense power to developers. Only developers could block users, promote new administrators, rename users, and so on. Seeing this as a problem, in 2004 Starling wrote an email to the Wikipedia-l mailing list titled "Developers should mind their own business," proposing that certain user rights be split into a separate group.
"Wikipedia should not be a technocracy, ruled by those with knowledge of computer systems," he wrote. "Wikipedia should be a democracy. Those in power should be accountable to the community at large, and ideally selected from and by the community at large."
Today, Starling describes that shift in power as having been a big deal at the time. "I was very conscious of the fact that I was designing a social system," he says. "As you can guess from that email, I was uncomfortable with the fact that the power to do so had somehow fallen to me, but I wanted to get it right."
He credits Sunir Shah, founder of MeatballWiki, for discussing "that change with me at length, as well as other changes at the interface of social policy and technical design."
It's unclear how much lasting impact this change had, given the WMF's rise to the top of the Wikimedia power structure, in large part because it controls the majority of developers and servers. In 2014, Möller instituted "superprotect," which allowed the WMF to protect a page from even administrators editing it. "[Möller's] idea was that it would be used in cases of conflict between the Foundation and the community, as a softer alternative to de-sysopping," Starling says. "When that conflict came, [Möller] asked me to make the necessary group rights changes. I said that I was uncomfortable putting my name to that on the wiki, so he found someone else to press the button."
Starling has a simple conclusion as to why the WMF has risen to the top of the power structure: Wikipedia lacks leadership. "I would like to see an elected editorial board with the mandate and courage to make major policy changes," he says. "Without such a body, WMF necessarily fills in the power vacuum, although it is too timid to do so effectively, especially on any question relating to the content."
The Most Controversial Move in Wikipedia History
Derek Ramsey originally wanted to create an article about a town he knew but couldn't come up with anything more than a couple of sentences. "So I came up with a solution: find a public domain mass data set that allowed a (somewhat) stub useful article to be created," he says. He imported census data tables into a MySQL database on his own Linux computer, correlated the data with other geographic sources, and after cleaning and validation, generated more than 3,000 text files for articles about United States counties, adding them to Wikipedia by hand.
"This was extremely tedious and slow, but effective," Ramsey says. "However, there were 33,832 cities, and that would have taken an order of magnitude longer to complete."
He first wrote a Java program to read each article and make HTTP requests to post it to Wikipedia, later coding in features like error checking, error correction, throttling, pauses for human verifications, and other features. Increasing the number of articles in Wikipedia by 40%, Andrew Lih later called it "the most controversial move in Wikipedia history" in his 2009 book The Wikipedia Revolution.
"I was bold and ignored all rules. You could still do that back then," Ramsey says. "After all, if I could edit articles manually, what difference did it make if I did the same thing automatically? It saved me time, but the end result was identical."
Out of all the controversy around mass article creation came two key things that Wikipedia still uses today. First, Ramsey created {{cite web}}, which most references on Wikipedia today use. "I wanted a generic way for Wikipedians to cite their sources easily, since prior to this the only citations made were manual and inconsistent," he says. "This was necessary because it was before the devs created native reference support in the Wikimedia software."
Second, Ramsey worked with other Wikipedians to develop an early bot policy. The initial version contained the contradictory statement, "In general bots are generally frowned [upon]." "I thought the concern was overblown, but the consensus was demanding that something be done to address the perceived issues," Ramsey says. "It was my desire to get ahead of the issue before draconian measures shut it all down, so I created bot policy as a sort of compromise. I figured it was better to do that than to have all bots banned wholesale."
Soon after, users started running bots and scripts under administrator accounts, dubbed "adminbots," to significant controversy. "There was a lot of hysteria surrounding adminbots on the English Wikipedia but a few people quietly ran them, as far back as like 2005," says Max McBride, a bot operator who previously ran adminbots.
McBride described people's attitudes on adminbots as "beyond reason" and suggested they were based on some sort of jealousy. "Like a random script gets admin rights and an admin gets two admin accounts, but not lots of regular users," he says. "I think that bred and fed some opposition."
Strong Security and the HTTPS Migration
Unlike many other prominent websites, Wikipedia hasn't suffered from an embarrassing, public security incident losing its users' private data. Some of this is due to a lack of collecting private data in the first place, but since the beginning there has been a strong culture of focusing on security.
"In the early days, the worst case scenario was irreversible destruction of large amounts of user work, since we didn't have the resources to make frequent backups," Starling says. "I spent a lot of time doing security reviews, and informed by that work, I wrote policies and improved our APIs and conventions."
He also credited Vibber with making key policy decisions for other MediaWiki installations (disabling uploads by default and having a non-web writable source tree), that ensured MediaWiki didn't become a "constant source of botnet nodes like some other PHP web applications."
But for readers and editors to visit the site securely required using a special secure.wikimedia.org gateway until native HTTPS support was rolled out in 2011 as an opt-in option. Then in 2013, whistleblower Edward Snowden revealed that the NSA was targeting Wikipedia users visiting the site over the default, unencrypted HTTP protocol.
Ryan Lane, former WMF Operations Engineer, says the Snowden leaks prioritized switching to HTTPS by default. "We knew some governments were spying on their users (the great firewall of China was well known for this, and they were sharing this tech with other governments), but the Snowden leaks showed that the government was explicitly targeting Wikipedia users," he says.
Kattouw worked on the MediaWiki-side of the HTTPS change, allowing for protocol-relative URLs to be used. "I think the [Site Reliability Engineering] people who worked on the HTTPS migration deserve more credit," he says. "That was a much more difficult migration than many people thought."
The politics involved in making the switch weren't the regular WMF vs. community ones, it was actual global politics. "For example, Russian Wikipedia asked us to implement HTTPS only (for all users, not just signed-in users) as soon as possible, as they wanted to head off Russian legislation that would have enabled per-page censorship, and it would have forced the government to choose between blocking all of Wikipedia, which was politically difficult, or dropping their aim at per-page censorship," Lane says. "This is why Russian Wikipedia got support before any other wiki (and more extensive support, at that). Chinese Wikipedia, on the other hand, asked us to delay rollout, as the Chinese government was already doing per-page censorship, and had previously blocked all of Wikipedia a number of times."
There's one large exception to this focus on security: the ability for users to create custom scripts and styles and share them with other users, on the wiki. In web development, this is typically known as a cross-site scripting vulnerability, but for Wikipedia it was a feature.
Fürstenberg created one of the most popular user scripts, Twinkle. He said it started as a helper for himself to "...conduct anti-vandalism and maintenance easier, from the point of reverting quickly, to the tedious task of filing reports to different sections. It pretty much boiled over from there."
Looking back, Vibber thinks the idea of user scripts is great, but implemented incorrectly. He said there are two primary problems: Running someone else's malicious code can lead to your account being taken over, and script code accesses internal data and methods that aren't going to stay stable, potentially breaking over time. "Both can be solved by using a sandboxed environment (probably a suitable iframe)," Vibber says. "I think there's a lot of cool stuff that can be built on top of this method, with full-on APIs for accessing an editor state as a plugin, or whatever."
Missed Opportunities and the Speed of Trust
At Wikimania 2012 and then in a Signpost op-ed, then-WMF Senior Designer Brandon Harris presented the "Athena Project," outlining a vision for what Wikipedia should look like in 2015. Suffice to say, that vision was never fully implemented, and Harris says he could write a book as to what went wrong.
"I'd say the primary reason was the fact that Foundation had a stellar lack of focus and a muddled leadership direction which allowed for lower-level political infighting to thrive," he says.
The reaction to Harris's proposal was generally mixed to negative, but that's what Harris was hoping for. "A thing a lot of people—even professional designers—don't understand about the design process is that only 10% of it is actually 'designing,'" he says. "Most of it is marketing: You have to understand the market you're designing for to know what to design and you have to convince folk that your design solves the problem. You may have to sell the idea that the problem even exists!"
Part of the purpose of his proposal was an exercise in re-examining the entire interface, something he said neither the WMF nor community do enough of. "Look at what happens, every day! Nothing has changed since 2015," Harris says. "The Foundation still doesn't know how to sell its ideas and it keeps trying to fix the same problems with the same tepid changes to the toolchain. The community still doesn't know how to govern itself and still keeps using the same broken processes to inadequately deal with the same issues."
McBride wrote an op-ed response to Harris, titled "Wikimedians are rightfully wary," expressing concerns about previous software deployments that didn't live up to their promise, like FlaggedRevs, which was supposed to solve the BLP problem. "It wasn't a proposed solution as much as it was the only 'solution,'" he says. "And a lot of people had pinned their hopes on it being successful, but I was more interested in it failing fast so we could move on and try other solutions."
After various trials, and years of RfCs, Flagged Revisions (now rebranded on the English Wikipedia as "Pending Changes") is barely used on BLP pages, unmaintained and no longer enabled on new wikis. "The BLP problem is definitely not fixed," McBride says. "And there's an enormous gap between current tech and what could be implemented to alleviate the problem."
In his op-ed, McBride questioned whether upcoming projects like VisualEditor would end up with a similar fate as FlaggedRevs. As it turned out, the rollout of VisualEditor and Media Viewer a few months later were both extremely controversial among Wikipedians (not to mention the separate but related issue of superprotect), something that Möller acknowledges in hindsight.
"In both cases, a more gradual rollout (probably adding at least 1–2 years to the release timeline for VE, and 6–12 months for MV) could have prevented a lot of pain and frustration," Möller says. "I take on my share of responsibility for that."
Both Möller and McBride independently brought up the same saying: "change moves at the speed of trust" (Möller credited Lydia Pintscher for teaching it to him). "For that principle to work in practice, an organization has to be prepared to let go of overly rigid timelines and commitments, because its commitments must always first and foremost be to the people whose trust it seeks to earn and keep," Möller says. "That doesn't mean it's impossible to make radical, transformative changes, but it can certainly feel that way."
McBride put it more bluntly. "Wikimedians don't like shitty software, they quickly embrace good software (think @pings or mass messages or...)," he says. "A lot of software is bad and is imposed on the communities without consultation or input. Of course people will dislike that and reject it."
Harris doesn't disagree. "...I think the primary reason is that editors are rightly concerned about impacts to their workflows, and the Foundation has been historically terrible about thinking about this and accounting for it," he says. "This is why I designed the New Pages Feed to work independently of the existing workflows and scripts that people had developed themselves."
Recognition and Legacy
Early on, Wales recognized key development milestones by giving developers their own holidays: Magnus Manske Day (January 25), Tim Starling Day (October 31) and Brion Vibber Day (June 1). "It isn't really clear who gets the credit now – whenever you step away, very few people remember what you did," Mituzas says. "Being recognized and rewarded by the community was definitely part of the motivation to keep on working."
Mituzas himself is remembered on the blame wheel, where he's responsible for 25% of Wikipedia's problems. "Sometimes it feels that the blame wheel is the only part that is left of any fame I had," he says.
Harris is likely the best known Wikimedia developer, having appeared on fundraising banners in 2011. "We had three 'storytellers' who interviewed a lot of us about why we were working there and they liked what I had to say. They took photos," he says. "Later one of them ended up being used as a test and performed fairly well. This became popular and weird because the internet is weird."
Unsurprisingly, there's a direct parallel to how credit operates on Wikipedia itself. "It's always fascinated me how much wiki editing has mirrored open source software contributions," McBride says. "In both, a lot of people making small suggestions and improvements are the ones who push the project forward."
Some of those names can be found on Special:Version or in the credits. Others might be in mailing list archives, forgotten bugs and long lost IRC logs, but their contributions nonetheless built Wikipedia into what it is today.
This article was written by Legoktm, a site reliability engineer for the Wikimedia Foundation, in his volunteer capacity.

Comments
Please log in or register to join the discussion