Introduction
Started as Perl CGI script running on single server in 2001, site has grown into distributedplatform, containing multiple technologies, all of them open. The principle of opennessforced all operation to use free & open-source software only. Having commercial alterna-tives out of question, Wikipedia had the challenging task to build efficient platform of freelyavailable components.Wikipedia
’
s primary aim is to provide a platform for building collaborative compendium ofknowledge. Due to different kind of funding (it is mostly donation driven), performance andefficiency has been prioritized above high availability or security of operation.At the moment there
’
re six people (some of them recently hired) actively working on inter-nal platform, though there
’
re few active developers who do contribute to the open-sourcecode-base of application.The Wikipedia technology is in constant evolution, information in this document may beoutdated and not reflecting reality anymore.
The big picture
Generally, it is extended LAMP environment - core components, front to back, are:•Linux - operating system (Fedora, Ubuntu)•PowerDNS - geo-based request distribution•LVS - used for distributing requests to cache and application servers•Squid - content acceleration and distribution•lighttpd - static file serving•Apache - application HTTP server•PHP5 - Core language•MediaWiki - main application•Lucene, Mono - search•Memcached - various object cachingMany of the components have to be extended to have efficient communication with eachother, what tends to be major engineering work in LAMP environments.This document describes most important parts of gluing everything together - as well asrequired adjustments to remove performance hotspots and improve scalability.
Wikipedia: Site internals, configuration, code examples and management issues
Domas Mituzas, MySQL Users Conference 2007
2
Add a Comment