Started as Perl CGI script running on single server in 2001, site has grown into distributedplatform, containing multiple technologies, all of them open. The principle of opennessforced all operation to use free & open-source software only. Having commercial alterna-tives out of question, Wikipedia had the challenging task to build efﬁcient platform of freelyavailable components.Wikipedia
s primary aim is to provide a platform for building collaborative compendium ofknowledge. Due to different kind of funding (it is mostly donation driven), performance andefﬁciency has been prioritized above high availability or security of operation.At the moment there
re six people (some of them recently hired) actively working on inter-nal platform, though there
re few active developers who do contribute to the open-sourcecode-base of application.The Wikipedia technology is in constant evolution, information in this document may beoutdated and not reﬂecting reality anymore.
The big picture
Generally, it is extended LAMP environment - core components, front to back, are:•Linux - operating system (Fedora, Ubuntu)•PowerDNS - geo-based request distribution•LVS - used for distributing requests to cache and application servers•Squid - content acceleration and distribution•lighttpd - static ﬁle serving•Apache - application HTTP server•PHP5 - Core language•MediaWiki - main application•Lucene, Mono - search•Memcached - various object cachingMany of the components have to be extended to have efﬁcient communication with eachother, what tends to be major engineering work in LAMP environments.This document describes most important parts of gluing everything together - as well asrequired adjustments to remove performance hotspots and improve scalability.
Wikipedia: Site internals, conﬁguration, code examples and management issues
Domas Mituzas, MySQL Users Conference 2007