1. Linux at CERN 1.1. Introduction & Summary For physics computing, Linux is the operating system of choice at CERN.

It provides a variety of services on machines ranging from desktops to servers. CERN has its own Linux distribution and support infrastructure. In the future, Linux use will increase due to rising number of farm machines, and investment into automatized tools will be needed to administer such large number of machines with reasonable cost in manpower. 1.2. History

Linux is a UNIX-like POSIX-compliant operating system grown out of a study project of Finnish student Linus Torvalds in 1991 . Together with the utilities and compiler from the GNU project and other open source projects, it has since has evolved with the help of volunteers from all over the world to become a production-quality multi-architecture operating system (History) and user environment, subsumed here under the name of the kernel ("Linux"). CERN has been using Linux since at least 1995 when individual physicists groups started using it. As from 1997, CERN has had a centrally-supported Linux release, then based on Red Hat 4, with all the CERN tools and environments (AFS, ASIS, SUE) available as on the other vendor UNICes. Regular releases have followed the Linux evolution of the outside world, and CERN has actively contributed to the development (Gigabit drivers, porting GNU libc to IA64) as necessary to its needs.

Illustration 1different Linux environments Usage

Batch & Interactive Farms: ~1200 (in Computer Center)

'Special' servers

~200 Disk servers ~50 Tape servers Desktops / Desk-side: ~1400 Embedded Systems: (few now, lots later) 1.3. Linux use at CERN

a)Farms A large part of the Linux systems installed at CERN is used for computing through large batch farms running LFS. Linux supports this type of machine very well. e. It also is being used for reading mail or occasionally for web access.g. typically being preferred as soon as the number of system grows. the typical physicists computing job fits comfortably within a single dual-processor IA32 machine. CERN Linux comes with the required graphical user environment (both GNOME and KDE are available) and utilities. especially documents in proprietary formats (like Microsoft Word/ PowerPoint or Adobe Framemaker) have forced users to keep a second machine with Microsoft Windows (or to run VMware or dual- . Other special-purpose servers are also running Linux. Linux is used here in parallel with SUN Solaris. c)Desktops A number of physicists prefer to use Linux on their desktop or laptop computer for their day-to-day work like reading mail or web browsing. A smaller interactive cluster with SUN Solaris allows to validate code against a different compiler. Compatibility with CERN-IT central mail and web services is regularly checked. Access to the data from the application is handled via the SHIFT architecture or through CERN's mass storage application CASTOR. b)Special Servers Similar to the batch farms. AFS or NFS file servers or CERN's DNS service. Dual-Processor machines with commodity hardware and off-the-shelf networking offer a sweet spot in terms of price / CPU performance for this kind of workload. This independence between (a large number of) jobs running over extended periods of time has given rise to the term High-throughput computing (HTC). This behavior by users has brought up a number of problems in terms of interoperability in the past. a combination of Dual-Processor IA32 with (cheap) hardware RAID cards and IDE disk drives has offered reliable disk storage in the form of the "Disk Server". commodity hardware with Linux as the operating system can deliver significant cost benefits over proprietary solutions for storage or special-purpose server applications. At CERN. tape drives are directly attached via SCSI or FC to Linux "Tape Servers". Solaris typically runs at CERN on more reliable hardware for services where high availability of individual machines is required. An "interactive" cluster (LXPLUS) allows users without a desktop Linux machine to develop applications and submit them to the batch system. for example ORACLE database servers. like LXBATCH. so the farm machines are processing jobs independently without need for Single-System-Image abstractions or MPI. from XTerminals or for remote users. Similarly. Today.

open source programs like OpenOffice. vibration) or constraints (hard or soft real time processing). They are being used mainly by the accelerator controls and experiments "online" groups. if possible). to handle data under special conditions (radiation.3 Running Linux on the desktop is also a preferred solution for software developers. Due to growing familiarity with UNIX/POSIX. As of lately. we see a trend to run Linux on such devices as long as no hard real time guarantees are required.boot). since it allows them to be in complete control of the runtime environment (unlike on the shared farm machines). one of the paradigms for CERN Linux is to have the same operating system (including libraries and compilers) on desktops and farm machines (and embedded devices.org are getting better at understanding such formats. d)Embedded appliances These special-purpose devices used to run proprietary real time operating systems like LynxOs or VxWorks. but are no full substitute yet. heat. Illustration 2Dependencies inside CERN Linux 7. This allows for more comfortable development and debugging . To facilitate this approach.

They also put special demands on the Linux kernel internals. 1. This level is handled by an external company. the support for Linux is handled at multiple levels and in different groups: • Users with several machines (like the CERN-IT farms) typically have dedicated local support for the day-to-day running of their applications. From a system administration perspective. CERN Linux distribution and certification As mentioned earlier. kernel bug fixes. CERN Linux support AT CERN. assistance to the farm operations and documentation. due to new bugs being discovered or incompatible hardware. Whenever the goal of having a uniform system cannot be met (e. for example on context switching time or inter-task fairness. • eventually. In the future. Certification process is tracked.4. a final decision to adopt or reject lies with this coordination group. No commercial distribution fulfills all CERN's needs and runs on all the hardware found at CERN. support calls may be opened with a vendor. the influence of GRID computing will bring in more requirements from other sites as well. A "certification coordination" group (LXCERT) with appointed members from the large Linux user groups and service providers inside CERN oversees the process and is responsible for bringing up and arbitrating user requirements and dependencies between different applications. both of which can trigger a kernel update). workarounds to common problems. 1. for diskless booting) or drivers (for VMEbus devices).g. • An in-house third-level support handles everything else. including preparations of new releases. like desktop installations. a formal process has been established for moving to a new release. the goal is to use a single Linux release to cover all aspects of Linux use at CERN. such deviations are noted and are folded into the next release. In order to ensure that no requirements on a new release are overlooked. This level is handled by an external company. • a second level will deal with recurring user problems and gives individual assistance to users. but they may need special supporting services (e. Given that .g. such machines are very close to ordinary off-the-shelf PCs as used in the batch farms. so CERN has been providing a modified version of Red Hat. divisions) also have local support to handle direct user questions • CERN-IT offers centralized support: • the CERN Helpdesk will take calls and re-route them appropriately. • Large user groups (experiments. The modifications include additional kernel patches.5.(instead of the cross-compiling development proprietary environments). both to comply with user expectation and to keep support effort down. new software like OpenAFS and CERN-specific physics software and management applications.

both technical (new services to be defined and implemented) and political (interoperability between sites perhaps leading to a HEP-wide Linux distribution.g. access to remote resources). LCG and EGEE. they could be integrated with or ultimately replace CERN's own storage solutions. PCI-X. the online event filter farms will grow massively. Similarly. such tools are currently under development as part of the "fabrics" efforts of EDG. . Lastly. The large (noncommercial) user and developer community is meanwhile proceeding with adding features (the 2. 10Gb Ethernet. so the current computing model is likely to be useful in the future. Outlook The current assumptions about physicists' jobs still seem to hold. making life harder for the copycat distributions like CERN Linux which in the past have profited from "free" software updates. a number of (relatively) low-cost storage solutions have appeared that could offer advantages over the current "Disk-Server" model used by CERN. the number of CPU nodes in batch farms will increase to cope with the massive amount of computational power required for LHC. This could lead to new approaches for experiment data analysis or the storage subsystem.6 kernel is expected soon). new developments in the interconnect area such as InfiniBand. IBM and SUN have all embraced Linux and are offering commercial support. Third-party hardware and software vendors like ORACLE. could be "enabling" technologies by providing cheap high-bandwidth and low-latency connections between CPU nodes themselves and storage.CERN has no support contract for the majority of Linux machines. As part of CERN's OpenLAB industry collaboration). and creating new software as often as abandoning older products. The various GRID projects bring in new challenges. often enough in uncoordinated fashion. these calls are typically used to inform the Linux user community and may not be resolved for considerable time. and for better scalability by keeping data and metadata on separate services. PCI-Express. Similarly. At the same time. This growth requires new farm management tools to prevent operational costs from exploding. RDMA. Therefore. Typically they provide for direct data access by the clients. 1. Such solutions are being evaluated at the moment (e.6. the Linux world itself is changing – vendors like Red Hat or SuSE are now concentrating more on profitable bits of their business.

Sign up to vote on this title
UsefulNot useful