College of Engineering & Computer Science Computing Infrastructure Upgrade & Replacement

Mark Stanislav Engineering Computer Services (ECS) University of Michigan - Dearborn mstanisl@engin.umd.umich.edu August 2007 - January 2008
Abstract This document will cover the computing infrastructure upgrades and replacements done for the the College of Engineering and Computer Science at the University of Michigan, Dearborn campus. The contents of this document are limited to the information allowed for general consumption and aspects of certain server configurations or otherwise may have been withheld. The scope of this project was vastly encompassing with an almost complete replacement of all major infrastructure components for the department and its resources. The College of Engineering and Computer Science (CECS) contains five major departments and at the time of implementation, about 3,300 students, staff, and faculty within it. The plan of this project was to upgrade all existing integral components of the computing infrastructure with minimal downtime and end-user inconvenience. Outlined in this document will be not only a point of view into the decisions made for the upgrades and replacements, but also technical information relating to how technologies were implemented and notable aspects to each configuration. The project timeline noted above is for the initial project planning, all testing, and commencing in January 2008, the launch of the new computing resources.

1

Contents
1 Existing Infrastructure 1.1 Authentication . . . . . . . . 1.2 Network File Storage . . . . . 1.2.1 File System Quotas . . 1.3 E-Mail Services . . . . . . . . 1.3.1 SMTP . . . . . . . . . 1.3.2 IMAP . . . . . . . . . 1.3.3 Web Mail . . . . . . . 1.4 Network & Service Monitoring 1.5 Remote Login Cluster . . . . 1.6 User Account Management . . 1.7 PC Client Authentication . . 1.8 UNIX Lab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 5 6 6 6 6 7 7 7 8 8 8 9 9 9 9

2 Existing Infrastructure - Issue Summary 2.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 General Security & Management . . . . . . . . . . . . . . . . . . . .

3 Infrastructure Replacement - Overview 10 3.1 Scope Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2 Project Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4 Authentication 4.1 Overview . . . . . . . 4.2 Architecture . . . . . 4.3 LDAP . . . . . . . . 4.4 Kerberos . . . . . . . 4.5 Major Improvements 5 Network File Storage 5.1 Overview . . . . . . . 5.2 Architecture . . . . . 5.3 OpenAFS . . . . . . 5.4 Major Improvements 6 E-Mail Services 2 11 11 11 11 12 12 13 13 13 13 13 14

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8

Overview . . . . . . . Architecture . . . . . SMTP . . . . . . . . IMAP . . . . . . . . Webmail . . . . . . . Mailing Lists . . . . Additional Features . Major Improvements

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

14 14 14 15 15 15 15 16 16 16 16 17 17 17

7 Network & Service Monitoring 7.1 Overview . . . . . . . . . . . . 7.2 Nagios . . . . . . . . . . . . . 7.3 Cacti . . . . . . . . . . . . . . 7.4 Snort . . . . . . . . . . . . . . 8 Remote Login Cluster

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

9 User Account Management 18 9.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 9.2 User Account . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 9.3 Administrator Account . . . . . . . . . . . . . . . . . . . . . . . . . . 18 10 PC Client Authentication 11 UNIX Lab 12 Data Backups 19 19 20

13 Infrastructure Details 20 13.1 Operating Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 13.2 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 13.3 Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 14 Implementation 14.1 User Credentials Migration 14.2 User Data Migration . . . 14.3 Service Cut-Over . . . . . 14.4 Goals Reached . . . . . . . 14.5 Needed Improvements . . 14.6 Miscellaneous . . . . . . . 21 21 22 22 23 23 23

. . . . . .

. . . . . .

. . . . . .

. . . . . . 3

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

15 Appendix 15.1 New Infrastructure Servers . . . 15.2 Screenshots . . . . . . . . . . . 15.2.1 my.engin - User Main . . 15.2.2 my.engin - Administrator 15.2.3 Cacti . . . . . . . . . . . 15.2.4 Nagios . . . . . . . . . . 16 References 16.1 General . . . . . . . . . . . . 16.2 Authentication . . . . . . . . 16.3 Network File Storage . . . . . 16.4 E-Mail Services . . . . . . . . 16.5 Network & Service Monitoring

. . . . . . . . . User . . . . . .

. . . . . . . . . . . . . . . . . . . . . Diagnostic . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

24 24 25 25 26 27 27 28 28 28 28 28 28

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

4

1

Existing Infrastructure

The majority of existing infrastructure was running from very outdated and somewhat failing Sun Microsystems hardware (generally Sun Ultra or Enterprise hardware from approximately 8 years prior). Most of the servers were installed with Solaris 8 which was originally released in February 2000. Hardware was rarely configured with redundancy by way of disks, backups, or fail-over. Most existing infrastructure was dependent on the single machine which operated as both data storage and application server. The following is an overview of the previous infrastructure that was in place before upgrades and replacements were made. Its inclusion is useful to see the overwhelming scope of the project as well as noting the major improvements from the deprecated software and hardware in place previously.

1.1

Authentication

The original authentication system implemented by ECS was a standard Network Information Service (NIS) configuration, supported by a single master server which passed information to subnetwork specific slave machines. NIS (and not even the more recent, NIS+) had been implemented within the department nearly a decade ago. NIS offers little security (DES passwords) and has no real purpose in a modern computing infrastructure. NIS is long outdated but has a particularly common inclusion when used along side older version of NFSv3.

1.2

Network File Storage

NFSv3 (RFC from 1995) was implemented as the network file sharing mechanism for all data that required a network share. Included in this data were user’s personal files and e-mail, departmental web sites, lab software, and faculty and staff information. In modern network storage and sharing, the abilities of NFS are far surpassed by current technologies such as NFSv4 and OpenAFS. As previously mentioned, NFSv3 is commonly used with NIS and was a standard configuration from nearly a decade ago.

5

Actual file storage for the network was accomplished with a mash-together of a few fiber enabled storage arrays that were drastically misused for their adequate space, as well as some local disk space. The allocation of disk space hindered the ability for easy expansion and wasn’t configured for the usage of hot-spares in case of disk failure. Generally speaking, a failing RAID would leave users without resources. 1.2.1 File System Quotas

Users generally were given a very small (20MB) file quota, which was often not enforced by the system, allowing for abuses of disk space with the potential for a Denial of Service (DoS) on computing resources if an account was compromised.

1.3

E-Mail Services

Of note, e-mail quotas were also improperly managed much like the general file system. E-mail data was stored within accessible reach of users, allowing for potential security risks and the potential for privacy violations. In addition, there was previously no simple way to request e-mail aliases for an account, check to see quota information, or even set vacation messages. 1.3.1 SMTP

The incoming mail server was based on Sendmail and resided on a single server acting the mail gateway for both incoming and outgoing e-mail connections. The existing build of Sendmail was approximately three years old and severely undermaintained. No ability for SMTP connections existed outside of the local network and no SMTP-SSL/TLS connection was available to ensure e-mail was sent securely from the originating sender to the server. No redundancy was available if the SMTP server went down so users were unable to send and receive e-mail if a failure occurred with the machine. 1.3.2 IMAP

Again, a single server was available to serve IMAP requests from users. All connections to check a user’s e-mail had to process through the same server which would 6

often lock-up at seemingly random times, most likely due to a hardware failure in system cooling or RAM. Within the project time-span, the IMAP server machine failed nearly 20 different times. The IMAP server was running a severely outdated version of UW-IMAP with improperly configured IMAP-SSL/TLS. 1.3.3 Web Mail

ECS had not previously supported their own webmail system as it was provided through the university’s technology department, ITS. This webmail interface, while functional, was not maintained by any staff member of ECS and its availability was outside the control of the department. Any webmail issues were unable to be fixed by ECS staff, providing confusion as to whom to contact when an error would occur.

1.4

Network & Service Monitoring

At the time of project start, ECS had no previously established network or service monitoring software in place. Server or application downtime was generally discovered by end-users and ECS would be alerted through a help desk ticket. Metrics of computing resources (network bandwidth, disk storage, service uptime) were unknown to the staff and their reliability and patterns of usage were only able to be guessed by staff as to how machines and services were preforming.

1.5

Remote Login Cluster

To provide members of CECS with access to remote computing resources in the form of a UNIX-like OS to perform homework on or remotely check their e-mail via the command line, ECS had implemented a three-server login cluster which would provide this access. At any given time, at least one of the three servers would be down due to their antiquated hardware failing. The inability to keep all three servers online caused issues for people trying to connect to a server they usually did and having it be down and inaccessible for extended durations.

7

1.6

User Account Management

A largely unorganized PERL script was responsible for command-line administration of user accounts. Tasks such as adding and deleting users was done through a lacking switch-based script that offered little information to the administrator and made managing of information tedious and confusing. As a result, hundreds of accounts that were largely unused sat dormant providing a security risk as well as wasted storage.

1.7

PC Client Authentication

The usage of a pGINA provided a way for department Windows machines to authenticate against the existing UNIX-based infrastructure. While this worked for simple authentication, the need to patch a major component of Windows and usage of unreliably maintained plugins to function makes it a last resort. No network file storage was available, as Windows network shares weren’t mounted for users upon logging in. This inconvenience largely made network storage usage not attractive to end-users, providing a higher usage of local machine storage. While the usage of a desktop client to store files is certainly reasonable, it also provides a single point of failure for that data and is unavailable at any other machine.

1.8

UNIX Lab

The ECS UNIX computing lab was running a Solaris 8 image from approximately 5 years ago, providing no actual friendly GUI for users or upgrades in software versions that had been available for years. Also, machines were rarely patched for security updates, making machines an easy target for a malicious user to take over 30 clients at one time if a vulnerability was found. Reliability of authentication was sporadic as only one server could authenticate users, providing a single point of failure for logging in.

8

2
2.1

Existing Infrastructure - Issue Summary
Hardware

• Primarily based around hardware that was generally 5-8 years old • Little usage of RAID for basic data redundancy • Missing hot-spares on RAID arrays • Mismanagement of available disk storage resources • Network interfaces all 10/100mbit on a 1Gbps network • Some hardware only supporting serial connections for administration • No redundancy of critical hardware (e-mail, authentication, network storage)

2.2

Services

• Insecure or feature-outdated versions of all critical services • Lack of ECS administrated web mail • No proper Windows domain with network file access • Network printing services not adequately setup • Mail, authentication, e-mail services not redundantly configured

2.3

General Security & Management

• No services/hardware monitoring • Lack of usable OS software package management • Deprecated interface for user management • No implemented reliable data backup procedure • No firewalls configured • No backup procedure for critical system configurations 9

3
3.1

Infrastructure Replacement - Overview
Scope Limitations

The project as defined was to update the core components of the infrastructure and any related entities that made sense to deal with at the same time. For instance, SQL database servers were not upgraded at this time as they are autonomous. However, network printing (CUPS) was replaced as it was related to the configuration and deployment of a proper Samba-based Windows domain. This project did not comprehensively replace the whole of ECS resources but the vital aspects to which other services could work around and with. By providing a current and functional infrastructure, the ability to add new servers becomes much more trivial and easily managed.

3.2

Project Goals

• Replace old or failing hardware with new machines • Switch from Solaris to Linux (Debian GNU/Linux, Gentoo Linux) and FreeBSD • Replace deprecated service with more current and advanced ones • Tightly integrate Windows PCs and the UNIX lab for maximum usability • Provided usable interfaces for administrators and end-users to manage accounts • Monitor systems and provided a reliable and easy back-up procedure • Keep security at the forefront of configurations and policies • Thoroughly focus on making e-mail as stable and featured as possible • Reallocate disk usage to maximize disk space and logically organize it • Prevent the upgrades from harming any remaining ’old’ infrastructure • Transition users easily and make them want to use the new infrastructure!

10

4
4.1

Authentication
Overview

Authentication has transitioned from one authentication mechanism into two. NIS was replaced with a combination of LDAP and MIT Kerberos V. The usage of two authentication mechanisms actually makes things less complex by providing two mechanisms that can be leveraged by services that don’t support the other. For instance, the usage of Kerberos with the OpenAFS cell provided ease of integration. On the other side, LDAP provides us with a way to not only authenticate users, but manage them.

4.2

Architecture

The previous usage of a single primary NIS server that would distribute to subnetworkspecific slaves has been replaced with two servers which provide full network coverage, sharing the load of queries using round-robin DNS. While there is still a single primary server for both LDAP and Kerberos, each server can be quickly changed to become the master server in case the original master is down for an extended period of time. By utilizing DNS-based load-balancing, we can help ensure that each server bears some of the load from authentication, address book look-ups, and any other related queries that involve LDAP or Kerberos.

4.3

LDAP

By utilizing LDAP we can store not only a username and password, but also information such as what department they work for (or are a student in), any e-mail aliases they have, paths to their network file share, path to their Samba share, what groups they belong to (for file access or organization), and even their e-mail vacation reply settings. LDAP provides network access to a myriad of needed information for user accounts in an easily managed database that is a generally widely supported authentication 11

mechanism. NIS had no such extended functionality as LDAP, but also, LDAP offers secure password storage in the database with ACLs restricting information from those who shouldn’t see it. In addition to the uses mentioned above, LDAP is the authentication mechanism for all UNIX/Linux clients as well as the authentication backend for Samba (Windows domain). Also, it provides the address book functionality to all e-mail clients so that users can look-up other people within the scope of CECS to e-mail. By having a central place for a user’s information we eliminate excessive and scattered storage across multiple servers.

4.4

Kerberos

Kerberos primarily allows us to integrate easily with the OpenAFS cell for network file server access but also helps us integrate with different networks. For example, the usage the Kerberos configuration provides easy access to available University of Michigan, Ann Arbor network file resources in addition to the ones that ECS provides. Kerberos can also provide an authentication interface for many services and helps ensure easy future expansion of networking resources. By utilizing Keberos, Windows PC, UNIX lab, and login cluster users are all able to access network files merely by logging in. The token generated by Kerberos provides the required access to those files, as well as the ability to log into other ECS servers without the need to re-authenticate.

4.5

Major Improvements

• Securely replicated data between redundant servers • Two authentication mechanisms providing vast compatibility with services • Current and reliable technologies that will be around for 5+ years • Organized user account information and settings storage to single database • Provide cross-network integration for remote-file access (Kerberos + OpenAFS) • Storage of passwords no longer in deprecated and easily crackable format 12

5
5.1

Network File Storage
Overview

OpenAFS has replaced NFSv3, providing a huge upgrade in not only technology but also reliability. OpenAFS is widely used in academia and elsewhere, providing smooth integration with Keberos based networks. OpenAFS has both database and fileservers. Database servers provide information as to what files are stored where. Fileservers contain the actual files.

5.2

Architecture

The OpenAFS cell (the collection of all of the OpenAFS resources) involves four database servers and two file servers. There are four database servers to provide supreme redundancy in the case of failure. Two OpenAFS file servers provided access to over 4TB of RAID-5 data storage, spanning across two RAID storage arrays and local disk space allocated on an internal RAID array.

5.3

OpenAFS

OpenAFS provides a coherent namespace for all file storage and allows you to manipulate file storage resources as needed. By using OpenAFS, data stores are able to be logically named (such as user.mail or user.home) and altered as needed for sizing or naming requirements. OpenAFS gives us certain other advantages such as: tight integration with Kerberos; hard filesystem quotas; ease-of-scalability; redundancy; manageability.

5.4

Major Improvements

• Redundant file storage that easily scales and allows reorganization • Simple integration with Solaris, Linux, Mac OS, and Windows clients • Reliable and well evaluated technology being widely utilized currently • Provides added security through highly customizable ACLs of volumes 13

6
6.1

E-Mail Services
Overview

E-mail services were one of the most heavily affected aspects of this infrastructure upgrade as they generally lacked a lot of simple functionality that a modern email system should possess. While e-mail isn’t as directly critical as something like authentication or file storage, it is probably the most important end-user service that ECS provides. Great effort was taken to make sure that the e-mail system was robust and brought users a proper e-mail system for 2008 and beyond.

6.2

Architecture

Two servers were utilized in the configuration of e-mail services. There is essentially full parity between the two machines, meaning that the services and configurations on both servers are the same. The mail servers utilized OpenAFS file space for mail storage, providing a central location for e-mail to be read/written to, allowing for many servers to access that data without a separate multi-interface RAID storage device. Round-robin DNS load balancing is done for incoming IMAP connections and MX record based access is done for incoming SMTP connections.

6.3

SMTP

The replacement for Sendmail was Postfix. It provides for more-easily customizable configurations and integrations. LDAP provides the authentication backend for SMTP, now allowing users to authenticate to the e-mail server to ensure that it doesn’t allow just anyone to send mail through it. Also added was an SSL certificate providing secure communication between the end-user and e-mail server. More than ever, users can now feel secure when sending e-mail through ECS mail servers. Virus and spam filtering was also added so that users can deal with less security threats and annoyances in their daily usage. Incoming and outgoing e-mail is scanned to help prevent users from being attacked and also from the ECS network from being blacklisted for sending dangerous e-mails.

14

6.4

IMAP

The existing IMAP service was replaced with Courier-IMAP which has a very strong security history and is widely used. E-Mail quotas were properly implemented with Courier, helping ensure that e-mail usage remained under control unlike the previous system. IMAP-SSL connections are also correctly configured now so that users can again feel safe that the data transmitting across a public network is secure. LDAP is also used to authenticate IMAP users, as it is with SMTP.

6.5

Webmail

Previously missing from the scope of the ECS infrastructure was a webmail system. To help ensure the usage of webmail by CECS users, two webmail systems featuring different strengths were implemented. RoundCube provides a very interactive, almost desktop-application feel to webmail. SquirrelMail on the other hand is a very simplistic webmail client with no visual features, but has been proven reliable for almost a decade. Each e-mail system features an address book (pulled from LDAP), information about e-mail quota usage, and ease of integration with the e-mail system. Both webmail systems authenticate against the IMAP servers that run locally on the servers.

6.6

Mailing Lists

Upgraded from the previous infrastructure was a new version of Mailman. Mailman is the foundation of all CECS mailing lists and the newest version provides security and usability enhancements over the previous. While not a major change, mailing lists were re-created and a method to populate mailing lists with reliable user databases was made through the usage of LDAP records.

6.7

Additional Features

Vacation reply messaging, e-mail alias requests, and quota information was not previously readily available to users or administrators. A web interface has been given to not only set vacation message information, but request quota increases & e-mail aliases simply. These will be mentioned later on regarding ‘my.engin’. 15

6.8

Major Improvements

• Working e-mail quotas that can be requested to be increased and viewed easily • Functional virus quarantining and spam filtering • Two webmail interfaces to help appease different user requirements • Redundant e-mail servers utilizing DNS round-robin/MX record balancing • Securely authenticated SMTP/IMAP transactions w/ off-campus SMTP access • Updated mailing list software and realiable list management • Centrally-stored mail protected by ACLs within OpenAFS

7
7.1

Network & Service Monitoring
Overview

As previously mentioned, the prior ECS infrastructure had no service/server monitoring in place. To fulfill that need, a combination of Nagios and Cacti were used to help not only monitor that a server or server is online, but also monitor how the resources of that server is being used, and when. SNMP is used for the Cacti monitoring and has been installed on all servers.

7.2

Nagios

Nagios provides per-server system monitoring of critical system services as well as watching for system metric thresholds to be broken. For instance, if a disk reaches a certain point (such as 80% full) Nagios will send an e-mail, page, and AIM instant message alerting the system administrators that a problem has occurred. This kind of monitoring helps to ensure that the technical staff is the first one to know of a problem, not the last. Each server has an individual set of monitored services and will even alert administrators when package upgrades are needed. Full customization of who, when, and why allow Nagios to provide administrators only the alerts that are critical. 16

7.3

Cacti

By utilizing SNMP, Cacti can monitor trends in server metric usage, such as: bandwidth; memory usage; disk usage; and load averages. Cacti is a great resource to monitor server trends to find out when you have spikes in load averages or memory usage to help find problems with scripts or services you may be running. With Cacti, you can not only find out if a server is being overtaxed but also help catch problems that may arise in the near future before Nagios needs to alert you at a critical juncture.

7.4

Snort

As an added protection, Snort helps to monitor the network for attacks. By using Snort we can help detect if a user on an ECS subnet is port scanning other computers or trying to perform a potential attack from ECS computing resources. Using Snort helps ensure that we know that ECS network clients and servers aren’t being used to attack any other campus machines, or worse, machines outside of the ECS network.

8

Remote Login Cluster

While not a major component of the critical network infrastructure, the login cluster is utilized by students, staff, and faculty for tasks such as code compilation, editing their files, and checking their mail. Previously, old hardware and Solaris 8 proved to cause many problems for these simplistic tasks. The new login cluster provides PC machines as servers and run Debian GNU/Linux rather than Solaris. They are more reliable and easier to maintain than the previous Sun Ultra 1 machines running Solaris. These new machines also feature more software than previously available to endusers. Customization to system-wide configuration files has helped to make tasks such as checking e-mail even easier. Also, because resources are more organized with the OpenAFS and how authentication works, replacing a login cluster machine is much easier than before.

17

9
9.1

User Account Management
Overview

As previously mentioned, the existing method for managing user accounts was done through a command-line PERL script. The replacement for this is a completely new PHP-based web application that gives greater control over the user. Not only can administrators add, edit, and delete users, but users can also login and see their details and set options related to their account.

9.2

User Account

Any user that has an account with ECS is able to log into the ‘my.engin’ website and manage their account details to a certain extent. Features of the ‘my.engin’ web site for a user include: • Configuring vacation auto-reply settings • Set-up e-mail forwarding • Request e-mail aliases & e-mail quota increases • Change the account password (updates LDAP, Kerberos, Samba all at once) • View account information (quota usage; e-mail aliases & forwards; web address)

9.3

Administrator Account

The administrative users for ‘my.engin’ are able to perform many user management tasks, including those not previously available to simply do through the old command-line PERL script. Available to administrators are the abilities to: • Add/delete users (verified against the student Oracle database) • View diagnostic information users to verify their configuration and settings • Check for user quotas that are approaching a high threshold • Reset user passwords • Edit user quotas, e-mail aliases, vacation status, forwarding, and more 18

10

PC Client Authentication

Prior to infrastructure upgrades, PC Windows authentication was done through a pGINA plugin that would contact NIS. Through the usage of Samba, a valid Windows domain is created that authenticates against LDAP. By integrating Samba with OpenAFS and Kerberos, per-user Windows home directory mounting is accomplished. This could also be handled by utilizing the Kerberos and OpenAFS clients for Windows. CUPS is one of the services updated on the periphery as it was directly related to creating a Windows domain. CUPS integrates well into the domain, providing loadbalanced printing in labs where appropriate. With the usage of network printing and network file storage, the new infrastructure gives any generic Windows client the ability to be fully integrated into the new infrastructure. Windows domain administrators are organized through LDAP for easy configuration. All computers that authenticate against Samba are also added to LDAP automatically.

11

UNIX Lab

The Solaris UNIX lab received a major software upgrade even though the actual hardware remained the same. Not only did the actual UNIX client image upgrade from Solaris 8 to 10, but it also was fully integrated with Kerberos & LDAP. When a user logs into the UNIX lab, it authenticates against LDAP and then retrieves a Kerberos token so that OpenAFS is mounted as their home directory. The UNIX lab provides the last component of a fully integrated environment, allowing a user to seamlessly access their same files from a Solaris lab machine, Windows PC client, Linux remote login cluster machine, or even a Mac OS machine they brought from home (using appropriate software). The machine that images machines was also upgraded and the UNIX lab is now a fully Solaris 10 configuration. The upgrades with the image outside of integration into the new infrastructure included basic package management as well as new GUI interfaces for users to select from to have a better experience.

19

12

Data Backups

A briefly mentioned goal was to implement data backups within the new infrastructure as the existing one had none. An unused tape-changer became the cornerstone of the new backup system. By using the built-in functionality of OpenAFS to do full and incremental backups, tapes are utilized to do weekly full backups and then daily incrementals until the next full. Configuration and system critical files for each server are also committed daily to a subversion repository on the network monitoring server. The backups of these files help to ensure that if a server crashes, a new server can be built and configured in a minimal amount of time. Tape backups with the added redundancy of all RAID-5 configurations for major data storage points help to ensure that data is rarely ‘lost’. OpenAFS easily restores volume backups into place, allowing for an accidental mail deletion from the week before to not affect a user dramatically. Data tape backups are not kept in the same building as the servers for obvious security and threat reasons.

13
13.1

Infrastructure Details
Operating Systems

The majority of new machines are all running Debian GNU/Linux. Apt provides simple binary package management and a minimalist installation, providing an easy choice for servers that have very specific service needs. The primary OpenAFS fileserver runs Gentoo Linux because of initial problems getting a working OpenAFS fileserver running on Debian GNU/Linux. FreeBSD was used on the network monitoring server as it provides a reliable and secure configuration that is trusted to be a hub of information about all of the servers. The selection of FreeBSD was one of familiarity and not due to overwhelming technical reasons other than security enhancements found through TrustedBSD extensions.

20

13.2

Hardware

The majority of replacement servers are HP Proliant servers. Their configurations vary depending on the requirements of the server, but each server generally has 80GB of disk space, 1-2GB of RAM, and gigabit ethernet. The OpenAFS fileservers are dramatically more high-end machines, featuring fiber channel cards, multiple gigabit-ethernet cards, hundreds of gigabytes of internal storage, etc. All machines run Pentium 4 or Xeon processors.

13.3

Connectivity

As mentioned, gigabit ethernet was one major upgrade to each new machine from the previous 10/100mbit configuration. The main OpenAFS file server also has multiple SCSI cards for connectivity to the tape changer, as well as expandability for more storage. It also possess a fiber channel card for connectivity to a RAID-5 Sun storage array. Currently, only one ethernet interface is used, but any additional ones could support direct server-to-server file server connectivity. Of note, the network monitoring server also has an analog modem in the machine so that the added ability of SMS could be implemented to alert administrators of problems.

14
14.1

Implementation
User Credentials Migration

One of the largest parts of this infrastructure upgrade was migrating users from an older authentication system to the new ones (Kerberos/LDAP/Samba). Part of the challenge of this procedure was to make sure that the passwords they previously had in the lax restrictions of DES and NIS would be usable in the new system. As a way to help publicize the transition as well as have users create new passwords if needed, a web site called ‘migrate.engin’ was created. This web software essentially verified whether or not a user existed, and if they did, forced them to create a new password if their current one wouldn’t meet a realistic set of requirements.

21

By forcing users to migrate it helped to figure out who had active accounts and who did not. It also made the idea of obtaining current passwords a lot simpler than waiting for users to arbitrarily login to a service and eavesdrop to essentially ‘steal’ the password. The migration effectively had a vast majority of active users hold their spot (so to speak) for the new infrastructure before the dead-line that was imposed upon them.

14.2

User Data Migration

The old and new mail systems were incompatible so a script had to be written to take a list of users, archive their existing mail, convert it into a usable format for the new mail server, and transfer that data onto the new server. Also, all user home directory data had to also be transferred so that data wouldn’t be missing that a user expected after the transition. One aspect of home directory files intentionally skipped where ‘dot’ files (files beginning with a period, usually related to configuration files). By not archiving ’dot’ files, we helped to ensure their old settings wouldn’t mangle their settings with the new infrastructure. A script to convert e-mail data and move both the converted e-mail and home directory data was deployed and successfully executed during the transition downtime.

14.3

Service Cut-Over

By launching the new infrastructure at approximately 5am on a Sunday morning, there was an extreme mitigation of user interruption. Any interruptions were likely due to auto-connecting IMAP clients left open. As new infrastructure was brought online and tested, data was transferred where needed. Once all hardware and configurations were in place, DNS was transferred to the new servers and the migration from the old infrastructure components to the new ones was completed.

22

14.4

Goals Reached

All initial goals were accomplished through the implementation of the new infrastructure. Users had a very positive view of the new services and upgrades offered by ECS and felt very little discontent with any procedures they had to endure. Great lengths were taken to help ensure simple aspects like making sure e-mail forwards existed in the new infrastructure as they did in the old one helped provide a smooth transition. While there were tweaks to be made, the overall scope of the project went flawlessly when implemented, even exceeding expectations first hoped for.

14.5

Needed Improvements

As mentioned before, some services were not upgraded as they weren’t required to be related to this infrastructure upgrade. Web servers, SQL database servers, and other stand-alone appliances still need to be integrated formally and upgraded for compliance in the new infrastructure. The needed improvements are generally related to usability, reliability, and security.

14.6

Miscellaneous

Documentation related to back-up procedures, risk assessment, policies & procedures, and administrative notes were also created in the same time-frame as these infrastructure updates. The usefulness of these documents are related to how well they are maintained in the future. General server installation documents were created for each server for emergency recovery situations.

23

15
15.1

Appendix
New Infrastructure Servers

joust.engin.umd.umich.edu - Debian GNU/Linux OpenAFS DB Server, LDAP Primary Server, Kerberos Primary Server breakout.engin.umd.umich.edu - Debian GNU/Linux OpenAFS DB Server, LDAP Secondary Server, Kerberos Secondary Server pitfall.engin.umd.umich.edu - FreeBSD Cacti, Nagios, Snort, Subversion gravitar.engin.umd.umich.edu - Gentoo Linux OpenAFS DB Server, OpenAFS Fileserver, Personal Web Files adventure.engin.umd.umich.edu - Debian GNU/Linux Postfix, Courier-IMAP, Apache (webmail) klax.engin.umd.umich.edu - Debian GNU/Linux Postfix, Courier-IMAP, Apache (webmail) threshold.engin.umd.umich.edu - Debian GNU/Linux OpenAFS DB Server, OpenAFS Fileserver, Samba, CUPS my.engin.umd.umich.edu - Debian GNU/Linux Apache (‘my.engin’ web site) cluster2.engin.umd.umich.edu - Debian GNU/Linux No services - File/shell access cluster3.engin.umd.umich.edu - Debian GNU/Linux No services - File/shell access

24

15.2
15.2.1

Screenshots
my.engin - User Main

25

15.2.2

my.engin - Administrator User Diagnostic

26

15.2.3

Cacti

15.2.4

Nagios

27

16
16.1

References
General

Gentoo Linux http://www.gentoo.org/ Debian GNU/Linux, http://www.debian.org/ Solaris, http://www.sun.com/software/solaris/

16.2

Authentication

OpenLDAP, http://www.openldap.org/ MIT Kerberos, http://web.mit.edu/Kerberos/ Samba, http://www.samba.org/

16.3

Network File Storage

OpenAFS, http://www.openafs.org/

16.4

E-Mail Services

Postfix, http://www.postfix.org/ Courier-IMAP, http://www.courier-mta.org/imap/ SquirrelMail, http://www.squirrelmail.org/ RoundCube, http://www.roundcube.net/ Mailman, http://www.gnu.org/software/mailman/ ClamAV, http://www.clamav.net/ DCC, http://www.rhyolite.com/dcc/ Spam Assassin, http://spamassassin.apache.org/

16.5

Network & Service Monitoring

Nagios, http://www.nagios.org/ Cacti, http://www.cacti.net/ Snort, http://www.snort.org/ 28

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.