Deploying your way into happiness

Vladimir Vuksan @vvuksan and Jeff Buchbinder

Just kidding!

No really, happiness is possible

What is deployment ?
The deployment of a ...., computer program, ... is its assembly or transformation from a packaged form to an operational working state. Deployment implies moving a product from a temporary or development state to a permanent or desired state.

http://en.wikipedia.org/wiki/System_deployment

Sounds exciting ...

http://www.flickr.com/photos/32214265@N00/1977622599

How it feels most of the time ...

http://www.flickr.com/photos/bobex_pics/3105613262

Why is deployment a problem?

Poor communication and coordination between the dev and ops teams Poorly tested/QAed software Unaccounted for differences between QA and production environments Overwriting state, logs, configuration and other transient software properties upon deployment e.g. ad-hoc changes to conf files to "make things work" are not propagated in consecutive deployments

How does deployment improvement fit into DevOps

Deployment is one of the biggest interconnection points between ops and development Streamline deployment and 90% of the dev vs. ops issues go away #win Lets ops spend more time on cooler things :-) such as monitoring, performance etc. Lets devs spend more time on cooler things since they don't have to spend so much time dealing with deployment issues Quicker iteration between releases Enables users by letting them self-service themselves.

Environment background

6 distinct applications to deploy. Java based using Tomcat/JBoss application servers Different release schedules for each of the apps Apps communicate with each other via web services => coordination headaches

How deployment used to be

Deployment done two ways

Single BASH installer for core/base app
./installer ­ver 5_40 ­cvsup ­servers snowhite,sleepingbeauty


Other apps installed by hand Need to do ant build_qa or ant build_prod to produce deployment binaries archives Failure due to accidental commits into a branch Installer breaks randomly Baked-in configurations wrong (duh!)

For other apps configurations "baked-in" binaries ie.

Frequent deployment issues
– – –

How we did it

In next couple slides we'll cover
– – – – –

Application design for deployment How to build releases Deployment tool How to deal with app specific complexities How to automatically patch the DB

Application design

Application should have a sane default configuration options. Any config option should be overrideable via an external file e.g.
– – –

include("default.conf"); if ( is_file( "$APP_HOME/conf/app.conf") ) include("$APP_HOME/conf/app.conf");

In most cases you only need to override database credentials (host, username, password). Goal is to be able to use the same binary across multiple environments and avoid baking configuration into binaries

Application design cont'd

Application should expose key internal metrics.

JMSenqueue=OK etc.

This is important because there are lots of things that can break inside the application which external monitoring may miss like JMS message can't be enqueued, etc. Keep release notes actions to a minimum.
– – –

Instructions often not followed or partially followed. Avoid them whenever possible Automate as much as you can

Building releases

Developers are in charge of building and packaging releases. QA or Ops will not know what to do if a build fails (this is Java remember) Each release has to be clearly labeled with the version and tagged in the repository. Archive file needs to contain the same e.g. Location Server 1.1.5 will be packaged as location1.1.5.tar.gz. Archives contain only WAR (Tomcat) or EAR (Jboss) files and DB patch files. Releases are to be uploaded into an appropriate file share ie. /share/releases/location.

Deployment tool data model

Support multiple applications
– –

can use different app server containers ie. Tomcat/JBoss can have configuration files that can be either key/value pairs or templates. Every application has a start and stop script single dashboard that allow deployment to multiple environments e.g. QA staging (current release), QA development (next scheduled release), Dev playbox, etc. Each of these domains has their own set of applications they could deploy with their own domain specific configuration options


Has notion of unique domains/Customers

Application example

Location server Needs Tomcat container
– –

Tomcat 5.5.25 for versions < 1.1.0 Tomcat 6.0.24 for versions > 1.1.5

Configuration override file found in /conf/location.conf which contains key/values e.g.

dbUser=location

$APP_HOME/start.sh -> startup script template $APP_HOME/stop.sh -> stop script template

Deployment tool aka Deployer

A 500+ command line PHP script that is able to deploy, stop, start and undeploy services => intended for batch operations, mass upgrades, etc. Separate Web GUI that interfaces with the command line utility + allows easy config changes => used largely by QA/dev Deployer mostly used for prep of the software base ie. building up file structure, overlaying files, creating configuration files. Avoids app specific complexities App specific customizations are implemented via BASH shell scripts (not a requirement)
– –

Keeps core deployer simple Easier to troubleshoot smaller BASH files

Deployer web GUI

Pick a version

Override default values using these

Deployer example

Commonly invoked this way $ deployer –version 1.2.5 –server web10 –domain joedev – app base –action deploy Results in following actions

1. Unpack the proper app server container e.g. jboss-4.2.3.tar.gz to /prep_dir 2. Overlay/untar WAR/EAR files for the name version e.g. base-1.2.5.tar.gz 3. Build configuration files and scripts 4. Stop the server on the remote box ie. ssh web01 /run/ba_base/stop.sh 5. Rsync the contents of the packaged release /prep_dir -> /run/ba_base 6. Make sure Apache AJP proxy is configured to proxy traffic and execute apache reload 7. Start up the server ie. ssh web01 sudo -u ba_base /run/ba_base/stop.sh

DB Patching

"Automatically" patch the DB => no human/ops intervention Every application has table called Patch (single column) with a list of DB patches that have been applied. Every app has dbpatches directory in the app archive which contains a list of patches named with version and order in which they should be applied e.g.
– –

2.54.01-addUserColumn.sql 2.54.02-dropUidColumn.sql
Apparent which goes first

During deployment startup script compares contents of the patch table and a list of dbpatches and applies any missing ones. If the patch script fails e-mail is sent to the QA or dev in charge of particular domain All this done using a 70 line BASH shell script => no magic/complexity

Production deployment prep

Usually a day or so before the deployment is scheduled have a conference call between dev, QA and ops that discusses deployment and any things to look out for. Apply any configuration changes that can be done ahead of time ie. add new config option Upgrade the staging environment by importing the prod DB, then upgrade to the desired release and QA it.

Actual deployment

Paper trail -> The day before the release QA opens up a ticket listing all the applications and versions that needed to be deployed. On the morning of the deployment (that was our low time) someone from ops, development and whole QA team engage in deploying the app and resolving any observed issues.

Other pointers

Sizing QA environment

Create a couple
– –

Staging QA => same version as production Dev QA => next scheduled release

Should run on underpowered hardware (virtualized a good choice) since it is almost impossible to simulate production load so this gives you valuable data e.g.

We discovered a number of major flaws when our virtualized machines ran out of disk space Easier to spot higher CPU utilization on 1 core box than 32 core box

Want to load test ? Use a performance environment that is similar to prod but used solely for load testing

Precreate environments

Create users/environments/network info ahead of time Avoids provisioning problems, resource locks, etc. Run different applications under their own user id => avoids accidentally stoping wrong app Use a unique identifier. Easy way => 2/3/4 letter codes ie. aa, ab, ac .... Create all the necessary user ids for every environment application combination ie. aa_msg, aa_base, aa_loc Precreate configurations, DBs/usernames (without schemas) Need a new environment => assign the already created environment to the new user/group ie. salesdemo_joe => af. dev_jane => aj. Deploy.

Multiple environments on same hardware w/o virtualization

Assign customer specific port ranges then use application specific offsets e.g. Customer aa => start_port 10100, ab => start_port 10200
– – –

Messaging app offset +10 Location app offset +20 Results aa_msg can use 10110-10119 (although usually only 3 ports necessary)

Precreate these when when precreating environments Alternatively specify customer specific IP ranges then use those. Private IPs are cheap.

Hostnames naming

Avoid theme names ie. planet/star names, cartoon characters etc. Pick functional names ie. db1, qadb1,web1,app1,mail1 etc. Why ? @vvuksan Agree. You shouldn't have to have this mental conversation at 3am: "Was the web server on Wolverine or Cyclops? Shit, maybe Beast." (http://twitter.com/markimbriaco/status/19498142304) @markimbriaco through acquisition I got 300 machines named after types of drink. In Finnish. Was hell :) (https://twitter.com/ripienaar/status/19498821560)

Questions ?