You are on page 1of 25


d i g i ta l s p ot l ı g h t

Introduction  2
Why devops’ time
has come  4
Elements of devops  9
How Etsy makes
Devops work  18
Resources  25

FALL 2014


Devops :: FALL 2014

More! Faster!

Devops helps ops keep up


the decade before

this one, software developers were told to look
for another profession,
because all the coding jobs were headed
offshore to India and China.
My, how times have changed. The unemployment rate for developers in the
US is under 3 percent and competition
for top programming talent has never
been fiercer. No wonder: Web and mobile apps that engage customers and
partners have become table stakes for
businesses everywhere, while fresh software platforms abound, from smart TVs
to Hadoop to whatever the Internet of
things cooks up next. Plus, software must
now be updated continuously to keep
pace with accelerated change.
Devops provides the foundation to
meet this wildly accelerated demand. It
also serves the needs of agile development methodology, which raises soft

ware quality by stipulating shorter dev
cycles and continual adjustment based
on feedback from stakeholders. Both a
philosophy and a set of automation tools,
devops enables operations – and in some
cases developers themselves -- to set up
dev and test environments on demand
using a software-defined, cloudlike infrastructure.
In this Digital Spotlight, we explore
devops’ agile development roots and
examine the major types of tools and
techniques to build modern dev, test,
and deployment environments. In addition, we present an in-depth interview
with the vice president of operations for
Etsy, a forward-looking e-commerce
company that has successfully built and
maintained a successful devops environment for years. We hope you find this
original content useful in pursuing your
own devops strategy.
—Eric Knorr, Editor in Chief

Introduction  2
By Eric Knorr

Why devops’ time has come  4
With vastly increased demand for new code,
enterprises can no longer afford long, slow
development cycles. Devops provides the
acceleration. B y E r i c K n o r r

Elements of devops  9
Devops is a little bit of philosophy and lot of tools.
Here’s how those tools help boost the efficiency of
the entire application development lifecycle.
By MARTIN heller

How Etsy makes Devops work  18
Etsy, which describes itself as an online
“marketplace where people around the world
connect to buy and sell unique goods,” is often
trotted out as a poster child for Devops. The
company latched onto the concepts early and
today is reaping the benefits as it scales to keep
pace with rapid business growth. B y J o Hn di x

Resources  25 + NETWORKWORLD.COM


big data. 3 .Faster. cloud. social business • Improve stakeholder collaboration For your free copy visit: ibm. continuous software delivery with DevOps Learn how your organization can: • Exceed customer expectations • Increase the velocity of software delivery • Leverage mobile. Part #RAG12453-USEN-00 For Dummies is a registered trademark of John Wiley & Sons.

digital spotlıght Devops :: FALL 2014 Why devops time has come With vastly increased demand for new code. Devops provides the acceleration.COM 4 .com + NETWORKWORLD.  BY Eric Knorr infoworld. slow development cycles. enterprises can no longer afford long.

But older technologies for testing and deployment have been integrated as well. Does devops mean dev and ops the time-toshould at least understand each production of other’s needs better? Sure. a methodology concocted capabilities continually. thanks to the Internet in progress. however. In fact. Because new platenabling developers to build.. Agile development was conceived as an antidote to waterfall methodologies. now realize there’s no such thing as These ideas are not new. which are different disciplines involving different skills and cultures. coding. such as feasibility. with no stakeholders to review applications end in sight. release automation.from cars to TVs to smartand deploy modularly. Inbetter applications. debegan five years ago. forms -. published in 2001.COM 5 . S H U T T E R S T O C K C O M P O S I T E / S T E P H E N S A U E R Agile is as agile does stellation of technologies gathered to support devops + NETWORKWORLD. program specifications. devops adds Why? Because Web and mobile automation and applications have become essenstreamlines worktial to connect with customers and flow throughout partners and to capture their preferthe entire cycle. provide feedback. simply because descriptions are open to infoworld. you need to go back to the original Agile Manifesto. devops underscores that effect on how enterprises now recognize the need well developers for quicker deployment of more and write code. stead. although applications that the idealism of that notion has meet or exceed faded since the devops movement expectations. ences and needs. such as application lifecycle management. test. and application performance monitoring. configuration management. mean the unification of the two. such as PaaS (platform as a service) and configuration management (e. Devops’ underlying technologies. evops mashes together development and operations into a single term. automated testing To understand the appeal of devops. requirements. which progress in linear fashion through a series of stages. Often. Although the goal is to shorten Only recently.keep emerging.g. has a con- C R E D I T: A L L A R T. “one and done” with applications. The waterfall method demanded that stakeholders compose highly detailed functional requirements up front. who would create their own technical specifications and build away until the project was complete. external design. and of things. Puppet and Chef) are relatively new. however. It also allows watches -. they originate with agile developyou need to improve them and add ment. It does spotlıght Devops :: FALL 2014 D tools. More imporvops has no direct tantly. Because organizations change direction if necessary. These would essentially be thrown over the transom to developers. and (finally) production. testing. more than a dozen years ago. the result wouldn’t be what stakeholders wanted.

requirements. Continuous attention to technical excellence and good design enhances agility. is the imperative to configure dev and test environments to order quickly and with minimal fuss. + spotlıght Devops :: FALL 2014 misinterpretations.COM 6 . Simplicity – the art of maximizing the amount of work not done – is essential. and no one could anticipate design flaws that might emerge along the way. smaller and more frequent builds. The crew that wrote the Agile Manifesto had experienced these frustrations first hand. however. from a couple of weeks to a couple of months. The best architectures. Software-defined infrastructure Devops runs parallel to a larger trend in enterprise IT: cloud computing. and designs emerge from self-organizing teams. Welcome changing requirements. the team reflects on how to become more effective. Businesspeople and developers must work together daily throughout the project. even late in development. That’s where devops comes in. a welcoming attitude toward new requirements. developers. Many waterfall projects failed or left users dissatisfied. Here are their 12 principles. then tunes and adjusts its behavior accordingly. and users should be able to maintain a constant pace indefinitely. smaller and more frequent builds. which together changed application development forever: Our highest priority is to satisfy the customer through early and continuous delivery of valuable software. At regular intervals. Although the cloud comes in many shapes and sizes. to the point where some argue that ops’ inability or reluctance to keep up has prevented agile methodology from realizing its potential. But all that change creates gobs of work for operations. Deliver working software frequently. Give them the environ- Agile development is all about change: faster time to market. Build projects around motivated individuals. ment and support they need and trust them to get the job done. It draws on a broad set of capabilities across tools traversing the entire development cycle in order to automate change as much as feasibly possible. Working software is the primary measure of progress. Agile processes promote sustainable development. the basic idea is that compute. a welcoming attitude toward new requirements. Agile processes harness change for the customer’s competitive advantage. and network resources can be configured and scaled on demand – without admins scurrying around to manually provi- infoworld. The most efficient and effective method of conveying information to and within a development team is face-to-face conversation. with a preference to the shorter time scale. Underlying all of devops. Agile development is all about change: faster time to market. The sponsors.

But today. You could say that this motivation is the driving force behind devops. devops also gives developers greater opportunity to test as they go in an environment that mirrors that of production. Whether in the public cloud or in the data center. offering clouds that approach the robustness of. and it’s no wonder enterprise operations are turning to the cloud automation techniques first pioneered by Google and Amazon. along with business demands for more and better apps. and VMware go further. but applications that meet the needs of stakeholders. for Web or mobile apps that may suddenly spike to millions of users. Ideally. Ansible. configuring infrastructure is not a trivial task. storage. Add the iterative cycles of agile development. which is the essence of the cloud. Amazon Web Services EC2 or Google Compute Engine. even though it requires them to learn tools and procedures intended for operations. One of the healthiest aspects of the devops trend is its emphasis on agility in service of business objectives. Further up the stack. spotlıght Devops :: FALL 2014 sion hardware infrastructure. PaaS (platform as a service) offerings ride on top of IaaS to provide dev. The embrace of constant change runs afoul of legacy processes. aside from a few Perl scripts. Chef. the whole methodology of smaller and more frequent builds and quick response to stakeholder feedback can be more theoretical than real. At one point. Some wonder whether operations might be automating itself out of existence. Such configuration management tools as Puppet. and Salt enable ops to script their own cloudlike functionality from the ground up.” Without devops. Although the emphasis of devops is on increasing operational efficiency to support application development. configuring a handful of physical hosts for dev and test was not an enormous burden. A two-way street + NETWORKWORLD. but it does need some measure of that sort of automation. The first principle of agile development is a commitment to software quality -. Devops does not need a full-blown IaaS (infrastructure as a service) cloud to function. say. operations sets up automated environments that enable developer self-service: A developer fills out a Web form and quickly obtains the dev and test environment he or she needs. test. and network resources that underlies the cloud offers the means to deliver that scalability to dev and test as well as to production. But most software development managers see configuring infrastructure as a poor use of developers’ time. and deployment lifecycle. as cliché as that word may have become. IaaS-style functionality enables operations to be vastly more efficient – and thus remain essential to the organizations that employ them. OpenStack.and the virtualization of compute. Adrian Cockcroft of Netflix caused a stir by coining the term “no-ops” to describe certain aspects of the company’s development cycle. Behind the scenes. as many EC2 customers can testify. Microsoft. such automation was seldom required.COM 7 . and deployment options similar to that of an application server. Softwarebased configuration. is still the greatest single benefit IT can bring to business. only with IaaS scalability and support for multiple programming languages. and evolve as requirements evolve. in response to something blowing up in production: “Hey. test. is making such tasks a whole lot easier. It’s a lot harder for developers to say.not beautiful code for the ages. infoworld. Before large Web applications came into play. So-called “private cloud” offerings from the likes of Citrix. severs spin up and a database containing a snapshot of the data relevant for the application comes to life. developers have their own sets of tools – as well as respon- sibilities that extend beyond just slinging code. Many so-called “full-stack” developers pride themselves in handling all aspects of the dev. we need scalability -. it worked fine when I tested it.

TEST AND OPS The more “traditional” testing teams are used to executing performance and scalability tests in their own environments at the end of a milestone. and have a higher success rate when it comes to executing changes. Close collaboration with Ops ensures that tests can be executed either in the production environment or in a staged environment that mirrors production. their test frameworks and environments have to become available to other teams to make performance tests a part of an automated testing practice in a Continuous Integration environment. However. so they enable developers to become aware of recent performance problems and how they were solved. scalability. Dynatrace helps with enabling this collaboration as it provides a shared language that allows Ops. Page Load Time. as the meaning of metrics is known to everyone involved. and eliminates fingerpointing by abandoning guesswork on the root cause of performance issues. it also allows developers to learn from real world Ops experiences and starts a mutual exchange that breaks down the walls between teams. Test and Ops. It not only encourages the adoption of agile practices in operations work. and recent studies have shown that organizations adopting DevOps practices have a significant competitive advantage over their peers. as this is beneficial to identify the root cause of performance issues in production. These teams need to educate developers on the importance of performance in large-scale environments under heavy load. testers and operators allows for better collaboration in the future. faster to get out new features. This makes common problem patterns easier to prevent. Executing these tests in collaboration with Ops allows the teams to become more confident when releasing a new version and also helps with proper capacity planning steps. The way we think about it is in terms of CAMS–adopting a Culture of blame-free communication and collaboration. Automated Tests running in CI also help with detecting performance regressions on metrics such as # of SQL Calls. from an Ops perspective defining a set of key performance metrics that is monitored in all stages and has been agreed on between developers. Furthermore. TEST AND OPS With performance aspects being covered in earlier testing stages. This once again entails defining a set of performance metrics that is applied across all phases. Automatic collection and analysis of performance metrics as done by Dynatrace ensures that all performance aspects are covered. introducing continuous Measurements. Test and Dev to focus on the actual problems they have to solve. performance engineers get time to focus on large-scale load tests that need to be executed in a production-like environment. AUTOMATION — ESTABLISH A PRACTICE OF AUTOMATED PERFORMANCE TESTING IN CI Running tests against the production system gives better input for capacity planning and uncovers heavy load application issues. embracing Automation to focus on important tasks.. and Dynatrace not only helps with identifying those patterns in production environments. the term covers a wide range of different topics and consequently means different things to different people.DYNATRACE DevOps aligns business requirements with IT performance. Test and Ops • Sharing: Share the same tools and performance metrics data across Dev. SHARING: SHARE THE SAME TOOLS AND PERFORMANCE METRICS DATA ACROSS DEV. MEASUREMENT — MEASURE KEY PERFORMANCE METRICS IN CI. WHAT’S NEXT? Dynatrace brings speed and confidence to DevOps by helping with various aspects relevant to adopting Continuous Delivery and DevOps practices. Have you considered making performance a part of your deployment pipeline? Check out our 30-day Free Trial and start using Dynatrace in Continuous Delivery today! Try Dynatrace free for 30 days at: Dynatrace. It allows to clearly state performance requirements that are well-known to Dev. and encouraging Sharing of these measurements. but also in earlier development stages to prevent them from making it into production. This helps to find any data-driven. With less and less time for extensive testing. They are able to react faster on changing market demands. and 3rd party impacted performance . Both Operations and Test Teams usually have a good understanding of performance as they deal with it every day. In order to focus the entire team on performance you must plug performance into the 4 pillars of CAMS: • Culture: Tighten the Feedback Loops between Development and Operations • Automation: Establish automated performance testing in Continuous Integration • Measurement: Measure key performance metrics in CI. # of JS files or Images. Test and Ops CULTURE — TIGHTEN THE FEEDBACK LOOPS BETWEEN DEVELOPMENT AND OPERATIONS Culture is the hardest to change but is also the most important because it means to change the way how teams work together and share the responsibility for the end users of their application. testing and development environments.

B y M a r tin He l l e r 9 . Here’s how those tools help boost the efficiency of the entire application development spotlıght Devops :: FALL 2014 Elements of devops Devops is a little bit of philosophy and lot of tools.

the correct database.” You can see where this is going. and the DBA and ops manager were unusually silent and tried not to look at the developer.” “Oh.” said the operations manager. and/or (horrible to tell) became a manager in various alternate universes. The developer left for a startup. “We don’t have a spare Oracle license. dear me. You can even hear “bwahaha” after each answer. The developer eventually got a dressing down in a weekly meeting. You need your own database. Basically. dear me. devops offers a big box of tools that automate their way around requests that used to result in Developer workf low Defect manager Work item Refresh tickets Analyze problem Check out code from repository Code solution Debug Test Send document set to repository Detect ! infoworld. and syntactically valid test data? And what if all of this happened under the control of a configuration file and scripts while he brewed and drank a cup of coffee? How “agile” would that be? Enter devops. + NETWORKWORLD.” said the DBA. no. and it would take six months to get you that and the server on which to run it. there was a developer who needed to write code against a spotlıght Devops :: FALL 2014 O nce upon a time. What if the developer could have spun up a virtual machine already configured with trial versions of the correct operating system. “You can’t touch our data. So he asked the database administrator for access to the production database. became a black-hat hacker. Ask operations. the correct table and index schemas. But I’ll do what I can. “Oh.COM 10 .

from coding to integration to deployment to monitoring to bug reporting. integrated. and the developer’s first task every day (after the daily stand-up meeting that agile organizations hold first thing) is to check out or clone all the code of interest from the shared repository. working life revolves around a development + NETWORKWORLD. In the real world. Existing code lives in a repository.COM . In an ideal world. Integration and deployment Workf low Code check-in Continuous integration server Build server Test runner Developer tools For a developer. which might be integrated or might be a selection of independent spotlıght ! Devops :: FALL 2014 “no” for an answer. Developers get what they need to do their jobs. nobody else’s check-ins or pushes would have an impact on the developer’s code. such as Git or Team Foundation Server (TFS). that won’t always be Defect report PASS? NO ! D E P L O Y PROMOTE Development server PROMOTE QA server PROMOTE Staging server Production server ! ! ! ! Defect reports infoworld. and tested. That has several pieces. These tools can be divided into sets that support each step in the application development lifecycle. because everybody’s code would already be merged. and operations can hold up their end of the bargain without too much trouble.

and merging. but in the case of compiled languages. the testing framework integrates with the IDE and any local repository. and testing yesterday’s changes might be the second order of business. they often have the luxury of debugging in a virtualized environment that faithfully reflects the production environment. By the same token. for example. while the developer has the design firmly in mind. When developers write and test easily and on a regular basis. automated reporting via email and defect managers. In the real world. test it. The very best code editors for devops purposes show you the repository status of the code you’re examining. This is often done using a continuous integration server. JIRA. or possibly even deeper ties. such as Jenkins. all the current source code and built libraries and infoworld. if there is one. all code would be perfect. The developer’s tests should flow into the code integration environment through the shared repository. TFS. and actions on the repository. For example. developers may have to use stub code to represent server actions or have local databases stand in for remote databases. which will tie into automated build tools. so you can tell immediately if you’re looking at outdated source code. integrating. The developer will either continue yesterday’s project. build it. IDEs often integrate tightly with repositories. looking at the defect manager (be it Bugzilla. developers want to be able to fire off builds from the IDE and capture the errors and warnings for editing purposes. or shelve that and handle a higher-priority ticket. And to complete the triangle. Redmine. but at the very least the developer has a browser tab open to view his or her tickets. When they are in an organization that has implemented devops. but at the very least the developer has a command-line console open for check-ins and check-outs. An IDE such as Eclipse or Visual Studio often has a window into the defect manager. they often spend the majority of the day running a debugger. Ideally. In an ideal world.COM 12 . Developers’ build tools depend on the programming language(s) they’re writing in. It also helps if the code editor knows about the syntax of the language. They’ll also refresh your copy before you introduce merge conflicts. The code editor is usually the core component of an IDE. or any other tracker) and addressing any “tickets” (bug reports or task assignments) is the next order of + NETWORKWORLD. and report on the results. spotlıght Devops :: FALL 2014 the case. if the build succeeds and all tests pass. From the developer’s point of view. so that it can flag errors in the background during coding and highlight the syntax with colors to help developers visually confirm that. Without that. along with the source code that the developer has debugged and tested. Test runners help developers run their unit tests and regression tests Code integration tools Code integration tools take the code in a shared repository. there is no such thing as perfect code – the closest we can come is code that doesn’t have any known bugs. so that any new code can be tested immediately after check-in. what they intended to be the name of an already-defined variable is correct. bug trackers often integrate with source code repositories. automated test runners.

Think of PaaS as providing computers. Ansible and Salt are Python-based systems. depending on the run-time platform and the additional infrastructure. cloud-agnostic configuration management and remote execution application. On the other hand. it will often rely on software deployment and configuration management tools. Puppet. and Vagrant – work across a wide range of platforms by using widely supported languages. Most projects. VMware. test. and it can orchestrate resources in any clouds. Vagrant takes the sting out of reproducing configuration-dependent bugs. database. or it can run in a public IaaS cloud such as Amazon EC2. information. Chef. da- infoworld.COM 13 . PaaS can be deployed on premises or offered as a service by a public cloud provider. PaaS includes infrastructure. whether or not they use automatic builds and tests. Deployment tools and environments If the continuous integration server is set up to deploy builds. Windows. Salt. For example. Ansible takes recipes in YAML and manages nodes over SSH. In other projects. often on freshly provisioned test environments. Puppet uses a custom declarative language to describe system configuration. Salt. has evolved into an award-winning open source. and Vagrant are Ruby-based. It’s basically a dev. Some projects implement continuous integration for every code push. Vagrant is a specialized configuration management tool for development environments. Puppet. It’s basically a dev. if the incremental build time is small. and process as a service. the Pivotal Cloud Foundry PaaS can be deployed on premises on top of VMware’s version of a private cloud. and Mac OS X systems. UNIX. There are more than 2. and deployment platform that sits on top of IaaS (infrastructure as a service).500 predefined modules listed in the Puppet Forge. If critical tests fail. but it can also run in a selfcontained + NETWORKWORLD. after they pass all tests. the relevant checkins can be backed out of the shared repository and returned to the responsible developer(s) for bug fixes. Salt can manage and deploy Linux. These often vary PaaS (platform as a service) occupies an interesting niche in the cloud ecosystem. some configuration management tools – such as Ansible. and deployment platform that sits on top of IaaS (infrastructure as a service). Chef uses a Ruby domainspecific language for its configuration recipes and uses an Erlang server as well as a Ruby client. and whether or not they integrate after code pushes or on demand throughout the day. a delay is introduced after a code push so that multiple pushes can be combined into the next build. also run nightly “clean” builds and tests. test. Puppet usually uses an agent/master architecture for configuring systems. and other virtual machine managers. which acts as a wrapper for VirtualBox. spotlıght Devops :: FALL 2014 executables can be tagged with the current build number in the repository. storage. originally a tool for remote server management. Chef. PaaS (platform as a service) occupies an interesting niche in the cloud ecosystem.

bug tracking. continuous integration. or on an IaaS CPU capacity to VMs that need it. building. and business processes or meta-applications. On the other hand. such as the Java Virtual Machine. For products that are released yearly. which use Warden Linux containers for isolation. application performance can change in production for infoworld. we are interested in system VMs. There are two kinds of VMs: system VMs. in addition. such as Cloud Foundry.” Where a PaaS adds value over IaaS is to automate all of the provisioning of resources and applications. such as VMware. and other devops tasks. Docker is rather new and not yet universally supported. While Docker is the current media darling of the software container space. Runtime monitoring tools Acceptance testing for products usually includes performance testing. moving from one stage to another can be a manual process. and promotion processes. and it can be used for build automation. System VMs offer excellent software isolation. with much less overhead than VMs. configuration.” in Droplet Execution Agents. which may go all the way up to fullblown load testing with realistic user profiles. cloud. OpenShift runs applications in containers called gears. such as DB2. it recently gained support for Windows as well. Even so. VMs can be deployed on dedicated server hardware. called “droplets. information streams. and most relevant vendors have signed on to support it. For agile products that are released weekly or biweekly. Software containers such as Docker offer good-enough software isolation in most cases. or a server application. at the expense of incurring some fairly heavyweight hypervisor overhead and using a lot of RAM. In the grand scheme of a software lifecycle. all tied up in one “stack” or “sandbox.COM 14 .digital spotlıght Devops :: FALL 2014 tabases. and process VMs. and uses SELinux for gear isolation. Docker can work independent of PaaS systems and can greatly simplify deployment for devops. each feature moves from design to development to testing to staging to production. teams need to automate their tests. which can be a huge time saver. testing. while bug reports feed back to the developers for triage and fixes at each + NETWORKWORLD. either onpremise or off-premise. While Docker began as a Linux-only solution. release management is often automated. For example. Part of what needs to be automated is the release process management. in which we can deploy a PaaS. Docker can make multiple clouds look like one big machine. For the purposes of deployment tools. In turn. packaging. All PaaS systems with which I am familiar wrap applications in software containers. Cloud Foundry runs built and packaged applications. Various hypervisors and IaaS infrastructures offer differing amounts of load isolation and differing algorithms for allocating excess Docker can work independent of PaaS systems and can greatly simplify deployment for devops.

Sometimes such problems will not reproduce on a developer’s machine. anyone?). the steps to re- infoworld. It may also require remotely entering and running diagnostics on the user’s + NETWORKWORLD. however. such as CPU and memory utilization. a root cause. a memory leak that manifests over time. Once you’ve determined the user’s runtime environment. and it will be assigned to the developer most familiar with the relevant code. In a worst case. on a server. TS: What were you doing? User: What I always do. Passive user metrics. The test VM may run locally on the developer’s spotlıght Devops :: FALL 2014 a number of reasons: a spike in usage (Black Friday. often collected using network monitoring appliances. streamlining the root cause analysis process. a reported defect will be accompanied by a detailed description. collected by generating application requests and measuring the response times. the developer can use configuration management tools to create a similar runtime environment in a VM. such reports require some skill on the part of tech support to dig out enough of a description and steps to reproduce the problem in order to allow a developer to work on the problem. When your application isn’t performing the way you’d like. you would have to turn them on for a short period to try to capture the problem. One common reason for this is that the development box is too fast and has too much memory to show the problem. In the past couple of years. and system metrics. or an ill-considered database index that slows down updates after its underlying table gets big. It worked yesterday.COM 15 . is intended for such purposes. and a third is that the user has another application installed that interferes with yours. These are usually broken down into user metrics. Vagrant. an overloaded server. but didn’t really elaborate on their use. another possibility is that the developer has a library installed that the user lacks. in particular. In a best case. such as time to see a page or complete a transaction. Application performance monitoring is intended to continually create metrics for the key performance indicators that matter to your application. Bug reporting and reproduction tools and environments We mentioned defect managers earlier. TS: Have you changed anything since yesterday? User: I didn’t change nothin’. In some cases. determining the root cause may be a frustrating and time-consuming process. are often reserved for non-peak-load periods. Until recently. active user metrics. are of most value when the application is heavily used. Needless to say. or on an IaaS cloud. a bug report will come from a frustrated user calling into tech support and include a conversation along these lines: TS: What’s wrong? User: It broke. the DDCM (deep dive component monitoring) agents intended to help you with root cause analysis generated too much overhead to be used in production. a bad spot on a disk. new DDCM products on the market claim to be able to monitor a wide selection of languages and frameworks with minimal overhead. System metrics are typically available all of the time. a script to reproduce the problem. then turn them off to allow production to resume at full capacity.

then the release manager or customer service manager needs to decide whether to propagate the change to production or schedule it for later integration and whether to give the user a patch or an operational work-around.COM 16 . If the change is spotlıght Devops :: FALL 2014 produce the user’s problem would change the production database. so that changes never propagate to the production database. If the modern agile application lifecycle sounds a little like Ezekiel’s vision of a chariot having wheels within wheels. Another wheel set represents a given build’s climb from development to testing to staging to production. such as bringing up a clean test database or promoting a build. so that the developers can concentrate on building actual features and fixing real bugs. And the tiniest wheels represent bug reports and fixes. the revised application must at least be regression tested. A senior contributing editor for InfoWorld. that’s OK: It is. and preferably all acceptance tests will be run. are quick and easy. it’s useful to have a scaled-down copy of the production application running in a PaaS. An inner wheel set represents the lifecycle of a story card or application feature. development shops can easily bog down at any stage. and veteran technology journalist. In this complicated environment. Once a fix for the problem is identified and a change set added to the code repository. In these situations. infoworld. The purpose of devops is to see that the routine things. that’s OK: It is. he frequently reviews software related to application development. One wheel set represents the sprints – typically one to two weeks – after which an application version is released from development to testing. entrepreneur. Martin Heller is a developer. The never-ending circle If the modern agile application lifecycle sounds a little like Ezekiel’s vision of a chariot having wheels within + NETWORKWORLD.

digital spotlıght .

digital spotlıght Devops :: FALL 2014 Q&A How Etsy makes Devops work Etsy. infoworld. Network World Editor in Chief John Dix caught up with Etsy VP of Technical Operations Michael Rembetsy to ask how the company put the ideas to work and what lessons it learned along the + NETWORKWORLD. The company latched onto the concepts early and today is reaping the benefits as it scales to keep pace with rapid business growth. which describes itself as an online “marketplace where people around the world connect to buy and sell unique goods.” is often trotted out as a poster child for Devops.COM 18 .

for example. but in 2013 we had about $1.COM 19 . even though we had some underlying engineering issues that made it hard to get things out the door. where and when did the company become interested in Devo ps? When I joined things were growing in a very organic + NETWORKWORLD. How. Deploys were often very painful. sure. by the time I joined in 2008 (the same year as Chad Dickerson. but the problem was we weren’t going fast with confidence. which was doing 10 deploys a day. Deploys were often very painful.3 billion in Gross Merchandise Sales. That took us a year infoworld. Everybody really bonded well together on a personal level. You don’t want to continuously deploy pain to everyone. which was unheard of. But it turned out to be just the opposite. We had a traditional mindset of. We had a traditional mindset of. there were about 35 employees. People were staying late. and each deploy took well over four hours. Now we have well over 600 employees and some 42 million members in over 200 countries around the world. more scalable E T S Y P H O T O S courtesy S c o t t B e a l e “ way. We needed to have a solid network. We had a really awesome office vibe. put a lot of effort into building a middle layer – what I called the layer of distrust – to allow developers to talk to our data bases in a faster. The company was founded and launched in 2005 and. needed to make sure that the site would be up. including over 1 million active sellers. We don’t have sales numbers for this year yet. and that resulted in a lot of silos and barriers within the company and distrust between different teams. who is now CEO). He had seen quite a lot in his time at Yahoo. socializing after hours. And that doesn’t really scale. and we had a lot of fun. We were going fast with lots of pain and it was making the overall experience for everyone not enjoyable. And that doesn’t really scale. no? Compared to the rest of the industry. and knew we could do it better and we could do it faster. Twice a week was pretty frequent even back then. all the things people do in a startup to try to be successful. working long hours. So we were certainly going a little bit faster than many spotlıght Devops :: FALL 2014 Let’s start with a brief update on where the company stands today. a very edgy feel.We knew there had to be a better way of doing it. developers write the code and ops deploys it. But in 2008 we compared ourselves to a company like Flickr. Where did the idea to change come from? Was it a universal realization that something had to give? The idea that things were not working correctly came from Chad. We always knew we wanted to move faster than everyone else. The engineering department. to build confidence with our members as well as ourselves. It created a lot more barriers between database engineers and developers. to make sure we were stable enough to grow. How often were you deploying in those early days? Twice a week. developers write the code and ops deploys it. But first we needed to stabilize the foundation.

The banner would rotate once a week and we would have to deploy the entire site in order to change it.” That really sparked a lot of thinking within the teams. We realized if we had a tool that would allow someone in member ops or engineering to go in and change that at the flick of a button we could make the process better for everyone. We knew we could do it faster and we knew we could do it better. and gain an understanding of the stress and fear of a deploy. It’s a little intimidating when you’re pushing that big red button that says — Put code onto website — because you could impact hundreds of thousands of people’s livelihoods. and so we started working on that. take into consideration performance. We were like. and that took four hours.COM 20 . That’s a big responsibility. and then running another command that pulls the server out of the load balancer. unpacks the code and then puts the server back in the load balancer. It’s a little intimidating when you’re pushing that big red button that says – Put code onto website – because you could impact hundreds of thousands of people’s livelihoods. And as we started adding more engineers. by nature. But we eventually started to figure out little things like. We’re going to fix it. but back in 2009 we + NETWORKWORLD. We don’t have any more banners on the homepage. we can build a better tool to do some of what we’re doing in a full deploy. “Hey. beginning of 2010. But whether the site breaks is not really the issue. The idea of letting developers deploy code onto the site really came about toward the end of 2009. infoworld. It was painful for everyone involved. So that gave birth to a dev tools team that started building some tooling that would let people other than operational folks deploy code to change a banner. It’s about making sure the developers and others deploying code feel empowered and confident in what they’re doing and understand what they’re doing while they’re doing it. This used to happen while we sat there hoping everything is ok while we’re deploying across something like 15 servers. But we also knew we could find a better way to deploy than making a TAR file and SSH’ing and Rsynch’ing it out to a bunch of servers. That was probably one of the first Devops-like spotlıght Devops :: FALL 2014 and a half. Then we realized we had to get rid of this app in the middle because it was slowing us down. we started to understand that if developers felt the responsibility for deploying code to the site they would also. take responsibility for if the site was “ up or down. The site is going to break now and then. we shouldn’t have to do a full site deploy every single time we wanted to change the banner on the homepage.

I know this group. So when John Allspaw. “These people? Absolutely not. It was more of a short term strategy. That’s fine. we would inject ourselves into those teams. It was just making things faster and better and stronger in a lot of ways. relationships between different groups. That’s absurd. no problem. “Oh my + NETWORKWORLD. If development came up with better ideas of how to deploy faster. You’re really talking about building trust and building friendships in a lot of ways. we weren’t an IBM. Again. SVP of Operations and Infrastructure. operations would be like. People checked their egos at the door. and leaving so it made it relatively easy to have that kind of faith in one another. more graphs. infoworld. “Oh my God. which led later to this idea of what we call designated operations. spotlıght Devops :: FALL 2014 So there wasn’t a Devops epiphany where you suddenly realized the answer to your problems. I was going to ask you about the physical proximity of folks.” That never happened. Depending upon what we were working on. Today we are more of a remote culture than 2009. “OK.” In a lot of organizations I’ve worked for in the past it was like. People checked their egos at the door. So the various teams were already sitting cheek by jowl? In the early days we had people on the left coast and on the right coast.” That never happened. And as we did that the culture in the whole organization begin to feel better. like. we fix it. “Oh. but we want reliability and sustainability and uptime. We all sat very close to one another.” And you have to remember this is in the early days where the site breaks often. There was no distrust between people. I’ll back them up. It emerged organically? It was certainly organic. We all knew when people were coming I can’t recall a time where someone walked in and said.COM 21 . “ They can’t do that. So in a lot of ways it was a big leap of faith to try to create trust between each other and faith that other groups are not going to impact the rest of the people. that person deployed this and broke the site. We were a small shop. But you didn’t actually integrate the development and operations teams? In the early days it was very separate but there was no idea of separation. came on in 2010. to make things a little more cohesive while we were creating those bonds of trust and faith. So it was one of those things. A lot of that came from the leadership of the organization as well as the teams themselves believing we could do this. but let’s also add more visibility over here. So if we had a new hire we would hire them in-house.” And there was no animosity between each other. I can’t recall a time where someone walked in and said. if it breaks. They can totally do this. that person deployed this and broke the site. where it’s like. people in Minnesota and New York. But in 2009 we started to realize we needed to bring things back inhouse to stabilize things. OK.

you start finding some really cool stuff. “We’re working on this really cool + NETWORKWORLD. If infoworld. How many people involved at this point? Product engineering is north of 200 people. Success is a really broad term. “Hey. So Devops for you is more just a method of work. If we’re testing a new type of server and it bombs. A “ good example: the search team now handles all the on-call for the search infrastructure. I consider failure success. what happens if that fails to this third-party provider? Oh. Correct. Oh. and so on. we don’t have a dedicated operations person who only works on search. How do you measure success? Is it the frequency of deployments or some other metric? Success is a really broad term. OK. Y and Z. It was extremely useful for collaboration and communication. For example.” The idea of designated ops is it’s not dedicated. as well. I consider that a success because we learned something. will be involved in the development of a new feature that’s launching. I want to make everybody aware I’m also going to probably need some network help. “We should do this thing called designated operations.” So what we started finding was the development teams actually had an advocate through the designated ops person coming back to the rest of the ops team saying. and that in turn gave us the ability to have more open conversations with spotlıght Devops :: FALL 2014 we were talking about better ways to collaborate and communicate with other teams and John says. That includes tech ops.COM 22 . “I’ve got this. It’s going to launch in about three months. Let me throw this over the wall to ops. So we started seeing some real benefits by using the idea of this designated ops person to do cross-team collaboration and communication on a more frequent basis. what you have is the designated ops person coming back to the rest of the ops team saying. So that way you remove a lot of the mentality of. There is no Devops group at Etsy. etc. that’s going to throw an exception. how you build Ganglia graphs or Nagios alerts. if we have a search team.” Instead. “Oh. and by doing that we actually started creating more allies for how we do things. We have a designated person who will show up for their meetings. product folks. With the capacity planning we’ve done it is going to require X. Are we capturing it? Are we displaying a friendly error for an end user to see? Etc.” And when you have all of your ops folks integrating themselves into these other teams. They understand what they’re trying to do and they’re extremely supportive. like people actually aren’t mad at developers.” And what we started doing with this idea of designated ops is educate a lot of developers on how operations works. They will be injecting themselves into everything the engineering team will do as early as possible in order to bring the mindset of. so I’m going to order some more servers and we’ll have to get those installed and get everything up and running. Well. I’m going to need some servers. and if they are unavailable it escalates to ops and then we take care of it. development. as well. I consider failure success. yeah.

So it’s not around the number of deploys. but that was back in the early days when it first launched. many success metrics and some of those successes are actually failures.” and it’s not about restricting access to languages. We use a lot of Nagios and Graphite and Ganglia for monitoring. “Next year we want to deploy 100 times a second. Rackspace at one point took Deployinator and rewrote a bunch of stuff and they were using it as their own deploying tool. it’s about creating a common denominator so everyone can share experiences and collaborate. “Oh. I don’t know if they still are today. Apache.COM 23 . which is spread “ throughout our infrastructure. So if a designated ops or development team starts feeling some pain. So we don’t have five key graphs we watch at all times. but we don’t say. Those are open-source tools that we contribute back to. And we wrote Deployinator. I want to use CoffeeScript or I want to use Tokyo Cabinet or I want to use this or spotlıght Devops :: FALL 2014 we’re testing a new type of server and it bombs. MySQL and + NETWORKWORLD. and we opensourced it because one of our principles is we want to share with the community. I consider that a success because we learned something. we use it all over the place. We have millions of graphs we watch. We want to make sure we’re getting the features out we want to get out and if that means we have to deploy faster. infoworld. which is our in-house tool that we use to deploy code. We use Chef for configuration management. I could tell you we’re deploying over 60 times a day now. A lot of people were like.” We want to be able to scale the number of deploys we’re doing with how quickly the rest of the teams are moving. then we’re going to solve that problem. we’ll look at how we can improve the process. which is our in-house tool that we use to deploy code. And we have a bunch of homegrown tools that help us with a variety of things. I’d say that’s the vast majority of the tooling that ops uses at this point. Do you pay attention to how often you deploy? We do. And we wrote Deployinator. We really changed over to more of a learning culture. and we open-sourced it because one of our principles is we want to share with the community. We basically chose a LAMP stack: Linux. I presume you had to standardize on your tool sets as you scaled. Development obviously uses standard languages and we built a lot of tooling around that. There are many.

com InfoWorld Networkworld 501 Second St. what kind of questions should they ask themselves to see if it’s really for them? I would suggest they ask themselves why they are doing it. Opus One. E-mail: firstinitial_lastname@nww. If they’re doing it to improve the overall structure of the engineering culture.infoworld. 01701-9002 508. It’s going to take lots of time. Thomas Henderson. I didn’t marry my wife the first day I met N.5153 But they have to keep in mind it’s not going to be an overnight process. No problem. CA 94107 415.766. I didn’t marry my wife the first day I met her. It has to be a cultural change in the way people are interacting. New York Chris Rogers 603. Craig Mathias. people can’t just abandon it. if it doesn’t work after a quarter or it doesn’t work after two quarters. It takes a lot of time. EDITORIAL Editor in Chief Executive Editor. enable people to feel more motivated and ownership. that’s a really good reason to do it. IL and MI Chip Zaboroski 508-820-8279 Ryan Francis East. Gaskin Computing Services. Box 9002 spotlıght Devops :: FALL 2014 www.O. Centennial Networking Labs. New England. 2014 FarPoint Group infoworld. Test Center Doug Dineley John Dix Managing Editor Uyen Phan Bob Brown Senior Editor Jason Snyder Editor at Large Paul Krill Senior Writer Serdar Yegulalp East Coast Site Editor Caroline Craig Newsletter Editor Lisa Schmeiser Online Executive Editor. People certainly will have discussions and disagreements about how they should do this or + NETWORKWORLD. or they think they can improve the community in which they’re responsible or the product they’re responsible for.COM 24 . that’s a pretty terrible reason. Southeast. We’ll just drop some Devops in there. It takes longer than people think and they need to be aware of that because. if it doesn’t work after a quarter or it doesn’t work after two quarters. independent Images by Shutterstock consultant. people can’t just abandon it. It takes longer than people think and they need to be aware of that because. Barry Nance. MA. say. attract talent.” That doesn’t work. EDITORIAL Editor in Chief Eric Knorr Executive Editor Galen Gruman That doesn’t mean everybody has to get along every step of the way. News Executive Features Editor Neal Weinberg Community Editor Colin Neagle Multimedia Programming Director Keith Shaw Online News Editor Michael Cooney Online News Editor Associate Editor Pete Babb Paul McNamara Senior Online Production Editor Lisa Blackwelder Ann Bednarz Online Associate News Editor Managing Editor SALES Senior Vice President Digital / Publisher Sean Weglage 508-820-8246 Vice President.978. Well no. James www. It took me a long time to get to the point where I felt comfortable in a relationship to go beyond just dating.583. © IDG Communications Inc. P. Digital Sales Farrah Forbes 508-202-4468 Account Coordinator Christina Donahue 508-620-7760 Jim Duffy Senior Editor Tim Greene Senior Writer Brandon Butler Staff Writer Jon Gold Web Production Managing Editor East. It took me a long time to get to the point where I felt comfortable in a relationship to go beyond just dating. Network Test.978. How do they think they’re going to benefit? If they’re doing it to. David Newman. San Francisco.3313 Art Director Stephen Sauer N et w or k w orld la b allian c e Joel Snyder. John Bass.3200 492 Old Connecticut Path. CA / OR / WA Kristi Nelson 415. Everybody will talk and it will be great.713.5044 DESIGN West / Central Becky Bogart 949.5301 As other people are considering adopting these methods of work. It’s not just the CEO saying. really easy. “Next year we’re going to be Devops. and that’s OK. It takes effort from people at the top and it takes effort from people on the bottom as well. ExtremeLabs. On paper it looks really.

Join Gene Kim. co-author of The Phoenix Project and DevOps researcher.COM 25 . as he discusses performance metrics. every quarter. QA and operations teams. ON-demand G webcast Mastering Performance and Collaboration Through DevOps DevOps is a hot topic within the IT community and is quickly becoming the standard for high performing and decidedly collaborative organizations. Many companies plan for changes and improvements in software releases several times a year.Devops :: FALL 2014 SPONSORED BY: Resources SPONSORED BY: EMA: Ten Factors Shaping the Future of Application Delivery In a recent research study on DevOps and Continuous Delivery. G DOWNLOAD HERE ON-demand G webcast DevOps: Culture or Tools? It’s Both Alan Shimel. explore how you can increase quality. efficiently and effectively – every day. ON-demand G webcast infoworld. com. EMA discovered there is a strong correlation between the company’s software delivery speed and their revenue growth. Some – like McKesson Health Solutions – have a business model that requires their IT team to respond to hundreds of releases every year. G DOWNLOAD HERE Using Continuous Delivery to Improve Software Delivery Learn more about the challenges impacting organizations and how continuous delivery processes can be a key success factor in accelerating software delivery. ON-demand G webcast Global Bank Improves Quality of Application Development Agile Development: How to Release Apps at the Speed of Business Lack of a centralized management of the process and sporadic access to development build assets was hurting development cycles. Read how this financial institution centralized build assets. cu development time in half and added additional security controls. and with great + NETWORKWORLD. Learn how your IT team can release apps at the speed of your business. discusses five key steps for developing a culture and assessing tools that can help you deliver software faster. This report can help organizations build a case for Continuous Delivery adoption. G DOWNLOAD HERE Forrester: How to Accelerate Innovation with Continuous Delivery In this on demand webcast with Forrester analyst Kurt Bittner. that enable high performance. Co-Founder & Editorin-Chief of DevOps. as well as the cultural and technical practices. Learn more about the business value of Continuous Delivery with Jenkins. break down silos and maximize productivity for developers. more efficiently.