You are on page 1of 75

Say you're in charge of a fleet of servers.

 
Everything is full steam ahead, 
until one day you discover that there's 
a security vulnerability in one of the applications used. 
Now, you need to upgrade 
all the servers to the latest version. 
If you have 10 servers in the fleet, 
it's probably not too much trouble 
to log into each one of 
them one after the other and install the new version. 
But what if you have 100 servers? 
This would get super boring and you'd 
likely end up making mistakes, 
leaving some servers with the wrong version installed. 
Now, imagine having to do this on 1000 servers. 
There's no way you're going to log into 
each of them to upgrade the software. 
So what can you do instead? 
In this course, we'll look into how we can apply 
automation to manage fleets of computers. 
We'll learn how to automate deploying new computers, 
keep those machines updated, 
manage large-scale changes, and a lot more. 
We'll discuss managing both physical machines running in 
our offices and virtual machines running in the Cloud. 
If this sounds overwhelming, don't worry, 
I'll go step-by-step with you along the way. 
I'm [inaudible] , and I'm 
a Site Reliability Engineer at 
Google working on the team that supports G-mail. 
If you've never heard about 
Site Reliability Engineering before, 
let me tell you a bit about what we do. 
SRE is focused on the reliability and 
maintainability of large systems. 
We apply tons of automation techniques to manage them. 
This let's teams with only a handful 
of engineers have a big impact, 
scaling our support as our service grows. 
We're small, but mighty. 
My job includes a lot of different tasks. 
Sometimes I spend my time collaborating with 
partner teams on the reliability aspects 
of a cool new feature, 
like scheduling emails to send at a later time on G-mail. 
Other days, I write software, 
creating tools that help 
automate how we manage the service. 
When I'm not doing that, 
I might do research or 
architectural design for a new project. 
I'm also part of the on-call rotation for the service. 
If problems come up when I'm on call, 
I'm in charge of fixing them or 
finding the right person to fix them if I can't. 
So what will we cover in this course? 
We'll start by looking into 
an automation technique called configuration management, 
which lets us manage 
the configuration of our computers at scale. 
Specifically, we'll learn how to use Puppet, 
the current industry standard 
for configuration management. 
We'll look at some simple examples, 
and then see how we can apply 
the same concept to more complex cases. 
You'll be a Puppet master in no time. 
Later on, we'll expand 
our automation skills by looking into 
how we can make use of the Cloud to 
help us scale our infrastructure. 
We'll learn about the benefits and 
challenges of moving services to the Cloud. 
We'll check out some of the best practices for 
handling hundreds of virtual 
machines running in the Cloud, 
how to adapt our services to that, 
and how to troubleshoot them 
when things don't go according to plan. 
Heads up, they rarely do. 
Before we move on, I should 
probably tell you a little bit about myself. 
I discovered I was interested in 
IT and technology as a teenager. 
So when I decided to enlist 
in the Navy right after high school, 
I signed up to be an 
Information Systems Technician there. 
I served in the Navy for four years supporting 
IT and networks resources around the world. 
After leaving the Navy, 
I went to college and then joined 
Google in the IT support department. 
The transition from working in 
a very structured environment like 
the military to a place like Google 
was initially a bit hard to wrap my head around. 
I had to become much more comfortable in dealing with 
ambiguity in the problem spaces that I was working in, 
which meant learning to trust 
my own sense of judgment and prioritization. 
All along the way, I kept learning 
new skills and growing as a person and an engineer. 
So I'm excited to be here to help 
you take the next step in your IT career, 
to help you keep growing 
your automation skills by learning 
how to manage fleets of 
computers using configuration management, 
and how to work with the Cloud. 
Modern IT is moving more and more 
towards Cloud-based solutions and having 
a solid background in how to manage them will be 
even more critical for IT professionals in the future. 
In this course, we'll use Qwiklabs which is 
an environment that allows you to test 
your code on a virtual machine running in the Cloud. 
This lets you experience real-world scenarios, 
where you'll need to interact with 
one or more remote systems to achieve your goal. 
We'll build on top of the many tools 
that you've learned about throughout the program, 
like using Python for automation scripts, 
using Git to store versions of code, 
or figure out what's going on when 
a program doesn't behave as expected. 
You'll see some complex topics and videos that 
may not 100 percent sink in the first time around. 
That's totally natural. 
Take your time and re-watch the videos a few 
times if you need to, you'll get the hang of it. 
Also, remember that you can use the discussion forums to 
connect with your fellow learners and 
ask questions anytime you need. 
We're about to begin our journey, 
learning how we can apply automation at large scale. 
So let's get started.
Welcome to the Course
Welcome to the course!
In this course, you’ll learn how to automate managing large fleets of computers using
tools like Configuration Management. You'll also learn how you can make use of Cloud
technologies

Course prerequisites
This course requires some familiarity with basic IT concepts like operating with file
systems, handling processes, and understanding log files.

The example scripts and programs in this course are written in Python, so you’ll need an
understanding of this programming language, too. 

We also touch upon other concepts covered in other courses of the specialization, like
using a Version Control System, or using the Linux command line interface.

How to pass the class


The course certificate gives you a way to prove your new programming skills to employers.
To qualify for the certificate, you have to enroll in the program, pay the fee, and pass the
graded assessments. If you don’t want to pay, you can still audit the course for free. This
lets you view all videos and submit practice quizzes as you learn. One thing to remember
though is that this option doesn’t let you submit assessments, earn a grade, or receive the
certificate.

How deadlines work


When you enroll in the course, the system automatically sets a deadline for when you
need to complete each section. Heads up: These deadlines are there to help you organize
your time, but you can take the course at your own pace. If you "miss" a deadline, you can
just reset it to a new date. There’s no time limit in which you have to finish the course, and
you can earn the certificate whenever you finish.

Qwiklabs
For some of our exercises, you'll be using an application called Qwiklabs. Qwiklabs lets
you interact with a computer running an operating system that might not be the same one
running on your machine. The Qwiklabs scenarios will allow you to solve some real-world
problems, putting your knowledge to work through active learning exercises.
Getting and giving help
Here are a few ways you can give and get help: 

1. Discussion forums: You can share information and ideas with your fellow
learners in the discussion forums. These are also great places to find answers
to questions you may have. If you're stuck on a concept, are struggling to
solve a practice exercise, or you just want more information on a subject, the
discussion forums are there to help you move forward.
2. Coursera learner support: Use the Learner Help Center to find information on
specific technical issues. These include error messages, difficulty submitting
assignments, or problems with video playback. If you can’t find an answer in
the documentation, you can also report your problem to the Coursera support
team by clicking on the Contact Us! link available at the bottom of help center
articles.
3. Qwiklabs support: Please use the Qwiklabs support request form to report
any issues with accessing or using Qwiklabs. A member of the Qwiklabs team
will work with you to help resolve the problem.
4. Course content issues: You can also flag problems in course materials by
rating them. When you rate course materials, the instructor will see your
ratings and feedback; other learners won’t. To rate course materials:
 Open the course material you want to rate. You can only rate videos, readings,
and quizzes.
 If the content was interesting or helped you learn, click the thumbs-up icon.
 If the content was unhelpful or confusing, click the thumbs-down icon.

Finding out more information


Throughout this course, we teach you how to solve a wide range of technical issues. While
we’ll provide a lot of information through videos and supplemental readings, sometimes,
you may need to look things up on your own (now and throughout your career). Things
change fast in IT, so it’s critical to do your own research to stay up-to-date on what’s new.
We recommend you use your favorite search engine to find more information about
concepts we cover in this course — it’s great practice for the real world!

INTODUCTION TO AUTOMATION AT SCALE

AUTOMATING WITH CONFIGURATION MANAGEMENT

No matter the size of your team 


or the number of computers in your fleet, 
knowing how to apply automation techniques will 
enable you to do your work much more effectively. 
As I shared earlier, I'm part of 
the Site Reliability Engineering 
team that supports Gmail. 
My team is relatively small 
but the service is pretty big. 
Without scaling our efforts 
through automation and tooling, 
it would be impossible to help Gmail 
meet its reliability goals. 
While you're probably not supporting 
such a large-scale service right now, 
you'll definitely benefit from using 
the right automation for your needs. 
Being able to automate the installation of new software, 
the provisioning of new workstations 
or the configuration of 
a new server can make a big difference 
even when you're the only person in your IT department. 
In the coming videos, 
we'll kick things off by looking at 
some important automation concepts, 
like what we mean when we talk about scale and how we can 
use configuration management to 
maintain the computers in our fleet, 
and how we can all benefit from 
treating our infrastructure as code. 
These concepts are the building 
blocks for letting us manage 
a growing number of devices 
without having to grow the team in charge of them. 
We'll then get to our first taste of Puppet, 
the configuration management tool 
that we'll be teaching you throughout this course. 
We'll check out a bunch of different examples 
to see what Puppet rules look like. 
We'll also learn about the underlying concepts and 
how you can get it to do the heavy lifting for you. 
The concepts that we'll check out 
throughout this module will help you 
take your first steps and automating at a larger scale. 
Knowing how to automatically 
manage the configuration of the devices in 
your fleet will let your team handle 
a lot more work with the same amount of people. 
It also frees up time to do 
more interesting stuff since 
all the boring tasks can get automated. 
By the end of the module, 
you'll have the skills to fix 
a bug in existing automation, 
which is great news since that's 
exactly what you're going to do with code you provide. 
Funny how that works, isn't it? 
Almost like we planned it. Let's dive in.

WHAT IS SCALE?

In this course we'll focus on making our work scale. 


So what do we mean when we talk about scale? 
Being able to scale what we do means that we can keep achieving larger impacts 
with the same amount of effort when a system scales. 
Well an increase in the amount of work it needs to do can be 
accommodated by an increase in capacity. 
For example, if the web application your company provides is scalable, 
that it can handle an increase in the number of people 
using it by adding more servers to serve requests. 
In short, a scalable system is a flexible one. 
Adding more computers to the pool of servers that are serving the website can 
be a very simple or 
very hard operation depending on how your infrastructure is set up. 
To figure out how scalable your current setup is, you can ask yourself 
questions like will adding more servers increase the capacity of the service? 
How are new servers prepared, installed, and configured? 
How quickly can you set up new computers to get them ready to be used? 
Could you deploy a hundred servers with the same IT team that you have today? 
Or would you need to hire more people to get it done faster? 
Would all the deployed servers be configured exactly the same way? 
Scaling isn't just about website serving content of course. 
If your company is rapidly hiring a lot of new employees, 
you'll need to have an onboarding process that can scale as needed. 
And as you keep adding new computers to the network, you'll need to make sure that 
your system administration process can scale to the growing needs of the company. 
This can include tasks like a applying the latest security policies and 
patches while making sure users' needs still get addressed all while more and 
more users join the network without new support staff to back you up. 
If making this happen sounds a bit like magic right now, 
remember that we're here to share the secret ingredient with you, automation. 
Automation is an essential tool for 
keeping up with the infrastructure needs of a growing business. 
By using the right automation tools, 
we can get a lot more done in the same amount of time. 
For example, we could deploy a whole new server by running a single command and 
letting the automation take care of the rest. 
We could also create a batch of user accounts with all the necessary 
permissions based on data already stored in the database, 
eliminating all human interaction. 
Automation is what lets us scale. 
It allows a small IT team to be in charge of hundreds or 
even thousands of computers. 
Okay, so what does that look like in practice? 
There's a bunch of different tools that we can use to achieve this. 
Up next, we'll talk about a type of tool called configuration management that can 
help us automate how we manage the computers in our fleets.

WHAT IS CONFIGURATION MANAGEMENT?

magine your team is in charge of setting up a new server. 


This could be a physical computer running close to you or 
a virtual machine running somewhere in the cloud. 
To get things moving, the team installs the operating system, 
configures some applications and services, sets up the networking stack, 
and when everything is ready, puts the server into use. 
By manually deploying the installation and configuring the computer, 
we see that we're using unmanaged configuration. 
When we say configuration here, 
we're talking about everything from the current operating system and 
the applications installed to any necessary configuration files or policies, 
including anything else that's relevant for the server to do its job. 
When you work in IT, you're generally in charge of the configuration of a lot of 
different devices, not just servers. 
Network routers printers and 
even smart home devices can have configuration that we can control. 
For example, a network switch might use a config file to set up each of its ports. 
All right, so now we know what we mean when we talk about configuration. 
We said that manually deploying a server means that the configuration is unmanaged. 
So what would it mean for the configuration to be managed? 
It means using a configuration management system to handle all 
of the configuration of the devices in your fleet, also known as nodes. 
There's a bunch of different tools available depending on the devices and 
services involved. 
Typically you'll define a set of rules that have to be applied to the nodes you 
want to manage and then have a process that ensures that those settings are true 
on each of the nodes. 
At a small scale, unmanaged configurations seem inexpensive. 
If you only manage a handful of servers, 
you might be able to get away with doing that without the help of automation. 
You could log into each device and make changes by hand when necessary. 
And when your company needs a new database server, you might just go ahead and 
manually install the OS and the database software into a spare computer. 
But this approach doesn't always scale well. 
The more servers that you need to deploy, 
the more time it will take you to do it manually. 
And when things go wrong, and they often do, 
it can take a lot of time to recover and have the servers back online. 
Configuration management systems aim to solve this scaling problem. 
By managing the configuration of a fleet with a system like this, 
large deployments become easier to work with because the system will deploy 
the configuration automatically no matter how many devices you're managing. 
When you use configuration management and you need to make a change in one or more 
computers, you don't manually connect to each computer to perform operations on it. 
Instead, you edit the configuration management rules and 
then let the automation apply those rules in the affected machines. 
This way the changes you make to a system or 
group of systems are done in a systematic, repeatable way. 
Being repeatable is important because it means that the results will be the same on 
all the devices. 
A configuration management tool can take the rules you define and 
apply them to the systems that it manages, making changes efficient and consistent. 
Configuration management systems often also have some form 
of automatic error correction built in so 
that they can recover from certain types of errors all by themselves. 
For example, say you found that some application that was being 
used widely in your company was configured to be very insecure. 
You can add rules to your configuration management system to improve the settings 
on all computers. 
And this won't just apply the more secure settings once. 
It will continue to monitor the configuration going forward. 
If a user changes the settings on their machine, the configuration management 
tooling will detect this change and reapply the settings you defined in code. 
How cool is that? 
There are lots of configuration management systems available in the IT 
industry today. 
Some popular systems include Puppet, Chef, Ansible, and CFEngine. 
These tools can be used to manage locally hosted infrastructure. 
Think bare metal or virtual machines, like the laptops or 
work stations that employees use at a company. 
Many also have some kind of Cloud integration allowing them to manage 
resources in Cloud environments like Amazon EC2, Microsoft Azure, or 
the Google Cloud platform, and the list doesn't stop there. 
There are some platform specific tools, like SCCM and Group Policy for Windows. 
These tools can be very useful in some specific environments, 
even when they aren't as flexible as the others. 
For this course, we've chosen to focus on Puppet because it's the current industry 
standard for configuration management. 
Keep in mind though that selecting a configuration management system is a lot 
like deciding on a programming language or version control system. 
You should pick the one that best fits your needs and adapt accordingly, 
if necessary. 
Each has its own strengths and weaknesses. 
So a little research beforehand can help you decide which system is best suited for 
your particular infrastructure needs. 
There are a lot of tools out there. 
So be sure to check them out. 
Up next, we'll discuss how we can make the most out of our configuration management 
system using the infrastructure as code paradigm.

WHAT IS INFRASTRUCTURE AS CODE?

We've called out that when we use 


a configuration management system, 
we write rules that describe how 
the computers in our fleet should be configured. 
These rules are then executed by the automation, 
to make the computers match our desired state. 
This means that we can model 
the behavior of our IT infrastructure 
in files that can be processed by automatic tools. 
These files can then be 
tracked in a version control system. 
Remember, version control systems help us 
keep track of all changes done to the files, 
helping answer questions like who, when, and why. 
More importantly, they're super-useful 
when we need to revert changes. 
This can be especially helpful 
if a change turns out to be problematic. 
The paradigm of storing 
all the configuration for the managed devices in 
version controlled files is known as 
Infrastructure as Code or IaC. 
In other words, we see that we're 
using Infrastructure as Code when all of 
the configuration necessary to deploy and manage 
a node in the infrastructure 
is stored in version control. 
This is then combined with automatic tooling 
to actually get the nodes provisioned and managed. 
If you have all the details of 
your Infrastructure properly stored in the system, 
you can very quickly deploy 
a new device if something breaks down. 
Simply get a new machine, 
either virtual or physical, 
use the automation to deploy 
the necessary configuration, and you're done. 
The principals of Infrastructure as Code are 
commonly applied in cloud computing environments, 
where machines are treated like 
interchangeable resources, instead 
of individual computers. This principle 
is also known as 
treating your computers as cattle 
instead of pets because you care for 
them as a group rather than individually. 
Apologies to anyone with a pet cow. 
This concept isn't just for managing computers in 
huge data centers or globe spanning infrastructures, 
it can work for anything; 
from servers to laptops, 
or even workstations in a small IT department. 
Even if your company only has 
a single computer working as the mail server, 
you can still benefit from 
storing all the configuration needed 
to set it up in a configuration management system. 
That way if the server ever stops working, 
you can deploy a replacement very quickly by simply 
applying the rules that configure 
the mail server to the new computer. 
One valuable benefit of 
this process is that the configuration 
applied to the device doesn't depend on 
a human remembering to follow all the necessary steps. 
Rest assured, silly human, 
the result will always be the same, 
making the deployment consistent. 
As mentioned, having Infrastructure as 
Code means that we can also apply 
the benefits of the version control system 
or VCS to your infrastructure. 
Since the configuration of 
our computers is stored in files, 
those files can be added to a VCS. 
This has all the benefits 
that version control systems bring. 
It gives us an audit trail of changes, 
it lets us quickly rollback if a change was wrong, 
it lets others reviewed 
our code to catch errors and distribute knowledge, 
it improves collaboration with the rest of the team, 
and it lets us easily check out the state of 
our infrastructure by looking 
at the rules that are committed. 
Not too shabby. I personally think 
this is one of the coolest things about IaC. 
The ability to easily see what configuration changes were 
made and roll back to 
a known good state is super important. 
It can make a big difference in 
quickly recovering from an outage, 
especially since changing the contents 
of the configuration file 
can be as dangerous as 
updating the version of an application. 
I've had my fair share of outages caused by 
an innocent-looking change with unintended side effects. 
But storing all the infrastructure in 
a version control system lets me quickly roll back 
to a previously known good version 
so that the outage length can be minimized. 
On top of that, 
having the rules stored in files means that we 
can also run automated tests on them. 
It's much better to find out in a test that 
a configuration file has a typo in 
it than to find out from our users. 
In a complex or large environment, 
treating your IT Infrastructure as Code can help you 
deploy a flexible scalable system. 
A configuration management system 
can help you manage that code by providing 
a platform to maintain and provision 
that infrastructure in an automated way. 
Having your infrastructure stored 
as code means that you can 
automatically deploy your infrastructure 
with very little overhead. 
If you need to move it to a different location, 
it can be deployed, 
de-provisioned, and redeployed at scale 
in a different locale with minimal code level changes. 
To sum all of this up, 
managing your Infrastructure as Code it 
means that your fleet of nodes are consistent, 
versioned, reliable, and repeatable. 
Instead of being seen as precious or unique, 
machines are treated as replaceable resources that 
can be deployed on-demand through the automation. 
Any infrastructure that claims to be scalable 
must be able to handle 
the capacity requirements of growth. 
Performing an action like adding more servers to handle 
an increase in requests is just a possible first step. 
There are other things that we 
might need to take into account, 
such as the amount of traffic that network can 
handle or the load on 
the back end servers like databases. 
Viewing your infrastructure in this way helps 
your IT team adapt and stay flexible. 
The technology industry is 
constantly changing and evolving. 
Automation and configuration management can help you 
embrace that change instead of avoiding it. 
Before diving into concrete examples 
of what this looks like, 
the first practice quiz of the course is coming up. 
These quizzes act as 
check-in points to help you make sure 
all the concepts covered in the videos are 
making sense. See you on the other side.

INTRODUCTION TO PUPPET

WHAT IS PUPPET?

As we called out a couple of times 


already, in this course, 
we'll be learning how to apply 
basic configuration management concepts by using Puppet. 
Puppet is the current industry standard for managing 
the configuration of computers in a fleet of machines. 
Part of the reason why Puppet is so popular is that 
it's a cross-platform tool 
that's been around for a while. 
It's an open source project that was created in 2005, 
and it's gone through several different versions. 
As it's evolved, the tool has incorporated feedback 
from its users to make it more and more useful. 
The latest available version at 
the time this Google course went live is Puppet 6, 
which came out in late 2018. 
We typically deploy puppet 
using a client-server architecture. 
The client is known as the Puppet agent, 
and the service is known as the Puppet master. 
When using this model, 
the agent connects to the master and sends a bunch of 
facts that describe the computer to the master. 
The master then processes this information, 
generates the list of rules that 
need to be applied on the device, 
and sends this list back to the agent. 
The agent is then in charge of making 
any necessary changes on the computer. 
Puppet is a cross-platform application 
available for all Linux distributions, 
Windows, and Mac OS. 
This means that you can use the same puppet rules 
for managing a range of different computers. 
What are these rules that we keep talking about? 
Let's check out a very simple example. 
This block is saying that the package 'sudo' 
should be present on every computer 
where the rule gets applied. 
If this rule is applied on 100 computers, 
it would automatically install 
the package in all of them. 
This is a small and simple block but can already 
give us a basic impression of 
how rules are written in puppet. 
Don't worry too much about the syntax now, 
we'll look into what each piece means in future videos. 
There are various installation tools 
available depending on the type of operating system. 
Puppet will determine the type of 
operating system being used 
and select the right tool to 
perform the package installation. 
On Linux distributions, there are 
several package management systems 
like APT, Yum, and DNF. 
Puppet will also determine 
which package manager should 
be used to install the package. 
On Mac OS, there's 
a few different available providers depending 
on where the package is coming from. 
The Apple Provider is used for 
packages that are part of the OS, 
while the MacPorts provider is used for 
packages that come from the MacPorts Project. 
For Windows, we'll need to add 
an extra attribute to our rule, 
stating where the installer file is located on 
the local desk or a network mounted resource. 
Puppet will then execute 
the installer and make 
sure that it finishes successfully. 
If you use Chocolatey to manage your windows packages, 
you can add an extra Chocolatey provider 
to Puppet to support that. 
We'll add a link to more information 
about this in our next reading. 
Using rules like this one, 
we can get puppet to do a lot more 
than just install packages for us. 
We can add, remove, 
or modify configuration files stored in the system, 
or change registry entries on Windows. 
We can also enable, disable, 
start, or stop the services that run on our computer. 
We can configure crone jobs, 
the scheduled tasks, add, remove, 
or modify Users and Groups 
or even execute external commands, 
if that's what we need. 
There's a lot to say about puppet. 
We won't go into absolutely every detail, 
but we'll cover the most 
important concepts in this course. 
The goal is to get you 
started with what you need to know about 
configuration management in general 
and puppet in particular. 
We'll also give you pointers to find 
out more information on your own. 
Up next, we'll check out 
the different resources we can use to define our rules.

PUPPET RESOURCESIn our last video, 

we saw an example that installed the pseudo package in a computer. 


To do that, our example used the package keyword declaring a package resource. 
In puppet, resources are the basic unit for 
modeling the configuration that we want to manage. 
In other words, each resource specifies one configuration that we're trying to 
manage, like a service, a package, or a file. 
Let's look at another example. 
In this case, we're defining a file resource. 
This resource type is used for managing files and directories. 
In this case, it's a very simple rule 
that ensures that etc/sysctl.d exists and is a directory. 
Let's talk a little bit about syntax. 
In both our last example and 
this one we could see that when declaring a resource in puppet, 
we write them in a block that starts with the resource type ,in this case File. 
The configuration of the resource is then written inside a block of curly braces. 
Right after the opening curly brace, we have the title of the resource, 
followed by a colon. 
After the colon come the attributes that we want to set for the resource. 
In this example, 
we're once again setting the insurer attribute with directory as the value, 
but we could set other attributes too >> Let's check out a different file 
resource. 
In this example, 
we're using a file resource to configure the contents of etc/timezone, a file, 
that's used in some Linux distributions to determine the time zone of the computer. 
This resource has three attributes. 
First, we explicitly say that this will be a file instead of a directory or 
a symlink then we set the contents of the file to the UTC time zone. 
Finally, we set the replace attribute to true, which means that the contents of 
the file will be replaced even if the file already exists. 
We've now seen a couple examples of what we can do with the file resource. 
There are a lot more attributes that we could set, 
like file permissions the file owner, or the file modification time.

Play video starting at :2:13 and follow transcript2:13


We've included a link to the official documentation in the next reading where 
you can find all the possible attributes that can be set for each resource. 
How do these rules turn into changes in our computers? 
When we declare a resource in our puppet rules. 
We're defining the desired state of that resource in the system. 
The puppet agent then turns the desired state into reality using providers.

Play video starting at :2:38 and follow transcript2:38


The provider used will depend on the resource defined and 
the environment where the agent is running. 
Puppet will normally detect this automatically without us having to do 
anything special. 
When the puppet agent processes a resource, 
it first decides which provider it needs to use, then passes along the attributes 
that we configured in the resource to that provider. 
The code of each provider is in charge of making our computer reflect the state 
requested in the resource. 
In these examples, We've looked at one resource at a time. 
Up next, we'll see how we can combine a bunch of resources into more complex 
puppet classes.

PUUPET CLASSES

 
In the examples of Puppet code that we've seen so far, 
we've declared classes that contain one resource. 
You might have wondered what those classes were for. 
We use these classes to collect the resources that are needed to achieve 
a goal in a single place. 
For example, you could have a class that installs a package, sets the contents of 
a configuration file, and starts the service provided by that package. 
Let's check out an example like that. 
In this case, we have a class with three resources, a package, 
a file, and a service. 
All of them are related to the Network Time Protocol, or NTP, 
the mechanism our computers use to synchronize the clocks. 
Our rules are making sure that the NTP package is always upgraded to 
the latest version. 
We're setting the contents of the configuration file using the source 
attribute, which means that the agent will read the required contents from 
the specified location. 
And we're saying that we want the NTP service to be enabled and running. 
By grouping all of the resources related to NTP in the same class, 
we only need a quick glance to understand how the service is configured and 
how it's supposed to work. 
This would make it easier to make changes in the future since we have 
all the related resources together. 
It makes sense to use this technique whenever we want 
to group related resources. 
For example, 
you could have a class grouping all resources related to managing log files, 
or configuring the time zone, or handling temporary files and directories. 
You could also have classes that group all the settings related to your web serving 
software, your email infrastructure, or even your company's firewall. 
We're just getting started with Puppet's basic resources and 
seeing how they can be applied. 
In further videos, we'll be learning a lot more about common practices when using 
configuration management tools. 
But before jumping into that, we've put together a reading with more information 
about Puppet syntax, resources and links to the official reference. 
Then we've got a quick quiz to check that everything is making sense.

Bolt Examples
Bolt lets you automate almost any task you can think of. These examples walk you
through beginning and intermediate level Bolt use cases, demonstrating Bolt
concepts along the way. You can find shorter examples of common Bolt patterns in
the Bolt examples repo, which is more of a reference than a learning tool.

If you'd like to share a real-world use case, reach out to us in the #bolt channel
on Slack.

For even more usage examples, check out the Puppet blog.

Note: Do you have a real-world use case for Bolt that you'd like to share? Reach
out to us in the #bolt channel on Slack.

 Automating Windows targets


Examples of how to run Powershell scripts, use Bolt inventory, convert a
script to a task, and write a Bolt plan, all on Windows.

 Deploy a TIG stack with Bolt


An example of deploying a Telegraf, InfluxDB, and Grafana stack with
Bolt, this demonstrates using existing Puppet content to quickly provide
value.

Resources
Sections
Resource declarations

Resource uniqueness

Relationships and ordering

Resource types

Title

Attributes

Namevars and name

Metaparameters

Resource syntax

 Basic syntax
 Complete syntax
 Resource declaration default attributes
 Setting attributes from a hash
 Abstract resource types
 Arrays of titles
 Adding or modifying attributes
 Local resource defaults
Expand
Resources are the fundamental unit for modeling system configurations. Each
resource describes the desired state for some aspect of a system, like a specific
service or package. When Puppet applies a catalog to the target system, it manages
every resource in the catalog, ensuring the actual state matches the desired state.

The following video gives you an overview of resources:

Resources contained in classes and defined types share the relationships of those
classes and defined types. Resources are not subject to scope: a resource in any
area of code can be referenced from any other area of code.

A resource declaration adds a resource to the catalog and tells Puppet to manage


that resource's state.

When Puppet applies the compiled catalog, it:


1. Reads the actual state of the resource on the target system.
2. Compares the actual state to the desired state.
3. If necessary, changes the system to enforce the desired state.
4. Logs any changes made to the resource. These changes appear
in Puppet agent's log and in the run report, which is sent to the primary
server and forwarded to any specified report processors.
If the catalog doesn't contain a particular resource, Puppet does nothing with
whatever that resource described. If you remove a package resource from your
manifests, Puppet doesn't uninstall the package; instead, it just ignores it. To
remove a package, manage it as a resource and set ensure => absent.

You can delay adding resources to the catalog. For example, classes and defined
types can contain groups of resources. These resources are managed only if you
add that class or defined resource to the catalog. Virtual resources are added to the
catalog only after they are realized.

Resource declarations
At minimum, every resource declaration has a resource type, a title, and a set
of attributes: 

<TYPE> { '<TITLE>': <ATTRIBUTE> => <VALUE>, } Copied!


The resource title and attributes are called the resource body. A resource
declaration can have one resource body or multiple resource bodies of the same
resource type.

Resource declarations are expressions in the Puppet language — they always have


a side effect of adding a resource to the catalog, but they also resolve to a value.
The value of a resource declaration is an array of resource references, with one
reference for each resource the expression describes.

A resource declaration has extremely low precedence; in fact, it's even lower than
the variable assignment operator (=). This means that if you use a resource
declaration for its value, you must surround it with parentheses to associate it with
the expression that uses the value.

If a resource declaration includes more than one resource body, it declares multiple
resources of that resource type. The resource body is a title and a set of attributes;
each body must be separated from the next one with a semicolon. Each resource in
a declaration is almost completely independent of the others, and they can have
completely different values for their attributes. The only connections between
resources that share an expression are:
 They all have the same resource type.
 They can all draw from the same pool of default values, if a resource body
with the title default is present.

Resource uniqueness
Each resource must be unique; Puppet does not allow you to declare the same
resource twice. This is to prevent multiple conflicting values from being declared
for the same attribute. Puppet uses the resource title and the name attribute
or namevar to identify duplicate resources — if either the title or the name is
duplicated within a given resource type, catalog compilation fails. See the page
about resource syntax for details about resource titles and namevars. To provide
the same resource for multiple classes, use a class or a virtual resource to add it to
the catalog in multiple places without duplicating it. See classes and virtual
resources for more information.

Relationships and ordering


By default, Puppet applies unrelated resources in the order in which they're written
in the manifest. If a resource must be applied before or after some other resource,
declare a relationship between them to show that their order isn't coincidental. You
can also make changes in one resource cause a refresh of some other resource. See
the Relationships and ordering page for more information.

Otherwise, you can customize the default order in which Puppet applies resources


with the ordering setting. See the configuration page for details about this setting.

Resource types
Every resource is associated with a resource type, which determines the kind of
configuration it manages. Puppet has built-in resource types such as file, service,
and package. See the resource type reference for a complete list and information
about the built-in resource types.

You can also add new resource types to Puppet:


 Defined types are lightweight resource types written in the Puppet language.
 Custom resource types are written in Ruby and have the same capabilities
as Puppet's built-in types.

Title
A resource's title is a string that uniquely identifies the resource to Puppet. In a
resource declaration, the title is the identifier after the first curly brace and before
the colon. For example, in this file resource declaration, the title is /etc/passwd: 

file { '/etc/passwd':

owner => 'root',

group => 'root',

} Copied!
Titles must be unique per resource type. You can have both a package and a
service titled "ntp," but you can only have one service titled "ntp." Duplicate titles
cause compilation to fail.

The title of a resource differs from the namevar of the resource. Whereas the title
identifies the resource to Puppet itself, the namevar identifies the resource to the
target system and is usually specified by the resource's name attribute. The
resource title doesn't have to match the namevar, but you'll often want it to: the
value of the namevar attribute defaults to the title, so using the name in the title can
save you some typing.

If a resource type has multiple namevars, the type specifies whether and how the
title maps to those namevars. For example, the package type uses
the provider attribute to help determine uniqueness, but that attribute has no
special relationship with the title. See each type's documentation for details about
how it maps title to namevars.

Attributes
Attributes describe the desired state of the resource; each attribute handles some
aspect of the resource. For example, the file type has a mode attribute that
specifies the permissions for the file.

Each resource type has its own set of available attributes; see the resource type
reference for a complete list. Most resource types have a handful of crucial
attributes and a larger number of optional ones. Attributes accept certain data
types, such as strings, numbers, hashes, or arrays. Each attribute that you declare
must have a value. Most attributes are optional, which means they have a default
value, so you do not have to assign a value. If an attribute has no default, it is
considered required, and you must assign it a value.

Most resource types contain an ensure attribute. This attribute generally manages


the most basic state of the resource on the target system, such as whether a file
exists, whether a service is running or stopped, or whether a package is installed or
uninstalled. The values accepted for the ensure attribute vary by resource type.
Most accept present and absent, but there are variations. Check the reference for
each resource type you are working with.

Tip: Resource and type attributes are sometimes referred to as


parameters. Puppet also has properties, which are slightly different from
parameters: properties correspond to something measurable on the target system,
whereas parameters change how Puppet manages a resource. A property always
represents a concrete state on the target system. When talking about resource
declarations in Puppet, parameter is a synonym for attribute.

Namevars and name 
Every resource on a target system must have a unique identity; you cannot have
two services, for example, with the same name. This identifying attribute
in Puppet is known as the namevar.

Each resource type has an attribute that is designated to serve as the namevar. For
most resource types, this is the name attribute, but some types use other attributes,
such as the file type, which uses path, the file's location on disk, for its namevar. If
a type's namevar is an attribute other than name, this is listed in the type reference
documentation.

Most types have only one namevar. With a single namevar, the value must be
unique per resource type. There are a few rare exceptions to this rule, such as
the exec type, where the namevar is a command. However, some resource types,
such as package, have multiple namevar attributes that create a composite
namevar. For example, both the yum provider and the gem provider
have mysql packages, so both the name and the provider attributes are namevars,
and Puppet uses both to identify the resource.

The namevar differs from the resource's title, which identifies a resource


to Puppet's compiler rather than to the target system. In practice, however, a
resource's namevar and the title are often the same, because the namevar usually
defaults to the title. If you don't specify a value for a resource's namevar when you
declare the resource, Puppet uses the resource's title.

You might want to specify different a namevar that is different from the title when
you want a consistently titled resource to manage something that has different
names on different platforms. For example, the NTP service might be ntpd on Red
Hat systems, but ntp on Debian and Ubuntu. You might title the service "ntp," but
set its namevar --- the name attribute --- according to the operating system. Other
resources can then form relationships to the resource without the title changing.

Metaparameters
Some attributes in Puppet can be used with every resource type. These are
called metaparameters. These don't map directly to system state. Instead,
metaparameters affect Puppet's behavior, usually specifying the way in which
resources relate to each other.

The most commonly used metaparameters are for specifying order relationships
between resources. See the documentation on relationships and ordering for details
about those metaparameters. See the full list of available metaparameters in
the metaparameter reference.

Resource syntax
You can accomplish a lot with just a few resource declaration features, or you can
create more complex declarations that do more.

Basic syntax
The simplified form of a resource declaration includes:

 The resource type, which is a word with no quotes.


 An opening curly brace {.
 The title, which is a string.
 A colon (:).
 Optionally, any number of attribute and value pairs, each of which consists
of:
o An attribute name, which is a lowercase word with no quotes.
o A => (called an arrow, "fat comma," or "hash rocket").
o A value, which can have any [data type][datatype].
o A trailing comma.
 A closing curly brace (}).
You can use any amount of whitespace in the Puppet language.

This example declares a file resource with the title /etc/passwd. This


declaration's ensure attribute ensures that the specified file is created, if it does not already exist
on the node. The rest of the declaration sets values for the file's owner, group,
and mode attributes. 

file { '/etc/passwd':

ensure => file,


owner => 'root',

group => 'root',

mode => '0600',

} Copied!

Complete syntax
By creating more complex resource declarations, you can:

 Describe many resources at once.


 Set a group of attributes from a hash with the * attribute.
 Set default attributes.
 Specify an abstract resource type.
 Amend or override attributes after a resource is already declared.
The complete generalized form of a resource declaration expression is:

 The resource type, which can be one of:


o A lowercase word with no quotes, such as file.
o A resource type data type, such as File, Resource[File],
or Resource['file']. It must have a type but not a title.
 An opening curly brace ({).
 One or more resource bodies, separated with semicolons (;). Each resource
body consists of:
o A title, which can be one of:
 A string.
 An array of strings, which declares multiple resources.
 The special value default, which sets default attribute values
for other resource bodies in the same expression.
o A colon (:).
o Optionally, any number of attribute and value pairs, separated with
commas (,). Each attribute/value pair consists of:
 An attribute name, which can be one of:
 A lowercase word with no quotes.
The special attribute *, called a "splat," which takes a
hash and sets other attributes.
 A =>, called an arrow, a "fat comma," or a "hash
rocket".
 A value, which can have any data type.
 Optionally, a trailing comma after the last attribute/value pair.
 Optionally, a trailing semicolon after the last resource body.
 A closing curly brace (})

<TYPE> { default: * => <HASH OF ATTRIBUTE/VALUE PAIRS>,

<ATTRIBUTE> => <VALUE>, ; '<TITLE>': * => <HASH OF

ATTRIBUTE/VALUE PAIRS>, <ATTRIBUTE> => <VALUE>, ; '<NEXT

TITLE>': ... ; ['<TITLE'>, '<TITLE>', '<TITLE>']: ... ; Copied!

Resource declaration default attributes


If a resource declaration includes a resource body with a title
of default, Puppet doesn't create a new resource named "default." Instead, every
other resource in that declaration uses attribute values from the default body if it
doesn't have an explicit value for one of those attributes. This is also known as
"per-expression defaults."

Resource declaration defaults are useful because it lets you set many attributes at
once, but you can still override some of them.

This example declares several different files, all using the default values set in the default resource
body. However, the mode value for the the files in the last array (['ssh_config',
'ssh_host_dsa_key.pub'....) is set explicitly instead of using the default. 

file {

default:
ensure => file,

owner => "root",

group => "wheel",

mode => "0600",

['ssh_host_dsa_key', 'ssh_host_key', 'ssh_host_rsa_key']:

# use all defaults

['ssh_config', 'ssh_host_dsa_key.pub', 'ssh_host_key.pub',

'ssh_host_rsa_key.pub', 'sshd_config']:

# override mode

mode => "0644",


;

} Copied!

The position of the default body in a resource declaration doesn't matter; resources


above and below it all use the default attributes if applicable.You can only have
one default resource body per resource declaration.

Setting attributes from a hash


You can set attributes for a resource by using the splat attribute, which uses the
splat or asterisk character *, in the resource body.

The value of the splat (*) attribute must be a hash where:

 Each key is the name of a valid attribute for that resource type, as a string.
 Each value is a valid value for the attribute it's assigned to.
This sets values for that resource's attributes, using every attribute and value listed in the hash.

For example, the splat attribute in this declaration sets the owner, group, and mode settings for
the file resource.

$file_ownership = {

"owner" => "root",

"group" => "wheel",

"mode" => "0644",

}
file { "/etc/passwd":

ensure => file,

* => $file_ownership,

} Copied!

You cannot set any attribute more than once for a given resource; if you try, Puppet raises a
compilation error. This means that:

 If you use a hash to set attributes for a resource, you cannot set a different,
explicit value for any of those attributes. For example, if mode is present in
the hash, you can't also set mode => "0644" in that resource body.
 You can't use the * attribute multiple times in one resource body, since the
splat itself is an attribute.
To use some attributes from a hash and override others, either use a hash to set per-expression
defaults, as described in the section on resource declaration defaults, or use the merging
operator, + to combine attributes from two hashes, with the right-hand hash overriding the left-
hand one.

Abstract resource types


Because a resource declaration can accept a resource type data type as its resource type , you can
use a Resource[<TYPE>] value to specify a non-literal resource type, where
the <TYPE> portion can be read from a variable.That is, the following three examples are
equivalent to each other: 

file { "/tmp/foo": ensure => file, } File { "/tmp/foo": ensure => file, }

Resource[File] { "/tmp/foo": ensure => file, } Copied!

 
$mytype = File

Resource[$mytype] { "/tmp/foo": ensure => file, }

Copied!

$mytypename = "file"

Resource[$mytypename] { "/tmp/foo": ensure => file, }

Copied!

This lets you declare resources without knowing in advance what type of resources they'll be, which
can enable transformations of data into resources.

Arrays of titles
If you specify an array of strings as the title of a resource body, Puppet creates
multiple resources with the same set of attributes. This is useful when you have
many resources that are nearly identical.

For example: 

$rc_dirs = [

'/etc/rc.d', '/etc/rc.d/init.d','/etc/rc.d/rc0.d',

'/etc/rc.d/rc1.d', '/etc/rc.d/rc2.d', '/etc/rc.d/rc3.d',


'/etc/rc.d/rc4.d', '/etc/rc.d/rc5.d', '/etc/rc.d/rc6.d',

file { $rc_dirs:

ensure => directory,

owner => 'root',

group => 'root',

mode => '0755',

} Copied!

If you do this, you must let the namevar attributes of these resources default to their titles. You can't
specify an explicit value for the namevar, because it applies to all of the resources.

Adding or modifying attributes


Although you cannot declare the same resource twice, you can add attributes to an
resource that has already been declared. In certain circumstances, you can also
override attributes. You can amend attributes with either a resource reference, a
collector, or from a hash using the splat (*) attribute.

To amend attributes with the splat attribute, see the section about setting attributes
from a hash.
To amend attributes with a resource reference, add a resource reference attribute block to the
resource that's already declared. Normally, you can only use resource reference blocks to add
previously unmanaged attributes to a resource; it cannot override already-specified attributes. The
general form of a resource reference attribute block is:

 A resource reference to the resource in question


 An opening curly brace
 Any number of attribute => value pairs
 A closing curly brace
For example, this resource reference attribute block amends values for the owner, group,
and mode attributes: 

file {'/etc/passwd':

ensure => file,

File['/etc/passwd'] {

owner => 'root',

group => 'root',

mode => '0640',

} Copied!

You can also amend attributes with a collector.


The general form of a collector attribute block is:

 A resource collector that matches any number of resources


 An opening curly brace
 Any number of attribute => value (or attribute +> value) pairs
 A closing curly brace
For resource attributes that accept multiple values in an array, such as the relationship
metaparameters, you can add to the existing values instead of replacing them by using the
"plusignment" (+>) keyword instead of the usual hash rocket ( =>). For details, see appending to
attributes in the classes documentation.

This example amends the owner, group, and mode attributes of any resources that match the
collector:

class base::linux {

file {'/etc/passwd':

ensure => file,

...}

include base::linux

File <| tag == 'base::linux' |> {


owner => 'root',

group => 'root',

mode => '0640',

} Copied!

CAUTION: Be very careful when amending attributes with a collector. Test with --
noop to see what changes your code would make.
 It can override other attributes you've already specified, regardless of class
inheritance.
 It can affect large numbers of resources at one time.
 It implicitly realizes any virtual resources the collector matches.
 Because it ignores class inheritance, it can override the same attribute more
than one time, which results in an evaluation order race where the last
override wins.

Local resource defaults


Because resource default statements are subject to dynamic scope, you can't always
tell what areas of code will be affected. Generally, do not include classic resource
default statements anywhere other than in your site manifest (site.pp). See
the resource defaults documentation for details. Whenever possible, use resource
declaration defaults, also known as per-expression defaults.

However, resource default statements can be powerful, allowing you to set


important defaults, such as file permissions, across resources. Setting local
resource defaults is a way to protect your classes and defined types from
accidentally inheriting defaults from classic resource default statements.

To set local resource defaults, define your defaults in a variable and re-use them in
multiple places, by combining resource declaration defaults and setting attributes
from a hash.

This example defines defaults in a $file_defaults variable, and then includes the variable in a
resource declaration default with a hash. 
class mymodule::params {

$file_defaults = {

mode => "0644",

owner => "root",

group => "root",

# ...

class mymodule inherits mymodule::params {

file { default: *=> $mymodule::params::file_defaults;

"/etc/myconfig":
ensure => file,

THE BUILDING BLOCK OF CONFIGURATION MANAGEMENT

WHY ARE DOMAIN SPECIFIC LANGUAGES

Up until now, we've seen examples of 


very simple Puppet rules they just 
define whiner more resources. 
These resources are the building blocks of Puppet rules, 
but we can do much more complex operations 
using Puppet's domain specific language or DSL. 
Typical programming languages like Python, Ruby, 
Java or Go are general purpose languages that can be used 
to write lots of different applications 
with different goals and use cases. 
On the flip side, 
a domain specific language is 
a programming language that's more limited in scope. 
Learning a domain-specific language 
is usually much faster and easier than 
learning a general purpose programming language 
because there's a lot less to cover. 
You don't need to learn as much syntax or understand as 
many keywords or taking to 
account a lot of overhead in general. 
In the case of Puppet, 
the DSL is limited to operations related to when and 
how to apply configuration management rules 
to our devices. 
For example, we can use the mechanisms provided by 
the DSL to set different values on 
laptops or desktop computers, 
or to install some specific packages 
only on the company's web servers. 
On top of the basic resource types 
that we already checked out, 
Puppet's DSL includes variables, 
conditional statements, and functions. 
Using them, we can apply different resources or 
set attributes to different values 
depending on some conditions. 
Before we jump into an example of what that looks like, 
let's talk a bit about Puppet facts. 
Facts are variables that represent 
the characteristics of the system. 
When the Puppet agent runs, 
it calls a program called 
factor which analyzes the current system, 
storing the information it gathers in these facts. 
Once it's done, it sends 
the values for these facts to the server, 
which uses them to calculate 
the rules that should be applied. 
Puppet comes with a bunch of baked-in core facts 
that store useful information about 
the system like what the current OS is, 
how much memory the computer has whether it's 
a virtual machine or not 
or what the current IP address is. 
If the information we need to make 
a decision isn't available through one of these facts, 
we can also write a script that checks for 
the information and turns it into our own custom fact. 
Let's check out an example of a piece of 
Puppet code that makes use of one of the built-in facts. 
This piece of code is using 
the is-virtual fact together with 
a conditional statement to decide whether 
the smartmontools package should be installed or purged. 
This package is used for monitoring 
the state of hard drives using smart. 
So it's useful to have it installed in physical machines, 
but it doesn't make much sense to 
install it in our virtual machines. 
We can see several of the characteristics of 
Puppets domain specific language in this block. 
So let's spend a little time looking 
at all of the elements of syntax here. 
First, facts is a variable. 
All variable names are preceded by 
a dollar sign in Puppet's DSL. 
In particular, the facts variable 
is what's known as a hash in the Puppet DSL, 
which is equivalent to a dictionary in Python. 
This means that we can access 
the different elements in the hash using their keys. 
In this case, we're accessing the value 
associated to the is virtual key. 
Second, we see how we can write 
a conditional statement using if else, 
enclosing each block of the 
conditional with curly braces. 
Finally, each conditional block 
contains a package resource. 
We've seen resources before, 
but we haven't looked at 
the syntax in detail. So let's do that now. 
Every resource starts with 
the type of resource being defined. 
In this case, package and the contents of 
the resource are then enclosed in curly braces. 
Inside the resource definition, 
the first line contains the title followed by a colon. 
Any lines after that are attributes that are being set. 
We use equals greater than to assign values to 
the attributes and then each attribute ends with a comma. 
We've now covered a large chunk of puppet's DSL syntax. 
If you look back to what it was like to 
learn your first programming language, 
you'll probably notice 
how much less syntax there is to learn here. 
That's typical of the domain specific languages 
used by configuration management tools. 
While each tool uses their own DSL, 
they're usually very simple 
and can be learned very quickly. 
Up next, we'll talk about a few other principles 
behind most configuration management tools. 
Whenever you're ready, let's dive in.
THE DIVING PRINCIPLES OF CONFIGURATION MANAGEMENT

Up to now, we've seen 


a few examples of what Puppet rules look like, 
including a bunch of 
different resources and even a conditional expression. 
You might have noticed that in all 
the examples we've checked out, 
we were never telling the computer the steps it 
should follow in order to do what we wanted. 
Instead, we were just declaring 
the end goal that we wanted to achieve, 
like going to a drive-through and ordering a burger, 
we didn't make it, but there it is. 
The providers that we mentioned 
earlier lake apt and yum are 
the ones in charge of turning our goals 
into whatever actions are necessary. 
We say that Puppet uses a declarative language because we 
declare the state that we want to 
achieve rather than the steps to get there. 
Traditional languages like Python or 
C are called procedural because 
we write out the procedure that the computer 
needs to follow to reach our desired goal. 
Coming from a procedural language like Python, 
it might take some time to get used to writing 
declarative code like the ones 
used for Puppet, and that's okay. 
Just remember that when it 
comes to configuration management, 
it makes sense to simply 
state what the configuration should be, 
not what the computer should do to get there. 
Say you're using a resource to 
declare that you want a package installed, 
you don't care what commands a computer 
has to run you install it, 
you only care that after 
the configuration management tool has run, 
the package is installed. 
Another important aspect of 
configuration management is that 
operations should be idempotent. 
In this context, 
an idempotent action can be performed over and over 
again without changing the system after 
the first time the action was performed, 
and with no unintended side effects. 
Let's check this out with an example of a file resource. 
This resource ensures that 
the /etc/issue file has a set 
of permissions and a specific line in it. 
Fulfilling this requirement is an idempotent operation. 
If the file already exists and has the desired content, 
then Puppet will understand that 
no action has to be taken. 
If the file doesn't exist, 
then puppet will create it. 
If the contents or permissions 
don't match, Puppet will fix them. 
No matter how many times the agent applies the rule, 
the end result is that this file will 
have the requested contents and permissions. 
Idempotency is a valuable property 
of any piece of automation. 
If a script is idempotent, 
it means that it can fail halfway through its task and be 
run again without problematic consequences. 
Say you're running your configuration management system 
to setup a new server. 
Unfortunately, the setup fails 
because you forgot to add a second disk to 
the computer and the configuration required two disks. 
If your automation is idempotent, 
you can add the missing disk and 
then have the system pick up from where it left off. 
Most Puppet resources provide idempotent actions, 
and we can rest assured that two runs of 
the same set of rules will lead to the same end result. 
An exception to this is the exec resource, 
which runs commands for us. 
The actions taken by the exec resource might not be 
idempotent since a command might 
modify the system each time it's executed. 
To understand this, let's check out what happens when 
we execute a command that moves a file on our computer. 
First, we'll check that the example.txt file is here, 
and then we'll move it to the desktop directory.
Putar video mulai dari :3:42 dan ikuti transkrip3:42
This works fine now, 
but what happens if we run the exact same command 
again after it's been executed once? 
We receive an error because 
the file is no longer in the same place. 
In other words, this was not an idempotent action, 
as executing the same action twice produced 
a different result and 
the unintended side effect of an error. 
If we were running this inside Puppet, 
this would cause our Puppet run to finish with an error. 
So if we need to use 
the exec resource to run a command for us, 
we need to be careful to 
ensure that the action is idempotent. 
We could do that for example by using 
the onlyif attribute like this. 
Using the onlyif attribute, 
we specified that this command should be executed 
only if the file that we want to move exists. 
This means that the file will be moved if it 
exists and nothing will happen if it doesn't. 
By adding this conditional, 
we've taken an action that's not 
idempotent and turned it into an idempotent one. 
Another important aspect of how 
configuration management works is 
the test and repair paradigm. 
This means that actions are taken only 
when they are necessary to achieve a goal. 
Puppet will first test to see if the resource 
being managed like a file or a package, 
actually needs to be modified. 
If the file exists in the place we want it to, 
no action needs to be taken. 
If a package is already installed, 
there's no need to install it again. 
This avoids wasting time 
doing actions that aren't needed. 
Finally, another important characteristic 
is that Puppet is stateless, 
this means that there's no state being 
kept between runs of the agent. 
Each Puppet run is independent of 
the previous one, and the next one. 
Each time the puppet agent runs, 
it collects the current facts. 
The Puppet master generates 
the rules based just on those facts, 
and then the agent applies them as necessary. 
We're just getting started with what 
configuration management is and 
what it looks like in Puppet. 
But hopefully, you're starting to see how 
understanding these basic concepts and how 
turning them into practical rules can 
help you manage a small army of computers. 
Up next, there's a reading with 
links to more information about 
the concepts we've covered followed by 
a quick quiz. You've got this.

Domain-specific language
From Wikipedia, the free encyclopedia

Jump to navigationJump to search

A domain-specific language (DSL) is a computer language specialized to a


particular application domain. This is in contrast to a general-purpose
language (GPL), which is broadly applicable across domains. There are a wide
variety of DSLs, ranging from widely used languages for common domains, such
as HTML for web pages, down to languages used by only one or a few pieces of
software, such as MUSH soft code. DSLs can be further subdivided by the kind of
language, and include domain-specific markup languages, domain-
specific modeling languages (more generally, specification languages), and domain-
specific programming languages. Special-purpose computer languages have always
existed in the computer age, but the term "domain-specific language" has become
more popular due to the rise of domain-specific modeling. Simpler DSLs, particularly
ones used by a single application, are sometimes informally called mini-languages.
The line between general-purpose languages and domain-specific languages is not
always sharp, as a language may have specialized features for a particular domain
but be applicable more broadly, or conversely may in principle be capable of broad
application but in practice used primarily for a specific domain. For
example, Perl was originally developed as a text-processing and glue language, for
the same domain as AWK and shell scripts, but was mostly used as a general-
purpose programming language later on. By contrast, PostScript is a Turing
complete language, and in principle can be used for any task, but in practice is
narrowly used as a page description language.
Use[edit]
The design and use of appropriate DSLs is a key part of domain engineering, by using a
language suitable to the domain at hand – this may consist of using an existing DSL or GPL, or
developing a new DSL. Language-oriented programming considers the creation of special-
purpose languages for expressing problems as standard part of the problem-solving process.
Creating a domain-specific language (with software to support it), rather than reusing an existing
language, can be worthwhile if the language allows a particular type of problem or solution to be
expressed more clearly than an existing language would allow and the type of problem in
question reappears sufficiently often. Pragmatically, a DSL may be specialized to a particular
problem domain, a particular problem representation technique, a particular solution technique,
or other aspects of a domain.

Overview[edit]
A domain-specific language is created specifically to solve problems in a particular domain and is
not intended to be able to solve problems outside of it (although that may be technically
possible). In contrast, general-purpose languages are created to solve problems in many
domains. The domain can also be a business area. Some examples of business areas include:

 life insurance policies (developed internally by a large insurance enterprise)


 combat simulation
 salary calculation
 billing
A domain-specific language is somewhere between a tiny programming language and a scripting
language, and is often used in a way analogous to a programming library. The boundaries
between these concepts are quite blurry, much like the boundary between scripting languages
and general-purpose languages.

In design and implementation[edit]


Domain-specific languages are languages (or often, declared syntaxes or grammars) with very
specific goals in design and implementation. A domain-specific language can be one of a visual
diagramming language, such as those created by the Generic Eclipse Modeling System,
programmatic abstractions, such as the Eclipse Modeling Framework, or textual languages. For
instance, the command line utility grep has a regular expression syntax which matches patterns
in lines of text. The sed utility defines a syntax for matching and replacing regular expressions.
Often, these tiny languages can be used together inside a shell to perform more complex
programming tasks.
The line between domain-specific languages and scripting languages is somewhat blurred, but
domain-specific languages often lack low-level functions for filesystem access, interprocess
control, and other functions that characterize full-featured programming languages, scripting or
otherwise. Many domain-specific languages do not compile to byte-code or executable code, but
to various kinds of media objects: GraphViz exports to PostScript, GIF, JPEG, etc.,
where Csound compiles to audio files, and a ray-tracing domain-specific language
like POV compiles to graphics files. A computer language like SQL presents an interesting case:
it can be deemed a domain-specific language because it is specific to a specific domain (in
SQL's case, accessing and managing relational databases), and is often called from another
application, but SQL has more keywords and functions than many scripting languages, and is
often thought of as a language in its own right, perhaps because of the prevalence of database
manipulation in programming and the amount of mastery required to be an expert in the
language.
Further blurring this line, many domain-specific languages have exposed APIs, and can be
accessed from other programming languages without breaking the flow of execution or calling a
separate process, and can thus operate as programming libraries.
Programming tools[edit]
Some domain-specific languages expand over time to include full-featured programming tools,
which further complicates the question of whether a language is domain-specific or not. A good
example is the functional language XSLT, specifically designed for transforming one XML graph
into another, which has been extended since its inception to allow (particularly in its 2.0 version)
for various forms of filesystem interaction, string and date manipulation, and data typing.
In model-driven engineering, many examples of domain-specific languages may be found
like OCL, a language for decorating models with assertions or QVT, a domain-specific
transformation language. However, languages like UML are typically general-purpose modeling
languages.
To summarize, an analogy might be useful: a Very Little Language is like a knife, which can be
used in thousands of different ways, from cutting food to cutting down trees. A domain-specific
language is like an electric drill: it is a powerful tool with a wide variety of uses, but a specific
context, namely, putting holes in things. A General Purpose Language is a complete workbench,
with a variety of tools intended for performing a variety of tasks. Domain-specific languages
should be used by programmers who, looking at their current workbench, realize they need a
better drill and find that a particular domain-specific language provides exactly that.

Domain-specific language topics[edit]


External and Embedded Domain Specific Languages[edit]
DSLs implemented via an independent interpreter or compiler are known as External Domain
Specific Languages. Well known examples include LaTeX or AWK. A separate category known
as Embedded (or Internal) Domain Specific Languages are typically implemented within a host
language as a library and tend to be limited to the syntax of the host language, though this
depends on host language capabilities.[1]

Usage patterns[edit]
There are several usage patterns for domain-specific languages: [2][3]

 Processing with standalone tools, invoked via direct user operation, often on the
command line or from a Makefile (e.g., grep for regular expression matching, sed,
lex, yacc, the GraphViz toolset, etc.)
 Domain-specific languages which are implemented using programming language
macro systems, and which are converted or expanded into a host general purpose
language at compile-time or realtime
 embedded domain-specific language (eDSL),[4] implemented as libraries which
exploit the syntax of their host general purpose language or a subset thereof while
adding domain-specific language elements (data types, routines, methods, macros
etc.). (e.g. jQuery, React, Embedded SQL, LINQ)
 Domain-specific languages which are called (at runtime) from programs written in
general purpose languages like C or Perl, to perform a specific function, often
returning the results of operation to the "host" programming language for further
processing; generally, an interpreter or virtual machine for the domain-specific
language is embedded into the host application (e.g. format strings, a regular
expression engine)
 Domain-specific languages which are embedded into user applications (e.g., macro
languages within spreadsheets) and which are (1) used to execute code that is
written by users of the application, (2) dynamically generated by the application, or
(3) both.
Many domain-specific languages can be used in more than one way.[citation needed] DSL code
embedded in a host language may have special syntax support, such as regexes in sed, AWK,
Perl or JavaScript, or may be passed as strings.
Design goals[edit]
Adopting a domain-specific language approach to software engineering involves both risks and
opportunities. The well-designed domain-specific language manages to find the proper balance
between these.
Domain-specific languages have important design goals that contrast with those of general-
purpose languages:

 Domain-specific languages are less comprehensive.


 Domain-specific languages are much more expressive in their domain.
 Domain-specific languages should exhibit minimal redundancy.
Idioms[edit]
In programming, idioms are methods imposed by programmers to handle common development
tasks, e.g.:

 Ensure data is saved before the window is closed.


 Edit code whenever command-line parameters change because they affect program
behavior.
General purpose programming languages rarely support such idioms, but domain-specific
languages can describe them, e.g.:

 A script can automatically save data.


 A domain-specific language can parameterize command line input.

Examples[edit]
Examples of domain-specific languages include HTML, Logo for pencil-like
drawing, Verilog and VHDL hardware description languages, MATLAB and GNU Octave for
matrix programming, Mathematica, Maple and Maxima for symbolic mathematics, Specification
and Description Language for reactive and distributed systems, spreadsheet formulas and
macros, SQL for relational database queries, YACC grammars for creating parsers, regular
expressions for specifying lexers, the Generic Eclipse Modeling System for creating diagramming
languages, Csound for sound and music synthesis, and the input languages
of GraphViz and GrGen, software packages used for graph layout and graph rewriting, Hashicorp
Configuration Language used for Terraform and other Hashicorp tools, Puppet also has its
own configuration language.

GameMaker Language[edit]
The GML scripting language used by GameMaker Studio is a domain-specific language targeted
at novice programmers to easily be able to learn programming. While the language serves as a
blend of multiple languages including Delphi, C++, and BASIC, there is a lack of structures, data
types, and other features of a full-fledged programming language. Many of the built-in functions
are sandboxed for the purpose of easy portability. The language primarily serves to make it easy
for anyone to pick up the language and develop a game.

ColdFusion Markup Language[edit]


ColdFusion's associated scripting language is another example of a domain-specific language for
data-driven websites. This scripting language is used to weave together languages and services
such as Java, .NET, C++, SMS, email, email servers, http, ftp, exchange, directory services, and
file systems for use in websites.
The ColdFusion Markup Language (CFML) includes a set of tags that can be used in ColdFusion
pages to interact with data sources, manipulate data, and display output. CFML tag syntax is
similar to HTML element syntax.
Erlang OTP[edit]
The Erlang Open Telecom Platform was originally designed for use inside Ericsson as a domain-
specific language. The language itself offers a platform of libraries to create finite state machines,
generic servers and event managers that quickly allow an engineer to deploy applications, or
support libraries, that have been shown in industry benchmarks to outperform other languages
intended for a mixed set of domains, such as C and C++. The language is now officially open
source and can be downloaded from their website.

FilterMeister[edit]
FilterMeister is a programming environment, with a programming language that is based on C,
for the specific purpose of creating Photoshop-compatible image processing filter plug-ins;
FilterMeister runs as a Photoshop plug-in itself and it can load and execute scripts or compile
and export them as independent plug-ins. Although the FilterMeister language reproduces a
significant portion of the C language and function library, it contains only those features which
can be used within the context of Photoshop plug-ins and adds a number of specific features
only useful in this specific domain.

MediaWiki templates[edit]
The Template feature of MediaWiki is an embedded domain-specific language whose
fundamental purpose is to support the creation of page templates and the transclusion (inclusion
by reference) of MediaWiki pages into other MediaWiki pages.

Software engineering uses[edit]


There has been much interest in domain-specific languages to improve the productivity and
quality of software engineering. Domain-specific language could possibly provide a robust set of
tools for efficient software engineering. Such tools are beginning to make their way into the
development of critical software systems.
The Software Cost Reduction Toolkit [5] is an example of this. The toolkit is a suite of utilities
including a specification editor to create a requirements specification, a dependency graph
browser to display variable dependencies, a consistency checker to catch missing cases in well-
formed formulas in the specification, a model checker and a theorem prover to check program
properties against the specification, and an invariant generator that automatically constructs
invariants based on the requirements.
A newer development is language-oriented programming, an integrated software
engineering methodology based mainly on creating, optimizing, and using domain-specific
languages.

Metacompilers[edit]
Further information: Metacompiler

Complementing language-oriented programming, as well as all other forms of domain-specific


languages, are the class of compiler writing tools called metacompilers. A metacompiler is not
only useful for generating parsers and code generators for domain-specific languages, but
a metacompiler itself compiles a domain-specific metalanguage specifically designed for the
domain of metaprogramming.
Besides parsing domain-specific languages, metacompilers are useful for generating a wide
range of software engineering and analysis tools. The meta-compiler methodology is often found
in program transformation systems.
Metacompilers that played a significant role in both computer science and the computer industry
include Meta-II,[6] and its descendant TreeMeta.[7]

Unreal Engine before version 4 and other games[edit]


Unreal and Unreal Tournament unveiled a language called UnrealScript. This allowed for rapid
development of modifications compared to the competitor Quake (using the Id Tech 2 engine).
The Id Tech engine used standard C code meaning C had to be learned and properly applied,
while UnrealScript was optimized for ease of use and efficiency. Similarly, the development of
more recent games introduced their own specific languages, one more common example
is Lua for scripting.[citation needed]

Rules Engines for Policy Automation[edit]


Various Business Rules Engines have been developed for automating policy and business rules
used in both government and private industry. ILOG, Oracle Policy
Automation, DTRules, Drools and others provide support for DSLs aimed to support various
problem domains. DTRules goes so far as to define an interface for the use of multiple DSLs
within a Rule Set.
The purpose of Business Rules Engines is to define a representation of business logic in as
human-readable fashion as possible. This allows both subject matter experts and developers to
work with and understand the same representation of the business logic. Most Rules Engines
provide both an approach to simplifying the control structures for business logic (for example,
using Declarative Rules or Decision Tables) coupled with alternatives to programming syntax in
favor of DSLs.

Statistical modelling languages[edit]


Statistical modelers have developed domain-specific languages such as R (an implementation of
the S language), Bugs, Jags, and Stan. These languages provide a syntax for describing a
Bayesian model and generate a method for solving it using simulation.

Generate model and services to multiple programming


Languages[edit]
Generate object handling and services based on an Interface Description Language for a
domain-specific language such as JavaScript for web applications, HTML for documentation, C+
+ for high-performance code, etc. This is done by cross-language frameworks such as Apache
Thrift or Google Protocol Buffers.

Gherkin[edit]
Gherkin is a language designed to define test cases to check the behavior of software, without
specifying how that behavior is implemented. It is meant to be read and used by non-technical
users using a natural language syntax and a line-oriented design. The tests defined with Gherkin
must then be implemented in a general programming language. Then, the steps in a Gherkin
program acts as a syntax for method invocation accessible to non-developers.

Other examples[edit]
Other prominent examples of domain-specific languages include:

 Emacs Lisp
 Game Description Language
 OpenGL Shading Language
 Gradle
 ActionScript

Advantages and disadvantages[edit]


Some of the advantages:[2][3]

 Domain-specific languages allow solutions to be expressed in the idiom and at the


level of abstraction of the problem domain. The idea is that domain experts
themselves may understand, validate, modify, and often even develop domain-
specific language programs. However, this is seldom the case.[8]
 Domain-specific languages allow validation at the domain level. As long as the
language constructs are safe any sentence written with them can be considered
safe.[citation needed]
 Domain-specific languages can help to shift the development of business information
systems from traditional software developers to the typically larger group of domain-
experts who (despite having less technical expertise) have a deeper knowledge of
the domain.[9]
 Domain-specific languages are easier to learn, given their limited scope.
Some of the disadvantages:

 Cost of learning a new language


 Limited applicability
 Cost of designing, implementing, and maintaining a domain-specific language as well
as the tools required to develop with it (IDE)
 Finding, setting, and maintaining proper scope.
 Difficulty of balancing trade-offs between domain-specificity and general-purpose
programming language constructs.
 Potential loss of processor efficiency compared with hand-coded software.
 Proliferation of similar non-standard domain-specific languages, for example, a DSL
used within one insurance company versus a DSL used within another insurance
company.[10]
 Non-technical domain experts can find it hard to write or modify DSL programs by
themselves.[8]
 Increased difficulty of integrating the DSL with other components of the IT system (as
compared to integrating with a general-purpose language).
 Low supply of experts in a particular DSL tends to raise labor costs.
 Harder to find code examples.

Tools for designing domain-specific languages [edit]

 JetBrains MPS is a tool for designing domain-specific languages. It uses projectional


editing which allows overcoming the limits of language parsers and building DSL
editors, such as ones with tables and diagrams. It implements language-oriented
programming. MPS combines an environment for language definition, a language
workbench, and an Integrated Development Environment (IDE) for such languages. [11]
 Xtext is an open-source software framework for developing programming languages
and domain-specific languages (DSLs). Unlike standard parser generators, Xtext
generates not only a parser but also a class model for the abstract syntax tree. In
addition, it provides a fully featured, customizable Eclipse-based IDE. [12]
 Racket is a cross-platform language toolchain including native code, JIT and
Javascript compiler, IDE (in addition to supporting Emacs, Vim, VSCode and others)
and command line tools designed to accommodate creating both domain-specific
and general purpose languages.[13][14]

uppet can be somewhat alien to technologists who have a background in automation


scripting. Where most of our scripts are procedural, Puppet is declarative. While a
declarative language has many major advantages for configuration management, it does
impose some interesting restrictions on the approaches we use to solve common
problems.
Although Puppet’s design philosophy may not be the most exciting topic to begin this
book, it drives many of the practices in the coming chapters. Understanding that
philosophy will help contextualize many of the recommendations covered.

Declarative code
The Puppet Domain Specific Language (DSL) is a declarative language, as opposed to the
imperative or procedural languages that system administrators tend to be most comfortable
and familiar with.

In an imperative language, we describe how to accomplish a task. In a declarative language, we


describe what we want to accomplish. Imperative languages focus on actions to reach a result, and declarative
languages focus on the result we wish to achieve. We will see examples of the difference below.

Puppet’s language is (mostly) verbless. Understanding and internalizing this paradigm is


critical when working with Puppet; attempting to force Puppet to use a procedural or
imperative process can quickly become an exercise in frustration and will tend to produce
fragile code.

In theory, a declarative language is ideal for configuration baselining tasks. With the Puppet
DSL, we describe the desired state of our systems, and Puppet handles all responsibility for
making sure the system conforms to this desired state. Unfortunately, most of us are used to a
procedural approach to system administration. The vast majority of the bad Puppet code I’ve
seen has been the result of trying to write procedural code in Puppet, rather than adapting
existing procedures to Puppet’s declarative model.
Procedural code with Puppet

In some cases, writing procedural code in Puppet is unavoidable. However, such code is
rarely elegant, often creates unexpected bugs, and can be difficult to maintain. We will see
practical examples and best practices for writing procedural code when we look at the exec
resource type in Chapter 5.

Of course, it’s easy to simply say “be declarative.” In the real world, we are often tasked to
deploy software that isn’t designed for a declarative installation process. A large part of this
book will attempt to address how to handle many uncommon tasks in a declarative way. As a
general rule, if your infrastructure is based around packaged open source software, writing
declarative Puppet code will be relatively straight-forward. Puppet’s built in types and
providers will provide a declarative way to handle most of your operational tasks. If you’re
infrastructure includes Windows clients and a lot of enterprise software, writing declarative
Puppet code may be significantly more challenging.

Another major challenge system administrators face when working within the constraints of a
declarative model is that we tend to operate using an imperative workflow. How often have
you manipulated files using regular expression substitution? How often do we massage data
using a series of temp files and piped commands? While Puppet offers many ways to
accomplish the same tasks, most of our procedural approaches do not map well into Puppet’s
declarative language. We will explore some examples of this common problem and discuss
alternative approaches to solving it.
What is declarative code anyway?
As mentioned earlier, declarative code tends not to have verbs. We don’t create users and we
don’t remove them; we ensure that the users are present or absent. We don’t install or remove
software; we ensure that software is present or absent. Where create and install are verbs,
present and absent are adjectives. The difference seems trivial at first, but proves to be very
important in practice.

A real world example:

Imagine that I’m giving you directions to the Palace of Fine Arts in San Francisco.

Procedural instructions:
 Head North on 19th Avenue

 Get on US-101S

 Take Marina Blvd. to Palace Dr.

 Park at the Palace of Fine Arts Theater

These instructions make a few major assumptions:


 You aren’t already at the Palace of Fine Arts

 You are driving a car

 You are currently in San Francisco

 You are currently on 19th avenue or know how to get there.

 You are heading North on 19th avenue.

 There are no road closures or other traffic disruptions that would force you to a different
route.

Compare this to the declarative instructions:


 Be at 3301 Lyon Street, San Francisco, CA 94123 at 7:00PM

The declarative approach has a few major advantages in this case:


 It makes no assumptions about your mode of transportation. These instructions are still
valid if your plans involve public transit or parachuting into the city.

 The directions are valid even if you’re currently at the Palace of Fine Arts

 These instructions empower you to route around road closures and traffic

The declarative approach allows you to chose the best way to reach the destination based on
your current situation, and it relies on your ability to find your way to the destination given.

Declarative languages aren’t magic. Much like an address relies on your understanding how
to read a map or use a GPS device, Puppet’s declarative model relies on its own procedural
code to turn your declarative request into a set of instructions that can achieve the declared
state. Puppet’s model uses a Resource type to model an object, and a provider implementing
procedural code to produce the state the model describes.
The major limitation imposed by Puppet’s declarative model might be somewhat obvious. If
a native resource type doesn’t exist for the resource you’re trying to model, you can’t manage
that resource in a declarative way. Declaring that I want a red two-story house with 4
bedrooms might empower you to build the house out of straw or wood or brick, but it
probably won’t actually accomplish anything if you don’t happen to be a general contractor.

There is some good news on this front, however. Puppet already includes native types and
providers for most common objects, the Puppet community has supplied additional native
models, and if you absolutely have to accomplish something procedurally you can almost
always fall back to the exec resource type.

A practical example
Let’s examine a practical example of procedural code for user management. We will discuss
how to make the code can be made robust, its declarative equivalent in Puppet, and the
benefits of using Puppet rather than a shell script for this task.

Imperative / Procedural Code


Here’s an example of an imperative process using BASH. In this case, we are going to create
a user with a home directory and an authorized SSH key on a CentOS 6 Host.
Example 1-1. Imperative user creation with BASH

groupadd examplegroup

useradd -g examplegroup alice

mkdir ~alice/.ssh/

chown alice.examplegroup ~alice/.ssh

echo "ssh-rsa
AAAAB3NzaC1yc2EAAAABIwAAAIEAm3TAgMF/2RY+r7KIeUoNbQb1TP6ApOtgJPNV\

0TY6teCjbxm7fjzBxDrHXBS1vr+fe6xa67G5ef4sRLl0kkTZisnIguXqXOaeQTJ4Idy4LZEVVbn
gkd\

2R9rA0vQ7Qx/XrZ0hgGpBA99AkxEnMSuFrD/E5TunvRHIczaI9Hy0IMXc= \

alice@localhost" > ~alice/.ssh/authorized_keys

What if we decide this user should also be a member of the wheel group?
Example 1-2. Imperative user modification with BASH
useradd -g examplegroup alice

usermod -G wheel alice

And if we want to remove that user and that user’s group?


Example 1-3. Imperative user removal with BASH

userdel alice

groupdel examplegroup

Notice a few things about this example:


 Each process is completely different

 The correct process to use depends on the current state of the user

 Each of these processes will produce errors if invoked more than one time

Imagine for a second that we have several systems. On some systems, example user is absent.
On other systems, alice is present, but not a member of the wheel group. On some
systems, alice is present and a member of the wheel group. Imagine that we need to write a
script to ensure that alice exists, and is a member of the wheel group on every system, and
has the correct authorized key. What would such a script look like?
Example 1-4. Robust user management with BASH

#!/bin/bash

if ! getent group examplegroup; then

groupadd examplegroup

fi

if ! getent passwd alice; then

useradd -g examplegroup -G wheel alice

fi

if ! id -nG alice | grep -q 'examplegroup wheel'; then


usermod -g examplegroup -G wheel alice

fi

if ! test -d ~alice/.ssh; then

mkdir -p ~alice/.ssh

fi

chown alice.examplegroup ~alice/.ssh

if ! grep -q alice@localhost ~alice/.ssh/authorized_keys; then

echo "ssh-rsa
AAAAB3NzaC1yc2EAAAABIwAAAIEAm3TAgMF/2RY+r7KIeUoNbQb1TP6ApOtg\
JPNV0TY6teCjbxm7fjzBxDrHXBS1vr+fe6xa67G5ef4sRLl0kkTZisnIguXqXOaeQTJ4Idy4LZE
VVb\

ngkd2R9rA0vQ7Qx/XrZ0hgGpBA99AkxEnMSuFrD/E5TunvRHIczaI9Hy0IMXc= \
alice@localhost" >> ~alice/.ssh/authorized_keys

fi

chmod 600 ~alice/.ssh/authorized_keys

Of course, this example only covers the use case of creating and managing a few basic
properties about a user. If our policy changed, we would need to write a completely different
script to manage this user. Even fairly simple changes, such as revoking this user’s wheel
access could require somewhat significant changes to this script.

This approach has one other major disadvantage; it will only work on platforms that
implement the same commands and arguments of our reference platform. This example will
fail on FreeBSD (implements adduser, not useradd) Mac OSX, and Windows.
Declarative code

Let’s look at our user management example using Puppet’s declarative DSL.
Creating a user and group:
Example 1-5. Declarative user creation with Puppet

$ssh_key =
"AAAAB3NzaC1yc2EAAAABIwAAAIEAm3TAgMF/2RY+r7KIeUoNbQb1TP6ApOtgJPNV0
T\

Y6teCjbxm7fjzBxDrHXBS1vr+fe6xa67G5ef4sRLl0kkTZisnIguXqXOaeQTJ4Idy4LZEVVbngk
d2R\

9rA0vQ7Qx/XrZ0hgGpBA99AkxEnMSuFrD/E5TunvRHIczaI9Hy0IMXc="

group { 'examplegroup':

ensure => 'present',

user { 'alice':

ensure => 'present',

gid => 'examplegroup',

managehome => true,

ssh_authorized_key { 'alice@localhost':

ensure => 'present',

user =>'alice',

type =>'ssh-rsa',
key => $ssh_key,

Adding alice to the wheel group:


Example 1-6. Declarative group membership with puppet

$ssh_key =
"AAAAB3NzaC1yc2EAAAABIwAAAIEAm3TAgMF/2RY+r7KIeUoNbQb1TP6ApOtgJPNV0
T\

Y6teCjbxm7fjzBxDrHXBS1vr+fe6xa67G5ef4sRLl0kkTZisnIguXqXOaeQTJ4Idy4LZEVVbngk
d2R\

9rA0vQ7Qx/XrZ0hgGpBA99AkxEnMSuFrD/E5TunvRHIczaI9Hy0IMXc="

group { 'examplegroup':

ensure => 'present',

user { 'alice':

ensure => 'present',

gid => 'examplegroup',

groups => 'wheel', # (1)

managehome => true,

ssh_authorized_key { 'alice@localhost':
ensure => 'present',

user =>'alice',

type =>'ssh-rsa',

key => $ssh_key,

(1) Note that the only change between this example and the previous example is the addition
of the groups parameter for the alice resource.
Remove alice:
Example 1-7. Ensure that a user is absent using Puppet

$ssh_key =
"AAAAB3NzaC1yc2EAAAABIwAAAIEAm3TAgMF/2RY+r7KIeUoNbQb1TP6ApOtgJPNV0
T\

Y6teCjbxm7fjzBxDrHXBS1vr+fe6xa67G5ef4sRLl0kkTZisnIguXqXOaeQTJ4Idy4LZEVVbngk
d2R\

9rA0vQ7Qx/XrZ0hgGpBA99AkxEnMSuFrD/E5TunvRHIczaI9Hy0IMXc="

group { 'examplegroup':

ensure => 'absent', # (1)

user { 'alice':

ensure => 'absent', # (2)

gid => 'examplegroup',

groups => 'wheel',


managehome => true,

ssh_authorized_key { 'alice@localhost':

ensure => 'absent', # (3)

user =>'alice',

type =>'ssh-rsa',

key => $ssh_key,

Ssh_authorized_key['alice@localhost'] -> # (4)

User['alice'] -> # (5)

Group['examplegroup']

(1), (2), (3) Ensure values are changed from Present to Absent.

(4), (5) Resource ordering is added to ensure groups are removed after users. Normally, the
correct order is implied due to the Autorequire feature discussed in Chapter 5.

You may notice the addition of resource ordering in this example when it wasn’t required in previous
examples. This is a byproduct of Puppet’s Autorequire feature. PUP-2451 explains the issue in greater depth.

In practice, rather than managing alice as 3 individual resources, we would abstract this into a defined type
that has its own ensure parameter, and conditional logic to enforce the correct resource dependency ordering.
In this example, we are able to remove the user by changing the ensure state from present to
absent on the user’s resources. Although we could remove other parameters such as gid,
groups, and the users key, in most cases it’s better to simply leave the values in place, just in
case we ever decide to restore this user.

It’s usually best to disable accounts rather than remove them. This helps preserve file ownership information
and helps avoid UID reuse.

In our procedural examples, we saw a script that would bring several divergent systems into
conformity. For each step of that example script, we had to analyze the current state of the
system, and perform an action based on state. With a declarative model, all of that work is
abstracted away. If we wanted to have a user who was a member of 2 groups, we would
simply declare that user as such, as in Example 1-6.

Non-Declarative code with Puppet


It is possible to write non-declarative code with Puppet. Please don’t do this:

$app_source = 'http://www.example.com/application.tar.gz'

$app_target = '/tmp/application.tar.gz'

exec { 'download application':

command => "/usr/bin/wget -q ${app_source} -O ${app_target}",

creates => '/usr/local/application/',

notify => exec['extract application'],

exec { 'extract application':

command => "/bin/tar -zxf ${app_target} -C /usr/local",

refreshonly => true,


creates => '/usr/local/application/',

This specific example has a few major problems:


1. Exec resources have a set timeout. This example may work well over a relatively fast
corporate Internet connection, and then fail completely from a home DSL line. The
solution would be to set the timeout parameter of the exec resources to a reasonably
high value.

2. This example does not validate the checksum of the downloaded file, which could
produce some odd results upon extraction. An additional exec resource might be used to
test and correct for this case automatically.

3. In some cases, a partial or corrupted download may wedge this process. We attempt to
work around this problem by overwriting the archive each time it’s downloaded.

4. This example makes several assumptions about the contents of application.tar.gz. If any
of those assumptions are wrong, these commands will repeat every time Puppet is
invoked.

5. This example is not particularly portable, and would require a platform specific
implementation for each supported OS.

6. This example would not be particularly useful for upgrading the application.

This is a relatively clean example of non-declarative Puppet code, and tends to be seen when
working with software that is not available in a native packaging format. Had this application
been distributed as an RPM, dpkg, or MSI, we could have simply used a package resource for
improved portability, flexibility, and reporting. While this example is not best practices, there
are situations where is unavoidable, often for business or support reasons.

This example could be made declarative using the nanliu/staging module.

Another common pattern is the use of conditional logic and custom facts to test for the
presence of software. Please don’t do this unless it’s absolutely unavoidable:

Facter.add(:example_app_version) do

confine :kernel => 'Linux'

setcode do

Facter::Core::Execution.exec('/usr/local/app/example_app --version')
end

end

$app_source = 'http://www.example.com/app-1.2.3.tar.gz'

$app_target = '/tmp/app-1.2.3.tar.gz'

if $example_app_version != '1.2.3' {

exec { 'download application':

command => "/usr/bin/wget -q ${app_source} -O ${app_target}",

before => exec['extract application'],

exec { 'extract application':

command => "/bin/tar -zxf ${app_target} -C /usr/local",

This particular example has many of the same problems of the previous example, and
introduces one new problem: it breaks Puppet’s reporting and auditing model. The
conditional logic in this example causes the download and extraction resources not to appear
in the catalog sent to the client following initial installation. We won’t be able to audit our
run reports to see whether or not the download and extraction commands are in a consistent
state. Of course, we could check the example_application_version fact if it happens to be
available, but this approach becomes increasingly useless as more resources are embedded in
conditional logic.
This approach is also sensitive to factor and plugin sync related issues, and would definitely
produce some unwanted results with cached catalogs.

Using facts to exclude parts of the catalog does have one benefit: it can be used to obfuscate
parts of the catalog so that sensitive resources do not exist in future Puppet runs. This can be
handy if, for example, your wget command embeds a passphrase, and you wish to limit how
often it appears in your catalogs and reports. Obviously, there are better solutions to that
particular problem, but in some cases there is also benefit to security in depth.

Idempotency
In computer science, an idempotent function is a function that will return the same value each
time it’s called, whether it’s only called once, or called 100 times. For example: X = 1 is an
idempotent operation. X = X + 1 is a non-idempotent,recursive operation.
Puppet as a language is designed to be inherently idempotent. As a system, Puppet designed
to be used in an idempotent way. A large part of this idempotency owed to its declarative
resource management model, however Puppet also enforces a number of rules on its variable
handling, iterators, and conditional logic to maintain its idempotency.

Idempotence has major benefits for a configuration management language:


 The configuration is inherently self healing

 State does not need to be maintained between invocations

 Configurations can be safely re-applied

For example, if for some reason Puppet fails part way through a configuration run, re-
invoking Puppet will complete the run and repair any configurations that were left in an
inconsistent state by the previous run.
Convergence vs Idempotence

Configuration management languages are often discussed in terms of their convergence


model. Some tools are designed to be eventually convergent; others immediately convergent
and/or idempotent.

With an eventually convergent system, the configuration management tool is invoked over
and over; each time the tool is invoked, the system approaches a converged state, where all
changes defined in the configuration language have been applied, and no more changes can
take place. During the process of convergence, the system is said to be in a partially
converged, or inconsistent state.

For Puppet to be idempotent, it cannot by definition also be eventually convergent. It must


reach a convergent state in a single run, and remain in the same state for any subsequent
invocations. Puppet can still be described as an immediately convergent system, since it is
designed to reach a convergent state after a single invocation.

Convergence of course also implies the existence of a diverged state. Divergence is the act of
moving the system away from the desired converged state. This typically happens when
someone attempts to manually alter a resource that is under configuration management
control.
There are many practices that can break Puppet’s idempotence model. In most cases,
breaking Puppet’s idempotence model would be considered a bug, and would be against best
practices. There are however some cases where a level of eventual convergence is
unavoidable. One such example is handling the numerous post-installation software reboots
that are common when managing Windows nodes.
Side effects
In computer science, a side effect is a change of system or program state that is outside the
defined scope of the original operation. Declarative and idempotent languages usually
attempt to manage, reduce, and eliminate side effects. With that said, it is entirely possible for
an idempotent operation to have side effects.

Puppet attempts to limit side effects, but does not eliminate them by any means; doing so
would be nearly impossible given Puppet’s role as a system management tool.

Some side effects are designed into the system. For example, every resource will generate a
notification upon changing a resource state that may be consumed by other resources. The
notification is used to restart services in order to ensure that the running state of the system
reflects the configured state. File bucketing is another obvious intended side effect designed
into Puppet.

Some side effects are unavoidable. Every access to a file on disk will cause that file’s atime
to be incremented unless the entire filesystem is mounted with the noatime attribute. This is
of course true whether or not Puppet is being invoked in noop mode.

Resource level idempotence


Many common tasks are not idempotent by nature, and will either throw an error or produce
undesirable results if invoked multiple times. For example, the following code is not
idempotent because it will set a state the first time, and throw an error each time it’s
subsequently invoked.
Example 1-8. A non-idempotent operation that will throw an error

useradd alice

The following code is not idempotent, because it will add undesirable duplicate host entries
each time it’s invoked:
Example 1-9. A non-idempotent operation that will create duplicate records

echo '127.0.0.1 example.localdomin' >> /etc/hosts

The following code is idempotent, but will probably have undesirable results:
Example 1-10. An idempotent operation that will destroy data

echo '127.0.0.1 example.localdomin' > /etc/hosts

To make our example idempotent without clobbering /etc/hosts, we can add a simple check
before modifying the file:
Example 1-11. An imperative idempotent operation

grep -q '^127.0.0.1 example.localdomin$' /etc/hosts \

|| echo '127.0.0.1 example.localdomin' >> /etc/hosts


The same example is simple to write in a declarative and idempotent way using the native
Puppet host resource type:
Example 1-12. Declarative Idempotence with Puppet

host { 'example.localdomain':

ip => '127.0.0.1',

Alternatively, we could implement this example using the file_line resource type from the
optional stdlib Puppet module:
Example 1-13. Idempotent host entry using the File_line resource type

file_line { 'example.localdomin host':

path => '/etc/hosts',

line => '127.0.0.1 example.localdomain',

In both cases, the resource is modeled in a declarative way and is idempotent by its very
nature. Under the hood, Puppet handles the complexity of determining whether the line
already exists, and how it should be inserted into the underlying file. Using the native host
resource type, Puppet also determines what file should be modified and where that file is
located.

The idempotent examples are safe to run as many times as you like. This is a huge benefit
across large environments; when trying to apply a change to thousands of hosts, it’s relatively
common for failures to occur on a small subset of the hosts being managed. Perhaps the host
is down during deployment? Perhaps you experienced some sort of transmission loss or
timeout when deploying a change? If you are using an idempotent language or process to
manage your systems, it’s possible to handle these exceptional cases simply by performing a
second configuration run against the affected hosts (or even against the entire infrastructure.)

When working with native resource types, you typically don’t have to worry about
idempotence; most resources handle idempotence natively. A couple of notable exceptions to
this statement are the exec and augeas resource types. We’ll explore those in depth in
Chapter 5.
Puppet does however attempt to track whether or not a resource has changed state. This is
used as part of Puppet’s reporting mechanism and used to determine whether or not a signal
should be send to resources with a notify relationship. Because Puppet tracks whether or not a
resource has made a change, it’s entirely possible to write code that is functionally
idempotent, without meeting the criteria of idempotent from Puppet’s resource model.
For example, the following code is functionally idempotent, but will report as having
changed state with every Puppet run.
Example 1-14. Puppet code that will report as non-idempotent

exec { 'grep -q /bin/bash /etc/shells || echo /bin/bash >> /etc/shells':

path => '/bin',

provider => 'shell',

Puppet’s idempotence model relies on a special aspect of its resource model. For every
resource, Puppet first determines that resource’s current state. If the current state does not
match the defined state of that resource, Puppet invokes the appropriate methods on the
resources native provider to bring the resource into conformity with the desired state. In most
cases, this is handled transparently, however there are a few exceptions that we will discuss
in their respective chapters. Understanding these cases will be critical in order to avoid
breaking Puppet’s simulation and reporting models.

This example will report correctly:


Example 1-15. Improved code that will report as Idempotent

exec { 'echo /bin/bash >> /etc/shells':

path =>'/bin',

unless => 'grep -q /bin/bash /etc/shells',

In this case, unless provides a condition Puppet can use to determine whether or not a change
actually needs to take place.

Using condition such as unless and only if properly will help produce safe and robust exec resources. We will
explore this in depth in Chapter 5.

A final surprising example is the notify resource, which is often used to produce debugging
information and log entries.
Example 1-16. The Notify resource type

notify { 'example':
message => 'Danger, Will Robinson!'

The notify resource generates an alert every time its invoked, and will always report as a
change in system state.

Run level idempotence


Puppet is designed to be idempotent both at the resource level and at the run level. Much like
resource idempotence means that a resource applied twice produces the same result, run level
idempotence means that invoking Puppet multiple times on a host should be safe, even on
live production environment.

You don’t have to run Puppet in enforcing mode in production.

Run level idempotence is a place where Puppet’s model of change becomes just as important
as whether or not the resources are functionally idempotent. Remember that before
performing any configuration change, Puppet will first determine whether or not the resource
currently conforms to policy. Puppet will only make a change if resources are in an
inconsistent state. The practical implication is that if Puppet does not report having made any
changes, you can trust this is actually the case.

In practice, determining whether or not your Puppet runs are truly idempotent is fairly
simple: If Puppet reports no changes upon its second invocation on a fresh system, your
Puppet codebase is idempotent.

Because Puppet’s resources tend to have side effects, it’s much possible (easy) to break
Puppet’s idempotence model if we don’t carefully handle resource dependencies.
Example 1-17. Ordering is critical for run-level idempotence

package { 'httpd':

ensure => 'installed',

file { '/etc/httpd/conf/httpd.conf':

ensure => 'file',


content => template('apache/httpd.conf.erb'),

Package['httpd'] ->

File['/etc/httpd/conf/httpd.conf']

The file resource will not create paths recursively. In example 1-17, the httpd package must
be installed before the httpd.conf file resource is enforced; and it depends on the existence
of /etc/httpd/conf/httpd.conf, which is only present after the httpd package has been
installed. If this dependency is not managed, the file resource becomes non-idempotent; upon
first invocation of Puppet it may throw an error, and only enforce the state of httpd.conf upon
subsequent invocations of Puppet.
Such issues will render Puppet convergent. Because Puppet typically runs on a 30 minute
interval, convergent infrastructures can take a very long time to reach a converged state.

There are a few other issues that can render Puppet non-idempotent

Non-deterministic code
As a general rule, the Puppet DSL is deterministic, meaning that a given set of inputs
(manifests, facts, exported resources, etc) will always produce the same output with no
variance.

For example, the language does not implement a random() function; instead


a fqdn_rand() function is provided that returns random values based on a static seed (the
host’s fully qualified domain name.) This function is by its very nature not cryptographically
secure, and not actually random at all. It is however useful for in cases where true
randomness is not needed, such as distributing the start times of load intensive tasks across
the infrastructure.
Non-deterministic code can pop up in strange places with Puppet. A notorious example is
Ruby 1.8.7’s handling of hash iteration. The following code is non-deterministic with Ruby
1.8.7; the output will not preserve the original order and will change between runs:
Example 1-18. Non-deterministic hash ordering with Ruby 1.8.x

$example = {

'a' => '1',

'b' => '2',

'c' => '3',


}

alert(inline_template("<% @example.to_a.join %>\n"))

Another common cause of non-deterministic code pops up when our code is dependent on a
transient state.

file { '/tmp/example.txt':

ensure => 'file',

content => "${::servername}\n",

Example 1-18 will not be idempotent if you have a load balanced cluster of Puppet Masters.
The value of $::servername changes depending on which master compiles the catalog for a
particular run.
With non-deterministic code, Puppet loses run level idempotence. For each invocation of
Puppet, some resources will change shape. Puppet will converge, but it will always report
your systems as having been brought into conformity with its policy, rather than being
conformant. As a result, it’s virtually impossible to determine whether or not changes are
actually pending for a host. It’s also more difficult to track what changes were made to the
configuration, and when they were made.

Non deterministic code also has the side effect that it can cause services to restart due to
Puppet’s notify behavior. This can cause unintended service disruption.

Stateless
Puppet’s client / server API is stateless, and with a few major (but optional) exceptions,
catalog compilation is a completely stateless process.

A stateless system is a system that does not preserve state between requests; each request is
completely independent from previous request, and the compiler does not need to consult
data from previous request in order to produce a new catalog for a node.

Puppet uses a RESTful API over HTTPS for client server communications.

With master/agent Puppet, the Puppetmaster need only have a copy of the facts supplied by
the agent in order to compile a catalog. Natively, Puppet doesn’t care whether or not this is
the first time it has generated a catalog for this particular node, nor whether or not the last run
was successful, or if any change occurred on the client node during the last run. The nodes
catalog is compiled in its entirety every time the node issues a catalog request. The
responsibility for modeling the current state of the system then rests entirely on the client, as
implemented by the native resource providers.

IF you don’t use a puppetmaster or have a small site with a single master, statelessness may
not be a huge benefit to you. For medium to large sites however, keeping Puppet stateless is
tremendously useful. In a stateless system, all Puppetmasters are equal. There is no need to
synchronize data or resolve conflicts between masters. There is no locking to worry about.
There is no need to design a partition tolerant system in case you lose a datacenter or data-
link, and no need to worry about clustering strategies. Load can easily be distributed across a
pool of masters using a load balancer or DNS SRV record, and fault tolerance is as simple as
ensuring nodes avoid failed masters.

It is entirely possible to submit state to the master using custom facts or other techniques. It’s
also entirely possible to compile a catalog conditionally based on that state. There are cases
where security requirements or particularly idiosyncratic software will necessitate such an
approach. Of course, this approach is most often used when attempting to write non-
declarative code in Puppet’s DSL. Fortunately, even in these situations, the Server doesn’t
have to actually store the node’s state between runs; the client simply re-submits its state as
part of its catalog request.

If you keep your code declarative, it’s very easy to work with Puppet’s stateless client/server
configuration model. IF a manifest declares that a resource such as a user should exist, the
compiler doesn’t have to be concerned with the current state of that resource when compiling
a catalog. The catalog simply has to declare a desired state, and the Puppet agent simply has
to enforce that state.

Puppet’s stateless model has several major advantages over a stateful model:
 Puppet scales horizontally

 Catalogs can be compared

 Catalogs can be cached locally to reduce server load

It is worth noting that there are a few stateful features of Puppet. It’s important to weigh the
value of these features against the cost of making your Puppet infrastructure stateful, and to
design your infrastructure to provide an acceptable level of availability and fault tolerance.
We will discuss how to approach each of these technologies in upcoming chapters, but a
quick overview is provided here.

Sources of state
In the beginning of this section, I mentioned that there are a few features and design patterns
that can impose state on Puppet catalog compilation. Let’s look at some of these features in a
bit more depth.
Filebucketing

Filebucketing is an interesting and perhaps underappreciated feature of the File resource type.
If a filebucket is configured, the file provider will create a backup copy of any file before
overwriting the original file on disk. The backup may be bucketed locally, or it can be
submitted to the Puppetmaster.
Bucketing your files is useful for keeping backups, auditing, reporting, and disaster recovery.
It’s immensely useful if you happen to blast away a configuration you needed to keep, or if
you discover a bug and would like to see how the file is changed. The Puppet enterprise
console can use filebucketing to display the contents of managed files.

Filebuckets can also be used for content distribution, however using a filebucket this way
creates state. Files are only present in a bucket when placed there; either as a backup from a
previous run, or by the static_compiler terminus. Placing a file in the bucket only happens
during a Puppet run, and Puppet has no internal facility to synchronize buckets between
masters. Reliance upon file buckets for content distribution can create problems if not applied
cautiously. It can create problems when migrating hosts between datacenters, when
rebuilding masters. Use of filebucketing in your modules can also create problems during
local testing with puppet apply.
Exported resources

Exported resources provide a simple service discovery mechanism for Puppet. When a
puppetmaster or agent compiles a catalog, resources can be marked as exported by the
compiler. Once the resources are marked as exported, they are recorded in a SQL database.
Other nodes may then collect the exported resources, and apply those resources locally.
Exported resources persist until they are overwritten or purged.

As you might imagine, exported resources are, by definition stateful and will affect your
catalog if used.

We will take an in depth look at PuppetDB and exported resources in Chapter 2. For the time
being, just be aware that exported resources introduce a source of state into your
infrastructure.

In this example, a pool of webservers export their pool membership information to a haproxy
load balancer, using the puppetlabs/haproxy module and exported resources.
Example 1-19. Declaring state with an exported resource

include haproxy

haproxy::listen { 'web':

ipaddress => $::ipaddress,

ports => '80',

Haproxy::Balancermember >

Example 1-20. Applying state with an exported resource


@@haproxy::balancermember { $::fqdn:

listening_service => 'web',

server_names => $::hostname,

ipaddresses => $::ipaddress,

ports => '80',

options => 'check',

This particular example is a relatively safe use of exported resources; if PuppetDB for some reason became
unavailable the pool would continue to work; new nodes would not be added to the pool until PuppetDB was
restored. TODO: Validate what I just said is true given the internal use of concat on this module…

Exported resources rely on PuppetDB, and are typically stored in a PostgreSQL database.
While the PuppetDB service is fault tolerant and can scale horizontally, the PostgreSQL itself
scales Vertically and introduces a potential single point of failure into the infrastructure.
Hiera

Hiera is by design a pluggable system. By default is provides JSON and YAML backends,
both of which are completely stateless. However, it is possible to attach Hiera to a database or
inventory service, including PuppetDB. If you use this approach, it can introduce a source of
state into your Puppet Infrastructure. We will explore Hiera in depth in Chapter 6.
Inventory and reporting

The Puppet infrastructure maintains a considerable amount of reporting information


pertaining to the state of each node. This information includes facts about each node, detailed
information about the catalogs sent to the node, and the reports produced at the end of each
Puppet run. While this information is stateful, this information is not typically consumed
when compiling catalogs.

There are plugins to Puppet that allow inventory information to be used during catalog
compilation, however these are not core to Puppet.
Custom facts

Facts themselves do not inherently add state to your Puppet manifests, however they can be
used to communicate state to the Puppetmaster, which can then be used to compile
conditional catalogs. Using facts in this way does not create the scaling and availability
problems inherent in server site state, but it does create problems if you intend to use cached
catalogs, and it does reduce the effectiveness of your reporting infrastructure.

Summary
In this chapter, we reviewed the major design features of Puppet’s language, both in terms of
the benefits provided by Puppet’s language, and the restrictions its design places on us.
Future chapters will provide more concrete recommendations for the usage of Puppet’s
language, overall architecture of Puppet, and usage of Puppet’s native types and providers.
Building code that leverages Puppet’s design will be a major driving force behind may of the
considerations in future chapters.

Takeaways from this chapter:


 Puppet is declarative, idempotent, and stateless

 In some cases violation of these design ideals is unavoidable

 Write declarative, idempotent, and stateless code whenever possible

MODULE REVIEW

MODULE 1 WRAP UP AUTOMATING WITH CONFIGURATION MANAGEMENT

We started our journey talking 


about what would happen if you needed to 
upgrade a package in a fleet of 1,000 different servers. 
If you've never heard about 
configuration management before, 
an upgrade like that probably seemed like 
a super long and boring task, right? 
But now you know that there's a bunch 
of tools you can use to make 
your life much easier when 
making large-scale changes like that one. 
We've talked about the automation 
that's necessary for provisioning, 
managing and adapting a fleet 
of computers in a scalable way. 
We called out that an important concept in 
today's IT world is to treat our infrastructure as code. 
This lets us manage our fleet 
of computers in a consistent, 
versionable, reliable and repeatable way. 
To figure out how to get there, 
we've covered a lot of concepts 
related to configuration management, 
like how these tools use a domain 
specific language to help us 
clearly state what we want our system 
to look like after the tools have run. 
We've mentioned that the language 
is declarative because we 
declare our goals rather 
than detail the steps to get there, 
and most importantly the actions taken must be 
idempotent so that several runs of 
the same rules always lead to the same results. 
All along, we've been using Puppet as 
an example of how configuration management tools work. 
We looked into the puppet DSL syntax and checked out 
the most common resources: packages, files and services. 
We'll learn about other resources and 
other advanced techniques in future videos. 
But by now you should have 
a pretty good idea of what Puppet rules look like and 
how you can put into action 
the configuration management concepts that we discussed. 
With the concepts we've covered, 
you're probably starting to see 
how keeping your fleet of machines, 
whether they're virtual or physical, 
off of a pedestal is good practice. 
If something breaks, goes 
down or catches fire literally or figuratively, 
you can easily spin up 
a replacement because you know exactly what it's 
supposed to look like from 
the configuration and you can 
deploy it easily using the automation. 
In the next module, we'll 
check out how you can deploy Puppet in 
your infrastructure and look into 
some more advanced configuration management 
and change management techniques. 
Before that, you'll have 
the opportunity to try out fixing 
a system where the configuration management 
isn't doing what it's supposed to. 
You'll see what running the Puppet agent looks like in 
practice and find out 
what's wrong with the deployed rules, 
and then get the automation to behave as expected. 
Cool, right? Let's go for it.

HOW LOG IN TO QWIKLABS

How to Log in to Qwiklabs


In the following assessments, you’ll be using Qwiklabs for hands-on learning. Qwiklabs
provisions resources backed by Google Cloud that will be used to perform the tasks in the
assessments. By using Qwiklabs, you won't have to purchase or install software yourself,
and you can use the Linux operating system as if it was installed on your local machine.

Important details:

 You will have 90 minutes to complete each lab.


 You'll experience a delay as the labs load, as well as for the working instances
of Linux VMs. So, please wait a couple of minutes.
 Make sure to access labs directly through Coursera and not in the Qwiklabs
catalog. If you access labs through the Qwiklabs catalog, you will not receive a
grade. (As you know, a passing grade is required to complete the course.)
 You'll connect to a new VM for each lab with temporary credentials created for
you; these will last only for the duration of the lab.
 The grade is calculated when the lab is complete, so be sure to hit "End Lab"
when you're done. Note: after you end the lab, you won't be able to access
your previous work.
 To get familiar with entering labs, find the links below for the operating
system of the machine you are currently using for a visualization of the key
steps. Note that while video resources linked below do not have a voiceover or
any audio, all important details will still be housed in each lab’s set of
instructions on the Qwiklabs platform.
 If you receive the error "Sorry, your quota has been exceeded for the lab",
please submit a request or reach out to the Qwiklabs support team directly via
chat support on qwiklabs.com.
Demo videos for accessing labs:

 For Windows users


 For Mac users
 For Linux users
 For Chrome OS users

You might also like