Professional Documents
Culture Documents
Contents
Active Directory backup and recovery with Veeam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Backup considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Sequential backup of AD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
SureBackup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Recovery of a full VM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Restoring an object in the tombstone state using Veeam Explorer for Active Directory . . . . . . . . . . 21
Other restoration possibilities with Veeam Explorer for Microsoft Active Directory . . . . . . . . . . . . . . 25
One of the key challenges with AD is the backup and recovery process. In this white paper, the main
aspects for successful recovery are discussed. An important take-away with this, as with every recovery
plan, is that you should test all scenarios for your specific case.
First discussed, is what you should consider when backing up AD. SureBackup is discussed next, so you can
automatically verify your backups. However, for AD, there are also some great things going on under the hood
of the SureBackup engine. In addition, item-level recovery with Veeam Explorer for AD is explored.
Backup considerations
Windows 2008 R2 and vmxnet3
If you are installing on Windows 2008 R2 and using vmxnet3, be sure to install windows patch
mentioned in the kb article below, so SureBackup and recovery will work correctly. Otherwise, you
might end up with a local adapter #2 and IP address settings that may not be retained when you do
SureBackup or Instant Recovery. Learn more here: http://kb.vmware.com/kb/1020078.
Recovery with FRS might be possible but might need additional manual steps. It is highly
recommended that part of your migration is to a modern server OS (operating system). This white
paper will assume you have done this migration and will not cover the FRS part, as it is considered to
be old technology.
If you migrated to 2008 R2 or later and you are not sure if you are running on DFS-R, you can check the
registry with the following PowerShell command:
If the key value is 3 (ELIMINATED), DFSR is being used as shown in the screenshot below
If this is not the case, you can refer to the following documentation:
http://technet.microsoft.com/en-us/library/dd640019%28v=ws.10%29.aspx
Although AD is built to be a multi-master replication application, some roles, which are called the
FSMO roles, will only be on one server in the forest or on the domain. Although discussing the
functions of these roles is out of scope, it is still important for full VM recovery that you understand
which roles are hosting these roles.
You can check which server is running the FSMO Roles by executing:
Import-Module ActiveDirectory
$domain="lab.local"
By default, all roles will be held by the first domain controller. To transfer the role, you can use the
Move-ADDirectoryServerOperationMasterRole cmdlet. For example, to move the PDC Emulator role to the
second server, you can execute: Move-ADDirectoryServerOperationMasterRole -Identity <new server>
OperationMasterRole "PDCEmulator"
Source : http://blogs.technet.com/b/heyscriptingguy/archive/2012/03/10/easily-use-powershell-
to-discover-the-holders-of-active-directory-fsmo-roles.aspx
You can also do this with the GUI. For the RID, PDC and infrastructure roles, you can use the Active Directory
Users and Computers MMC. For domain naming, you can use the AD Domains and Trusts MMC. However,
in this case you will need to right-click the root node, instead of the domain. For Schema, you first need to
register the corresponding dll (regsvr32 schmmgmt.dll), and then you can launch the schema MMC.
When you do a full domain recovery, it is recommended that you recover the AD server with the most
FSMO roles, preferably the one hosting the PDC emulator role.
If a domain controller fails and it is not likely to be coming back, seize its roles and perform metadata cleanup.
You can view and transfer the operations master role in the GUI, as shown here.
The Global Catalog is a subset of the information stored at a domain level, which is being replicated over the
whole forest. Again, it is important that Global Catalog Server is running at the time of recovery, or clients
might fail to authenticate against the network. It also recommended that you have at least one global catalog
server, per site. Starting in 2008, all AD controllers are global catalog servers by default.
To understand which server acts as a global catalog, you can run the following script:
Import-Module ActiveDirectory
You can also check the Sites and Services MMC. Select your domain controller, go to NTDS
Settings and select the properties
Sequential backup of AD
One of the major issues that might happen during backup of clusters is that the clusters might go offline
or break when they are being backed up, due to VSS triggering at the same time. AD is a type of cluster, so
it is not recommended to back up the domain controllers at the same time. For example, in the past, users
have reported broken SysVol DFS-R replication because they were simultaneously backed up. This issue is
addressed in Veeams v7 Patch 3. Be sure youre running the latest version of Veeam Backup & Replication.
If you want to avoid backing up the AD server at the same time as the domain controllers, there are
two ways to achieve this goal. You can disable the parallel processing and put the VM in the same job.
(But remember, parallel processing is a global setting.) Alternatively, you can put the domain controllers
in different jobs and schedule them at different times. For example, you could make a small job that
runs 10 minutes before all other jobs and include one domain controller. This would ensure that you
have the resources required to start the backup. In v8, the scheduler has been enhanced to give higher
priority to jobs that started earlier, so that these jobs are finished earlier. You can also chain one of the
jobs to the first job. In general, chaining is not recommended for various reasons.
Finally, for item-level recovery, you only need one copy of the NTDS.dit file. This means there is no real
need to backup all AD controllers every hour if you want to use them for item-level recovery
One of the most essential parts of backing up active directory is using Application-Aware Image
Processing. Not only will this trigger the VSS framework to make everything consistent but it will also
put everything into place to do a successful Full VM Recovery
Veeam has two ways of contacting the guest. First way is via direct RPC calls to the machine. In this case
Veeam, will try to contact the machine over the network.
If you have a VMware virtual environment and it is not possible to connect over the network, Veeam will fail
back to the VIX or VMware Tools API. This could be useful if you are backing up a machine in DMZ. For Hyper-V,
such scenarios are not possible because the Integration Services do not offer such API.
You can test if your credentials are working and if the required network ports are open by using
the Test Now functionality:
In both cases, Veeam needs administrative credentials. However, if you are using VIX API and UAC is
enabled on the target machine, you need to specify an administrator with a well-known SID ending (*).
This could be the <DOMAIN>\administrator or the <MACHINENAME>\administrator
During Backup, Veeam will add runtime components, which will trigger the VSS framework. Afterwards,
these components are removed so the machine is kept in a clean state. You can see this action in the
events logs. For example:
Backup has talked successfully with the NTDS writer and made a successful backup of AD.
SureBackup
SureBackup is the framework that allows you test and see if your backups have been successfully made.
Discussing the setup of the SureBackup virtual lab is out of scope for this document. If you need help
setting up this part, please consult the manual at http://helpcenter.veeam.com/backup/
To setup SureBackup, you are likely to create an Application group containing your Domain Controller.
During the automated backup test you probably will want AD to be the first server to be booted and
tested. You will also want the Domain Controller to remain running while other tests are running.
When you want to test active controller, Veeam will automatically configure the virtual machine to start in
authoritative restore mode. This will force the domain controller to not wait for other domain controllers when
it boots. In this test, we create a simple SureBackup job where we only boot one controller.
To do this, go to Backup Infrastructure > Surebackup > Application Groups. On this node, add an application
group. Add your Virtual Machine and configure the roles by editing the virtual machine. Select the roles you
want to test. You will likely want to test the DNS server, Domain controller and Global Catalog roles.
In the backup and replication section, youll be able to create a new SureBackup job by selecting the
option from the ribbon. If you dont see the option menu, select the Jobs node in the navigation pane.
When you create a new SureBackup job, you can select the Application Group. Also, notice that there
is a handy option called Keep the application group running once the job completes. Basically, when
you run SureBackup, it will not stop the job after tests are done, but you can execute additional manual
tests. If you want SureBackup to run automatically, do not enable it.
Once youve completed this process, you can leave all the settings to default and then start your
SureBackup job.
When you start a SureBackup job, a couple of interesting things happen. First of all, the routing engine
starts. Basically, this starts a small appliance that separates the production network and the isolated
environment. This process also protects the production network from changes.
Next, the VMs (virtual machines) will be published to the vSphere environment youve configured. To
eliminate confusion, the VMs are renamed by adding a random GUID as a suffix.
In this simple one-host setup, you can see the production VM (active01) and the test VM (active01_<randomGUID>)
The Configuring DC step is where Veeam changes the registry keys to force the VM into an
authoritative mode. You can see these steps in the logs. The default log folder is %programdata%\
Veeam\Backup. In this folder, you will find a subfolder that has the name of your SureBackup job where
you will find the job logs with the steps Veeam has executed.
You can open these logs with Notepad. If you do, scroll all the way down. From the edit menu, use the Find
option and search for [PrepareDC] Windows Registry changes. Change the direction to search upward.
This will reveal four big changes in the registry in both control sets:
This effectively forces the Domain Controllers still using the old FRS technology to start the
replication in an authoritative mode.
http://support.microsoft.com/kb/290762
I n Services\NTDS\Parameters, add the dword "Repl Perform Initial Synchronizations" with the value
00000000.
This forces the domain controller not to wait for another partner to replicate the directory partitions.
http://technet.microsoft.com/en-us/library/cc757662%28v=ws.10%29.aspx
http://msdn.microsoft.com/en-us/library/bb891959%28VS.85%29.aspx#sysvol
After these changes are applied, the server is will be powered on. If you monitor the VMware console
of the isolated VM, you will notice that the domain controller will start and reboot after it has reached
the logon screen. This is because the server firsts boots in Directory Services Restore Mode. Basically,
during backup Veeam has put everything in place to achieve a successful restore. For this reason, it is
an important to enable Application-Aware Image Processing. Once booted in restore mode, Veeam will
restore the server and reboot the server in the Normal mode.
Because Veeam is aware that you are restoring the domain controller, it will augment the boot timers
in the Application Group settings. You can check this in the Startup Options. By default, the boot time
will be set to 600 seconds, but when selecting the AD roles, the boot time will be augmented to 2,100
seconds. If this is not enough time, you can manually override these settings
Veeam will not wait the entire 35 minutes before testing an AD if it is not necessary. If it can see that the
VMware tools are running and it can ping the virtual machine, it will consider the machine booted and
wont wait for the full 2,100 seconds to pass.
In order to be able to perform tests for a restored DC, Veeam Backup & Replication needs to make
sure that the DC is ready for testing and has reached a "stabilization point". Veeam establishes the
stabilization point via the vSphere APIs or integration services.
It will scan if these tools are providing an IP address and only after multiple intervals seeing the
same address it will continue to check if the VM can be reached over the network. If the VM reboots
somewhere in the middle of the process, the AD controller wont reach this stable state, thus letting
SureBackup wait until it stabilizes or the 2,100 seconds timeout period has expired.
In some rare cases, the machine will boot too slowly and will not reboot fast enough. This means that
the StableIP algorithm will wrongly assume that the VM is stable. If you run into problems, simply
contact support so they can fine-tune the algorithm.
Selecting the AD role also influences the StableIP algorithm. The dynamic waiting interval will be longer
because the engine expects the reboot in the first place. This will help you avoid more false positives
when the VM is booting too slowly.
Once the VM is booted, it will wait an additional 120 seconds for the application to initialize. Again, this
timeout is configurable.
Finally, it will execute the test to determine if the application running. Veeams built-in test are rather
basic, but quite effective. For every application, it knows which port the application should respond to.
If you use the default roles, Veeam will just execute a port scan via the utility called:
The connection tester takes the following argument: <ip> <port> as shown above
You can see the configuration of the connection tester in the Test Scripts section. The following ports
are tested for the corresponding service
DNS Server 53
Domain Controller 389
Global Catalog 3268
While port scans can tell you if a service is running, this information is limited. You can create more
extensive scripts yourself and add them to the Test Scripts.
For example, the SureBackup framework supports PowerShell (.ps1) scripts directly. In 2012 and 2012
R2, PowerShell functionality is greatly enhanced and good example is the Resolve-dnsname cmdlet,
which has been added. This allows you to trigger a DNS query against a certain server and test if a DNS
is responding to DNS query.
[CmdletBinding()]
param(
[string]$dnsserver = "10.168.93.11",
[string]$testdns = "lab.local"
$errorcode = 1
try {
if($test.count -gt 0)
$errorcode = 0
catch [System.ComponentModel.Win32Exception] {
$errorcode = 1
exit $errorcode
A very simple PowerShell script that will go behind a simple port scan could be called with the
following arguments -dnsserver %vm_ip% -testdns %vm_fqdn% which basically attempts to
resolve its own hostname.
Recovery of a full VM
Surebackup already reveals a great deal about recovery. For example, SureBackup forces the domain
controller not to do an initial replication and to start DFS-R and NTFRS in an authoritative mode.
However, if you are running a full VM recovery, these changes are not applied.
A few users may ask why Veeam executes this mode in SureBackup, but not when restoring to production?
The answer is in ADs architecture, which is focused by default on high availability. Therefore, in most scenarios,
there will be still an active domain controller that is left. This could be local or remote controller. For disaster
recovery, it is highly recommended to have at least one AD controller per site.
One of the great things about Veeam is that it executes application-aware image processing. With a
traditional image-based backup (or snapshot), you might run into issue like lingering objects. This is
because in AD changes are not based on a real clock, but, instead, on a number incremented every
time a change occurs. This USN number is only defined locally. That means that for active01 USN 8111
is not per definition the same time stamp as for active02. AD has some pretty good technology in
place to do this replication based on local USN.
NOTE: This USN number should never decrease. In a physical environment, that is not so hard to achieve.
However, in the world of VMs with snapshot technology in place or image-based restores, this could actually
happen. The problem then becomes that all of the replication technology is based on this number. So, these
are a couple of possible outcomes when you decrease the number:
I f VM active01 made some changes after the backup, the other nodes think that active01 already
has these changes when you do a restore, simply because active01 was the source of the changes.
C
hanges that occurred on other nodes are then not synced back because the surviving nodes think
the changes were already replicated to active01.
I f active01 re-uses numbers after the restore, the other domain controllers will assume they
already have the changes in place because they already saw changes with this USN number.
Items that were being deleted after backup time on the surviving nodes are being revived.
So when you are restoring a full VM, in most cases, it is not a good idea to restore a VM in authoritative
mode, for both the database and for DFS-R. If there is at least one surviving node, the default way that
Veeam recovers an AD controller is suitable for the majority of cases becomes correct.
If ever you execute an image-based recovery, you will see that Veeam reboots twice. First, it will boot
to restore mode. Then, the Veeam Service will reboot the VM so it is non-authoritative. The domain
controller itself will also understand that it has been recovered from Backup and will take this into
account when it is doing a restore. It will invalidate the database and will request and update for
everything that has been changed after the backup.
Starting from Windows 2012, Microsoft has built-in a failsafe mechanism called the VM generation
ID. This device generates a new ID every time a change to a virtual machine occurs, such as cloning,
and reverting to a snapshot. Every time AD services start, they will read out this generation ID and
compare it to what it currently has cached. If the number is different, AD will conclude that something
has happened to it and invalidate its current database. NOTE: The VM generation ID is supported by
modern hypervisors such as Hyper-V 3.0 and vSphere 5.0 U2 (and up).
If you still have active nodes, one of the other options would be to completely remove, or demote, the
faulty AD controller from the domain and create, or promote, a new domain controller replacing the
faulty AD controller. Demoting can only be done, however, when the AD server is still working. If it is
not working, you could go very tedious process which may include:
I f you need to seize the PDC role, you probably will have to configure an external time zone
server or use the hardware clock. This is because domain controllers that do not hold the PDC
role, sync with the domain controller that holds the PDC role.
As discussed before, AD is a multi-master database. That means nodes can read, write and update
at the same time. If an attribute is changed on two domain controllers before replication, a stamp
consisting of the version, a real date and a domain controller will be used to resolve the conflict. If one
domain controller claims to have a higher version it will win.
But what about when you delete an object? How does replication work? If the object is gone, how will
you described what has been removed? You could try telling your peers immediately, but what if that
fails? Some domain controllers will still have the object; but others will not.
For this issue, AD uses tombstone. When you delete an object, it is not really gone, but here are some
key points about what does actually happen:
Thus, although the object is not really gone, it is still buried somewhere in AD. The question then becomes,
when will it really be deleted? A process called garbage collection runs every 12 hours on every domain
controller. When an object has been declared deleted for over 180 days, it will be removed.
You can still bring the object back at this point, a process, which is called tombstone reanimation.
However, only a subset of values will be restored. Starting with Windows 2008 R2, AD introduced
the recycle bin. In this case, deleting an object takes two steps. First, it is marked as deleted, but not
stripped of its attributes. Second, when its Deleted Object Lifetime Expires, it will go into a recycled
state similar to the tombstone state, ready to be picked up by the garbage collection.
Enabling this feature requires that all the domain controllers are running 2008 R2 and up.
Similarly, your functional level needs to be updated as well. Finally, enabling the feature will alter
the schema so it can never be disabled.
The effect on the database here is that it will be bigger because deleted objects will keep their current
attributes, but this might be a small price to pay. One limitation is that it will not keep different versions.
Only the latest version will be kept. For this issue, you should fail back to your backup product.
So lets create and delete an object called An Doe and see what happens when it is being deleted. To
see deleted objects, you can use the ldp.exe utility. You will need to load the control Return Deleted
Objects as this container is hidden by default.
If you analyze the object, you will see that the distinguished Name is changed. More important, take
a look at the following three attributes : GUID, uSNChanged and uSNCreated. After the restore, we will
compare these attributes with the original ones
Restoring an object in the tombstone state using Veeam Explorer for Active Directory
Veeam Explorer for AD was introduced with Veeam Backup & Replication version 8. This tool makes the
recovery of a single user (or a whole OU) easy and fast. Instead of restoring the VM in a virtual lab, this
wizard will start a file-level recovery and automatically mount the NTDS.dit file. This file, together with
the logs files, forms the AD JET database.
Starting the wizard can be done in the restore menu by selecting Application Items and finally
choosing the option for Microsoft Active Directory restore. This will filter out the VMs that are AD-
controllers, which can be useful in larger environments.
If the backup process did not correctly identify your AD controller(for example, if AAIP was not enabled), you
can manually start a Windows File-Level Recovery on the AD controller. Once this browser window is open,
you can open up the Veeam Explorer for Active Directory from the start menu.
When you select add database, you should point to the NTDS.dit file that is located in the backup of
your AD controller. Typically, the file path of this database will be under the VeeamFLR mount point (C:\
VeeamFLR) on the second volume in the ntds folder, which resides by default in the Windows folder.
After you have mounted the database file, you can restore items. However, if this is not necessary, it is better to
use the default method of selecting an application item restore in the Veeam Backup & Replication console.
Once this wizard is open, you will be able to search for the user you want to restore either by browsing
the directory structure or by searching the whole lab. If you are unsure about certain objects, you can
push the Compare with Production button. This will compare this version with the current AD State.
We can now restore An Doe back to production by selecting the object and clicking Restore Objects.
There are two options: 1. Restore to <domain> or 2. Restore to. The first option, Restore to <domain>
allows for very fast recovery. In this case, all default Veeam settings are used and the user will be
recovered to production. If you want to have more options (such as to not restore the password), you
should use the second Restore to option.
After a restore operation is successful, the object should no longer be in the Deleted Objects container. Again,
you can verify this with ldp.exe. In the original location, An Doe will be brought back from the land of the
death. If you now compare the attributes we wrote down earlier, you will see some interesting results.
DNS Server 53
Domain Controller 389
Global Catalog 3268
If you check the result, you can see that the uSNCreated is the same in both instances. This means that
the Explorer has literally revived the original object and has restored all of its attributes.
Other restoration possibilities with Veeam Explorer for Microsoft Active Directory
Veeam Explorer for Microsoft Active Directory allows you to restore an individual user. However
there are some other possibilities that might not always be so apparent. For example, you can
also restore computer objects or groups.
In addition, containers can also be restored. Imagine a scenario where someone has deleted a whole
OU, including all users. Restoring all these users individually might take a lot of time. Because Veeam
Explorer for Active Directory allows you to restore passwords, you can just choose to restore the whole
container including all the objects. Just select the container in the tree view and click restore.
But what if a user still exists, but one of its attributes has been changed? Recovery is still possible, just hit the
compare option directly on the user. This will allow you to select individual attributes that have been changed.
If you need to have complete control over the restore, select the Restore to possibility. This will give
you an option to edit the default settings. This need for complete control could be because you want
to restore a user to a different OU, want to reset the password and other scenarios.
Finally you can also decide to export a user or container. This will create an ldf file, which should not
be confused with SQL log files. These files are to be used with ldifde. You can open these file with
notepad to see their content
Content of and ldf file. In this case, account0000 has been deleted.
In this example, you will try to restore a user. You can do a restore by using the following minimum
command line:
ldifde -i -f <file.ldf>
It is possible here, that some problems might occur. Even if you specify the -x parameter, the tombstone
object will not be utilized. Essentially, you are creating a new object that looks the same. If you are restoring a
user, you might find that there is no password being set with this export and, by default, this is not allowed.
You can disable a password policy in the group policy of your domain. However, this might pose a real security
threat. It is highly recommended that you use Veeam Explorer to execute restores.
Founded in 2006, Veeam currently has 39,000 ProPartners and more than 193,000 customers
worldwide. Veeams global headquarters are located in Baar, Switzerland, and the company has
offices throughout the world. To learn more, visit http://www.veeam.com.
Disaster recovery
orchestration
for the Enterprise
NEW Veeam Availability Orchestrator