You are on page 1of 27

Performing a Failover

and Failback using


CloudEndure Disaster
Recovery
List of Contents

Preparing CloudEndure DR Project

1. Generating and Using Your Credentials (IAM)


2. Working With Project
3. Defining Your Replication Settings
4. Adding Machines (Agent Installation)
5. Configure Blueprint

Testing and Running Disaster Recovery

6. Testing Mode
7. Performing a Failover
8. Performing a Failback
9. Return to Normal Operations
1. Generating and Using Your Credentials

a. Create IAM User for CloudEndure Credentials

a. Sign in to the AWS Management Console and open the IAM console
at https://console.aws.amazon.com/iam/
b. In the navigation pane, choose Users and then choose Add users.
c. Type the user name for the new user.
d. Select the type of access this set of users will have. Select programmatic access.
e. For Console password, choose Autogenerated password.
f. Choose Next: Permissions.
g. On the Set permissions page
h. Attach existing policies directly. choose Create policy to open a new browser
tab and create a new policy from scratch. Click create policy and choose Json.

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": "ec2:CreateTags",
"Resource": "arn:aws:ec2:*:*:*/*",
"Condition": {
"StringEquals": {
"ec2:CreateAction": "RunInstances"
}
}
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": "ec2:CreateTags",
"Resource": "arn:aws:ec2:*:*:*/*",
"Condition": {
"StringEquals": {
"ec2:CreateAction": "CreateVolume"
}
}
},
{
"Sid": "VisualEditor2",
"Effect": "Allow",
"Action": [
"ec2:RevokeSecurityGroupIngress",
"ec2:DetachVolume",
"ec2:AttachVolume",
"ec2:DeleteVolume",
"ec2:TerminateInstances",
"ec2:StartInstances",
"ec2:RevokeSecurityGroupEgress",
"ec2:StopInstances"
],
"Resource": [
"arn:aws:ec2:*:*:dhcp-options/*",
"arn:aws:ec2:*:*:instance/*",
"arn:aws:ec2:*:*:volume/*",
"arn:aws:ec2:*:*:security-group/*"
],
"Condition": {
"StringLike": {
"ec2:ResourceTag/Name": "CloudEndure*"
}
}
},
{
"Sid": "VisualEditor3",
"Effect": "Allow",
"Action": [
"ec2:RevokeSecurityGroupIngress",
"ec2:DetachVolume",
"ec2:AttachVolume",
"ec2:DeleteVolume",
"ec2:TerminateInstances",
"ec2:StartInstances",
"ec2:RevokeSecurityGroupEgress",
"ec2:StopInstances"
],
"Resource": [
"arn:aws:ec2:*:*:dhcp-options/*",
"arn:aws:ec2:*:*:instance/*",
"arn:aws:ec2:*:*:volume/*",
"arn:aws:ec2:*:*:security-group/*"
],
"Condition": {
"StringLike": {
"ec2:ResourceTag/CloudEndure creation time": "*"
}
}
},
{
"Sid": "VisualEditor4",
"Effect": "Allow",
"Action": [
"ec2:DisassociateAddress",
"ec2:CreateDhcpOptions",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:DeregisterImage",
"ec2:DeleteSubnet",
"ec2:DeleteSnapshot",
"ec2:ModifySnapshotAttribute",
"ec2:ModifyVolumeAttribute",
"ec2:CreateVpc",
"ec2:AttachInternetGateway",
"ec2:GetConsoleScreenshot",
"ec2:GetConsoleOutput",
"elasticloadbalancing:DescribeLoadBalancer*",
"ec2:CreateRoute",
"ec2:CreateInternetGateway",
"ec2:CreateSecurityGroup",
"ec2:CreateSnapshot",
"ec2:ModifyVpcAttribute",
"ec2:ModifyInstanceAttribute",
"ec2:ReleaseAddress",
"ec2:AuthorizeSecurityGroupEgress",
"ec2:AssociateDhcpOptions",
"ec2:ImportKeyPair",
"ec2:CreateTags",
"ec2:RegisterImage",
"ec2:ModifyNetworkInterfaceAttribute",
"ec2:AssociateRouteTable",
"ec2:CreateRouteTable",
"ec2:DetachInternetGateway",
"iam:ListInstanceProfiles",
"ec2:AllocateAddress",
"ec2:ReplaceNetworkAclAssociation",
"ec2:CreateVolume",
"kms:ListKeys",
"ec2:Describe*",
"ec2:DeleteVpc",
"iam:GetUser",
"ec2:CreateSubnet",
"ec2:AssociateAddress",
"ec2:DeleteKeyPair",
"ec2:CreateNetworkAclEntry",
"outposts:GetOutpostInstanceTypes"
],
"Resource": "*"
},
{
"Sid": "MigrationHubConfig",
"Effect": "Allow",
"Action": [
"mgh:GetHomeRegion"
],
"Resource": "*"
},
{
"Sid": "VisualEditor5",
"Effect": "Allow",
"Action": [
"ec2:RevokeSecurityGroupIngress",
"mgh:CreateProgressUpdateStream",
"kms:Decrypt",
"kms:Encrypt",
"ec2:RevokeSecurityGroupEgress",
"ec2:DeleteDhcpOptions",
"ec2:RunInstances",
"kms:DescribeKey",
"kms:CreateGrant",
"ec2:DeleteNetworkAclEntry",
"kms:ReEncrypt*",
"kms:GenerateDataKey*"
],
"Resource": [
"arn:aws:mgh:*:*:progressUpdateStream/*",
"arn:aws:ec2:*:*:subnet/*",
"arn:aws:ec2:*:*:key-pair/*",
"arn:aws:ec2:*:*:dhcp-options/*",
"arn:aws:ec2:*:*:instance/*",
"arn:aws:ec2:*:*:volume/*",
"arn:aws:ec2:*:*:security-group/*",
"arn:aws:ec2:*:*:network-acl/*",
"arn:aws:ec2:*:*:placement-group/*",
"arn:aws:ec2:*:*:vpc/*",
"arn:aws:ec2:*:*:network-interface/*",
"arn:aws:ec2:*::image/*",
"arn:aws:ec2:*:*:snapshot/*",
"arn:aws:kms:*:*:key/*"
]
},
{
"Sid": "VisualEditor6",
"Effect": "Allow",
"Action": [
"ec2:CreateTags",
"mgh:ImportMigrationTask",
"mgh:AssociateCreatedArtifact",
"mgh:NotifyMigrationTaskState",
"mgh:DisassociateCreatedArtifact",
"mgh:PutResourceAttributes"
],
"Resource": [
"arn:aws:mgh:*:*:progressUpdateStream/*/migrationTask/*",
"arn:aws:ec2:*:*:subnet/*",
"arn:aws:ec2:*::network-interface/*",
"arn:aws:ec2:*:*:dhcp-options/*",
"arn:aws:ec2:*::snapshot/*",
"arn:aws:ec2:*:*:security-group/*",
"arn:aws:ec2:*::image/*"
]
},
{
"Sid": "VisualEditor7",
"Effect": "Allow",
"Action": "ec2:Delete*",
"Resource": [
"arn:aws:ec2:*:*:route-table/*",
"arn:aws:ec2:*:*:dhcp-options/*",
"arn:aws:ec2:*:*:instance/*",
"arn:aws:ec2:*:*:volume/*",
"arn:aws:ec2:*:*:security-group/*",
"arn:aws:ec2:*:*:internet-gateway/*"
],
"Condition": {
"StringLike": {
"ec2:ResourceTag/Name": "CloudEndure*"
}
}
},
{
"Sid": "VisualEditor8",
"Effect": "Allow",
"Action": "ec2:Delete*",
"Resource": [
"arn:aws:ec2:*:*:route-table/*",
"arn:aws:ec2:*:*:dhcp-options/*",
"arn:aws:ec2:*:*:instance/*",
"arn:aws:ec2:*:*:volume/*",
"arn:aws:ec2:*:*:security-group/*",
"arn:aws:ec2:*:*:internet-gateway/*"
],
"Condition": {
"StringLike": {
"ec2:ResourceTag/CloudEndure creation time": "*"
}
}
},
{
"Sid": "VisualEditor9",
"Effect": "Allow",
"Action": "ec2:ModifyVolume",
"Resource": "arn:aws:ec2:*:*:volume/*",
"Condition": {
"StringLike": {
"ec2:ResourceTag/Name": "CloudEndure*"
}
}
},
{
"Sid": "VisualEditor10",
"Effect": "Allow",
"Action": "cloudwatch:GetMetricData",
"Resource": "*"
}
]
}

i. After you create the policy, close that tab and return to your original tab to add
the policy to the new user.
j. Choose Next: Tags.

k. Choose Next: Review to see all of the choices you made up to this point. When
you are ready to proceed, choose Create user.
l. To view the users' access keys (access key IDs and secret access keys),
choose Show next to each password and access key that you want to see. To
save the access keys, choose Download .csv and then save the file to a safe
location.
Then click Close.

2. Working with Project

a. Create Project.

Click icon + on the left cloudendure console.

Enter Project Name.

For AWS Credentials, Enter AWS Access Key ID and AWS Secret Access Key
from user credentials that we created before. Then click Save.
3. Defining Your Replication Settings

Once your credentials have been set up, you will need to define the replication settings for
AWS.

a. Setting Up Replication Setting

Go to Replication Setting.

• For Disaster Recovery Source choose Other Infrastructure


• For Disaster Recovery Target choose AWS Asia Pasific (Singapore)
• For Replication Server Instance type choose m5.large
• For Converter Instance type choose default.
• For Subnet choose Private Subnet (vpc-musashi)
• Select Use VPN
• For tags. Enter a key “Name” with value “App-Server-Portal” and key
“Project” with value “Musashi-DR”.

Then Save Replication Settings


4. Adding Machines (Agent Installation)

Notes for Windows:


1. It is recommended to install all available Windows Updates on the machine.
2. Windows Source machines need to have at least 2 GB of free space in order to
launch a Target machine successfully.
3. CloudEndure does not support Windows disks that have GPT partitions and are
dynamic as boot disks for BIOS machines.
4. CloudEndure does not support OS-based disk encryption features such as
BitLocker. These should be disabled before using any CloudEndure services.

Prerequisites:
For Microsoft Windows Server 2012 R2 64 bit & Microsoft Windows Server 2016 64
Bit:

1. Microsoft Windows Server versions 2008 R2 and above require .Net Framework
version 4.5 or above to be installed.
2. Nitro instances (for example, the C5 and M5 family types) will work with RHEL
7.0+ and CentOS 7.0+ in AWS in a Linux environment and with Windows Server
2008 R2, Windows Server 2012 R2, Windows Server 2016, and Windows Server
2019 in a Windows environment. Certain newer AWS regions only support Nitro
instances and therefore only support the previously mentioned operating systems.
3. Each Source machine with an installed Agent continuously communicates with
CloudEndure Replication Servers in the Staging Area over TCP Port 1500. TCP
Port 1500 is needed for the transfer of replicated data from the Source machines to
the Staging Area.

a. How to Install CloudEndure Agent:

Click on Machine Actions, and choose Add Machines

You can found your Installation Token Information, and how to install
CloudEndure Agent according to your Operating System.
5. Configure Blueprint

a. Configure Blueprint APP Portal Server


Click on APP Portal Server.

Choose Blueprint Setting.

• For Machine Type choose c5.2xlarge.


• For Launch Type choose On demand.
• For Subnet choose Public Subnet (vpc-musashi).
• For Private IP Choose Create New.
• For Elastic IP choose None.
• For Public IP Choose Yes.

• For Placement Group, leave as the default (none)


• For IAM Role, leave as the default (none)
• For Initial Target Instance State choose Started
• Target Instance Boot Type choose BIOS
• For tags. Enter a key “Name” with value “App-Server-Portal” and
key “Project” with value “Musashi-DR”.
• For Disks choose SSD-gpp3, IOPS 3000 and Throughput 125.

Then click Save Blueprint.


b. Configure Blueprint DB Portal Server

Click on DB Portal.

Choose Blueprint Setting.

• For Machine Type choose m5.2xlarge.


• For Launch Type choose On demand.
• For Subnet choose Private Subnet (vpc-musashi).
• For Private IP Choose Create New.
• For Elastic IP choose None.
• For Public IP Choose No.
• For Placement Group, leave as the default (none)
• For IAM Role, leave as the default (none)
• For Initial Target Instance State choose Started
• Target Instance Boot Type choose BIOS
• For tags. Enter a key “Name” with value “DB-Portal” and
key “Project” with value “Musashi-DR”.
• For Disks choose SSD-gpp3, IOPS 3000 and Throughput 125.

Then click Save Blueprint.


Testing and Running Disaster Recovery

6. Testing Mode

Test Mode and Recovery Mode create exactly the same Target machines, but by using the
correct Mode, the User Console will correctly indicate if the Source machine was tested
recently or if it is currently Failed Over (the User Console assumes that a Target machine
launched in Recovery Mode was Failed Over, even though you do the actual Failover
outside of CloudEndure).
Important: Before testing launch, make sure the server is not running at the same time as
the server on premise, so that there are no conflicts on the domain controller.

Important: You should launch machines in Test Mode prior to launching them in Recovery
Mode in order to ensure that the process has been thoroughly tested. After testing either
SSH (Linux) or RDP (Windows) into your machine and ensure that everything is working
correctly.

Click Launch Target Machine and choose Test Mode.

A confirmation message will appear. Click NEXT to launch the Testing machine.

Note: Any previous Target machines launched for the Source machines you are testing
will be deleted.
Choose the Recovery Point for the Target machine in Testing Mode.

Note: If the "Latest" Recovery Point is chosen, then CloudEndure will take a new
snapshot of the Source machine and use that snapshot.

After choosing your Recovery Point, click CONTINUE WITH LAUNCH.

Job Progress will only contain information only when you perform one of the Job actions.
Important: If there's an issue with the dns, unregister and re-register the DNS.

7. Performing a Failover

Important! Before Failover, make sure the server is not running at the same time as the
server on premise, so that there are no conflicts on the domain controller.

Note: To Failover specific machines, you will need to create a new Project (Step 2
and 3) and configure it identically to your original DR Project (same Target region,
license, credentials and Replication Settings.) Once you have the Project ready.

Verify that the Source machine you want to Failover has the following status indications
under each column.

o DATA REPLICATION PROGRESS - Continuous Data Protection

o STATUS – Target machine can be launched

• DISASTER RECOVERY LIFECYCLE – Tested Recently


Check the box to the left each machine you want to Failover and click the Recovery
Mode option under the LAUNCH X TARGET MACHINES button.

A confirmation message will appear. Click NEXT to launch the Recovery machine.

Note: Any previous Target machines launched for the Source machines you are testing
will be deleted.

Choose the Recovery Point for the Target machine in Recovery Mode.

Note: If the "Latest" Recovery Point is chosen, then CloudEndure will take a new
snapshot of the Source machine and use that snapshot.

After choosing your Recovery Point, click CONTINUE WITH LAUNCH.


Job Progress.

Job Progress will only contain information only when you perform one of the Job actions.

Important: If there's an issue with the dns, unregister and re-register the DNS.

s
8. Performing a Failback
Note: To Failback specific machines, you will need to Create a new Project (Step 2
and 3) and configure it identically to your original DR Project (same Target region,
license, credentials and Replication Settings.) Once you have the Project ready.

The Project view will reset, and what were your Target machines will now appear
as Source machines, ready to replicate their data back to the original Source infrastructure,
which will now (temporarily) appear as your Target infrastructure.

• Source AWS instance

o Allow inbound port 1500 open on the Source AWS instance from which you
are failing back so that the CloudEndure Failback Client could connect to it.

o Allow the Source AWS instance access to console.cloudendure.com via TCP


443.

• Target non-cloud or other-cloud on-prem environment (using the Failback client)

o Allow the Target non-cloud or other-cloud on-prem environment access to S3


so the replicator software can be downloaded.

o Allow your non-cloud or other-cloud on-prem environment access to


console.cloudendure.com over port 443.

o Allow outbound port 1500 open on the Target environment

Choose the Prepare for Failback option in the PROJECT ACTIONS menu in the User
Console.
Note: You can only initiate the Prepare for Failback action once all of the Source
machines in the Project have launched Target machines in either Test or Recovery
mode.

Click CONTINUE on the Prepare for Failback prompt.

The machines in the Machine List View will display Waiting for boot from Failback Client
under the DATA REPLICATION PROGRESS column.

Download the Failback Client, using the link from the Replication Settings section in the
CloudEndure User Console under Setup & Info. You can access the link by clicking on
Learn more about failing back to "Other Infrastructure" and then clicking download
from here on the Failing Back to an Unidentified Cloud / Other Infrastructure dialog. The
file will be downloaded automatically.
Note: The Failback Client interface will launch and connect to the DHCP client.
Upon connection, you will be prompted for your CloudEndure Installation Token.
Input your installation token into the Failback Client.

Note: You can locate your installation token by navigating to Setup & Info > Other
Settings > Installation Token in the CloudEndure Console.
Note: You also need your Instance ID on AWS. Go to AWS Management Console
> EC2 Instances

Note: The Failback will launch new machines. After the new machines have been
launched, the old original Source machines need to be cleaned up manually as
CloudEndure never deletes Source machines. If the Failback Client is launched on
an Other Infrastructure (non-AWS) Source, then the same machine will function as
the Target and should not be deleted.

Important: After Failback completed make sure the server is not running at the same
time as the server on AWS (Server that has been launched on AWS is n not deleted), so
that there are no conflicts on the domain controller.

Important: If there's an issue with the dns, unregister and re-register the DNS.

9. Return to Normal Operation

Note: Before returning to normal operation, you need to have the data that was
written to your Recovery machines copied back to the machines in your original
Source infrastructure.
Your machines will yet again undergo the initiation sequence.

Once your machines enter Continuous Data Protection mode, the Failback process has
been successfully completed.

Additional. Errors when failing back to different source machine

If you're failing over to AWS from one Source machine and then failing back from AWS
to a different source machine,The machine may be stuck at the Initiating Disaster
Recovery stage without data replication progress.

To fix this issue, reinstall the CloudEndure Agent on the new (different) Source machine.

You might also like