Professional Documents
Culture Documents
Ammar Hasayen
AMMARHASAYEN.WORDPRESS.COM
Contents
1.
Introduction: ......................................................................................................................................... 3
1.1 Quorum ............................................................................................................................................... 3
1.2 DAG Networks ..................................................................................................................................... 3
1.3 Active Manager ................................................................................................................................... 4
2.
3.
4.
5.
6.
7.
Autodiscover ....................................................................................................................................... 13
7.1 When Autodiscover is triggered on Outlook .................................................................................... 13
7.2 How to find the service ..................................................................................................................... 14
7.3 What Autodiscover needs ................................................................................................................. 14
7.4 What Autodiscover process .............................................................................................................. 14
7.5 What Autodiscover returns............................................................................................................... 14
8.
Scenario 3................................................................................................................................................ 16
Scenario 4................................................................................................................................................ 17
1. Introduction:
This guide simply explains in a very easy way, all the technologies and procedures that you need to know
to perform Exchange 2010 data center switch over, recovering DAG member or stretching DAG between
sites.
1.1 Quorum
Define as a mechanism to ensure that only one subset of members are functioning at any given time. It
used to find majority.
There is Quorum data that is configuration shared between all nodes.
Exchange 2010 supports only two out four models of Quorums:
Witness is a file share (Witness.log) that represent a vote when there is need to break the tie. When we
are one vote from losing the majority, the node that hold the cluster group (PAM) will lock the witness
file share.
The witness cluster file share is created when the DAG members become even and cluster will apply
isalive controls to monitor it. If it fails, the cluster group is moved to another node and try to bring it
online.
(Exchange Subsystem) group should be member of the local administrator group on the witness server
and the alternative witness server.
MAPI Network:
o You can have only one MAPI network.
o Default G and register in DNS
Replication Network: (Over TCP 64327)
o You can have Zero or as many replication networks as you much
o No default G and no register in DNS
DAG Network enumeration happens only when adding DAG members or can be triggered by
running (Set-DatabaseAvailabilityGroup DiscoverNetworks)
If the MAPI network dies in a server, automatic switch over happens.
o
o
If Replication network dies in a server, replication will happen over MAPI network.
ISCIS network should be configured to be ignored from Cluster use.
And also make sure that the replication cannot route to the MAPI network in any case, or cross
heartbeat scenario will happen.
Note: In case of two DAG started members in the alternate datacenter exist, the boot time of the
alternative witness share server can be used. If the witness boot time is before, DAC succeeded.
Else, use Restore-DatabaseAvailabilityGroup . This only true for two member started DAG members.
In all cases, if all DAG members are DAC 0, use Start-DatabaseAvailabilityGroup to reset the DAC bit
to 1 even if the nodes are already started.
2.3 Restore-DatabaseAvailabilityGroup
o
o
Evicts DAG members marked as stopped from the cluster , thus created quorum
Assign alternate witness share in case of even number of nodes
2.4 Examples
Stop-DatabaseAvailabilityGroup -Identity DAG1 -MailboxServer E14EX2
Stop-DatabaseAvailabilityGroup -Identity DAG1 -ActiveDirectorySite Redmond
Stop-DatabaseAvailabilityGroup -Identity DAG1 -MailboxServer E14EX3 ConfigurationOnly
o
o
o
o
o
o
4. Database Mobility
If you have a server that fails but the SAN or disk database files are accessible, you can mount the DB on
another server. This is called Database Mobility.
1. Attach the database files to a drive on the new mailbox server.
2. Use eseutil to check the health of the database
Eseutil /MH database.edb |findstr state:
3. If the database is dirty shutdown and log files are available, then perform soft recovery : From
the folder that contains the log files, type :
eseutil /r E00 /d G:\Data\databaseFolderPath
Note: Replace E00 with log prefix
4. Finally, create new DB on the new server, mark it as over writable, dismount it , switch files.
5. Point the user to the new DB :
Get-Mailbox -Database oldDB | Set-Mailbox Database newDB
6. Outlook clients will automatically pick up the new info.
5.2 Scenario 1
Suppose that the primary site went down completely, and you changed the DNS entry for
owa.contoso.com to point to the CAS NLB in the secondary site. Now the primary site is back to normal
and you changed the DNS entry for owa.contoso.com to point to the primary CAS NLB in the main site.
The client need to wait for the TTL for owa.contoso.com to expire (usually set the TTL to 5 minutes), and
also after the cache expires, the browser will still cache the DNS entry for another 20 minutes.
So a loop will happen here as the browser will go to owa.contoso.com which will go to the secondary
CAS NLB because of the browser cache, and the secondary CAS array will send an OWA redirection
message Hey... You should be using https://owa.contoso.com for best performance. Because the
mailbox is active in the primary site now and the OWA ExternalURL for the primary CAS array is
https://owa.contoso.com.
The user may think ODD, I just did log in at that site! Silly computer, let me log in again.
The second time he logs in to owa.contoso.com, he will probably still hit the secondary CAS array servers
because of their browser cache still isnt updated. The secondary CAS array servers are intelligent
enough to see this 2nd logon attempt (via a web canary) and then know OH this users DNS cache is
old. They dont know we failed back to the other datacenter. Send him the FailbackURL for the primary
CAS servers.
The user is then prompted with a slightly different page with a CONTINUE button and it explains to
them that the mailbox is in the process of being brought online in different datacenter. He clicks
continue, which takes him to the FailbackURL. They log in again and this time is successfully in OWA.
So the Secondary CAS array will detect if the primary CAS servers has the failbackURL configured, and if
it is, it will redirect the client to it to end the loop. If there is no failbackURL configured, then the
secondary CAS array will send an error page to the client indicating that he should close his browser and
try again.
5.3 Scenario 2
If the CAS receive a request for OWA to a database, and he can see that the database legacyExchangeDN
matches his local AD site, but the database is mounted in different site, the CAS will issues a redirect to
the ExternalURL of the CAS server hosting the mounted database.
5.4 Scenario 3
NEW IN SP2 Cross-Site Silent Redirection
If you configure the Set-OWAVirtualDirectory with CrossSiteRedirectType = Silent (default is manual),
then all redirections become silent. In addition, if FBA or Integrated authentication is configured, a
Single Sign On experience will occur.
If the mailbox servers in the primary are operational and there is a functioning DC in
the primary site, use
Stop-DatabaseAvailabilityGroup -Identity DAG1 -ActiveDirectorySite NYC
o If the mailbox servers in the primary site are not operational but there is domain
controller in the primary site, use this command for each primary MBX servers:
Stop-DatabaseAvailabilityGroup -Identity DAG1 -MailboxServer E14EX3
ConfigurationOnly
o If no DC nor mailbox servers are available in the parent side, then make sure that
mailbox servers are shutdown always.
o If the primary mailbox server are online, make sure the cluster service is set to
Disabled or do it yourself.
o
2. UM Servers
We need to tell the secondary site which servers are available during the switch
over. This can be done by using the Stop-DatabaseAvailabilityGroup command
with the ConfigurationOnly.
If any Unified Messaging servers are in use in the failed datacenter, they must be disabled to
prevent call routing to the failed datacenter. You can disable a Unified Messaging server by
using the Disable-UMServer cmdlet (for example, Disable-UMServer UM01).
Alternatively, if you are using a Voice over IP (VoIP) gateway, you can also remove the Unified
Messaging server entries from the VoIP gateway, or change the DNS records for the failed
servers to point to the IP address of the Unified Messaging servers in the second datacenter if
your VoIP gateway is configured to route calls using DNS.
5. You should make sure the Witness server and directory are up. Never lose them and avoid
restarting them. Make sure Exchange Trusted Subsystem is member of the local administrator
group on the Witness server and create a firewall rule on the Witness server if necessary to
allow all traffic from the mailbox server to the Witness Server.
6. At this moment, the secondary mailbox server(s) will try to assume the ownership of the cluster
group and trying to get the secondary DAG IP online and will keep trying to bring the alternative
Witness share online.
7. Use Get-DatabaseAvailabilityGroup cmdlet to make sure the Stopped servers are those mailbox
servers in the primary site while started servers are those in the secondary site only.
8. If databases in the secondary site dont mount automatically, remember to remove any
activation blocks on the server level (Set-MailboxServer) or on the database level (Suspend
Activation).
9. If still databases didnt mount correctly, use this command:
Move-ActiveMailboxDatabase Server FQDNofaServerinPrimarySite
ActivateOnServer FQDNofaServerinDRSite
This command contains many Skip switches that can be handy.This is very important step as it is
like taking ownership of those databases. You can also use :
Move-ActiveMailboxDatabase DatabaseName ActivateOnServer
FQDNofaServerinDRSite
10. We need to choose whether to remove the database copies existing in the primary site to allow
log truncation or not. If we choose so, reseeding will be necessary once you fail back to the
primary data center.
11. Outlook Office clients will act as per the following :
a. If the primary CAS servers are online, CAS servers in the primary site will issue a silent
redirect message to outlook users. Outlook users will see a message that they need to
restart their outlook.
b. If the primary CAS servers are online, you can change the DNS name for the outlook
anywhere name or just force autodiscover to work by repairing outlook profile
12. OWA clients will do the following :
a. If the primary CAS servers are online, silent redirection will happen with SOO since both
OWA virtual directories has Integrated Authenticated on them
b. If the primary CAS servers are offline, DNS name for OWA primary should point to
secondary and thats it.
13. If you restarted mailbox servers in the secondary site and/or the Witness server, the DAC bit will
be sit to 0 and databases will be shown as Dismounted. If you try to mount them , an error that
the replication services on the primary mailbox servers are not online. You may find a problem
locating the Active manager also especially if you typed: Get-DatabaseAvailabilityGroup
Identity DAGName Status.
The solution will be forcing the DAC bit to be 1 by running the Start-DatabaseAvabilibityGroup
Server (Secondary Mailbox Servers) even if they are already started.
Mail.LON.contoso.com
OWA.LON.contoso.com
EAS.LON.contoso.com
And suppose SCP for Autodiscover for CAS servers in the primary datacenter points to
Mail.NYC.contoso.com where SCP for CAS servers in the secondary datacenter points to
Mail.LON.contoso.com. Suppose also that the public autodiscover.Contoso.com points externally to
primary datacenter publishing rule
7. Notice that the default cluster group is hosted on the secondary site which means that the
Primary Active Manager PAM is located on the node who holds the default cluster group.
To identify the PAM server, run: Get-DatabaseAvailabiliyGroup Identity DAG1 Status
|FL *Primary*
8. You can move the default cluster group to the primary mailbox server by running Cluster
group Cluster Group /MoveTo:EX01 .
9. Dismount databases in the secondary datacenters and move the CAS URLs.
10. After DNS is replicated and the cache is refreshed, use the Move-ActiveMailboxDatabase for the
copies in the primary site.
11. Mount database copies in the primary site.
12. Outlook clients will find a message to indicate that the administrator has changed something
and the outlook need to be restarted.
Note : When mounting database copies on the primary site, sometimes you will face issues like database
cannot mount because index problem. For this scenario, you can run :
Update-MailboxDatabaseCopy DBName\FailedToMountServer CatalogOnly
If this didnt work, use
Move-ActiveMailboxDatabase Database Name -ActivateOnServer DestinataionServer
SkipClientExperienceChecks
Note that this command is powerful, look at this :
Move-ActiveMailboxDatabase Database Nam e ActivateOnServer Options
Where Options can be:
SkipActiveCopyChecks
SkipClientExperienceChecks
SkipHealthChecks
SkipLagChecks
7. Autodiscover
7.1 When Autodiscover is triggered on Outlook
o
o
o
o
o
Nevertheless, repairing Outlook profile is the most effective way to force complete reconfiguration of
Outlook when Autodiscover gets new information.
Database Name
Home Server (RPC Client Access Array Server attribute of the DB), aka. The database
legacyExchangeDN
LegacyDN of the mailbox
The rest of information are not that important and are return by Autodiscover.
If profile is configured, outlook will try to resolve the Home Server in the outlook profile and connect to
it using TCP. This represents the Client Access Server Array object which should not be resolving
externally in all cases, (nor internally, only if you want to force Outlook Anywhere behavior)
Scenario 1
It is important to remember that neither Outlook nor CAS care about the AD site in which the CAS server
is located at.
If the database get mounted to different site, and you change just the DNS record of the primary CAS
array to point to the CAS array of the secondary site, everything works fine. This works for RPC Clients.
Scenario 2
RULE: The RPCClientAccessServer property of the database a.k.a the database legacyExchnageDN always
points to the RPC CAS array that is in the same site as the copy of the mailbox database with the lowest
activation preference (which equals 1).
In the below figure, when the database get mounted on MBX-C, the RPCClientAccessServer property will
stay CAS-Pri.contoso.com. The outlook user will still point to cas.pri.contoso.com and CAS Direct
Connect over the WAN will happen from CAS-Pri to MBX-C. If CAS-Pri is inaccessible, the Outlook will get
disconnected!
Scenario 3
The only time the system changes RPCClientAccessServer value on the database is when the
administrator changes the ActivationPreference number on the activated database copy such that it
now has the lowest value (meaning it becomes the preferred copy), as seen below.
However, the Outlook clients with an existing Outlook profile would continue to use the old RPC
endpoint rather than the new RPC endpoint (even though Autodiscover detected the change). This is
because the old RPC endpoint does not return an ecWrongServer response to the client.
The RPC endpoint accepts the connection; therefore, Outlook ignores the Autodiscover response
because it has a working connection. In the event that the old RPC endpoint becomes inaccessible,
Outlook 2007/2010 would update its settings. At any time you could force Outlook to use the new RPC
endpoint by forcing a profile repair.
You can also manually change the RPCClientAccessServer property of the database to point to the new
array instead of changing its activation preference.
The same happens when you move a mailbox to a database in different AD site. Outlook will continue to
use the old and configured RPC CAS array unless that array become inaccessible or you trigger Outlook
profile repair.
Scenario 4
After Exchange SP2 RU3, the following changes happen:
By default, once you have installed SP2 RU3, when you move mailboxes between AD sites, all
versions of Outlook will get prompted to restart and the Outlook profiles RPC endpoint will be
updated.
Actually the CAS array log on the primary site will ask the Outlook to redirect to the CAS array in the
secondary site although the LegacyExchangeDN of the database is still pointing to the primary CAS array.
BEST COPY
SELECTION BCS
Within
NO
AutoDatabaseMount
Dial ?
Yes
Exclusion?