Recover Crashed Exchange 2013 Mailbox Server in DAG

Recovering a crashed mailbox server in a Database Availability Group (DAG) is a straightforward process using the setup.exe /m:RecoverServer command. However, to ensure a smooth recovery, you need to follow certain steps. Here’s an overview of the recovery process, which I will explain in detail later:

  1. Reset the Crashed Server’s Computer Account in Active Directory (AD): Begin by resetting the computer account associated with the crashed server in AD.
  2. Install a New Server OS: Replace the old crashed server with a new one and install the necessary operating system.
  3. Set Up Windows Features and Updates: Install all required Windows features, prerequisites, and updates on the new server.
  4. Remove Passive Database Copies: Remove the passive database copies that were hosted on the crashed server. If the server is still accessible, manually delete any remaining database and log files.
  5. Remove the Crashed Server from the DAG: Use the Exchange Management Console (EMC) or Exchange Management Shell (EMS) to remove the crashed server from the DAG.
  6. Evict the Server from the Failover Cluster: Remove the crashed server from the failover cluster using the Failover Cluster Manager.
  7. Start the Recovery Process: Run the setup.exe command with the necessary switches in Command Prompt to initiate the recovery process. Further details about these switches will be provided later in this section.
  8. Add to DAG and Re-seeding DB copies: You’d need to add server back to DAG and re-seeding DB for passive copies.

After completing the recovery process, the servers will reappear in the Exchange Management Console (EMC). Once this is done, you can configure the Database Availability Group (DAG) and reseed databases as necessary. Here are the detailed steps I followed for recovery:

Steps for Recovery

1. Reset the Crashed Server’s Computer Account in Active Directory (AD)

  • Open Active Directory Users and Computers (ADUC).
  • Find the computer account of the crashed server.
  • Right-click on the account and select “Reset Account” to remove any old associations.

2. Install a New Server OS to Replace the Crashed Server

  • Set up a new server with the same specifications as the crashed server.
  • Use the same computer name as the old server and join it to the domain.
  • Install the necessary Windows features, Microsoft Filter Packs, and the Unified Communications Managed API (UCMA) Runtime.

3. Set Up Windows Features and Updates

a) Run PowerShell as an administrator.

b) Install Windows Features:

Use the following PowerShell command to install required features:

Install-WindowsFeature AS-HTTP-Activation, Desktop-Experience, NET-Framework-45-Features, RPC-over-HTTP-proxy, RSAT-Clustering, RSAT-Clustering-CmdInterface, Web-Mgmt-Console, WAS-Process-Model, Web-Asp-Net45, Web-Basic-Auth, Web-Client-Auth, Web-Digest-Auth, Web-Dir-Browsing, Web-Dyn-Compression, Web-Http-Errors, Web-Http-Logging, Web-Http-Redirect, Web-Http-Tracing, Web-ISAPI-Ext, Web-ISAPI-Filter, Web-Lgcy-Mgmt-Console, Web-Metabase, Web-Mgmt-Console, Web-Mgmt-Service, Web-Net-Ext45, Web-Request-Monitor, Web-Server, Web-Stat-Compression, Web-Static-Content, Web-Windows-Auth, Web-WMI, Windows-Identity-Foundation

c) Install Additional Prerequisites:

Download and install the following prerequisites:

Note: Use the same computer name as old ones and join to domain.

4. Remove Database Passive Copies from the Crashed Server

Since the crashed server is inaccessible, you can remove passive database copies using the Exchange Management Shell (EMS) on healthy servers.

Steps:

1. Open EMS on a healthy server.

2. Check the status of the mailbox database copies

Get-MailboxDatabaseCopyStatus *<name of your crashed server>

Ensure that all databases related to the crashed server are in the ServiceDown state.

3. Remove the failed database copies:powershellCopy codeGet-MailboxDatabaseCopyStatus *<name of your crashed server> | Remove-MailboxDatabaseCopy

A warning may appear. Review it and proceed to remove the copies.

Checking the Database Copy Status

5. Remove the Crashed Server from the DAG

Using the Exchange Management Console (EMC):

  1. Open EMC.
  2. Navigate to Servers > Database Availability Group.
  3. Select the DAG and click Manage DAG Membership.
  4. Remove the crashed server from the group.

Using Exchange Management Shell (EMS):
Run the following command to remove the server from the DAG:

Remove-DatabaseAvailabilityGroupServer -Identity <your DAG name> -MailboxServer <your crashed server name>

6. Evict the Crashed Server from Failover Cluster

Removing the server from the DAG does not automatically remove it from the failover cluster. You must manually evict it.

Steps:

  1. Open an elevated Command Prompt and check the cluster node statuses:shellCopy codecluster.exe node Identify nodes in a Down state.
  2. Open Failover Cluster Manager:
    • Go to [your cluster name] > Nodes.
    • Locate the failed server.
    • Right-click on the server and choose More Actions > Evict to remove it.

7. Start the Recovery Process

  1. Navigate to the folder containing the Exchange setup files.
  2. Run the following command to start the recovery:shellCopy codesetup.exe /m:RecoverServer /IAcceptExchangeServerLicenseTerms

Note:

If you encounter errors stating that the setup files are outdated, download the latest Cumulative Update (CU) for Exchange from Microsoft’s official site.

  1. Verify the Exchange version on healthy servers (if needed):powershellCopy codeGet-ExchangeServer | Format-List Name, Edition, AdminDisplayVersion

The recovery process fetches information from AD objects and reinstalls Exchange with all previous configurations. You can monitor the progress in the console.

Showing Recovery Process

8. Add to DAG and Re-seeding DB copies

After the servers are recovered, you need to reboot the recovered servers for proper functioning. Then,
       a) add the servers back to DAG group.
       b) reseed the DB passive copies.
This is quite a simple process and I won’t go details with these.

Congratulation!

Your recovery process is now Successful.


Note: For me, I had some minor issues when reseeding DB with the following errors.

The seeding operation failed. Error: An error occurred while performing the seed operation. Error: Unable to delete logs at 'D:\M datalogs'. The database has been seeded successfully. If any obsolete log files exist, manually delete them to prevent database divergence. Error: System.IO.IOException: The file or directory is corrupted and unreadable. at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath) at System.IO.FileInfo.Delete() at Microsoft.Exchange.Cluster.Replay.DatabaseSeederInstance.DeleteLogFiles(DirectoryInfo di, String logfilePrefix, String logfileSuffix, Int32& logNum) at Microsoft.Exchange.Cluster.Replay.DatabaseSeederInstance.

So, I assume there is some disk corruption in my harddisk. Fortunately, that server has no active database copies, So I manually chkdsk the volume without /r switch. Since some disk errors are found, I use the chkdsk /r /f instead.

C:\chkdsk d: /r /f

Then, I reseed the DB again, and other errors come up.

The seeding operation failed. Error: An error occurred while performing the seed operation. Error: Failed to notify source server ‘[my source server name]’ about the local truncation point. Hresult: 0xc8000713. Error: Unable to find the file.

According to this article, it said that this is the issue with the old logs folder which was not in the sync. And I did the following procedures:

1. Dismount the Active Copy of the database from EMC console.

2. Find the Database file path and Log file folder path in EMS shell. Let us assume here, edb file path is D:\Database\emydatabase.edb  and Log folder path is D:\DatabaseLogs.

Get-MailboxDatabase mdb04 | fl path

3. Login to the source server hosting that database, here ‘[my source server name]’ and you need to run eseutil.exe in elevated command prompt to verify that DB is in clean shutdown state. We have got the DB path & log folder path in step 2.

C:\eseutil /mh D:Databasemydatabase.edb

(If the DB state is clean shutdown, you can continue the next step. If the state is dirty shutdown, you need to go for the recovery process using log files. And this article will help you.)

4. If the DB is clean shutdown state, you can delete all log files in folder path we obtained from step-2. If you are unsure, you can rename the Log folder and delete them later. You can also delete via the following command if there are thousands of log files.

D:\DatabaseLogsrmdir . /s /q

5. Mount the database, this will create new log files.

6. Reseed the database copies.

Leave a Reply

Your email address will not be published. Required fields are marked *