This post is lingering around for a while as I thought it would be fixed some days. Unfortunately it wasn’t fixed as of today, so be aware in case you’re going to decommision your Exchange servers. In particular:
The need of removing databases can have multiple reasons. In my case it was the next logical step in our journey to Exchange Online.
As we were migrating mailboxes to Exchange Online, there was a time, where we could start reducing the number of servers in our datacenters.
So we started decommissioning of the first servers out of approx. 130 in total. First steps taken were the following:
- moving off any kind of mailboxes from databases in the respective DAG
- disable circular logging (as we are using Exchange Native Data Protection!)
- removing all database copies
- removing active database
So far so good and it looks straight forward. But suddenly we started seeing tickets coming in from service owners of LOB and end-users having connectivity issues.
By checking the event logs, I found the first indication:
The event logs across all servers were flodded by these events. And not only Autodiscover and EWS was affected: All HTTP Exchange protocols were affected. The server responded with HTTP 503 error randomly.
Even mailboxes already migrated to Exchange Online were affected as we still have Autodiscover pointing on-premises as long as a single mailbox is there.
The only solution for this problem is to recycle the application pools on all Exchange servers. This will fix it immediately.
Be careful whenever you’re going to remove databases as the ResourceLocator seems to have an outdated cache and is not re-evaluating. This issue might be fixed in one of the next CUs.