I had an Active Directory replication problem at a customer site with a multi domain environment.  Two domain controllers exists in a child domain called DC1 and DC2.  DC1 resided in a remote branch location and DC2 exists in a datacentre.
Symptoms
When attempting a manual replication attempt the following error was experienced:
The following error occurred during the attempt to synchronize naming context "ROOT DOMAIN" for domain controller DC1 to domain controller DC2:
The naming context is in the process of being removed or is not replicated from the specified server.
The operation will not continue.
 
When doing repadmin /showrepl all inbound replication partners came back successful under the last replication attempt however all outbound replication partners such as the replication attempt from DC1 to DC2 came back as failed with the following errors:
Source: RemoteSiteName\DC1
******* 216 CONSECUTIVE FAILURES since 2013-04-21 07:32:26
Last error: 8524 (0x214c):
Can't retrieve message string 8524 (0x214c), error 1815.
Naming Context: CN=Configuration,DC=ROOTDOMAIN,DC=LOCAL
Source: RemoteSiteName\DC1\
******* WARNING: KCC could not add this REPLICA LINK due to error.
Naming Context: DC=CHILDDOMAIN,DC=ROOTDOMAIN,DC=LOCAL
Source: RemoteSiteName\DC1
******* WARNING: KCC could not add this REPLICA LINK due to error.
Running a DCDiag on DC1 came back with the following errors:
Starting test: Connectivity
The host 9b2163cf-b8e7-4ad4-bd54-2342e6cfc1db._msdcs.rootdomain.local could not be resolved to an IP address. Check the DNS server, DHCP, server name etc.
Although the GUID DNS name 9b2163cf-b8e7-4ad4-bd54-2342e6cfc1db._msdcs.rootdomain.local couldn't be resolved, the server name (DC1.child.rootdomain.local) resolved to the IP address xx.xx.xx.xx and was pingable. Check that the IP address is registered correctly with the DNS server.
Resolution
The following error message is the one which lead me to the resolution of this replication issue.
The host 9b2163cf-b8e7-4ad4-bd54-2342e6cfc1db._msdcs.rootdomain.local could not be resolved to an IP address. Check the DNS server, DHCP, server name etc.
Although the GUID DNS name 9b2163cf-b8e7-4ad4-bd54-2342e6cfc1db._msdcs.rootdomain.local couldn't be resolved, the server name (DC1.child.rootdomain.local) resolved to the IP address xx.xx.xx.xx and was pingable. Check that the IP address is registered correctly with the DNS server.
The first domain you promote in a new Active Directory forest is the forest root domain (this can never be changed without building a new forest). The forest root domain contains a MSDCS container in DNS and contains a bunch of CNAME records for all domain controllers in the root domain as well as any child domains/new tree domains. These CNAME records are what Active Directory uses to lookup domain controllers when attempting to perform replication.
This is shown in the following screenshot.
The reason DC1 was unable to replicate to any other DC in the domain was because someone deleted the GUID mapping the CNAME record for DC1 from the msdcs container in Active Directory. From the DCDIAG error message we manually recreated the 9b2163cf-b8e7-4ad4-bd54-2342e6cfc1db._msdcs.rootdomain.local record mapping to DC1 as a CNAME record.
After 45 minutes replication began working again for DC1.
Hope this post helps someone with the same problem.
- DC2 was able to replicate changes to DC1 without issues.
- DC1 was not able to replicate changes to DC2.
Symptoms
When attempting a manual replication attempt the following error was experienced:
The following error occurred during the attempt to synchronize naming context "ROOT DOMAIN" for domain controller DC1 to domain controller DC2:
The naming context is in the process of being removed or is not replicated from the specified server.
The operation will not continue.
When doing repadmin /showrepl all inbound replication partners came back successful under the last replication attempt however all outbound replication partners such as the replication attempt from DC1 to DC2 came back as failed with the following errors:
Source: RemoteSiteName\DC1
******* 216 CONSECUTIVE FAILURES since 2013-04-21 07:32:26
Last error: 8524 (0x214c):
Can't retrieve message string 8524 (0x214c), error 1815.
Naming Context: CN=Configuration,DC=ROOTDOMAIN,DC=LOCAL
Source: RemoteSiteName\DC1\
******* WARNING: KCC could not add this REPLICA LINK due to error.
Naming Context: DC=CHILDDOMAIN,DC=ROOTDOMAIN,DC=LOCAL
Source: RemoteSiteName\DC1
******* WARNING: KCC could not add this REPLICA LINK due to error.
Running a DCDiag on DC1 came back with the following errors:
Starting test: Connectivity
The host 9b2163cf-b8e7-4ad4-bd54-2342e6cfc1db._msdcs.rootdomain.local could not be resolved to an IP address. Check the DNS server, DHCP, server name etc.
Although the GUID DNS name 9b2163cf-b8e7-4ad4-bd54-2342e6cfc1db._msdcs.rootdomain.local couldn't be resolved, the server name (DC1.child.rootdomain.local) resolved to the IP address xx.xx.xx.xx and was pingable. Check that the IP address is registered correctly with the DNS server.
Resolution
The following error message is the one which lead me to the resolution of this replication issue.
The host 9b2163cf-b8e7-4ad4-bd54-2342e6cfc1db._msdcs.rootdomain.local could not be resolved to an IP address. Check the DNS server, DHCP, server name etc.
Although the GUID DNS name 9b2163cf-b8e7-4ad4-bd54-2342e6cfc1db._msdcs.rootdomain.local couldn't be resolved, the server name (DC1.child.rootdomain.local) resolved to the IP address xx.xx.xx.xx and was pingable. Check that the IP address is registered correctly with the DNS server.
The first domain you promote in a new Active Directory forest is the forest root domain (this can never be changed without building a new forest). The forest root domain contains a MSDCS container in DNS and contains a bunch of CNAME records for all domain controllers in the root domain as well as any child domains/new tree domains. These CNAME records are what Active Directory uses to lookup domain controllers when attempting to perform replication.
This is shown in the following screenshot.
The reason DC1 was unable to replicate to any other DC in the domain was because someone deleted the GUID mapping the CNAME record for DC1 from the msdcs container in Active Directory. From the DCDIAG error message we manually recreated the 9b2163cf-b8e7-4ad4-bd54-2342e6cfc1db._msdcs.rootdomain.local record mapping to DC1 as a CNAME record.
After 45 minutes replication began working again for DC1.
Hope this post helps someone with the same problem.
 




 
 
 
"After 45 minutes replication began working again for DC1."
ReplyDeleteSo the system fixed itself automatically after 45mins, or you had to actually do something to fix it? If you had to do, then what actually? If you had waited more than 45 mins, and the issue would have been resolved all by itself, would you even notice this transient issue at all?
Sorry, I just missed that you manually re-created the GUID entry. Isn't there an automatic way of doing this, for example when the Netlogon/ADDS/DNS service is restarted, it checks for this issue and fixes all by itself?
DeleteGood blog Clint,
ReplyDeleteFor readers that record is also referred to as DSA GUID or DC-GUID depending on the search. Probably the most important record in DNS.
Thanks
Mike
Hi Anonymous,
ReplyDeleteThis customer has over 30 domain controllers so as you would understand the wait would be dependant on replication and the ISTG (part of the KCC) to perform rediscovery.
Also thanks for your input Mike.
Kind Regards,
Clint
Thanks, had a similar (but not the same issue) but the fix was the same
ReplyDelete