Monday, November 18, 2013

The server holding the PDC role is down

My customer is running a single Active Directory domain with a single forest.  This domain has a central site and approximately 15 remote branch sites in a hub and spoke type deployment.

When running a FSMO role check using DCDiag on any domain controller in the domain, all domain controllers complained the PDC role is down.  This error experienced below was experienced on all domain controllers throughout the domain.

dcdiag /test:FSMOcheck

Warning: DsGetDcName(TIME_SERVER) call failed, error 1355
A Time Server could not be located.
The server holding the PDC role is down.
Warning: DcGetDcName(GOOD_TIME_SERVER_PREFERRED) call failed, error 1355.
A Good Time Server could not be located.


The time errors just mean the domain time hierarchy is not configured correctly, most likely cause is the PDC emulator in the forest root domain is not configured to sync to an external time source (common configuration requirement in all AD domains).

What was most concerning however is the error "The server holding the PDC role is down".  Testing the PDC emulator, I could connect to it using MMC snap-ins such as Active Directory Users and Computers and verified the PDC Emulator domain controller was processing authentication requests.  Why is this error saying it is down?

Next I performed a test to verify if the PDC emulator is working correctly.  The PDC Emulator has many roles including being the source of authority for domain group policy changes, the source of authority for time synchronisation and being a reliable source for all password synchronization.  For example, all password changes performed by other DCs in the domain are immediately replicated to the PDC emulator role.  If a logon authentication fails at a given DC in a domain due to a bad password, the DC will forward the authentication request to the PDC emulator to validate the request against the most current password.

Using this understanding around how the PDC emulator works, I performed a password change on a remote domain controller in another Active Directory site which has a replication interval of 3 hours.  I then attempted to login using the user account changed password in a different AD site using a domain controller which does not have the updated password.  I verified the request went to the PDC emulator to validate the users password has changed - hence the PDC emulator is in working order.

Also querying the PDC emulator with netdom returned it successfully.


Why are all domain controllers complaining the PDC role is down when doing a DCDiag?

The Resolution

I first went to tackle the time issue.  After logging in to the PDC emulator I noticed the PDC emulator time configuration was setup incorrectly.  It was set to use NT5DS instead of NTP, had the announce flag set to 10 instead of 5 and did not have an NTP server specified.  The PDC emulator in the forest root domain is the only computer in an Active Directory forest which should synchronise using NTP to an external time source, all other domain controllers and member computers need to be set to use NT5DS which tells them to use the Windows Time Hierarchy.

If you are not aware how this works, have a look at the following article (note some of the commands are slightly different in Vista/2008 upwards but concepts still apply).

http://www.windowsnetworking.com/articles-tutorials/windows-2003/Configuring-Windows-Time-Service.html

For changes in some of the commands in the windowsnetworking.com article please see:

http://clintboessen.blogspot.com.au/2012/07/windows-time-sync-changes-in-2008.html

 To fix this up I first ran to configure the PDC to use the pool.ntp.org NTP time server, one of the most widely used time servers on the Internet.

w32tm /config /update /manualpeerlist:pool.ntp.org

Next I went to regedit to the following key:

HKLM\SYSTEM\CurrentControlSet\services\W32Time\Config

And set the AnnounceFlags from 10 to 5 to make it a "reliable time source".  The PDC emulator must always be a reliable time source.  All other domain controllers and member servers should be set to 10.


Next under Parameters you need to set the Type to NTP instead of NT5DS, even though we configured a NTP server it wont use it unless we tell it to use NTP under Type.  You can also see the NtpServer is pool.ntp.org which we configured with the command above.


Next restart the Windows Time service in services.msc on the PDC Emulator.

Lastly perform a w32tm /resync from an elevated command prompt.

 
After fixing up the time issue I ran the following dcdiag test command on a remote domain controller again:
 
dcdiag /test:FSMOcheck


The error did not reoccur, all tests passed.

Conclusion

The conclusion from these findings that the error "The server holding the PDC role is down" from DCDiag is bogus, the PDC emulator is in working order, DCDiag simply said this because the time configuration was not setup correctly.

5 comments:

  1. Rather than edit the registry after using the w32tm /config command, you can use the following:
    w32tm /config /manualpeerlist:oceania.pool.ntp.org,0x1 /syncfromflags:manual /reliable:yes /update

    ReplyDelete
  2. Thanks Anonymous, yeah back in 2003 server you needed to edit the registry... still in old habits.

    Your method is a cleaner way of making the change.

    ReplyDelete
  3. Hi Clint,

    Even more reliable is to use Group Policy. Configure a Group Policy Object with a WMI filter which targets the DC holding the PDCe role. This means even if the PDCe role is transferred the new PDCe will have the right NTP settings applied (it will still require UDP 123 out to Internet though!)

    Well documented and much more resilient!

    http://blogs.technet.com/b/askds/archive/2008/11/13/configuring-an-authoritative-time-server-with-group-policy-using-wmi-filtering.aspx

    ReplyDelete
  4. Thanks for your input Anonymous, I like your solution.

    ReplyDelete
  5. Great article ! You made my day Cint !!!

    Guy Van Dyck
    Guynius Software

    ReplyDelete