Crowd Synchronisation against Active Directory times out with a load balancer in place
After upgrading to Crowd 2.7 or higher, you may notice that synchronizations against your Active Directory and consequentially login attempts may fail
Crowd server and Datacenter
A Load Balancer in between
Atlassian Crowd may exhibit the aforementioned error after an upgrade. There are other applicable factors that may incur into this issue:
- There is a load balancer between Crowd and Active Directory
- The same setup with no changes has worked before the upgrade
It's possible to verify the issue by analyzing the atlassian-crowd.log file and search for a similar timeout error message:
2014-12-02 15:29:20,185 scheduler_Worker-6 ERROR [atlassian.crowd.directory.DbCachingDirectoryPoller] Error occurred while refreshing the cache for directory [ 13467652 ]. com.atlassian.crowd.exception.OperationFailedException: Error looking up attributes for highestCommittedUSN at com.atlassian.crowd.directory.MicrosoftActiveDirectory.fetchHighestCommittedUSN(MicrosoftActiveDirectory.java:807) at com.atlassian.crowd.directory.ldap.cache.UsnChangedCacheRefresher.synchroniseAll(UsnChangedCacheRefresher.java:159) at com.atlassian.crowd.directory.DbCachingRemoteDirectory.synchroniseCache(DbCachingRemoteDirectory.java:1120) at com.atlassian.crowd.manager.directory.DirectorySynchroniserImpl.synchronise(DirectorySynchroniserImpl.java:76) ... at com.atlassian.crowd.directory.DbCachingDirectoryPoller.pollChanges(DbCachingDirectoryPoller.java:50) at com.atlassian.crowd.manager.directory.monitor.poller.DirectoryPollerJobRunner.runJob(DirectoryPollerJobRunner.java:93) at com.atlassian.scheduler.core.JobLauncher.runJob(JobLauncher.java:135) at com.atlassian.scheduler.core.JobLauncher.launchAndBuildResponse(JobLauncher.java:101) at com.atlassian.scheduler.core.JobLauncher.launch(JobLauncher.java:80) at com.atlassian.scheduler.quartz1.Quartz1Job.execute(Quartz1Job.java:32) at org.quartz.core.JobRunShell.run(JobRunShell.java:223) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549) Caused by: org.springframework.ldap.UncategorizedLdapException: Uncategorized exception occured during LDAP processing; nested exception is javax.naming.NamingException: LDAP response read timed out, timeout used:120000ms.; remaining name '/'
There are several possible causes; which may or may not resolve the problem. Proceed to the Solution section and verify if they apply to the case and help to resolve the issue.
Has "Follow Referrals" been enabled?
The most common cause of timeouts is due to "Follow Referrals" being enabled. Generally, these timeouts have two root causes
- The DNS for the domain is not valid, causing timeouts. See User Lookups Fail With PartialResultExceptions for more information
- A large domain (particularly if the domain is partitioned) can also cause similar timeouts. Disabling this option will prevent Crowd from following referrals into other partitions which should speed up sync time (but may not give a complete result)
If "Follow Referrals" has been enabled, try disabling it before performing a second synchronization.
Restricting the LDAP Scope
Using a smaller filter, see if you can limit the LDAP search to just a single, smaller OU; the smaller the better.
Does upgrading to Crowd 2.8.2 or higher resolve the problem?
Crowd 2.8.2 introduced some important improvements for Crowd performance; particularly in large directories. See the Crowd 2.8.2 Release Notes for more information.
Can you bypass the load balancer?
Bypass the load balancer, and connect directly to Active Directory. Bypassing the load balancer (even just temporarily) will help to confirm the load balancer as the cause of the problem (or remove it from consideration). Some customers have reported problems with certain load balancer products/configurations after upgrading to Crowd 2.7 or higher. The same configuration works without any problems in Crowd 2.6. Some customers have reported success with using HAProxy as a load balancer to Active Directory
Please note that the setup or configuration of a load balancer is not covered by Atlassian Support Offerings.