Unresponsive Application/Users cannot login when cache is flushed in Data Center

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Problem

When the cache is flushed by a script or other function that utilizes the cachemanager's flush option,  JIRA Data Center will become unresponsive and users will not be able to login. JIRA will eventually rebuild the cache and propagate it across the nodes, but until this happens the application will be in an unusable state. 


In the event that user caches are flushed, the following may appear in the atlassian-jira.log

Remote user name (xxxxxx): Not found in any directory.

The following will appear in the stack traces from threads:

"https-jsse-nio-8080-exec-3" #300 daemon prio=5 os_prio=0 tid=0x00007f9b9ba31800 nid=0x11f6b runnable [0x00007f91b8683000]
java.lang.Thread.State: RUNNABLE
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:525)
at com.squareup.tape.QueueFile.ringWrite(QueueFile.java:237)
at com.squareup.tape.QueueFile.add(QueueFile.java:317)
- locked <0x0000000554a8c1c0> (a com.squareup.tape.QueueFile)
at com.squareup.tape.FileObjectQueue.add(FileObjectQueue.java:46)
at com.atlassian.jira.cluster.distribution.localq.tape.TapeLocalQCacheOpQueue.add(TapeLocalQCacheOpQueue.java:151)
at com.atlassian.jira.cluster.distribution.localq.LocalQCacheOpQueueWithStats.add(LocalQCacheOpQueueWithStats.java:115)
at com.atlassian.jira.cluster.distribution.localq.LocalQCacheManager.addToQueue(LocalQCacheManager.java:370)
at com.atlassian.jira.cluster.distribution.localq.LocalQCacheManager.addToAllQueues(LocalQCacheManager.java:354)
at com.atlassian.jira.cluster.distribution.localq.LocalQCacheReplicator.replicateToQueue(LocalQCacheReplicator.java:85)
at com.atlassian.jira.cluster.distribution.localq.LocalQCacheReplicator.replicatePutNotification(LocalQCacheReplicator.java:65)
at com.atlassian.jira.cluster.cache.ehcache.AbstractJiraCacheReplicator.replicateViaCopy(AbstractJiraCacheReplicator.java:153)
at com.atlassian.jira.cluster.cache.ehcache.AbstractJiraCacheReplicator.notifyElementPut(AbstractJiraCacheReplicator.java:88)
at net.sf.ehcache.event.RegisteredEventListeners.internalNotifyElementPut(RegisteredEventListeners.java:192)
at net.sf.ehcache.event.RegisteredEventListeners.notifyElementPut(RegisteredEventListeners.java:170)
at net.sf.ehcache.Cache.notifyPutInternalListeners(Cache.java:1648)
at net.sf.ehcache.Cache.putInternal(Cache.java:1618)
at net.sf.ehcache.Cache.put(Cache.java:1543)
at net.sf.ehcache.Cache.put(Cache.java:1508)
at com.atlassian.cache.ehcache.DelegatingCache.put(DelegatingCache.java:93)
at com.atlassian.jira.cache.DeferredReplicationCache.lambda$put$0(DeferredReplicationCache.java:60)
at com.atlassian.jira.cache.DeferredReplicationCache$$Lambda$148/563903108.get(Unknown Source)
at com.atlassian.jira.cluster.cache.ehcache.BlockingParallelCacheReplicator.runDeferred(BlockingParallelCacheReplicator.java:172)
at com.atlassian.jira.cache.DeferredReplicationCache.put(DeferredReplicationCache.java:59)
at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache$PutVisitor.visit(UserOrGroupCache.java:247)
at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache$PutVisitor.visit(UserOrGroupCache.java:238)
at com.atlassian.jira.util.Functions$MappedVisitor.visit(Functions.java:198)
at com.atlassian.jira.entity.SelectQueryImpl$ExecutionContextImpl$$Lambda$1139/664730190.accept(Unknown Source)
at com.atlassian.jira.entity.SelectQueryImpl$ExecutionContextImpl.forEach(SelectQueryImpl.java:231)
at com.atlassian.jira.entity.SelectQueryImpl$ExecutionContextImpl.visitWith(SelectQueryImpl.java:185)
at com.atlassian.jira.crowd.embedded.ofbiz.EagerOfBizGroupCache.visitAllUsingDatabase(EagerOfBizGroupCache.java:81)
at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache.buildCacheForced(UserOrGroupCache.java:149)
at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache.buildCacheIfRequiredUnderLock(UserOrGroupCache.java:129)
at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache.buildCacheIfRequired(UserOrGroupCache.java:118)
at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache$1.create(UserOrGroupCache.java:42)
at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache$1.create(UserOrGroupCache.java:38)
at com.atlassian.util.concurrent.ResettableLazyReference$InternalReference.create(ResettableLazyReference.java:182)
at com.atlassian.util.concurrent.LazyReference$Sync.run(LazyReference.java:325)
at com.atlassian.util.concurrent.LazyReference.getInterruptibly(LazyReference.java:143)
at com.atlassian.util.concurrent.LazyReference.get(LazyReference.java:112)
at com.atlassian.util.concurrent.ResettableLazyReference.get(ResettableLazyReference.java:92)
at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache.getCache(UserOrGroupCache.java:50)
at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache.refresh(UserOrGroupCache.java:61)
at com.atlassian.jira.crowd.embedded.ofbiz.OfBizGroupDao.flushCache(OfBizGroupDao.java:307)
at com.atlassian.jira.crowd.embedded.ofbiz.OfBizCacheFlushingManager$OfBizCacheFlushingManagerListener.flushAllCaches(OfBizCacheFlushingManager.java:106)
at com.atlassian.jira.crowd.embedded.ofbiz.OfBizCacheFlushingManager$OfBizCacheFlushingManagerListener.onEvent(OfBizCacheFlushingManager.java:90)
at com.onresolve.scriptrunner.canned.jira.admin.ChangeSharedEntityOwnership.doScript(ChangeSharedEntityOwnership.groovy:313)
at com.onresolve.scriptrunner.canned.jira.admin.ChangeSharedEntityOwnership$doScript$1.callCurrent(Unknown Source)


Diagnosis

Environment

  • While this can affect any size Data Center customer, it will take longer to rectify itself depending on how large the instance is. 

Diagnostic Steps

We found that there was something clearing the cache, this is where user information is stored and until it's rebuilt ALL users will be unable to login.

The only place we have seen this used (so far) is in Script Runner's Change dashboard or filter ownership script which is a built in script that can be run by any JIRA admin. 

Workaround

  • If using Script Runner's Change dashboard or filter ownership script, schedule a good time to run this script after hours or on the weekend.



Last modified on May 10, 2019

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.