Inconsistency in group membership and user status on one or multiple nodes in Jira Data Center

Still need help?

The Atlassian Community is here for you.

Ask the community


Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.

Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

This KB applies if you are running in DC and the Jira version is lower than 8.10. From 8.10 removing the stale nodes is being taken care of as per suggestion JRASERVER-42916.


Problem

In Jira Data Center, few users will randomly lose group membership or are made Inactive in User Management, which in turn will cause the login failure for that User on the impacted node/nodes. Symptoms include: 

  • User group membership on the UI of one or many nodes differs from the data in the DB. 
  • User status (active or inactive) in the UI differs from the DB. 

Diagnosis

Check the user and group membership which is in question in the DB to confirm we have data inconsistency issues:

  1. Check the user details of an affected user from the user table. Make a note of the “active” column, 1 stands for active, and 0 stands for inactive.

    select * from cwd_user where lower_user_name = '<lower_user_name>'; 
  2. Check the group details of an affected user group from the membership table. Make a note of the “parent_name” column to see the groups associated with the user. 

    select * from cwd_membership where lower_child_name = '<lower_user_name>';

    If anything appears different in the results from the database table compared to what you see in JIRA's user interface, then you may be affected by this issue.

  3. Check the user details in user management on all the nodes. If the user mismatch appears on all the nodes then apply the fix explained in LDAP users and groups display unexpectedly in Jira server.

  4. If the user issue is noticed only on particular nodes then we need to identify what is causing this behaviour. The steps below will help us identify if changes to the cache are being sent from another environment. 

    Add the below logging to get more details on all the nodes and restart all nodes to apply the changes. After logging has been increased, wait for the issue to reappear.
    In the file <JIRA_INSTALL>/atlassian-jira/WEB-INF/classes/log4j.properties add the lines below:

    log4j.logger.net.sf.ehcache.distribution.RMICachePeer = DEBUG, filelog 
    log4j.additivity.net.sf.ehcache.distributionRMICachePeer = false 
    log4j.logger.com.atlassian.cache.event.com.atlassian.jira.issue = DEBUG, filelog 
    log4j.additivity.com.atlassian.cache.event.com.atlassian.jira = false 
    log4j.logger.com.atlassian.cache.event.com.atlassian.jira.config = DEBUG, filelog 
    log4j.additivity.com.atlassian.cache.event.com.atlassian.jira.config = false
  5. On issue reappearance, check for the affected group/user in the atlassian-jira.log* , there we see RMI events for the cache.
    In the below log example we can see that “xx.xx.xx.xx“ IP/node sends a request for remove group “xx_xx“, which is the actual root cause of the problem.

    2020-05-07 12:18:24,960 RMI TCP Connection(12898)-xx.xx.xx.xx DEBUG [n.s.ehcache.distribution.RMICachePeer] RMICachePeer for cache com.atlassian.jira.crowd.embedded.ofbiz.OfBizInternalMembershipDao.childrenCache: remote remove received for key: MembershipKey[directoryId=10100,name=xx_xx,type=GROUP_USER] 
  6. Next, identify the IP to see if the request is coming from outside the DC cluster nodes. You may use nslookup/hostname command to resolve the IP.

Cause

The impacted node receives the group or user cache update from a node that is not part of the cluster. This happens when a backup is being restored to the lower environment where the impacted node is part of the source.

Resolution

The resolution should be done on the environment which sends the cache update to the affected node.

Remove the node entry from the cluster table on the environment which you identified as part of the diagnosis.  Please refer to Remove abandoned or offline nodes in JIRA Data Center for removing the node from the cluster.

Description
Product

Last modified on Jul 18, 2022

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.