User-login JIRA stats logs

Still need help?

The Atlassian Community is here for you.

Ask the community

Intro

In Jira 8.22 we have provided some bug-fixes and performance improvements to the user login process and closed following public issues:

JRASERVER-70690

BUG-FIX

JRASERVER-70690 - Getting issue details... STATUS

We’re adding “unique” constraint to idx_mem_dir_parent_child index of cwd_membership table. This will improve data consistency and eliminate rare user login issues. Jira will perform this work automatically during upgrade, just keep in mind that it might take some additional time.

JRASERVER-71483

BUG-FIX

JRASERVER-71483 - Getting issue details... STATUS

User login was occasionally disrupted by inconsistent state of caches across the cluster. We’ve made the process independent from the cache which will prevent any issues with logging in to the instances.

JRASERVER-70468 & JSWSERVER-20844

PERF-FIX

JRASERVER-70468 - Getting issue details... STATUS   JSWSERVER-20844 - Getting issue details... STATUS

This release comes with performance improvements mainly targeting bots frequently logging in to Jira via basic authentication.

Jira is now delaying storing user login counters and timestamps. Because of that the performance of instances will improve but the latest information might be not available immediately. The information displayed is eventually consistent with maximum delay of 30 seconds.

Authentications are now cached inside Jira for 15 mins or until first unsuccessful attempt. This will improve the performance and reduce the load on authenticating external directories.

Performance results

Test setup:

  • 8 nodes

  • LDAP with 50k users

  • node generating REST traffic with basic auth via jmeter

  • test running ~30min


Before (Jira 8.20)Fixed (Jira 8.22.0)
number of user login actions in during the test975_394 (1M)

16_045_878 (16M)

user login average time282 milliseconds17 milliseconds
user login 99p time1_511 milliseconds38 milliseconds
user login max time10_521 milliseconds5_082 milliseconds
DB usagehighlow
Remote dir communicationhighlow
Node2node communicationhighlow

Jira Stats

During this work we have added several Jira Stats related to user login which can be used to monitor user-login performance.

[JIRA-STATS] [LOGIN-STATS]

In this stats we can see which can compare local authentication requests to remote authentication request.

  • settings - configuration parameters
  • topUsers - top users requesting authentication with counters, here user big_mac performed 9_648 authentications on this node (since the node started, ie. duration=PT2H53M18.768S)
  • authCacheGetOrLoadInMillis
    • "count": 3205 - number of authentications (local or remote)
    • "avg": 21 - average time to authenticate in milliseconds
  • authSuccessfulInMillis
    • "count": 69 - number of remote successful authentications 
    • "avg": 904 - average time to successfully authenticate (remote) in milliseconds
  • authFailedInMillis
    • "count": 3205 - number of remote failed authentications 
    • "avg": 64 - average time authenticate with failure (remote) in milliseconds
  • authCacheInvalidation: 42 - number of cache invalidations per user (key); cache can be invalidated on following events: user log-out,  user credential updated, user renamed, user profile updated; this should happen rarely comparing to the total number of authentications (authCacheInvalidation << authCacheGetOrLoadInMillis.count)

Login-stats example:

[JIRA-STATS] [LOGIN-STATS] total stats: duration=PT2H53M18.768S, statsOverhead=n/a, data=
{
  "settings": {
    "legacyMode": false,
    "authCacheExpireMin": 15,
    "authCacheMaxSize": 10000,
    "passwordEncoder": "atlassian-sha1"
  },
  "topUsers": {
    "big_mac": 9648,
    "s_wiseman": 3742,
    "": 3137,
    "backbone_sync_bot": 1606,
    "bruce_lee": 26,
    "serv_sef_rep_bot": 14,
    "server_jirasync_bot": 5,
    "dc_jira_bot": 4
  },
  "authSuccessfulInMillis": {
    "count": 69,
    "min": 177,
    "max": 16880,
    "sum": 62425,
    "avg": 904,
    "distributionCounter": {
      "1": 0,
      "10": 0,
      "100": 0,
      "1000": 65,
      "5000": 2,
      "9223372036854775807": 2
    }
  },
  "authFailedInMillis": {
    "count": 3205,
    "min": 15,
    "max": 1536,
    "sum": 205535,
    "avg": 64,
    "distributionCounter": {
      "1": 0,
      "10": 0,
      "100": 2971,
      "1000": 228,
      "5000": 6
    }
  },
  "authCacheGetOrLoadInMillis": {
    "count": 18223,
    "min": 0,
    "max": 16881,
    "sum": 389061,
    "avg": 21,
    "distributionCounter": {
      "1": 14783,
      "10": 2,
      "100": 2957,
      "1000": 456,
      "5000": 19,
      "9223372036854775807": 6
    }
  },
  "authCacheInvalidation": 42,
  "encodePasswordInMillis": {
    "count": 69,
    "min": 0,
    "max": 7,
    "sum": 7,
    "avg": 0,
    "distributionCounter": {
      "0": 68,
      "1": 0,
      "5": 0,
      "9223372036854775807": 1
    }
  },
  "validatePasswordInMillis": {
    "count": 14933,
    "min": 0,
    "max": 10,
    "sum": 21,
    "avg": 0,
    "distributionCounter": {
      "0": 14930,
      "1": 0,
      "5": 1,
      "9223372036854775807": 2
    }
  }
}


[JIRA-STATS] [LOGIN-STORE-STATS]

These are stats which describe the performance of the delayed login store update, i.e. user login counters, timestamps.
Login store data:

"loginInfo":
{
  "failedLoginCount": 14959,
  "loginCount": 643073,
  "lastFailedLoginTime": "2022-01-17T12:37:05.662+0000",
  "previousLoginTime": "2022-01-17T13:31:52.253+0000"
}

Login-store-stats example:

[JIRA-STATS] [LOGIN-STORE-STATS] total stats: duration=PT3H33M53.089S, statsOverhead=n/a, data=
{
  "settings": {
    "legacyMode": false,
    "flushIntervalInSeconds": 30
  },
  "recordLoginAttemptSuccessfulInMillis": {
    "count": 965,
    "min": 0,
    "max": 295,
    "sum": 3066,
    "avg": 3,
    "distributionCounter": {
      "1": 877,
      "10": 0,
      "100": 87,
      "1000": 1,
      "5000": 0
    }
  },
  "recordLoginAttemptFailedInMillis": {
    "count": 2,
    "min": 25,
    "max": 26,
    "sum": 51,
    "avg": 25,
    "distributionCounter": {
      "1": 0,
      "10": 0,
      "100": 2,
      "1000": 0,
      "5000": 0
    }
  },
  "updateLastLoginTimeInMillis": {
    "count": 1349,
    "min": 0,
    "max": 151,
    "sum": 14750,
    "avg": 10,
    "distributionCounter": {
      "1": 827,
      "10": 1,
      "100": 520,
      "1000": 1,
      "5000": 0
    }
  },
  "resetFailedLoginCountInMillis": {
    "count": 0,
    "min": 0,
    "max": 0,
    "sum": 0,
    "avg": 0,
    "distributionCounter": {
      "1": 0,
      "10": 0,
      "100": 0,
      "1000": 0,
      "5000": 0
    }
  },
  "getOrLoadLastKnownLoginInfoInMillis": {
    "count": 1932,
    "min": 0,
    "max": 5,
    "sum": 89,
    "avg": 0,
    "distributionCounter": {
      "1": 1929,
      "10": 3,
      "100": 0,
      "1000": 0,
      "5000": 0
    }
  },
  "delegateRecordLoginAttemptSuccessfulInMillis": {
    "count": 88,
    "min": 29,
    "max": 295,
    "sum": 3062,
    "avg": 34,
    "distributionCounter": {
      "1": 0,
      "10": 0,
      "100": 87,
      "1000": 1,
      "5000": 0
    }
  },
  "delegateRecordLoginAttemptFailedInMillis": {
    "count": 2,
    "min": 25,
    "max": 26,
    "sum": 51,
    "avg": 25,
    "distributionCounter": {
      "1": 0,
      "10": 0,
      "100": 2,
      "1000": 0,
      "5000": 0
    }
  },
  "delegateUpdateLastLoginTimeInMillis": {
    "count": 521,
    "min": 19,
    "max": 149,
    "sum": 14692,
    "avg": 28,
    "distributionCounter": {
      "1": 0,
      "10": 0,
      "100": 520,
      "1000": 1,
      "5000": 0
    }
  },
  "delegateResetFailedLoginCountInMillis": {
    "count": 0,
    "min": 0,
    "max": 0,
    "sum": 0,
    "avg": 0,
    "distributionCounter": {
      "1": 0,
      "10": 0,
      "100": 0,
      "1000": 0,
      "5000": 0
    }
  },
  "invalidateLastKnownLoginInfoCache": 940,
  "flushDataToDBInMillis": {
    "count": 422,
    "min": 0,
    "max": 248,
    "sum": 8939,
    "avg": 21,
    "distributionCounter": {
      "1": 250,
      "10": 0,
      "100": 154,
      "1000": 18,
      "5000": 0
    }
  },
  "flushUserAttemptsInMillis": {
    "count": 2,
    "min": 0,
    "max": 0,
    "sum": 0,
    "avg": 0,
    "distributionCounter": {
      "1": 2,
      "10": 0,
      "100": 0,
      "1000": 0,
      "5000": 0
    }
  },
  "flushRecordedSuccessfulLoginAttemptsMillis": {
    "count": 90,
    "min": 0,
    "max": 37,
    "sum": 933,
    "avg": 10,
    "distributionCounter": {
      "1": 59,
      "10": 0,
      "100": 31,
      "1000": 0,
      "5000": 0
    }
  },
  "flushLastLoginTimesMillis": {
    "count": 519,
    "min": 0,
    "max": 217,
    "sum": 7914,
    "avg": 15,
    "distributionCounter": {
      "1": 222,
      "10": 0,
      "100": 296,
      "1000": 1,
      "5000": 0
    }
  }
}


[JIRA-STATS] [GROUP-DAO-STATS]

Stats related to refreshing the user and group cache during user-login.

The interesting part of this stats in the context of  JRASERVER-71483 - Getting issue details... STATUS  is that when we are unable to find a request group in the group cache we assume the cache can be corrupted and reload this group from cache.

  • findByNameOrNullCacheHit - counter of groups we have found in cache
  • findByNameOrNullCacheMiss - counter of groups we did not find in cache
  • findByNameOrNullDBHitInMilliseconds - timers and counter of group we did not find in cache but found in DB (which fixed the cache)
  • findByNameOrNullDBMissInMilliseconds - timers and counter of group we did not find in cache and this group was not found in DB


[JIRA-STATS] [GROUP-DAO-STATS] total stats: duration=PT3H5.678S, statsOverhead=n/a, data=
{
  "eager": false,
  "findByNameOrNullSkipRefreshOnNull": false,
  "findByNameOrNull": 2680342,
  "findByNameOrNullCacheHit": 2671125,
  "findByNameOrNullCacheMiss": 0,
  "findByNameOrNullDBHitInMilliseconds": {
    "count": 0,
    "min": 0,
    "max": 0,
    "sum": 0,
    "avg": 0,
    "distributionCounter": {
      "1": 0,
      "10": 0,
      "50": 0,
      "100": 0,
      "500": 0,
      "1000": 0,
      "5000": 0,
      "10000": 0
    }
  },
  "findByNameOrNullDBMissInMilliseconds": {
    "count": 0,
    "min": 0,
    "max": 0,
    "sum": 0,
    "avg": 0,
    "distributionCounter": {
      "1": 0,
      "10": 0,
      "50": 0,
      "100": 0,
      "500": 0,
      "1000": 0,
      "5000": 0,
      "10000": 0
    }
  },
  "findGroupsGenericValueInMilliseconds": {
    "count": 0,
    "min": 0,
    "max": 0,
    "sum": 0,
    "avg": 0,
    "distributionCounter": {
      "1": 0,
      "10": 0,
      "50": 0,
      "100": 0,
      "500": 0,
      "1000": 0,
      "5000": 0,
      "10000": 0
    }
  },
  "groupAddFailed": 0,
  "groupAddFailedUpdateSuccessful": 0,
  "groupAddFailedUpdateFailed": 0
}
Last modified on Jan 25, 2022

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.