User-login JIRA stats logs
Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.
Support for Server* products will end after February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.
Support for Server* products will end after February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Intro
In Jira 8.22 (backported to 8.20.8) we have provided some bug-fixes and performance improvements to the user login process and closed following public issues:
JRASERVER-70690
BUG-FIX
- JRASERVER-70690Getting issue details... STATUS
We’re adding “unique” constraint to idx_mem_dir_parent_child
index of cwd_membership
table. This will improve data consistency and eliminate rare user login issues. Jira will perform this work automatically during upgrade, just keep in mind that it might take some additional time.
JRASERVER-71483
BUG-FIX
- JRASERVER-71483Getting issue details... STATUS
User login was occasionally disrupted by inconsistent state of caches across the cluster. We’ve made the process independent from the cache which will prevent any issues with logging in to the instances.
JRASERVER-70468 & JSWSERVER-20844
PERF-FIX
- JRASERVER-70468Getting issue details... STATUS - JSWSERVER-20844Getting issue details... STATUS
This release comes with performance improvements mainly targeting bots frequently logging in to Jira via basic authentication.
Jira is now delaying storing user login counters and timestamps. Because of that the performance of instances will improve but the latest information might be not available immediately. The information displayed is eventually consistent with maximum delay of 30 seconds.
Authentications are now cached inside Jira for 15 mins or until first unsuccessful attempt. This will improve the performance and reduce the load on authenticating external directories.
Performance results
Test setup:
8 nodes
LDAP with 50k users
node generating REST traffic with basic auth via jmeter
test running ~30min
Before (Jira 8.20) | Fixed (Jira 8.22.0) | |
---|---|---|
number of user login actions in during the test | 975_394 (1M) | 16_045_878 (16M) |
user login average time | 282 milliseconds | 17 milliseconds |
user login 99p time | 1_511 milliseconds | 38 milliseconds |
user login max time | 10_521 milliseconds | 5_082 milliseconds |
DB usage | high | low |
Remote dir communication | high | low |
Node2node communication | high | low |
Jira Stats
During this work we have added several Jira Stats related to user login which can be used to monitor user-login performance.
[JIRA-STATS] [LOGIN-STATS]
In this stats we can see which can compare local authentication requests to remote authentication request.
- settings - configuration parameters
- topUsers - top users requesting authentication with counters, here user big_mac performed 9_648 authentications on this node (since the node started, ie. duration=PT2H53M18.768S)
- authCacheGetOrLoadInMillis
- "count": 3205 - number of authentications (local or remote)
- "avg": 21 - average time to authenticate in milliseconds
- authSuccessfulInMillis
- "count": 69 - number of remote successful authentications
- "avg": 904 - average time to successfully authenticate (remote) in milliseconds
- authFailedInMillis
- "count": 3205 - number of remote failed authentications
- "avg": 64 - average time authenticate with failure (remote) in milliseconds
- authCacheInvalidation: 42 - number of cache invalidations per user (key); cache can be invalidated on following events: user log-out, user credential updated, user renamed, user profile updated; this should happen rarely comparing to the total number of authentications (authCacheInvalidation << authCacheGetOrLoadInMillis.count)
Login-stats example:
[JIRA-STATS] [LOGIN-STATS] total stats: duration=PT2H53M18.768S, statsOverhead=n/a, data=
{
"settings": {
"legacyMode": false,
"authCacheExpireMin": 15,
"authCacheMaxSize": 10000,
"passwordEncoder": "atlassian-sha1"
},
"topUsers": {
"big_mac": 9648,
"s_wiseman": 3742,
"": 3137,
"backbone_sync_bot": 1606,
"bruce_lee": 26,
"serv_sef_rep_bot": 14,
"server_jirasync_bot": 5,
"dc_jira_bot": 4
},
"authSuccessfulInMillis": {
"count": 69,
"min": 177,
"max": 16880,
"sum": 62425,
"avg": 904,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 0,
"1000": 65,
"5000": 2,
"9223372036854775807": 2
}
},
"authFailedInMillis": {
"count": 3205,
"min": 15,
"max": 1536,
"sum": 205535,
"avg": 64,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 2971,
"1000": 228,
"5000": 6
}
},
"authCacheGetOrLoadInMillis": {
"count": 18223,
"min": 0,
"max": 16881,
"sum": 389061,
"avg": 21,
"distributionCounter": {
"1": 14783,
"10": 2,
"100": 2957,
"1000": 456,
"5000": 19,
"9223372036854775807": 6
}
},
"authCacheInvalidation": 42,
"encodePasswordInMillis": {
"count": 69,
"min": 0,
"max": 7,
"sum": 7,
"avg": 0,
"distributionCounter": {
"0": 68,
"1": 0,
"5": 0,
"9223372036854775807": 1
}
},
"validatePasswordInMillis": {
"count": 14933,
"min": 0,
"max": 10,
"sum": 21,
"avg": 0,
"distributionCounter": {
"0": 14930,
"1": 0,
"5": 1,
"9223372036854775807": 2
}
}
}
[JIRA-STATS] [LOGIN-STORE-STATS
]
These are stats which describe the performance of the delayed login store update, i.e. user login counters, timestamps.
Login store data:
"loginInfo":
{
"failedLoginCount": 14959,
"loginCount": 643073,
"lastFailedLoginTime": "2022-01-17T12:37:05.662+0000",
"previousLoginTime": "2022-01-17T13:31:52.253+0000"
}
Login-store-stats example:
[JIRA-STATS] [LOGIN-STORE-STATS] total stats: duration=PT3H33M53.089S, statsOverhead=n/a, data=
{
"settings": {
"legacyMode": false,
"flushIntervalInSeconds": 30
},
"recordLoginAttemptSuccessfulInMillis": {
"count": 965,
"min": 0,
"max": 295,
"sum": 3066,
"avg": 3,
"distributionCounter": {
"1": 877,
"10": 0,
"100": 87,
"1000": 1,
"5000": 0
}
},
"recordLoginAttemptFailedInMillis": {
"count": 2,
"min": 25,
"max": 26,
"sum": 51,
"avg": 25,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 2,
"1000": 0,
"5000": 0
}
},
"updateLastLoginTimeInMillis": {
"count": 1349,
"min": 0,
"max": 151,
"sum": 14750,
"avg": 10,
"distributionCounter": {
"1": 827,
"10": 1,
"100": 520,
"1000": 1,
"5000": 0
}
},
"resetFailedLoginCountInMillis": {
"count": 0,
"min": 0,
"max": 0,
"sum": 0,
"avg": 0,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 0,
"1000": 0,
"5000": 0
}
},
"getOrLoadLastKnownLoginInfoInMillis": {
"count": 1932,
"min": 0,
"max": 5,
"sum": 89,
"avg": 0,
"distributionCounter": {
"1": 1929,
"10": 3,
"100": 0,
"1000": 0,
"5000": 0
}
},
"delegateRecordLoginAttemptSuccessfulInMillis": {
"count": 88,
"min": 29,
"max": 295,
"sum": 3062,
"avg": 34,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 87,
"1000": 1,
"5000": 0
}
},
"delegateRecordLoginAttemptFailedInMillis": {
"count": 2,
"min": 25,
"max": 26,
"sum": 51,
"avg": 25,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 2,
"1000": 0,
"5000": 0
}
},
"delegateUpdateLastLoginTimeInMillis": {
"count": 521,
"min": 19,
"max": 149,
"sum": 14692,
"avg": 28,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 520,
"1000": 1,
"5000": 0
}
},
"delegateResetFailedLoginCountInMillis": {
"count": 0,
"min": 0,
"max": 0,
"sum": 0,
"avg": 0,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 0,
"1000": 0,
"5000": 0
}
},
"invalidateLastKnownLoginInfoCache": 940,
"flushDataToDBInMillis": {
"count": 422,
"min": 0,
"max": 248,
"sum": 8939,
"avg": 21,
"distributionCounter": {
"1": 250,
"10": 0,
"100": 154,
"1000": 18,
"5000": 0
}
},
"flushUserAttemptsInMillis": {
"count": 2,
"min": 0,
"max": 0,
"sum": 0,
"avg": 0,
"distributionCounter": {
"1": 2,
"10": 0,
"100": 0,
"1000": 0,
"5000": 0
}
},
"flushRecordedSuccessfulLoginAttemptsMillis": {
"count": 90,
"min": 0,
"max": 37,
"sum": 933,
"avg": 10,
"distributionCounter": {
"1": 59,
"10": 0,
"100": 31,
"1000": 0,
"5000": 0
}
},
"flushLastLoginTimesMillis": {
"count": 519,
"min": 0,
"max": 217,
"sum": 7914,
"avg": 15,
"distributionCounter": {
"1": 222,
"10": 0,
"100": 296,
"1000": 1,
"5000": 0
}
}
}
[JIRA-STATS] [GROUP-DAO-STATS]
Stats related to refreshing the user and group cache during user-login.
The interesting part of this stats in the context of - JRASERVER-71483Getting issue details... STATUS is that when we are unable to find a request group in the group cache we assume the cache can be corrupted and reload this group from cache.
- findByNameOrNullCacheHit - counter of groups we have found in cache
- findByNameOrNullCacheMiss - counter of groups we did not find in cache
- findByNameOrNullDBHitInMilliseconds - timers and counter of group we did not find in cache but found in DB (which fixed the cache)
- findByNameOrNullDBMissInMilliseconds - timers and counter of group we did not find in cache and this group was not found in DB
[JIRA-STATS] [GROUP-DAO-STATS] total stats: duration=PT3H5.678S, statsOverhead=n/a, data=
{
"eager": false,
"findByNameOrNullSkipRefreshOnNull": false,
"findByNameOrNull": 2680342,
"findByNameOrNullCacheHit": 2671125,
"findByNameOrNullCacheMiss": 0,
"findByNameOrNullDBHitInMilliseconds": {
"count": 0,
"min": 0,
"max": 0,
"sum": 0,
"avg": 0,
"distributionCounter": {
"1": 0,
"10": 0,
"50": 0,
"100": 0,
"500": 0,
"1000": 0,
"5000": 0,
"10000": 0
}
},
"findByNameOrNullDBMissInMilliseconds": {
"count": 0,
"min": 0,
"max": 0,
"sum": 0,
"avg": 0,
"distributionCounter": {
"1": 0,
"10": 0,
"50": 0,
"100": 0,
"500": 0,
"1000": 0,
"5000": 0,
"10000": 0
}
},
"findGroupsGenericValueInMilliseconds": {
"count": 0,
"min": 0,
"max": 0,
"sum": 0,
"avg": 0,
"distributionCounter": {
"1": 0,
"10": 0,
"50": 0,
"100": 0,
"500": 0,
"1000": 0,
"5000": 0,
"10000": 0
}
},
"groupAddFailed": 0,
"groupAddFailedUpdateSuccessful": 0,
"groupAddFailedUpdateFailed": 0
}