User-login JIRA stats logs
Platform Notice: Data Center - This article applies to Atlassian products on the Data Center platform.
Note that this knowledge base article was created for the Data Center version of the product. Data Center knowledge base articles for non-Data Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
In Jira 8.22 (backported to 8.20.8) we have provided some bug fixes and performance improvements to the user login process and closed the following public issues:
Bug-Fix
We’re adding “unique” constraint to idx_mem_dir_parent_child
index of cwd_membership
table. This will improve data consistency and eliminate rare user login issues. Jira will perform this work automatically during the upgrade, just keep in mind that it might take some additional time.
Bug-Fix
User login was occasionally disrupted by the inconsistent state of caches across the cluster. We’ve made the process independent from the cache which will prevent any issues with logging in to the instances.
JRASERVER-70468 - As a Jira Administrator I want to configure user accounts for integration jobs with low login overhead
JSWSERVER-20844 - Severe performance degradation for user mentions and user login action due to contention in IndexedUserDao
Performance-Fix
This release comes with performance improvements mainly targeting bots frequently logging in to Jira via basic authentication.
Jira is now delaying storing user login counters and timestamps. Because of that the performance of instances will improve but the latest information might be not available immediately. The information displayed is eventually consistent with a maximum delay of 30 seconds.
Authentications are now cached inside Jira for 15 minutes or until first unsuccessful attempt. This will improve the performance and reduce the load on authenticating external directories.
Performance results
Test setup:
8 nodes
LDAP with 50k users
node generating REST traffic with basic auth via jmeter
test running ~30min
Before (Jira 8.20) | Fixed (Jira 8.22.0) | |
---|---|---|
number of user login actions in during the test | 975_394 (1M) | 16_045_878 (16M) |
user login average time | 282 milliseconds | 17 milliseconds |
user login 99p time | 1_511 milliseconds | 38 milliseconds |
user login max time | 10_521 milliseconds | 5_082 milliseconds |
DB usage | high | low |
Remote dir communication | high | low |
Node2node communication | high | low |
Jira Stats
During this work, we have added several Jira Stats related to user login which can be used to monitor user-login performance.
[JIRA-STATS] [LOGIN-STATS]
In these stats, we can see which can compare local authentication requests to remote authentication requests.
- settings - configuration parameters
- topUsers - top users requesting authentication with counters, here user big_mac performed 9_648 authentications on this node (since the node started, ie. duration=PT2H53M18.768S)
- authCacheGetOrLoadInMillis
- "count": 3205 - number of authentications (local or remote)
- "avg": 21 - average time to authenticate in milliseconds
- authSuccessfulInMillis
- "count": 69 - number of remote successful authentications
- "avg": 904 - average time to successfully authenticate (remote) in milliseconds
- authFailedInMillis
- "count": 3205 - number of remote failed authentications
- "avg": 64 - average time authenticate with failure (remote) in milliseconds
- authCacheInvalidation: 42 - number of cache invalidations per user (key); cache can be invalidated on following events: user log-out, user credential updated, user renamed, user profile updated; this should happen rarely compared to the total number of authentications (authCacheInvalidation << authCacheGetOrLoadInMillis.count)
Login-stats example:
[JIRA-STATS] [LOGIN-STATS] total stats: duration=PT2H53M18.768S, statsOverhead=n/a, data=
{
"settings": {
"legacyMode": false,
"authCacheExpireMin": 15,
"authCacheMaxSize": 10000,
"passwordEncoder": "atlassian-sha1"
},
"topUsers": {
"big_mac": 9648,
"s_wiseman": 3742,
"": 3137,
"backbone_sync_bot": 1606,
"bruce_lee": 26,
"serv_sef_rep_bot": 14,
"server_jirasync_bot": 5,
"dc_jira_bot": 4
},
"authSuccessfulInMillis": {
"count": 69,
"min": 177,
"max": 16880,
"sum": 62425,
"avg": 904,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 0,
"1000": 65,
"5000": 2,
"9223372036854775807": 2
}
},
"authFailedInMillis": {
"count": 3205,
"min": 15,
"max": 1536,
"sum": 205535,
"avg": 64,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 2971,
"1000": 228,
"5000": 6
}
},
"authCacheGetOrLoadInMillis": {
"count": 18223,
"min": 0,
"max": 16881,
"sum": 389061,
"avg": 21,
"distributionCounter": {
"1": 14783,
"10": 2,
"100": 2957,
"1000": 456,
"5000": 19,
"9223372036854775807": 6
}
},
"authCacheInvalidation": 42,
"encodePasswordInMillis": {
"count": 69,
"min": 0,
"max": 7,
"sum": 7,
"avg": 0,
"distributionCounter": {
"0": 68,
"1": 0,
"5": 0,
"9223372036854775807": 1
}
},
"validatePasswordInMillis": {
"count": 14933,
"min": 0,
"max": 10,
"sum": 21,
"avg": 0,
"distributionCounter": {
"0": 14930,
"1": 0,
"5": 1,
"9223372036854775807": 2
}
}
}
[JIRA-STATS] [LOGIN-STORE-STATS
]
These are stats that describe the performance of the delayed login store update, i.e. user login counters, and timestamps.
Login store data:
"loginInfo":
{
"failedLoginCount": 14959,
"loginCount": 643073,
"lastFailedLoginTime": "2022-01-17T12:37:05.662+0000",
"previousLoginTime": "2022-01-17T13:31:52.253+0000"
}
Login-store-stats example:
[JIRA-STATS] [LOGIN-STORE-STATS] total stats: duration=PT3H33M53.089S, statsOverhead=n/a, data=
{
"settings": {
"legacyMode": false,
"flushIntervalInSeconds": 30
},
"recordLoginAttemptSuccessfulInMillis": {
"count": 965,
"min": 0,
"max": 295,
"sum": 3066,
"avg": 3,
"distributionCounter": {
"1": 877,
"10": 0,
"100": 87,
"1000": 1,
"5000": 0
}
},
"recordLoginAttemptFailedInMillis": {
"count": 2,
"min": 25,
"max": 26,
"sum": 51,
"avg": 25,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 2,
"1000": 0,
"5000": 0
}
},
"updateLastLoginTimeInMillis": {
"count": 1349,
"min": 0,
"max": 151,
"sum": 14750,
"avg": 10,
"distributionCounter": {
"1": 827,
"10": 1,
"100": 520,
"1000": 1,
"5000": 0
}
},
"resetFailedLoginCountInMillis": {
"count": 0,
"min": 0,
"max": 0,
"sum": 0,
"avg": 0,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 0,
"1000": 0,
"5000": 0
}
},
"getOrLoadLastKnownLoginInfoInMillis": {
"count": 1932,
"min": 0,
"max": 5,
"sum": 89,
"avg": 0,
"distributionCounter": {
"1": 1929,
"10": 3,
"100": 0,
"1000": 0,
"5000": 0
}
},
"delegateRecordLoginAttemptSuccessfulInMillis": {
"count": 88,
"min": 29,
"max": 295,
"sum": 3062,
"avg": 34,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 87,
"1000": 1,
"5000": 0
}
},
"delegateRecordLoginAttemptFailedInMillis": {
"count": 2,
"min": 25,
"max": 26,
"sum": 51,
"avg": 25,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 2,
"1000": 0,
"5000": 0
}
},
"delegateUpdateLastLoginTimeInMillis": {
"count": 521,
"min": 19,
"max": 149,
"sum": 14692,
"avg": 28,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 520,
"1000": 1,
"5000": 0
}
},
"delegateResetFailedLoginCountInMillis": {
"count": 0,
"min": 0,
"max": 0,
"sum": 0,
"avg": 0,
"distributionCounter": {
"1": 0,
"10": 0,
"100": 0,
"1000": 0,
"5000": 0
}
},
"invalidateLastKnownLoginInfoCache": 940,
"flushDataToDBInMillis": {
"count": 422,
"min": 0,
"max": 248,
"sum": 8939,
"avg": 21,
"distributionCounter": {
"1": 250,
"10": 0,
"100": 154,
"1000": 18,
"5000": 0
}
},
"flushUserAttemptsInMillis": {
"count": 2,
"min": 0,
"max": 0,
"sum": 0,
"avg": 0,
"distributionCounter": {
"1": 2,
"10": 0,
"100": 0,
"1000": 0,
"5000": 0
}
},
"flushRecordedSuccessfulLoginAttemptsMillis": {
"count": 90,
"min": 0,
"max": 37,
"sum": 933,
"avg": 10,
"distributionCounter": {
"1": 59,
"10": 0,
"100": 31,
"1000": 0,
"5000": 0
}
},
"flushLastLoginTimesMillis": {
"count": 519,
"min": 0,
"max": 217,
"sum": 7914,
"avg": 15,
"distributionCounter": {
"1": 222,
"10": 0,
"100": 296,
"1000": 1,
"5000": 0
}
}
}
[JIRA-STATS] [GROUP-DAO-STATS]
Stats related to refreshing the user and group cache during user-login.
The interesting part of this stats in the context of JRASERVER-71483 - As a Jira admin I would like to recover from user cache corruption without a whole cluster restart is that when we are unable to find a request group in the group cache we assume the cache can be corrupted and reload this group from cache.
- findByNameOrNullCacheHit - counter of groups we have found in cache
- findByNameOrNullCacheMiss - counter of groups we did not find in cache
- findByNameOrNullDBHitInMilliseconds - timers and counter of group we did not find in cache but found in DB (which fixed the cache)
- findByNameOrNullDBMissInMilliseconds - timers and counter of group we did not find in cache and this group was not found in DB
[JIRA-STATS] [GROUP-DAO-STATS] total stats: duration=PT3H5.678S, statsOverhead=n/a, data=
{
"eager": false,
"findByNameOrNullSkipRefreshOnNull": false,
"findByNameOrNull": 2680342,
"findByNameOrNullCacheHit": 2671125,
"findByNameOrNullCacheMiss": 0,
"findByNameOrNullDBHitInMilliseconds": {
"count": 0,
"min": 0,
"max": 0,
"sum": 0,
"avg": 0,
"distributionCounter": {
"1": 0,
"10": 0,
"50": 0,
"100": 0,
"500": 0,
"1000": 0,
"5000": 0,
"10000": 0
}
},
"findByNameOrNullDBMissInMilliseconds": {
"count": 0,
"min": 0,
"max": 0,
"sum": 0,
"avg": 0,
"distributionCounter": {
"1": 0,
"10": 0,
"50": 0,
"100": 0,
"500": 0,
"1000": 0,
"5000": 0,
"10000": 0
}
},
"findGroupsGenericValueInMilliseconds": {
"count": 0,
"min": 0,
"max": 0,
"sum": 0,
"avg": 0,
"distributionCounter": {
"1": 0,
"10": 0,
"50": 0,
"100": 0,
"500": 0,
"1000": 0,
"5000": 0,
"10000": 0
}
},
"groupAddFailed": 0,
"groupAddFailedUpdateSuccessful": 0,
"groupAddFailedUpdateFailed": 0
}