Jira notifications sent with a long delay due group filter subscriptions or high number of private filter subscriptions
Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.
Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Symptoms
Any type of notification emails from Jira (batched notifications, non-batched notifications, customer notifications) are emptied from the Mail Queue with a very long delay.
Diagnosis
Enable logging and debugging under Outgoing Mail in the page ⚙ > Logging and profiling. Check if you see a huge number of mail subscription emails being sent and queued in the
atlassian-jira-outgoing-mail.log
file. In the example below, we can see that more than a hundred filter subscription emails are sent every 5 minutes:grep '2021-01-06 12:35' atlassian-jira-outgoing-mail.log | grep -c 'Mail Queue Service Sending: com.atlassian.jira.mail.Subscription' 120 grep '2021-01-06 12:40' atlassian-jira-outgoing-mail.log | grep -c 'Mail Queue Service Sending: com.atlassian.jira.mail.Subscription' 325 grep '2021-01-06 12:45' atlassian-jira-outgoing-mail.log | grep -c 'Mail Queue Service Sending: com.atlassian.jira.mail.Subscription' 209
Enable the package com.atlassian.jira.service in the page ⚙ > System > Logging and Profiling. Check in the file
atlassian-jira.log
if the Mail Queue Service if scheduled every 1 min as it should. If it's not scheduled every 1 min but at random intervals, then it is an indication that each execution of this service might be taking a long time to complete:2021-01-04 11:58:00,025+0100 Caesium-1-1 DEBUG anonymous Mail Queue Service [c.a.j.s.services.mail.MailQueueService] Attempting to run mail queue service 2021-01-04 12:23:00,000+0100 Caesium-1-4 DEBUG anonymous Mail Queue Service [c.a.j.s.services.mail.MailQueueService] Attempting to run mail queue service 2021-01-04 12:24:00,055+0100 Caesium-1-3 DEBUG anonymous Mail Queue Service [c.a.j.s.services.mail.MailQueueService] Attempting to run mail queue service 2021-01-04 12:45:03,432+0100 Caesium-1-4 DEBUG anonymous Mail Queue Service [c.a.j.s.services.mail.MailQueueService] Attempting to run mail queue service 2021-01-04 13:08:00,071+0100 Caesium-1-1 DEBUG anonymous Mail Queue Service [c.a.j.s.services.mail.MailQueueService] Attempting to run mail queue service 2021-01-04 13:29:00,003+0100 Caesium-1-3 DEBUG anonymous Mail Queue Service [c.a.j.s.services.mail.MailQueueService] Attempting to run mail queue service
When running the following SQL query, you might see 2 rows, 1 with the "S" status and one with the "Already running" status, indicating that the Mail Queue Service is taking more than 1 minute to complete, preventing it from being scheduled every 1 minute:
select * from rundetails where job_id = 'com.atlassian.jira.service.JiraService:10000'; id | job_id | start_time | run_duration | run_outcome | info_message ---------+----------------------------------------------+------------------------+--------------+-------------+-------------- 9030105 | com.atlassian.jira.service.JiraService:10000 | 2021-01-04 11:57:00+02 | 1 | S | 9034027 | com.atlassian.jira.service.JiraService:10000 | 2019-01-04 12:08:00+02 | 1 | A | Already running
- Taking thread dumps while the problem is happening will reveal that:
3 of the 4 Caesium threads (responsible to run scheduled jobs like the mail queue service) are waiting and doing nothing, while the 4 thread is not mentioned anywhere:
"Caesium-1-4" daemon prio=5 tid=0x0000000000000120 nid=0 waiting on condition java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000063956614> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) ... "Caesium-1-2" daemon prio=5 tid=0x000000000000011e nid=0 waiting on condition java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000063956614> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) ... "Caesium-1-1" daemon prio=5 tid=0x000000000000011d nid=0 waiting on condition java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000063956614> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
in reality, the 4th Caesium thread is running, but it is busy processing filter subscription emails. You should see a long running thread similar to the one from either example shown below:
example 1
"Sending mailitem com.atlassian.jira.mail.SubscriptionSingleRecepientMailQueueItem owner: 'username(username)'" daemon prio=5 tid=0x000000000000011f nid=0 runnable java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101) at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144) at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174) - locked <0x0000000008366fcd> (a com.mysql.jdbc.util.ReadAheadInputStream) at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3011) ... at com.atlassian.query.clause.OrClause.accept(OrClause.java:28) at com.atlassian.jira.jql.query.ContextAwareQueryVisitor.visit(ContextAwareQueryVisitor.java:57) at com.atlassian.jira.jql.query.ContextAwareQueryVisitor.visit(ContextAwareQueryVisitor.java:27) at com.atlassian.query.clause.AndClause.accept(AndClause.java:28) at com.atlassian.jira.jql.query.QueryVisitor.createQuery(QueryVisitor.java:51) at com.atlassian.jira.jql.query.DefaultLuceneQueryBuilder.createLuceneQuery(DefaultLuceneQueryBuilder.java:29) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.createLuceneQuery(LuceneSearchProvider.java:331) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.getHits(LuceneSearchProvider.java:216) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.search(LuceneSearchProvider.java:362) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.search(LuceneSearchProvider.java:135) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.search(LuceneSearchProvider.java:140) at com.atlassian.jira.bc.issue.search.DefaultSearchService.search(DefaultSearchService.java:118) at com.atlassian.jira.mail.SubscriptionSingleRecepientMailQueueItem.getContextParams(SubscriptionSingleRecepientMailQueueItem.java:242) at com.atlassian.jira.mail.SubscriptionSingleRecepientMailQueueItem.sendSearchRequestEmail(SubscriptionSingleRecepientMailQueueItem.java:118) at com.atlassian.jira.mail.SubscriptionSingleRecepientMailQueueItem.send(SubscriptionSingleRecepientMailQueueItem.java:103) at com.atlassian.mail.queue.MailQueueImpl.sendBufferUnderLock(MailQueueImpl.java:103) at com.atlassian.mail.queue.MailQueueImpl.sendBuffer(MailQueueImpl.java:56) at com.atlassian.jira.mail.JiraMailQueue$1.apply(JiraMailQueue.java:51) at com.atlassian.jira.mail.JiraMailQueue$1.apply(JiraMailQueue.java:48) at com.atlassian.jira.util.velocity.DefaultVelocityRequestContextFactory.runWithStaticBaseUrl(DefaultVelocityRequestContextFactory.java:110) at com.atlassian.jira.util.DefaultBaseUrl.runWithStaticBaseUrl(DefaultBaseUrl.java:56) at com.atlassian.jira.mail.JiraMailQueue.sendBuffer(JiraMailQueue.java:48) at com.atlassian.jira.service.services.mail.MailQueueService.run(MailQueueService.java:21) at com.atlassian.jira.service.JiraServiceContainerImpl.run(JiraServiceContainerImpl.java:68) at com.atlassian.jira.service.ServiceRunner.runService(ServiceRunner.java:62) at com.atlassian.jira.service.ServiceRunner.runServiceId(ServiceRunner.java:44) at com.atlassian.jira.service.ServiceRunner.runJob(ServiceRunner.java:32) at com.atlassian.scheduler.core.JobLauncher.runJob(JobLauncher.java:134) at com.atlassian.scheduler.core.JobLauncher.launchAndBuildResponse(JobLauncher.java:106) at com.atlassian.scheduler.core.JobLauncher.launch(JobLauncher.java:90) at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.launchJob(CaesiumSchedulerService.java:435) at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeLocalJob(CaesiumSchedulerService.java:402) at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeQueuedJob(CaesiumSchedulerService.java:380) at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService$$Lambda$1889/1640492218.accept(Unknown Source) at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeJob(SchedulerQueueWorker.java:66) at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeNextJob(SchedulerQueueWorker.java:60) at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.run(SchedulerQueueWorker.java:35) at java.lang.Thread.run(Thread.java:748)
example 2
"Sending mailitem com.atlassian.jira.mail.SubscriptionMailQueueItem owner: 'username(username)'" #110 prio=4 os_prio=-1 tid=0x0000000085c7e800 nid=0x1640 runnable [0x000000008fb1e000] java.lang.Thread.State: RUNNABLE at org.apache.lucene.index.SegmentTermDocs.read(SegmentTermDocs.java:157) at org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:122) at org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:139) at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:298) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:542) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:532) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:463) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:433) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:356) at com.atlassian.jira.index.DelegateSearcher.search(DelegateSearcher.java:165) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.runSearch(LuceneSearchProvider.java:566) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.getHits(LuceneSearchProvider.java:294) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.search(LuceneSearchProvider.java:421) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.search(LuceneSearchProvider.java:413) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.search(LuceneSearchProvider.java:104) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.search(LuceneSearchProvider.java:92) at com.atlassian.jira.mail.SubscriptionMailQueueItem.getContextParams(SubscriptionMailQueueItem.java:336) at com.atlassian.jira.mail.SubscriptionMailQueueItem.sendSearchRequestEmail(SubscriptionMailQueueItem.java:174) at com.atlassian.jira.mail.SubscriptionMailQueueItem.send(SubscriptionMailQueueItem.java:126) at com.atlassian.mail.queue.MailQueueImpl.sendBuffer(MailQueueImpl.java:66)
Check which one of the 2 scenario matches your situation:
Scenario 1: your Jira instance contains filter subscriptions associated to groups of high numbers of users
Run the query checking the presence of group filter subscriptions
select count (*) from filtersubscription where groupname != '';
- If this query returns any result, look for each group (from the group_name column) in Jira > ⚙ > User Management > Groups, and check if any of these group contains a high number of users (for example, at least 1700)
- If you find any group containing a high number of users then the resolution 1 from this KB might help resolve the issue
Scenario 2: your Jira instance contains a high number of private filter subscriptions
Run the following SQL to check the number of private filter subscriptions in your Jira instance::
select count (*) from filtersubscription where groupname = '';
If this number is high (for example, at least few hundreds), then the resolution 2 from this KB might help resolve this issue
Cause
The Mail Queue Service which is responsible to send Jira and Service Desk notifications is stuck trying to send a huge number of filter subscription emails. As a result, the notifications are sent with a long delay (after all the subscription emails are sent).
There are 2 possible scenarios which can cause this issue:
- Scenario 1: your Jira instance contains filter subscriptions associated to groups of high numbers of users
- The root cause of this scenario is described in the bug JRASERVER-61543 - Outgoing Mail Stopped Working due to Large Group Filter Subscription
- Scenario 2: your Jira instance contains a high number of private filter subscriptions configured with a high frequency (every few minutes)
Resolution
Resolution 1 (for Scenario 1)
- Set Group Filter Subscription to use a group that has a smaller user count <200 users.
- Or create a mailing list in Mail Server to be configured for the Filter Subscription. Create a dummy user in JIRA to be assign with the mailing list and the user will be the one subscribing to the Filter Subscription. Users who should be receiving mail subscription needs to be part of the mailing list.
Resolution 2 (for Scenario 2)
- Modify the private filter subscriptions by reducing their frequency (for example, from every 5 min to once per day):
- From the UI:
- If you are using a Jira version lower than 8.11.0, you will need to ask the users who own these private filter subscriptions do edit them
- If you are using Jira 8.11.0 or higher, the Jira system admin user can modify these private filter subscription via the page ⚙ > System > Filters
- From the database:
- Schedule a maintenance window
- Stop Jira
- Backup your Jira Database, so that you can revert any change made to it if necessary
Run the following SQL query:
update clusteredjob set cron_expression = '0 0 1 ? * *' where job_id like 'com.atlassian.jira.issue.subscription.DefaultSubscriptionManager%';
- Start Jira
- From the UI:
- Modify the private filter subscriptions by reducing their frequency (for example, from every 5 min to once per day):