Service Management customer notifications are sent with a long delay due to complex SLA configurations

Still need help?

The Atlassian Community is here for you.

Ask the community

Symptoms

All the customer notifications within a specific Service Management (former "Service Desk" project) are sent with random delays (sometimes several hours).

Diagnosis

  • The Service Management project is configured with at least 1 SLA which verifies either or both of the following conditions:

    • The SLA contains a high number of goals (at least hundreds of goals)
    • The SLA contains goals which are associated to JQL queries using complex functions, for example:
  • The Customer Notifications are not the only functionality that experiences delays in the Service Management project. Other functionalities such as the SLA or the Service Management automations are triggered/calculated with a long delay

  • When collecting thread dumps while the delays are happening, we might see a very long running thread (SdOffThreadEventJobRunner:thread-X) performing JQL operation for the SLA functionality:

    "SdOffThreadEventJobRunner:thread-4" #692 prio=5 os_prio=0 tid=0x00007f291c168800 nid=0xf98 runnable [0x00007f2884f9f000]
       java.lang.Thread.State: RUNNABLE
       at java.lang.Class.getEnclosingMethod0(Native Method)
       at java.lang.Class.getEnclosingMethodInfo(Class.java:1072)
       at java.lang.Class.getEnclosingClass(Class.java:1272)
       at java.lang.Class.getSimpleBinaryName(Class.java:1443)
       at java.lang.Class.getSimpleName(Class.java:1309)
       at com.riadalabs.jira.plugins.insight.services.cache.InsightCacheManagerImpl.getCache(InsightCacheManagerImpl.java:33)
       at com.riadalabs.jira.plugins.insight.services.cache.InsightCacheManagerImpl.getFromCache(InsightCacheManagerImpl.java:75)
       at com.riadalabs.jira.plugins.insight.services.core.dal.impl.ConfigurationDalImpl.loadStatusType(ConfigurationDalImpl.java:327)
       at com.riadalabs.jira.plugins.insight.services.predicate.attribute.AttributeValuePredicateBase.getStatusValue(AttributeValuePredicateBase.java:293)
    ...
    
       at com.riadalabs.jira.plugins.insight.services.predicate.MultiplePredicates.testAllOf(MultiplePredicates.java:72)
       at com.riadalabs.jira.plugins.insight.services.predicate.MultiplePredicates.test(MultiplePredicates.java:57)
       at com.riadalabs.jira.plugins.insight.services.predicate.MultiplePredicates.test(MultiplePredicates.java:12)
       at java.util.function.Predicate.lambda$or$2(Predicate.java:101)
       at java.util.function.Predicate$$Lambda$1320/767395427.test(Unknown Source)
       at java.util.function.Predicate.lambda$and$0(Predicate.java:69)
       at java.util.function.Predicate$$Lambda$1771/1190882033.test(Unknown Source)
       at com.riadalabs.jira.plugins.insight.services.predicate.ObjectIdBucketPredicate.test(ObjectIdBucketPredicate.java:25)
       at com.riadalabs.jira.plugins.insight.services.predicate.ObjectIdBucketPredicate.test(ObjectIdBucketPredicate.java:9)
       at com.riadalabs.jira.plugins.insight.services.core.ObjectFilterManagerImpl.filterObjectBeansInThread(ObjectFilterManagerImpl.kt:125)
    ...
    
       at com.riadalabs.jira.plugins.insight.services.jira.customfield.searcher.function.IQLFunction.getFunctionValues(IQLFunction.java:125)
       at com.riadalabs.jira.plugins.insight.services.jira.customfield.searcher.function.InsightAbstractJqlFunction.getValues(InsightAbstractJqlFunction.java:91)
       at com.atlassian.jira.jql.operand.FunctionOperandHandler$3.call(FunctionOperandHandler.java:61)
       at com.atlassian.jira.jql.operand.FunctionOperandHandler$3.call(FunctionOperandHandler.java:58)
       at com.atlassian.ozymandias.SafePluginPointAccess.call(SafePluginPointAccess.java:390)
       at com.atlassian.jira.jql.operand.FunctionOperandHandler.getValues(FunctionOperandHandler.java:58)
       at com.atlassian.jira.jql.operand.FunctionOperandHandler.getValues(FunctionOperandHandler.java:26)
       at com.atlassian.jira.jql.operand.DefaultJqlOperandResolver.getValues(DefaultJqlOperandResolver.java:88)
       at sun.reflect.GeneratedMethodAccessor784.invoke(Unknown Source)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at com.atlassian.plugin.util.ContextClassLoaderSettingInvocationHandler.invoke(ContextClassLoaderSettingInvocationHandler.java:26)
       at com.sun.proxy.$Proxy313.getValues(Unknown Source)
       at sun.reflect.GeneratedMethodAccessor784.invoke(Unknown Source)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at com.atlassian.plugin.osgi.bridge.external.HostComponentFactoryBean$DynamicServiceInvocationHandler.invoke(HostComponentFactoryBean.java:131)
       at com.sun.proxy.$Proxy313.getValues(Unknown Source)
       at com.atlassian.jira.jql.query.GenericClauseQueryFactory.getRawValues(GenericClauseQueryFactory.java:75)
    
    ...
    
       at com.atlassian.jira.jql.query.QueryVisitor.createQuery(QueryVisitor.java:51)
       at com.atlassian.jira.jql.query.DefaultLuceneQueryBuilder.createLuceneQuery(DefaultLuceneQueryBuilder.java:29)
       at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.createLuceneQuery(LuceneSearchProvider.java:343)
       at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.getHitCount(LuceneSearchProvider.java:183)
       at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.getHitCount(LuceneSearchProvider.java:147)
       at com.atlassian.servicedesk.internal.sla.goal.GoalMatcherImpl.isMatching(GoalMatcherImpl.java:103)
       at com.atlassian.servicedesk.internal.sla.goal.GoalMatcherImpl.lambda$null$0(GoalMatcherImpl.java:58)
       at com.atlassian.servicedesk.internal.sla.goal.GoalMatcherImpl$$Lambda$3298/2052566009.test(Unknown Source)

Cause


The Service Management SLAs, Customer Notifications and JSM Automations all share the same threads (SdOffThreadEventJobRunner:thread or SdSerialisedOffThreadProcessor:thread). Whenever an update is made to a ticket, the Service Management project processes this update as follows:

  • First, the ticket where the update occurred is tested again each single JQL query configured in each SLA within the project
  • It's only after this ticket has been processed by the SLA functionality that this ticket will be processed by the Service Management Automations and Customer Notifications

If the project contains an SLA configured with a lot of complex JQL queries, it will take a lot of time for any ticket to be processed by this SLA before being process by the Automations and Customer Notifications.

Because of that, we might experience:

  • delays and inconsistencies with the SLA displayed in the tickets
  • delays with the triggering of Automations
  • delays with the Customer Notifications

Workaround

Identify the complex JQL queries used in the SLA configurations. Some ways to identify them include:

  • Checking the atlassian-jira-slow-queries.log files, since these log files will report any JQL queries that is slow to run
    Here's a command that shows the 20 slowest queries:

    cat atlassian-jira-slow-queries.log*| grep OffThread | grep -oE "query.*queryTermMetrics" | sed "s/ ), queryTermMetrics//" | sed "s/[{]//g"| sed "s/[}]//g" | sed "s/query=key = [0-9]* AND //g" | sort | uniq -c | sort -nr | head -n 20


  • Reviewing each JQL query in each SLA and check if any of them is using functions coming from add-ons (such as ScriptRunner, or Insight Asset Management...)
  • Copying each JQL query in each SLA, and testing them one by one in the page Issues > search for issues to see if they take a long time to execute

Once these slow JQL queries are identified, you will need to find a way to optimize them, for example by removing any kind of add-on script or functions from these queries.



Last modified on Jun 23, 2023

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.