RejectedExecutionException
Symptoms
Stash might crash from time to time if this is affecting you.
The following appears in the atlassian-stash.log
:
2013-11-06 16:43:03,430 ERROR [http-bio-7990-exec-20] user 1003x3295x1 abcd XXX.XXX.XXX.XXX,127.0.0.1 "POST /rest/api/latest/projects/ABC/repos/plugins/pull-requests/123/comments H
TTP/1.1" c.a.s.i.e.LockFreeEventPublisher$Publisher Exception while dispatching to invoker 'AsyncInvokerBatch{invokers=[SingleParameterMethodListenerInvoker
{method=public void com.atlas sian.webhooks.plugin.WebHookEventsProcessor.onEvent(java.lang.Object), listener=com.atlassian.webhooks.plugin.WebHookEventsProcessor@47aef93e},
SingleParameterMethodListenerInvoker
{meth od=public void com.atlassian.stash.internal.audit.AuditedAnnotatedEventListener.onEvent(java.lang.Object) throws java.lang.Exception, listener=com.atlassian.stash.internal.audit.Audited AnnotatedEventListener@44eeb090},
SingleParameterMethodListenerInvoker
{method=public void com.atlassian.stash.internal.notification.pullrequest.PullRequestNotificationProducer.onPullReq uestCommentAdded(com.atlassian.stash.event.pull.PullRequestCommentAddedEvent), listener=com.atlassian.stash.internal.notification.pullrequest.PullRequestNotificationProducer@1ce2c1bf}]}
' for event 'com.atlassian.stash.event.pull.PullRequestCommentAddedEvent[source=com.atlassian.stash.internal.pull.comment.DefaultPullRequestCommentService@26d92120]'
java.util.concurrent.RejectedExecutionException: Task com.atlassian.sal.core.executor.ThreadLocalDelegateRunnable@1ab334c1 rejected from java.util.concurrent.ThreadPoolExecutor@1f6328e4
[Running, pool size = 4, active threads = 4, queued tasks = 4096, completed tasks = 7743]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2013) ~[na:1.7.0_09-icedtea]
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) [na:1.7.0_09-icedtea]
Cause
The error message RejectedExecutionException indicates events are being raised faster than they can be handled by listeners. This typically means there are one or more plugins with @EventListener
methods that are processing events directly on the event threads.
We know from our past issues that some plugins have been responsible for issues like that:
- Jenkins Plugin: Fixed on version 2.3.1 as described at the version history tab of their website on Marketplace
- Workzone Plugin: Fixed in version 2.1.1 as described on their public issue tracker
- Notifyr Plugin: Fixed in 2.3.2 as described on their public issue tracker
- Pull Request Notifier:
- Custom plugins
Diagnosis
Latest Stash releases
Look it up in the logs the following string:
The event queue is full. Stacks for the processing threads follow:
You should be able to find the stack trace of the root cause of that:
2015-09-03 14:21:07,363 WARN [http-nio-7990-exec-69] dustin.lyons @Z7AMPNx861x106286x4 r6qljd 10.10.43.114,0:0:0:0:0:0:0:1 "GET /rest/api/latest/projects/SAND/repos/amplitude-export/commits/eb46907449da30af0b1f9528e18a8cd6e98cf6f9/diff/src/main/scala/ExportService.scala HTTP/1.1" c.a.s.i.e.EventThreadPoolExecutor Stack trace for AtlassianEvent::thread-5 com.atlassian.stash.internal.util.StackException: Stack trace for AtlassianEvent::thread-5 at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.7.0_60] at java.net.SocketInputStream.read(SocketInputStream.java:152) ~[na:1.7.0_60] at java.net.SocketInputStream.read(SocketInputStream.java:122) ~[na:1.7.0_60] at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) ~[na:1.7.0_60] at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) ~[na:1.7.0_60] at java.io.BufferedInputStream.read(BufferedInputStream.java:334) ~[na:1.7.0_60] at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) ~[na:1.7.0_60] at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) ~[na:1.7.0_60] at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) ~[na:1.7.0_60] at se.bjurr.prnfs.listener.UrlInvoker.invoke(UrlInvoker.java:103) ~[na:na] at se.bjurr.prnfs.listener.PrnfsPullRequestEventListener$1.invoke(PrnfsPullRequestEventListener.java:49) ~[na:na] at se.bjurr.prnfs.listener.PrnfsPullRequestEventListener.handleEvent(PrnfsPullRequestEventListener.java:139) ~[na:na] at se.bjurr.prnfs.listener.PrnfsPullRequestEventListener.onEvent(PrnfsPullRequestEventListener.java:105) ~[na:na] at com.atlassian.event.internal.SingleParameterMethodListenerInvoker.invoke(SingleParameterMethodListenerInvoker.java:36) ~[atlassian-event-2.3.5.jar:na] at com.atlassian.stash.internal.event.AsyncBatchingInvokersTransformer$AsyncInvokerBatch.invoke(AsyncBatchingInvokersTransformer.java:100) ~[stash-platform-3.11.1.jar:na] at com.atlassian.event.internal.AsynchronousAbleEventDispatcher$1$1.run(AsynchronousAbleEventDispatcher.java:48) ~[atlassian-event-2.3.5.jar:na] at com.atlassian.sal.core.executor.ThreadLocalDelegateRunnable.run(ThreadLocalDelegateRunnable.java:38) ~[sal-core-2.13.4.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_60] at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] ... 4 frames trimmed 2015-09-03 14:21:07,376 WARN [http-nio-7990-exec-69] dustin.lyons @Z7AMPNx861x106286x4 r6qljd 10.10.43.114,0:0:0:0:0:0:0:1 "GET /rest/api/latest/projects/SAND/repos/amplitude-export/commits/eb46907449da30af0b1f9528e18a8cd6e98cf6f9/diff/src/main/scala/ExportService.scala HTTP/1.1" c.a.s.i.e.EventThreadPoolExecutor Stack trace for AtlassianEvent::thread-1 com.atlassian.stash.internal.util.StackException: Stack trace for AtlassianEvent::thread-1 at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.7.0_60] at java.net.SocketInputStream.read(SocketInputStream.java:152) ~[na:1.7.0_60] at java.net.SocketInputStream.read(SocketInputStream.java:122) ~[na:1.7.0_60] at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) ~[na:1.7.0_60] at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) ~[na:1.7.0_60] at java.io.BufferedInputStream.read(BufferedInputStream.java:334) ~[na:1.7.0_60] at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) ~[na:1.7.0_60] at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) ~[na:1.7.0_60] at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) ~[na:1.7.0_60] at se.bjurr.prnfs.listener.UrlInvoker.invoke(UrlInvoker.java:103) ~[na:na] at se.bjurr.prnfs.listener.PrnfsPullRequestEventListener$1.invoke(PrnfsPullRequestEventListener.java:49) ~[na:na] at se.bjurr.prnfs.listener.PrnfsPullRequestEventListener.handleEvent(PrnfsPullRequestEventListener.java:139) ~[na:na] at se.bjurr.prnfs.listener.PrnfsPullRequestEventListener.onEvent(PrnfsPullRequestEventListener.java:85) ~[na:na] at com.atlassian.event.internal.SingleParameterMethodListenerInvoker.invoke(SingleParameterMethodListenerInvoker.java:36) ~[atlassian-event-2.3.5.jar:na] at com.atlassian.stash.internal.event.AsyncBatchingInvokersTransformer$AsyncInvokerBatch.invoke(AsyncBatchingInvokersTransformer.java:100) ~[stash-platform-3.11.1.jar:na] at com.atlassian.event.internal.AsynchronousAbleEventDispatcher$1$1.run(AsynchronousAbleEventDispatcher.java:48) ~[atlassian-event-2.3.5.jar:na] at com.atlassian.sal.core.executor.ThreadLocalDelegateRunnable.run(ThreadLocalDelegateRunnable.java:38) ~[sal-core-2.13.4.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_60] at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] ... 4 frames trimmed
- For example, from the stack traces below the plugin
se.bjurr.prnfs
is flooding the Event queue.
If you can't find the string above in your logs ...
- Then you are in an older release of Stash and you have to cross check your logs and thread dumps
- To verify which plugin is causing the issue, you have to look at the Thread Dumps - specifically looking for the
AtlassianEvent
- There will be a Thread Dump in you Support Zip (Administration -> Atlassian Support Tools -> Support Zip -> Create)
- However you might decide to take a few Thread Dumps when the problem is actually happening by using the document below:
- Generate a Thread Dump Externally
- Please send this data to our Support should you hit this issue
- See the Thread Dump below, for example, for a real issue we analysed:
"AtlassianEvent::thread-4" daemon prio=10 tid=0x0000000019a4f000 nid=0x896 in Object.wait() [0x00002b8f36d13000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at com.hazelcast.spi.impl.BasicInvocationFuture.waitForResponse(BasicInvocationFuture.java:205)
- locked <0x0000000681106a60> (a com.hazelcast.spi.impl.BasicInvocationFuture)
at com.hazelcast.spi.impl.BasicInvocationFuture.get(BasicInvocationFuture.java:180)
...
...
at com.izymes.workzone.listener.BuildEventListener.findPullRequests(BuildEventListener.java:102)
at com.izymes.workzone.listener.BuildEventListener.onBuildResultReceivedEvent(BuildEventListener.java:76)
...
...
at com.atlassian.sal.core.executor.ThreadLocalDelegateRunnable.run(ThreadLocalDelegateRunnable.java:38)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
In this case, we could notice that the Workzone plugin was causing the issue. Notice that from the thread above:
at com.izymes.workzone.listener.BuildEventListener
Resolution
- The solution is to disable the offending plugin. Notice that Atlassian can't support third-party plugins. That said we encourage you to raise an issue with the plugin vendor or review your plugin code. Most of the vendors have a public issue tracker link available on our Marketplace. Take for example the similar issues we reported to different plugin vendors:
- Workzone plugin: https://bitbucket.org/izymesdev/workzone/issue/79/buildeventlistener-is-doing-its-processing
- Jenkins plugin: https://github.com/Nerdwin15/stash-jenkins-postreceive-webhook/issues
- Pull Request Notifier for Bitbucket Server: https://github.com/tomasbjerre/pull-request-notifier-for-bitbucket/issues/78
If you can't disable the plugin because it's critical to your workflow, we suggest not only increasing your queue size but also increasing your event processor threads. That can be done by adding the following in stash-config.properties:
event.dispatcher.max.threads=2*${scaling.concurrency} event.dispatcher.queue.size=9000
This will allocate (the number of CPUs reported by your system times two) event processor threads instead of the default (number of CPUs reported by your system) event processor threads, which should help drain your queue more quickly.