Webhook events are skipped because the queue is full
Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.
Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Problem
Webhook events may time out and fail to get dispatched.
This KB describes procedures for enabling more verbose logging levels to get detailed information for troubleshooting.
Background: Webhook Event Queue
In order for a webhook event to get processed, it first has to be added to the webhook queue.
If the webhook queue is full, new webhook events may time out waiting to be added.
This timeout is defined by the property below:
# The maximum amount of time allowed to be placed on the dispatch queue. This time
# is only utilized if the dispatch# queue is filled. The application will wait this
# amount of time to attempt to be placed in the queue.
#
# If unsuccessful, the webhook will be skipped.
#
# This value is in **milliseconds**
plugin.webhooks.dispatch.queue.timeout=<timeout>
Without enabling the relevant logging, it is hard to diagnose if this issue is occurring with existing INFO logging alone.
Diagnosis
Enable additional logging for the following classes using the following cURL commands (system admin permissions are required):
ApplicationWebhookEventDispatcher
We will enable TRACE
logging for this class.
This will log when a webhook event is first published.
Example Log
This will log when a webhook event is first published.
2023-03-17 12:50:21,661 TRACE [AtlassianEvent::thread-6] admin @1WYJAIJx770x272x0 a5iggl 0:0:0:0:0:0:0:1 "POST /projects/PROJECT_1/repos/rep_1/settings HTTP/1.1" c.a.b.i.w.ApplicationWebhookEventDispatcher Publishing webhook event repo:modified for repository event [com.atlassian.stash.internal.repository.AnalyticsRepositoryModifiedEvent[source=com.atlassian.stash.internal.repository.DefaultRepositoryService@4e53ca0c]]
We can see which project and repository triggered the webhook event from the request in the log: POST /projects/PROJECT_1/repos/rep_1/settings
- Project:
PROJECT_1
- Repository:
rep_1
Command
curl -X PUT -H "Accept: application/json" -H "Content-Type: application/json" -u username -p http://example.com/bitbucket/rest/api/latest/logs/logger/com.atlassian.bitbucket.internal.webhook.ApplicationWebhookEventDispatcher/TRACE
DefaultWebhookService
We will enable TRACE
logging for this class.
This will log information about interacting with the webhook dispatch queue.
Example Log
2023-03-17 12:50:21,667 TRACE [AtlassianEvent::thread-6] admin @1WYJAIJx770x272x0 a5iggl 0:0:0:0:0:0:0:1 "POST /projects/PROJECT_1/repos/rep_1/settings HTTP/1.1" c.a.w.internal.DefaultWebhookService Adding to webhook service dispatch queue with queue id [6fe8ecb5-8b69-4b3c-9813-89356cffb55f]
2023-03-17 12:50:21,668 TRACE [webhook-dispatcher:thread-1] c.a.w.internal.DefaultWebhookService Webhook has started execution for debug queue id [6fe8ecb5-8b69-4b3c-9813-89356cffb55f]
2023-03-17 12:50:21,684 TRACE [webhook-dispatcher:thread-1] c.a.w.internal.DefaultWebhookService A new webhook invocation has been created for webhook [1], invocation [8ae9db9f-232f-438c-bfc6-ec2762f11a27]
2023-03-17 12:50:21,700 DEBUG [webhook-dispatcher:thread-1] c.a.w.internal.DefaultWebhookService The overall result of the filter was [true] for invocation [8ae9db9f-232f-438c-bfc6-ec2762f11a27]
We can see which project and repository triggered the webhook event from the request in the log: POST /projects/PROJECT_1/repos/rep_1/settings
- Project:
PROJECT_1
- Repository:
rep_1
We can see which webhook was triggered, as well as the new invocation that was created for the given webhook, in the log: A new webhook invocation has been created for webhook [1], invocation [8ae9db9f-232f-438c-bfc6-ec2762f11a27]
- Webhook ID:
1
- Webhook invocation request ID:
8ae9db9f-232f-438c-bfc6-ec2762f11a27
We can see which webhook queue we’re utilizing in the log: Webhook has started execution for debug queue id [6fe8ecb5-8b69-4b3c-9813-89356cffb55f]
- Webhook queue ID: 6fe8ecb5-8b69-4b3c-9813-89356cffb55f
Command
curl -X PUT -H "Accept: application/json" -H "Content-Type: application/json" -u username -p http://example.com/bitbucket/rest/api/latest/logs/logger/com.atlassian.webhooks.internal.DefaultWebhookService/TRACE
DefaultWebhookDispatcher
We will enable DEBUG
logging for this class.
This will log information about dispatching a webhook and whether it is skipped or successfully sent.
Example Log
Successful
2023-03-17 12:50:21,706 DEBUG [webhook-dispatcher:thread-1] c.a.w.i.p.DefaultWebhookDispatcher Starting dispatch work for webhook invocation [8ae9db9f-232f-438c-bfc6-ec2762f11a27]
2023-03-17 12:50:22,945 DEBUG [ForkJoinPool.commonPool-worker-9] c.a.w.i.p.DefaultWebhookDispatcher Request has completed for webhook invocation [8ae9db9f-232f-438c-bfc6-ec2762f11a27]. Status code = 200
We can see which webhook invocation was successfully processed:
- Webhook invocation request ID:
8ae9db9f-232f-438c-bfc6-ec2762f11a27
Skipped
2023-03-17 12:50:21,706 DEBUG [webhook-dispatcher:thread-1] c.a.w.i.p.DefaultWebhookDispatcher Starting dispatch work for webhook invocation [8ae9db9f-232f-438c-bfc6-ec2762f11a27]
2023-03-17 12:50:22,945 DEBUG [ForkJoinPool.commonPool-worker-9] c.a.w.i.p.DefaultWebhookDispatcher Skipping webhook invocation [8ae9db9f-232f-438c-bfc6-ec2762f11a27] to (https://example.com)
We can see which webhook invocation was skipped:
- Webhook invocation request ID:
8ae9db9f-232f-438c-bfc6-ec2762f11a27
Command
curl -X PUT -H "Accept: application/json" -H "Content-Type: application/json" -u username -p http://example.com/bitbucket/rest/api/latest/logs/logger/com.atlassian.webhooks.internal.publish.DefaultWebhookDispatcher/DEBUG
Once you have finished troubleshooting, revert the logging for all classes back to the default INFO logging. You can do so by replacing DEBUG or TRACE in the above cURL commands with INFO.
JMX Monitoring
In addition to enabling more verbose and granular logging, you can also monitor webhook statistics exposed through JMX.
Please see Enabling JMX counters for performance monitoring - Webhook statistics
Cause
Webhook events fail to get dispatched because the webhook event queue is full.
Workaround
If issues with full queues and webhooks being skipped are identified with the logs above, it may help to increase the webhook queue size with the following property:
# The maximum size of the queue used to transform webhook publishing events into HTTP events.
# The default value is 250
plugin.webhooks.dispatch.queue.size=<size>
We suggest increasing the property value iteratively and following change management best practices by testing the change on a lower environment first before implementing to the production environment.