Creating DBR message fails with: KryoException: Buffer overflow. Available: 0, required: 1

Still need help?

The Atlassian Community is here for you.

Ask the community

JIRA DC 8.13

Context

Since JIRA DC 8.12 we are using Document Based Replication to replicate the index across the cluster. When a change on the issue is triggered on one node, JIRA synchronously re-indexes this issue then asynchronously serialises the object with all Lucene document(s) and distributes it to other nodes.


JIRA is using Kryo for the serialisation/deserialisation of Lucene documents. JIRA comes with some assumptions about how big the serialised documents may be. By default the maximum size of the object with Lucene documents is set to 16MB.

Problem 

It is possible that a full issue reindex (including all related entities) is triggered by a plugin on an issue with a large number of comments, worklogs and history and will produce a document larger than 16MB. If this happens you will see a similar log on the node which tried to create the DBR message:

[c.a.j.cluster.dbr.DefaultDBRSender] [DBR] [SENDER] Error when creating a dbr update with related message for documents: ...
...
com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0, required: 1
Serialization trace:
type (org.apache.lucene.document.TextField)
fields (org.apache.lucene.document.Document)
worklogs (com.atlassian.jira.cluster.dbr.DBRMessageUpdateWithRelatedData)

	at com.esotericsoftware.kryo.io.Output.require(Output.java:186)
	at com.esotericsoftware.kryo.io.Output.writeBoolean(Output.java:635)
	at com.esotericsoftware.kryo.serializers.UnsafeField$BooleanUnsafeField.write(UnsafeField.java:151)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:107)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:568)
	at com.esotericsoftware.kryo.serializers.ReflectField.write(ReflectField.java:70)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:107)
	at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:637)
	at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:158)
	at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:43)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:568)
	at com.esotericsoftware.kryo.serializers.ReflectField.write(ReflectField.java:70)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:107)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:568)
	at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:154)
	at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:43)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:568)
	at com.esotericsoftware.kryo.serializers.ReflectField.write(ReflectField.java:70)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:107)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:553)
	at com.atlassian.jira.cluster.dbr.KryoDBRMessageDataSerializer.serialize(KryoDBRMessageDataSerializer.java:55)


Side note: In general, it is fine for DBR messages to fail sometimes (~5% rate) as there is another replay mechanism that will make sure indexes on all nodes are consistent and will re-index missing data.

However, there is another issue that may cause those "big" issues to fail index replay when the number of related entities (comments, worklogs) is greater then 1000: JRASERVER-71980 - Getting issue details... STATUS , so in this case, both problems amplify each other. 

Workaround 

Serialisation exception when creating DBR message

Issue:  JRASERVER-71976 - Getting issue details... STATUS

The maximum size of the serialised data in a single DBR message is set to 16MB. It can be overridden with the following system property (example: overriding the maximum size to 32MB). Please don't set this parameter to a very high value.

-Dcom.atlassian.jira.cluster.dbr.serialization.max.size.bytes=33554432

(warning) Note: you will have to set this property on every node and this will require a rolling restart of all nodes.

Replay failing for a large list of comments or worklogs

Issue:  JRASERVER-71980 - Getting issue details... STATUS

Currently there is no workaround for this. Note that most of the time this should not be a problem and the index will be consistent across the cluster (tick). Every worklog or comment item on this list (when created o updated) was replicated (via DBR and the backup replay mechanism) via individual DBR messages and index replay operations. Those messages and operations will not be affected by  JRASERVER-71976 - Getting issue details... STATUS  and  JRASERVER-71980 - Getting issue details... STATUS .

The problem only affects re-index issue operations which trigger a full issue reindex (with all comments and worklogs). This is usually caused by misuse of JIRA indexing API: plugins update the issue only but trigger a full issue re-index (issue with all comments and worklogs) issue re-index instead of reindexing the issue itself. (info) Usually disabling the plugin triggering this re-indexing action should solve the problem.


Last modified on Jan 13, 2021

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.