Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.

Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Summary

Cluster Cache replication health check fails and the nodes cannot communicate with each other to replicate cache.

Name: Cluster Cache Replication
NodeId: null
Is healthy: false
Failure reason: ["The node node3 is not replicating","The node node2 is not replicating"]
Severity: CRITICAL

Exception from the atlassian-jira.log:

2021-06-04 18:26:25,830+0000 localq-reader-12 ERROR      [c.a.j.c.distribution.localq.LocalQCacheOpReader] [LOCALQ] [VIA-COPY] Abandoning sending: LocalQCacheOp{cacheName='com.atlassian.jira.plugins.healthcheck.service.HeartBeatService.heartbeat', action=PUT, key=node2, value == null ? false, replicatePutsViaCopy=true, creationTimeInMillis=1622831185825} from cache replication queue: [queueId=queue_node1_2_164546f60261c7e4be0c5f5f9aaeec86_put, queuePath=/var/atlassian/application-data/jira-home/localq/queue_node1_2_164546f60261c7e4be0c5f5f9aaeec86_put], failuresCount: 1/1. Removing from queue. Error: java.rmi.MarshalException: error marshalling arguments; nested exception is: 
    	java.net.SocketException: Broken pipe (Write failed)
com.atlassian.jira.cluster.distribution.localq.LocalQCacheOpSender$UnrecoverableFailure: java.rmi.MarshalException: error marshalling arguments; nested exception is: 
	java.net.SocketException: Broken pipe (Write failed)
	at com.atlassian.jira.cluster.distribution.localq.rmi.LocalQCacheOpRMISender.send(LocalQCacheOpRMISender.java:90)
	at com.atlassian.jira.cluster.distribution.localq.LocalQCacheOpReader.run(LocalQCacheOpReader.java:96)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.rmi.MarshalException: error marshalling arguments; nested exception is: 
	java.net.SocketException: Broken pipe (Write failed)
	at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:158)
	at net.sf.ehcache.distribution.RMICachePeer_Stub.put(Unknown Source)
	at com.atlassian.jira.cluster.distribution.localq.rmi.LocalQCacheOpRMISender.lambda$send$2(LocalQCacheOpRMISender.java:69)
	at com.atlassian.jira.cluster.distribution.localq.rmi.CachingRMICachePeerManager.withCachePeer(CachingRMICachePeerManager.java:94)
	... 6 more
Caused by: java.net.SocketException: Broken pipe (Write failed)
	at java.net.SocketOutputStream.socketWrite0(Native Method)
	at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
	at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
	at java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1915)
	at java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1576)
	at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:351)
	at sun.rmi.server.UnicastRef.marshalValue(UnicastRef.java:294)
	at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:153)

Cause

The nodes are unable to communicate with each other due to a loop in the IP configuration of the system.

Diagnosis

Verify if the IP entries are correct under /etc/hosts. Ensure there are no duplicated entries pointing to the node with both external and internal IP addresses:

127.0.1.1 ip-node-1
127.0.0.1 ip-node-1

Solution

Remove the nodes' hostname from the local loopback IP 127.0.0.1 – Leaving only its external IP entries
Repeat this for all nodes
Restart Jira
Check if cluster cache replication health is resolved.

You may need to check if the IP was statically defined in other settings of the system, such as:

JIRA_LOCAL_HOME/cluster.properties, entry:

ehcache.listener.hostName

JIRA_INSTALL/bin/setenv.sh, entry:

-Djava.rmi.server.hostname

Jira Software Support

Get started

Knowledge base

Products

Jira Software

Jira Service Management

Jira Work Management

Confluence

Bitbucket

Resources

Documentation

Community

System Status

Suggestions and bugs

Marketplace

Billing and licensing

Cluster Cache replication health check fails with error SocketException: Broken pipe exception

Still need help?

Summary

Cause

Diagnosis

Solution

Page

Viewport

Confluence

Cluster Cache replication health check fails with error SocketException: Broken pipe exception

Related content

Still need help?

Summary

Cause

Diagnosis

Solution

Related content