CSP-10571 - Configure a Cluster without Multicast traffic

Product: Confluence

Environment

Operating System  
Affect Version/s 2.5.4, 2.8
JVM  
Database  
Application Server  

Error Message

WARN level logging showing -

Tangosol Coherence AE 3.2/365 (RC3) <Warning> (thread=Cluster, member=n/a): This Member(Id=0, Timestamp=2007-08-08 03:25:56.418, Address=10.254.228.126:8088, MachineId=4222) has been attempting to join the cluster at address 227.178.239.70:32365 with TTL 1 for 36 seconds without success; this could indicate a mis-configured TTL value, or it may simply be the result of a busy cluster or active failover.
Tangosol Coherence AE 3.2/365 (RC3) <Warning> (thread=Cluster, member=n/a): Received a discovery message that indicates the presence of an existing cluster that does not respond to join requests; this is usually caused by a network layer failure:

Symptom

A node in a cluster will fail to rejoin after failure, yet does not form a new cluster (i.e. no cluster panic occurs).

Diagnostics/Test

Use Wireshark or Snoop to capture UDP traffic from the rejoining node and the senior node in the cluster.
The rejoining node should show the broadcast of multicast traffic, yet the senior node should show no such traffic arriving from this node.

Root Cause

The network 'cloud' is dropping multicast traffic from the rejoining node, yet is allowing it from the senior node.

Solution

Configure Coherence to use a well-known set of cluster nodes.

The downside is the members of your cluster are statically defined - you can't really swap in new nodes beyond your original four (unless the new node is a straight replacement and has the same IP address as the node it is replacing). In addition, you can't add new nodes beyond your original configuration without stopping the cluster.

So, to the configuration. This must be done on each node -

edit <confluence home>/confluence/WEB-INF/classes/tangosol-coherence-override.xml
comment out the multicast-listener element e.g. -
<!--
<multicast-listener>
<time-to-live system-property="tangosol.coherence.ttl">0</time-to-live>
<address system-property="tangosol.coherence.clusteraddress">234.3.1.2</address>
</multicast-listener>
-->
immediately following the commented out element, add a new unicast-listener element like this -
<unicast-listener>
<well-known-addresses>
<socket-address id="1">
<address>IP address of wiki01</address>
<port>8088</port>
</socket-address>
<socket-address id="2">
<address>IP address of wiki02</address>
<port>8088</port>
</socket-address>
<socket-address id="3">
<address>IP address of wiki03</address>
<port>8088</port>
</socket-address>
<socket-address id="4">
<address>IP address of wiki04</address>
<port>8088</port>
</socket-address>
</well-known-addresses>
<address>IP address of node you are making this change on</address>
<port>8088</port>
</unicast-listener>
Note You must use the IP address of each node and not the name. The address specific after the well-known-addresses will be the IP address (not node name) of the current host your are editing the config for.

In Confluence versions 2.6 and later, this file has been bundled into the main Confluence .jar file (e.g. CONFLUENCE_INSTALL/confluence/WEB-INF/lib/confluence-2.8.0.jar), so the file needs to be extracted before editing. Instructions for doing so, and for installing the new copy of the file once it's been edited, are here.

Labels

confluence confluence Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.