Unique Constraint Violation on Synchrony SNAPSHOTS Table

Still need help?

The Atlassian Community is here for you.

Ask the community

Summary

Synchrony logs show multiple errors about a duplicate key value violates unique constraint  related to SNAPSHOTS_pkey 

Environment

Confluence Data Center with Synchrony having its own Cluster (with at least 2 nodes).

Diagnosis

The atlassian-synchrony.log files are crowded with the following entries: 

{"synchrony":{"message":"synchrony.data [warn] error persisting snapshots","ns":"synchrony.data","level":"warn","throwable":"clojure.lang.ExceptionInfo: duplicate key {:type :duplicate-key, :key \"snapshot|0.confluence$content$123456789.2|/Synchrony-4dc1234-1234-12e3-1234-a1b23d40e00a/confluence-123456789\"}
...
Caused by: com.mysema.query.QueryException: Caught PSQLException for insert into \"SNAPSHOTS\" (\"key\", \"value\") values (?, ?)\r\n\tat com.mysema.query.sql.DefaultSQLExceptionTranslator.translate(DefaultSQLExceptionTranslator.java:38)
...
Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint \"SNAPSHOTS_pkey\"\n  Detail: Key (key)=(snapshot|0.confluence$content$123456789.2|/Synchrony-1dc23456-1234-12e3-1234-a1b23d40e00a/confluence-123456789) already exists.


Aside from the exceptions above, the following entries indicating a network issue can also be observed:

{"synchrony":{"member-id":"12cb3a4a-1f23-1234-123f-1d2345de6a7f","ns":"synchrony.aleph-cluster","level":"info","status":"open","message":"synchrony.aleph-cluster [info] connection transition"},"message":"synchrony.aleph-cluster [info] connection transition"}
...
[10.1.123.456]:5701 [Confluence-Synchrony] [3.7.4] Connection[id=3, /10.1.123.456:56436->/10.1.123.457:5701, endpoint=[10.1.123.457]:5701, alive=false, type=MEMBER] closed. Reason: Connection closed by the other side
...
[10.1.123.456]:5701 [Confluence-Synchrony] [3.7.4] Connecting to /10.1.123.457:5701, timeout: 0, bind-any: true


A network issue between the synchrony can also result in clojure.lang.ExceptionInfo: no such sequence  and left-merge revision not found errors  in the synchrony logs as shown below:

{"timestamp":"2020-11-19T21:43:59,583Z","level":"WARN","thread":"async-dispatch-7","logger":"synchrony.sync.hub","message":{"synchrony":{"message":"synchrony.sync.hub [warn] error in hub process","entity":"/Synchrony-1c9a3ac6-dfa2-3fcd-9576-b201658ad52d/confluence-660842486","ns":"synchrony.sync.hub","throwable":"clojure.lang.ExceptionInfo: no such sequence {:message \"no such sequence\", :type :no-such-sequence, :from #synchrony.history.rev{:origin \"fjSNjMCkMBYvol8ejCnJcpM\", :sequence 1, :partition 2}, :to #synchrony.history.rev{:origin \"q5fSYsRl_dm1c4AxkTBmng\", :sequence 0, :partition 3}}\n\tat clojure.core$ex_info.invokeStatic(core.clj:4725)\n\tat ginga.core$throwable.invokeStatic(core.cljc:326)\n\tat ginga.core$throw_map.invokeStatic(core.cljc:331)\n\tat synchrony.sync.hub$init_in_state_from_rev$fn__42493.invoke(hub.clj:365)\n\tat synchrony.sync.hub.(take?)(hub.clj:396)\n\tat synchrony.sync.hub$init$fn__42595.invoke(hub.clj:386)\n\tat synchrony.sync.hub.(take?)(hub.clj:409)\n\tat synchrony.sync.hub$fn__42640$fn__42770.invoke(hub.clj:400)\n\tat synchrony.sync.hub.(take?)(hub.clj:762)\n\tat synchrony.sync.hub$process_message$fn__44225.invoke(hub.clj:754)\n\tat clojure.lang.AFn.run(AFn.java:22)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n","level":"warn"}},"location":{"class":"synchrony.logging$eval69$fn__73","method":"invoke","line":"0"}}
{"timestamp":"2020-11-13T00:43:15,401Z","level":"WARN","thread":"async-dispatch-17","logger":"synchrony.http.entity-api","message":{"synchrony":{"message":"synchrony.http.entity-api [warn] Error in put-entity","entity":"/Synchrony-1c9a3ac6-dfa2-3fcd-9576-b201658ad52d/confluence-665492179","id":"Rr4jC6RLca-tywqIvGGPpQ","ns":"synchrony.http.entity-api","throwable":"clojure.lang.ExceptionInfo: left-merge revision not found {:type :server-error, :source :server}\n\tat clojure.core$ex_info.invokeStatic(core.clj:4725)\n\tat synchrony.sync.messages$ex_info_from_error_message.invokeStatic(messages.cljc:29)\n\tat synchrony.sync.connection$request_BANG_$fn__31266.invoke(connection.cljc:92)\n\tat synchrony.http.entity-api.(take?)(entity_api.clj:493)\n\tat synchrony.http.entity_api$content_reconciliation$fn__48540.invoke(entity_api.clj:472)\n\tat synchrony.http.entity-api.(take?)(entity_api.clj:536)\n\tat synchrony.http.entity_api$put_revision_handler$fn__48739.invoke(entity_api.clj:518)\n\tat clojure.lang.AFn.run(AFn.java:22)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n","level":"warn"}},"location":{"class":"synchrony.logging$eval69$fn__73","method":"invoke","line":"0"}}


Restarting a synchrony node will reconfigure the cluster and if there are network issues an error like the following will be present in the synchrony logs, for example:

{"timestamp":"2020-12-02T00:50:04,681Z","level":"INFO","thread":"hz._hzInstance_1_Confluence-Synchrony.cached.thread-1","logger":"com.hazelcast.nio.tcp.InitConnectionTask","message":"[10.133.40.181]:5701 [Confluence-Synchrony] [3.7.4] Could not connect to: /20.131.13.10:5701. Reason: SocketException[Connection timed out to address /20.131.13.10:5701]","location":{"class":"com.hazelcast.logging.Log4jFactory$Log4jLogger","method":"log","line":"50"}}


Cause

This problem happens because Synchrony nodes are not able to talk to each other over the network (network issues).

Workaround

The workaround for this scenario, if no network issue is identified, is simply to restart all Synchrony nodes in the attempt to make them communicate properly again.

Resolution

We need to ensure that all Synchrony nodes are communicating properly with one another.

A good validation measurement is to telnet from one node to another on port 5701 (Hazelcast Synchrony Port) and 25500 (Synchrony Aleph port). If connections are not established, then the log entries above will crowd your log files. More details on the port requirements for DC here:



Last modified on Oct 18, 2023

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.