"Failed to recover from translog" error occurs while starting Elasticsearch bundled with Bitbucket Server
Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.
Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Problem
Elasticsearch logs report the following error when starting up:
[2020-01-17T02:23:03,966][WARN ][o.e.i.e.Engine ] [bitbucket_bundled] [bitbucket-search][2]failed engine [failed to recover from translog]
org.elasticsearch.index.engine.EngineException: failed to recover from translog
at org.elasticsearch.index.engine.InternalEngine.recoverFromTranslogInternal(InternalEngine.java:445) ~[elasticsearch-6.5.3.jar:6.5.3]
....
Background and Diagnosis
Each Elasticsearch shard copy also writes operations into its transaction log known as the translog. In the event of a crash, recent operations that have been acknowledged but not yet included are recovered from the translog when the shard recovers. The data in the translog is only persisted to disk when the translog is fsync
ed and committed. In the event of a hardware failure or an operating system crash or a JVM crash or a shard failure, any data written since the previous translog commit will be lost. In some cases, even a bad drive or user error can cause the translog to become corrupted. When this corruption is detected by Elasticsearch due to mismatching checksums, Elasticsearch will fail the shard and refuse to allocate that copy of the data to the node, recovering from a replica if available.
In such cases, elasticsearch-translog tool can be used to recover data that is currently contained in the translog.
Warning
The elasticsearch-translog
tool should not be run while Elasticsearch is running, as you may permanently lose the documents that were contained only in the translog!
Resolution
- Stop Bitbucket Server and embedded Elasticsearch (see Warning above)
- Go to
<bitbucket-home>/shared/search/data/nodes/0/indices
folder - Run
<bitbucket-install>/elasticsearch/bin/elasticsearch-translog truncate -d <bitbucket-home>/shared/search/data/nodes/0/indices/<index-folder>/translog/
for each of the indicesIf the Elasticsearch error log shows the index folder as part of the error, you can delete translog of just that one folder.
- Restart Bitbucket Server