Configuring OpenSearch for Confluence

This page is for the OpenSearch Early Access Program (EAP).

OpenSearch for Confluence Data Center 8.9 was released as part of an early access program for our partners. It is not suitable for production environments because it’s missing some functionality.

If you’d like to use the full functionality of OpenSearch in a production environment, we recommend you upgrade to Confluence Data Center 9.0 or later, and that you run it on a staging environment first.


As Confluence instances grow in size and scale, the default search engine, Lucene, may be slower to index and return search results. To address this, Confluence Data Center will soon offer an alternative search engine as an opt-in feature — OpenSearch. OpenSearch can take advantage of multi-node instances to manage process-intensive indexing. In the user interface, the search experience will remain the same for users.

Requirements

  • Confluence 8.9 or above (non-production environments)

  • OpenSearch 2.9 (this is the only supported version).

Set up OpenSearch

To set up OpenSearch:

  1. Provision an OpenSearch cluster.

  2. Configure confluence.

  3. Migrate your Lucene index.

All these actions are covered on this page.

Provision an OpenSearch cluster

You don’t need to worry about indexes when you provision your OpenSearch cluster. Confluence will automatically create the required indexes.

  1. Provision an OpenSearch 2.9 cluster, either on premises or as a managed service (for example, AWS OpenSearch Service). Refer to the OpenSearch documentation for setting up a cluster.

  2. Refer to Securing an OpenSearch cluster for the recommended security configuration.

Configure Confluence

You can configure Confluence with properties in either the confluence.cfg.xml file or as system-properties. Note that if both are present, the system property will take precedence.

The bare minimum to configure is:

  • Tell Confluence that you’re using OpenSearch.

  • Provide Confluence with the OpenSearch cluster location and authentication details.

These minimum configuration options are covered in the tables below. For the full list of system properties, refer to Refer to Recognized System Properties.

Tell Confluence to use OpenSearch

Property

Value

Example

search.platform

Either opensearch or lucene.

The default is lucene.

opensearch

Provide cluster location and authentication details

There are two ways to authenticate to your OpenSearch cluster — basic auth (username/password), and IAM. These are mutually exclusive.

Option 1: authenticate with basic authentication

Property

Value

Example

opensearch.http.url

The HTTP URL of the OpenSearch cluster.

https://vpc-hluk-es3-n4nkxafilj53hlzsq4kvp36k7e.ap-southeast-2.es.amazonaws.com

opensearch.username

OpenSearch username.

admin

opensearch.password

OpenSearch password.

my-password

Option 2: authenticate with IAM (AWS only)

Property

Value

Example

opensearch.http.url

The host of the OpenSearch domain in AWS, without the protocol (http(s)://).

vpc-hluk-es3-n4nkxafilj53hlzsq4kvp36k7e.ap-southeast-2.es.amazonaws.com

opensearch.aws.region

The AWS region of the OpenSearch instance.

ap-southeast-2


Check your configuration

You can verify that Confluence is configured with OpenSearch on the System Information page. Check that the Search platform displays “OpenSearch”.

Migrate the Lucene index

Once the Confluence instance is reconfigured from Lucene to OpenSearch, you’ll need to repopulate the index with the existing data. You can use the Content Indexing admin UI to rebuild your index.

The simplest approach is to rebuild your index as soon as you’ve reconfigured your instance to use OpenSearch. However, this will cause your search functionality to be temporarily unavailable, or returning incomplete results, until the index is fully rebuilt. This is the only approach available for instances on a single node.

If your instance runs on a clustered configuration, you can avoid any search downtime by following the steps below.

Avoid downtime during index migration

If your Confluence instance is running in a clustered configuration, there are steps you can follow to maintain search availability during this migration process.

Users can still make changes while you’re rebuilding the index on OpenSearch. Changes will be written to a journal which will be replayed once reindexing is complete, ensuring the latest changes are available.


  1. From your Confluence cluster running Lucene, provision a “dark node” (i.e. a Confluence node that does not serve the regular user traffic from the load balancer)

  2. Configure the dark node with OpenSearch, connected to the destined OpenSearch cluster.

  3. Log in to this dark node and verify on the System Information page that it’s running on OpenSearch.

  4. Start the index rebuild process on the Content Indexing admin UI.
    This will repopulate the OpenSearch index with the existing data, and can take some time.

  5. Once the OpenSearch index has been completely rebuilt, verify that you can search for your documents as expected on this node.

  6. Configure the remaining cluster nodes from Lucene to OpenSearch.

  7. Deprovision the dark node, if needed.

Notable changes

There are some key differences when switching from Lucene to OpenSearch to be aware of, as it might require some changes in your existing processes.

Production backup strategies

Currently, you might take backups of your Lucene index in the home directory, which will avoid the need for a full reindex when restoring. After switching to OpenSearch, you will need to employ a different strategy using snapshots. Refer to Production backup strategy for more details.

Downtime during reindexing (during EAP phase)

Note that the situations described in this section are specific to the current state of this EAP feature in Confluence 8.9. We are working on improvements in this area which will enable zero-downtime reindexing with OpenSearch when it’s released publicly.

With Lucene, the index is local to each node, so you can rebuild your index without any downtime. However, OpenSearch works differently. Since OpenSearch indexes are shared across all Confluence nodes, rebuilding your index will cause a disruption to search functionality across the cluster.

Additionally, because we disable index refresh during reindexing to optimize for speed, an abrupt interruption during the process might leave your OpenSearch indexes in a non-functioning state. To restore your indexes, either redo the reindexing process, or restore them from a snapshot.

Search window limit

OpenSearch has a default result window limit of 10,000. This is the maximum number of results you can search for, and the default is set to limit memory usage. If your search request exceeds this limit (for example, if an existing app is programmed to search for more than that limit), you will get the following error.


Result window is too large, from + size must be less than or equal to: [10000] but was [10010]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.


If this limitation is causing a problem on your environment, contact us or your app developers: there are memory-efficient ways to work with a large number of search results that developers can employ to avoid this limitation.


Alternatively, as an immediate workaround, you can increase the index.max_result_window setting on your OpenSearch index. Search requests take heap memory and time proportional to the result window, so make sure your OpenSearch data node has enough memory. Consult OpenSearch documentation about this index setting.

Functionality we're still working on

OpenSearch integration is an Early Access Program (EAP) feature in Confluence 8.9. This provides customers and vendors with early access to preview this feature on their environments and test their apps, before we release it as a public feature to all customers.

The following functionalities are missing in this EAP version. These will be included later, in the public release.

  • Non-english indexing languages: English is the only supported language.

  • Health checks that notify admins about issues on their OpenSearch cluster.

  • Zero-downtime reindexing: when rebuilding your index on OpenSearch, your search functionality will be disrupted, i.e. your documents won’t be available until the reindexing is completed (In the public release, the reindexing process will employ redundancy techniques to avoid such disruption).

.

Last modified on Dec 9, 2024

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.