Data pipeline

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1026058464.

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077786100.

For a detailed reference of the exported data's schema, see Data pipeline export schema.

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1063162914.

Requirements

To trigger data exports through the REST API, you’ll need:

A valid Confluence Data Center license
Systems Administrator global permissions

Considerations

There are a number of security and performance impacts you’ll need to consider before getting started.

Security

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1059684397.

Export performance

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077913664.

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077915100.

Access the data pipeline

To access the data pipeline go to

Error rendering macro 'excerpt-include'

User 'null' does not have permission to view the page.

> Data pipeline.

Schedule regular exports

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910718.

Check the status of an export

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910756.

Cancel an export

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910731.

Exclude projects from the export

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1095766513.

Automatic data export cancellations

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910735.

Configuring the data export

You can configure the format of the export data using the following system properties.

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910761.

Use the data pipeline REST API

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910767.

Output files

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1044791064.

Location of exported files

Exported data is saved as separate CSV files. The files are saved to the following directory:

<shared-home>/data-pipeline/export/<job-id> if you run Confluence in a cluster
<local-home>/data-pipeline/export/<job-id> you are using non-clustered Confluence

Within the <job-id> directory you will see the following files:

users_job<job_id>_<schema_version>_<timestamp>.csv
spaces_job<job_id>_<schema_version>_<timestamp>.csv
pages_job<job_id>_<schema_version>_<timestamp>.csv
comments_job<job_id>_<schema_version>_<timestamp>.csv
analytics_events_job<job_id>_<schema_version>_<timestamp>.csv

To load and transform the data in these files, you'll need to understand the schema. See Data pipeline export schema.

Set a custom export path

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910843.

Sample Spark and Hadoop import configurations

If you have an existing Spark or Hadoop instance, use the following references to configure how to import your data for further transformation.

Spark / Databricks...

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1044791470.

Hadoop...

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1044791473.

Troubleshooting issues with data exports

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1044791075.

Products

Jira Software

Jira Service Management

Jira Work Management

Confluence

Bitbucket

Resources

Documentation

Community

System Status

Suggestions and bugs

Marketplace

Billing and licensing

Data pipeline

Managing Confluence Data

On this page

In this section

Still need help?

Requirements

Considerations

Security

Export performance

Access the data pipeline

Schedule regular exports

Check the status of an export

Cancel an export

Exclude projects from the export

Automatic data export cancellations

Configuring the data export

Use the data pipeline REST API

Output files

Location of exported files

Set a custom export path

Sample Spark and Hadoop import configurations

Troubleshooting issues with data exports

In this section

Page

Viewport

Confluence

Data pipeline

Managing Confluence Data

On this page

In this section

Related content

Still need help?

Requirements

Considerations

Security

Export performance

Access the data pipeline

Schedule regular exports

Check the status of an export

Cancel an export

Exclude projects from the export

Automatic data export cancellations

Configuring the data export

Use the data pipeline REST API

Output files

Location of exported files

Set a custom export path

Sample Spark and Hadoop import configurations

Troubleshooting issues with data exports

In this section

Related content