Data pipeline

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1026058464.

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077786100.

For a detailed reference of the exported data's schema, see Data pipeline export schema.

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1063162914.

On this page:

Requirements

To trigger data exports through the REST API, you’ll need:

Considerations

There are a number of security and performance impacts you’ll need to consider before getting started.

Security

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1059684397.

Export performance

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077913664.

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077915100.

Access the data pipeline

To access the data pipeline go to

Error rendering macro 'excerpt-include'

User 'null' does not have permission to view the page.

 > Data pipeline.

Schedule regular exports

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910718.

Check the status of an export

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910756.

Cancel an export

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910731.

Exclude projects from the export

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1095766513.

Automatic data export cancellations

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910735.

Configuring the data export

You can configure the format of the export data using the following system properties.

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910761.

Use the data pipeline REST API

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910767.

Output files

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1044791064.

Location of exported files

Exported data is saved as separate CSV files. The files are saved to the following directory:

  • <shared-home>/data-pipeline/export/<job-id> if you run Confluence in a cluster
  • <local-home>/data-pipeline/export/<job-id> you are using non-clustered Confluence

Within the <job-id> directory you will see the following files:

  • users_job<job_id>_<schema_version>_<timestamp>.csv 

  • spaces_job<job_id>_<schema_version>_<timestamp>.csv

  • pages_job<job_id>_<schema_version>_<timestamp>.csv

  • comments_job<job_id>_<schema_version>_<timestamp>.csv

  • analytics_events_job<job_id>_<schema_version>_<timestamp>.csv

To load and transform the data in these files, you'll need to understand the schema. See Data pipeline export schema.

Set a custom export path

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1077910843.

Sample Spark and Hadoop import configurations

If you have an existing Spark or Hadoop instance, use the following references to configure how to import your data for further transformation.


Spark / Databricks...

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1044791470.

Hadoop...

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1044791473.

Troubleshooting issues with data exports

Error rendering macro 'includeplus'

User 'anonymous' does not have permission to view the page with ID 1044791075.

Last modified on Dec 7, 2021

Was this helpful?

Yes
No
Provide feedback about this article

In this section

Powered by Confluence and Scroll Viewport.