Data pipeline export schema
This page describes the structure and data schema of the Confluence data export files.
To learn more about how the set up and configure your data pipeline, see Data pipeline.
Output file format and structure
- Each file has a header. This includes files from exports that resulted in no data.
- New lines are separated by CRLF characters
\r\n
. - Fields containing line breaks (CRLF), double quotes, and commas are enclosed in double quote.
- If double-quotes are present inside fields, then a double-quote appearing inside a field are escaped by preceding it with another double quote. For example:
"aaa"
,"b""bb"
,"ccc"
. - Fields with no data (null values) are represented in the CSV export by two consecutive delimiters (as in,
,,
). - Embedded break lines are escaped by default and printed as n.
Fields are available in all schema versions, unless specifically noted below.
Users file
Field | Description |
---|---|
instance_url | Type: URL Description: Base URL of the current instance. Example: |
user_id | Type: String Description: ID of the user Example: |
user_name | Type: String Description: User name of the user. Example: |
user_fullname | Type: String Description: Full name of the user. Example: |
user_email | Type: Email Description: Email address of the user Example: |
Spaces file
Field | Description |
---|---|
space_key | Type: String Description: Unique identifier that forms part of the URL for that space Example: |
instance_url | Type: String Description: Base URL of the current instance Example: |
space_url | Type: URL Description: The space URL Example: |
homepage_url | Type: URL Description: The space's home page URL Example: |
space_name | Type: String Description: Title of the space Example: |
space_type | Type: String Description: Whether the space is a global or personal space Example: |
space_status | Type: String Description: Whether the status of the space is current or archived Example: |
creator_id | Type: User Description: ID of the user who created the space Example: |
last_modifier_id | Type: User Description: ID of the user who last modified the space Example: |
created_date | Type: Time Description: Space creation timestamp Example: |
updated_date | Type: Time Description: Last modification timestamp Example: |
Pages file
Field | Description |
---|---|
page_id | Type: Number Description: Unique ID of the page |
instance_url | Type: String Description: Base URL of the current instance Example: |
space_key | Type: String Description: Space key of the space the page exists in |
page_url | Type: String Description: URL of the page Example: |
page_type | Type: String Description: Whether the entity is a page or a blog post Example: |
page_title | Type: String Description: Title of the page |
page_status | Type: String Description: Status of the page (the only value is current, this does not indicate that a page is in a space that has been archived) |
page_content | Type: String Description: Content of the page in Confluence storage format (limited to 10,000 characters) Example:
|
page_parent_id | Type: Number Description: ID of the current page's direct parent |
labels | Type: String Description: Comma separated list of labels of the page Example: |
page_version | Type: String Description: Version number of the latest version page Example: 3 |
creator_id | Type: Number Description: ID of the user who created the page Example: |
last_modifier_id | Type: User Description: ID of the user who last updated the page Example: |
created_date | Type: Time Description: Creation timestamp Example: |
updated_date | Type: Time Description: Last modification timestamp Example: |
last_update_description | Type: String Description: Version comment entered when the page was last updated (limited to 2,000 characters) |
Comments file
Field | Description |
---|---|
comment_id | Type: Number Description: Unique ID of the comment |
instance_url | Type: String Description: Base URL of the current instance Example: |
comment_url | Type: String Description: Full URL of the comment |
page_id | Type: Number Description: Unique ID of the page which contains the comment |
parent_comment_id | Type: Number Description: If the comment is a reply, this is the ID of the parent comment (empty for top level comments) |
comment_content | Type: String Description: Content of the comment in Confluence storage format (limited to 2,000 characters) Example:
|
creator_id | Type: User Description: ID of the user who created the comment Example: |
last_modifier_id | Type: User Description: ID of the user who last modified the comment Example: |
created_date | Type: Time Description: Creation timestamp Example: |
updated_date | Type: Time Description: Last modification timestamp Example: |
Analytics events file
Field | Description |
---|---|
instance_url | Type: String Description: Base URL of the current instance Example: |
event_id | Type: Number Description: Unique ID of the analytics event |
event_name | Type: String Description: Name of the analytics event. Events include page_viewed, page_created, page_updated, blog_viewed, blog_created, blog_updated, comment_created, attachment_viewed, attachment_created. Example: |
created_date | Type: Time Description: Creation timestamp Example: |
event_author_id | Type: User Description: ID of the user who performed the action that triggered the event Example: |
event_space_key | Type: String Description: Space key of the space the event was triggered in or affects (affected object) Example: |
event_container_id | Type: Number Description: ID of the containing entity. For pages this is the page ID, for attachments and comments, it’s the page ID of the page the attachment or comment appears on. |
event_content_id | Type: Number Description: ID of the entity. For pages this is the page ID, for attachments, it is the attachment ID, and for comments it's the comment ID. |