Confluence Space import via scripting tools

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

    

This guide is for informational purposes and is not eligible for support from Atlassian.  If you have any questions about the information on this page, please reach out to our Atlassian Community for help. 

Summary

Currently there are no REST API methods in Confluence for Space export or import.

We do currently have an open feature request to Add REST API endpoint for generating space exports being managed under  CONFSERVER-40457 - Getting issue details... STATUS . please make sure to vote, comment and watch that feature request if you are interested on it.

If an admin wishes to make these operations less manual and automate them to a certain extent, they can rely on Confluence's XML-RPC and SOAP APIs. However, these APIs are deprecated since Confluence 5.5.

Scripting

Another way to accomplish this is to script these tasks. To do so we'll need to rely on some cookies and tokens to emulate the UI activity via command line.

Essentially, we'll need to capture the atlassian-token and the JSESSIONID cookies when authenticating with Confluence. These headers can be found in a browser's Developer Tools.

You can retrieve these with a script (shell or python examples below) to automate the Space import operation.

For reference:  CONFSERVER-40457 - Getting issue details... STATUS

Environment

The space import scripts below were tested on:

Confluence DC 7.4.3 (server)

Linux (Centos 7) and Mac (Catalina) (clients)

Requirements

The Python version will run only on Python 3 and it requires four modules: requests, bs4, html5lib and lxml.

The shell version will require GNU Bash 3.2 and upwards.

The space backup file must already reside in the <confluence-shared-home/restore> directory. The scripts will not upload the file to that location.


Solution

Space import with Bash/Python

The Bash version uses curl to make the HTTP requests and store the token and cookies required. Then it'll send the payload to Confluence's endpoints.

importSpace.sh
importSpace.sh
#!/bin/bash

ADMIN_USERNAME=<admin username>
ADMIN_PASSWORD=<admin password>
BUILD_INDEX=true # if you want to trigger a reindex immediately after the space import (false if you don't)
CONFLUENCE_BASE_URL=<confluence base URL>
SPACE_FILE_NAME=<space export backup file> # here you put only the export file name, without any path (it needs to reside inside the <confluence-shared-home/restore directory)
TASK_ID_FILE=/tmp/space_import_task.session
CONFLUENCE_COOKIE_FILE=/tmp/space_import.cookie
RESPONSE_FILE=/tmp/response.txt
STATUS_REFRESH_INTERVAL=0.5 # refresh rate (in seconds) for the space import status request

overwrite() { 
    echo -e "\r\033[1A\033[0K$@"; 
}

get_atl_token_session_cookie() {
    # Removing temporary token/cookie and task ID files from previous execution of the script
    rm -f ${TASK_ID_FILE} ${CONFLUENCE_COOKIE_FILE};

    echo "[ $(date) ] INFO Authenticating with Confluence ..."
    AUTH_HTTP_RESPONSE=$(curl -o ${RESPONSE_FILE} -w "%{http_code}" -s -c ${CONFLUENCE_COOKIE_FILE} -u ${ADMIN_USERNAME}:${ADMIN_PASSWORD} ${CONFLUENCE_BASE_URL}'/users/viewmyprofile.action')

    if [ $? -ne 0 ] || [ ${AUTH_HTTP_RESPONSE} -ne 200 ] ; then
        echo "[ $(date) ] ERROR Authentication failed: [HTTP response: ${AUTH_HTTP_RESPONSE}]. Check [${RESPONSE_FILE}] for details."
        exit 1
    else
        echo "[ $(date) ] INFO Retrieving the ATL_TOKEN ..."
        ATLASSIAN_TOKEN=$(grep "atlassian-token" ${RESPONSE_FILE} | awk -F"\"" '{print $6}')

        if [[ -z "${ATLASSIAN_TOKEN}" ]] ; then
            echo "[ $(date) ] ERROR ATL_TOKEN retrieval failed."
            exit 1
        else
            space_import;
        fi
    fi
}

space_import() {
    echo "[ $(date) ] INFO Starting the space import [file: <shared_home>/restore/${SPACE_FILE_NAME}] [index rebuild post-import: ${BUILD_INDEX}] ..."
    IMPORT_HTTP_RESPONSE=$(curl -o ${TASK_ID_FILE} -w "%{http_code}" -s -L -b ${CONFLUENCE_COOKIE_FILE} ${CONFLUENCE_BASE_URL}'/admin/restore-local-file.action' \
      -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' \
      -H 'Content-Type: application/x-www-form-urlencoded' \
      -H 'Origin: '${CONFLUENCE_BASE_URL} \
      -H 'Connection: keep-alive' \
      -H 'Referer: '${CONFLUENCE_BASE_URL}'/admin/backup.action' \
      --data 'atl_token='${ATLASSIAN_TOKEN}'&localFileName='${SPACE_FILE_NAME}'&buildIndexLocalFileRestore='${BUILD_INDEX}'&edit=Import')

    if [ $? -ne 0 ] || [ ${IMPORT_HTTP_RESPONSE} -ne 200 ] ; then
        echo "[ $(date) ] ERROR Space import failed: [HTTP response: ${IMPORT_HTTP_RESPONSE}]. Check [${TASK_ID_FILE}] for details."
        exit 1
    else
        TASK_ID=$(grep "ajs-taskId" ${TASK_ID_FILE} | awk -F"\"" '{print $4}')
        ERROR_MSG=$(perl -ne 'print if /<div class="aui-message aui-message-error closeable">/../<\/div>/' ${TASK_ID_FILE} | sed 's/<[^>]*>//g' | awk 'NF' | paste -d " "  - - | tr -s " ")

        if [[ ! -z "${ERROR_MSG}" ]] ; then
            echo "[ $(date) ] ERROR Import failed: [${ERROR_MSG}]"
            exit 1
        else
            if [[ -z "${TASK_ID}" ]] ; then
                echo -e "[ $(date) ] INFO TASK_ID was not captured. Check Confluence logs [atlassian-confluence.log] file to follow the space import progress."
            else
                echo
                import_status;
            fi
        fi
    fi

    echo "[ $(date) ] Space import finished in [${elapsed_time}] seconds!"
    exit 0
}

import_status() {
    percentage_complete=0
    while [ ${percentage_complete} -lt 100 ] ; do
        TASK_STATUS=$(curl -L -b ${CONFLUENCE_COOKIE_FILE} ${CONFLUENCE_BASE_URL}'/rest/api/longtask/'${TASK_ID} -H 'Accept: */*' 2>/dev/null)
        percentage_complete=$(echo "${TASK_STATUS}" | grep -Eo 'percentageComplete.+[0-9]{1,}' | cut -d ":" -f 2 | cut -d "," -f 1)
        status=$(echo "${TASK_STATUS}" | grep -Eo 'translation.+' | cut -d ":" -f 2 | cut -d "," -f 1 | tr -d "\"")
        elapsed_time_ms=$(echo "${TASK_STATUS}" | grep -Eo 'elapsedTime.+[0-9]{1,}' | cut -d ":" -f 2 | cut -d "," -f 1)
        elapsed_time=$(printf %.$2f $(awk "BEGIN { printf "${elapsed_time_ms}/1000" }";))

        overwrite "Space import status: [${status/./}] [${percentage_complete}%]"
        sleep ${STATUS_REFRESH_INTERVAL}
    done
}

get_atl_token_session_cookie;
importSpace.py
importSpace.py
#!/usr/local/bin/python3
from __future__ import print_function

import sys
import json
import time
import requests
from datetime import datetime
from bs4 import BeautifulSoup

# pip3 install requests
# pip3 install bs4
# pip3 install html5lib
# pip3 install lxml

admin_username = <admin username>
admin_password = <admin password>
index_post_import = 'true' # if you want to trigger a reindex immediately after the space import (false if you don't)
confluence_base_url = <confluence base URL>
file = <space export backup file> # here you put only the export file name, without any path (it needs to reside inside the <confluence-shared-home/restore directory)
status_refresh_interval=0.5 # refresh rate (in seconds) for the space import status request

def print_statusline(msg: str):
    last_msg_length = len(print_statusline.last_msg) if hasattr(print_statusline, 'last_msg') else 0
    print(' ' * last_msg_length, end='\r')
    print(msg, end='\r')
    print_statusline.last_msg = msg

def get_authenticated_admin_session_and_token():
    headers = {'Content-Type': 'application/x-www-form-urlencoded', 'Accept': '*/*'}
    url = confluence_base_url+'/doauthenticate.action'
    session = requests.Session()
    # retrieve the session cookies
    try:
        print("[ %s ] INFO Authenticating with Confluence ..." % (datetime.now().strftime("%Y/%m/%d %H:%M:%S")))
        session.post(confluence_base_url + '/users/viewmyprofile.action', auth = (admin_username, admin_password), headers = headers)

        # retrieve the ATL_TOKEN
        try:
            print("[ %s ] INFO Retrieving the ATL_TOKEN ..." % (datetime.now().strftime("%Y/%m/%d %H:%M:%S")))
            r = session.post(url, headers = headers)
            soup_token = BeautifulSoup(r.text, "lxml")
            atlassian_token = soup_token.find("meta", {"id": "atlassian-token"})["content"]
            space_import(session, atlassian_token)
        except Exception as e:
            print("[ %s ] ERROR ATL_TOKEN retrieval failed: [%s]" % (datetime.now().strftime("%Y/%m/%d %H:%M:%S"), str(e)))
            sys.exit(1)

    except Exception as e:
        print("[ %s ] ERROR Authentication failed: [%s]" % (datetime.now().strftime("%Y/%m/%d %H:%M:%S"), str(e)))
        sys.exit(1)

def space_import(session, atlassian_token):
    payload = {'atl_token':atlassian_token, 'localFileName':file, 'buildIndexLocalFileRestore':index_post_import, 'edit':'Import'}
    headers = {'Content-Type': 'application/x-www-form-urlencoded'}
    url = confluence_base_url + '/admin/restore-local-file.action'
    try:
        print("[ %s ] INFO Starting the space import [file: <shared_home>/restore/%s] [index rebuild post-import set to %s] ..." % (datetime.now().strftime("%Y/%m/%d %H:%M:%S"), file, index_post_import))
        i = session.post(url, headers = headers, data = payload)

        try:
            soup_import = BeautifulSoup(i.text, "lxml")
            check_error = soup_import.find('div', attrs = {'class':'aui-message aui-message-error closeable'})

            if check_error is not None:
                import_error = check_error.get_text()
                print("[ %s ] ERROR Import failed: [%s]" % (datetime.now().strftime("%Y/%m/%d %H:%M:%S"), import_error.replace('\n', ' ').replace('.', '').lstrip().rstrip()))
                sys.exit(1)
            else:
                headers = {'Content-Type': 'application/json', 'Accept': '*/*'}
                task_id = soup_import.find("meta", {"name": "ajs-taskId"})["content"]
                url = confluence_base_url + '/rest/api/longtask/' + task_id

                percentage_complete = 0
                while percentage_complete < 100:
                    try:
                        s = session.get(url, headers = headers)
                        s_dict = json.loads(s.text)
                        status = s_dict['messages'][0]['translation']
                        percentage_complete = s_dict['percentageComplete']
                        elapsed_time = round(s_dict['elapsedTime'] / 1000, 2)

                        output = 'Space import status: [' + str(status) + '] [' + str(percentage_complete) + '%]'

                        print_statusline(output)
                        time.sleep(status_refresh_interval)

                    except Exception as e:
                        print("[ %s ] WARN Error when getting the space import status: [%s]" % (datetime.now().strftime("%Y/%m/%d %H:%M:%S"), str(e)))
                        continue

            print('', end = '\n', flush = True)
            print("[ %s ] INFO Space import finished in [%s] seconds!" % (datetime.now().strftime("%Y/%m/%d %H:%M:%S"), elapsed_time))

        except Exception as e:
            print("[ %s ] ERROR Import failed: [%s]" % (datetime.now().strftime("%Y/%m/%d %H:%M:%S"), str(e)))
            sys.exit(1)

    except Exception as e:
        print("[ %s ] ERROR Space import failed (HTTP request): [%s]" % (datetime.now().strftime("%Y/%m/%d %H:%M:%S"), str(e)))
        sys.exit(1)

get_authenticated_admin_session_and_token()

The Python version will rely on libraries (the requests module will take care of that auth/token part).

Variables

Both scripts require the admin to adjust a few necessary information for them to run:

  • admin username
  • admin password
  • Confluence base URL
  • space export backup file name
  • index to be performed post-space-import (true or false)

Output

You should expect something like this as a successful result:

Error handling

Both scripts will handle errors for:

  • authentication
  • token retrieval
  • space import

Some examples of the Errors while executing the script:


Last modified on Apr 11, 2022

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.