How to find large pages in Confluence

Still need help?

The Atlassian Community is here for you.

Ask the community

This article requires fixes

This article has been Flagged for fixing. Use caution when using it and fix it if you have Publisher rights.

Platform Notice: Data Center - This article applies to Atlassian products on the Data Center platform.

Note that this knowledge base article was created for the Data Center version of the product. Data Center knowledge base articles for non-Data Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Purpose

When using Collaborative Editing, poor performance may be experienced for extremely large pages. In some situations, the editor may be unusable as covered in this bug report:

In this guide, we'll cover how to find such large pages and how to recover them.

Finding large pages on the database

Always back up your data before performing any modifications to the database. If possible, test any alter, insert, update, or delete SQL commands on a staging server first.

We will want to find any pages that have more than 5MB of text on them. Different databases require different queries. Please check the queries below depending on the DB in use.

(warning) A note about calculating the size: All of the below are going to be approximations. The 5MB limit is on the size of the request to save a page over REST. This includes all the overhead of the Content object, etc. The limit for the size of the body of a page is thus bound by the 5MB besides the overhead. Additionally, the size in the database could be different from the actual size transmitted over REST where the limit is. Thus, please update the values in the queries below to get a better approximation. Perhaps, instead of looking for pages larger than 5MB, it would be better to start looking for pages that are larger than say 4.5 MB, and narrow the search in.
(error) Do not run the queries on a live system with large amounts of data as they could possibly be slow.

MySql/Postgres/H2
SELECT c.contentid, s.spacename, c.title, OCTET_LENGTH(b.body) FROM content c
        JOIN spaces s on c.spaceid = s.spaceid
        JOIN bodycontent b on b.contentid = c.contentid
        WHERE c.contenttype = 'PAGE'
        AND c.content_status = 'current'
        AND c.prevver IS NULL
        AND OCTET_LENGTH(b.body) >= 5 * 1024 * 1024;
MSSQL
SELECT c.contentid, s.spacename, c.title, DATALENGTH(b.body) FROM content c
        JOIN spaces s on c.spaceid = s.spaceid
        JOIN bodycontent b on b.contentid = c.contentid
        WHERE c.contenttype = 'PAGE'
        AND c.content_status = 'current'
        AND c.prevver IS NULL
        AND DATALENGTH(b.body) >= 5 * 1024 * 1024;

Oracle can have two approaches since there isn't a good way to get the size of CLOB field in bytes.

  1. Estimate the size in bytes by counting the number of characters and multiplying by some integer K. Note that all English characters can be represented by 1 byte, but the database might store it at a fixed width, depending on the encoding. Multiplying by 2 might give a good estimate. It would at least give an idea of which pages might be close to the limit. At that point you may want to obtain a list of those pages and use the other method to get a more precise number. You could also use the second method to verify your estimate on a subset of pages to see if K needs to be adjusted.

    Oracle (number of characters)
    SELECT c.contentid, s.spacename, c.title, dbms_lob.getlength(b.body) FROM content c
            JOIN spaces s on c.spaceid = s.spaceid
            JOIN bodycontent b on b.contentid = c.contentid
            WHERE c.contenttype = 'PAGE'
            AND c.content_status = 'current'
            AND c.prevver IS NULL
            AND dbms_lob.getlength(b.body) >= 2 * 1024 * 1024;
  2. Convert the CLOB to a BLOB and measure the size of that.
    (warning) This query is likely not performant. Do not run it against the entire database. First, try to obtain a list of pages that might be close to the limit by using the query above.

    create or replace function cloblengthb(p_clob in clob ) return number
    as
      v_temp_blob BLOB;
      v_dest_offset NUMBER := 1;
      v_src_offset NUMBER := 1;
      v_amount INTEGER := dbms_lob.lobmaxsize;
      v_blob_csid NUMBER := dbms_lob.default_csid;
      v_lang_ctx INTEGER := dbms_lob.default_lang_ctx;
      v_warning INTEGER;
      v_total_size number := 0; -- Return total clob length in bytes
    BEGIN
      IF p_clob is not null THEN
        DBMS_LOB.CREATETEMPORARY(lob_loc=>v_temp_blob, cache=>TRUE);
        DBMS_LOB.CONVERTTOBLOB(v_temp_blob, p_clob,v_amount,v_dest_offset,v_src_offset,v_blob_csid,v_lang_ctx,v_warning);
        v_total_size := DBMS_LOB.GETLENGTH(v_temp_blob);
        DBMS_LOB.FREETEMPORARY(v_temp_blob);
      ELSE
        v_total_size := NULL;
      END IF;
      return v_total_size;
    end cloblengthb;
    /
    SELECT c.contentid, s.spacename, c.title, cloblengthb(b.body) FROM content c
            JOIN spaces s on c.spaceid = s.spaceid
            JOIN bodycontent b on b.contentid = c.contentid
            WHERE c.contenttype = 'PAGE'
            AND c.content_status = 'current'
            AND c.prevver IS NULL
            AND c.contentid IN(<predefined list of ids>)
            AND cloblengthb(b.body) >= 2 * 1024 * 1024;

Recovering large pages to be editable again

  1. On a test environment with data cloned from production, switch Collaborative Editing OFF temporarily
  2. Attempt to edit the page and split the content into smaller pages
  3. Re-enable Collaborative Editing and confirm the split pages can be edited in the test environment
  4. Copy the split pages to production and delete the original large page

(warning) Switching Collaborative Editing OFF and back ON is a heavy task, so we do not recommend doing that in a production environment. Instead, follow the steps above to recover and split the page in a test instance.

Last modified on Dec 30, 2024

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.