How to perform a Confluence site search for keywords and links through the database
There are limitations for the default Lucene search algorithm in Confluence.
These are a few examples of the limitation :
- URL links are not searchable : - CONFSERVER-27503Getting issue details... STATUS
- Parts of a string are not searchable :
CONFSERVER-33996Getting issue details...
CONFSERVER-36770Getting issue details...
There are times when you really need to search for the pages that contain the keywords.
For an example, an external URL was inserted to numerous pages but as time goes by, those URL might point to a dead link as there might be some changes in the subdomain/URL path.
To search for these contents, run the SQL query below on your Confluence database.
<INSERT_KEYWORD_HERE> with your keyword.
% symbol represents a wildcard search.
The SQL results will return the
content ID, which is referred in the Confluence UI as the page ID,
content type, page/blog title and the
The query below uses wildcard searching in order to locate page content that matches a specific pattern.
Due to the nature of wildcard searching and the possible size of the target tables, this query may take a long time to execute.
If possible, it is recommended to carry these out in a clone of the production database rather than on the production environment itself to avoid any impact to the production environment.
SELECT c.CONTENTID, c.CONTENTTYPE, c.TITLE, s.SPACEKEY FROM CONTENT c JOIN BODYCONTENT bc ON c.CONTENTID = bc.CONTENTID JOIN SPACES s ON c.SPACEID = s.SPACEID WHERE c.PREVVER IS NULL AND c.CONTENTTYPE IN ('PAGE', 'BLOGPOST') AND bc.BODY LIKE '%<INSERT_KEYWORD_HERE>%';
SELECT c.CONTENTID, c.CONTENTTYPE, c.TITLE, s.SPACEKEY FROM dbo.CONTENT c JOIN dbo.BODYCONTENT bc ON c.CONTENTID = bc.CONTENTID JOIN dbo.SPACES s ON c.SPACEID = s.SPACEID WHERE c.PREVVER IS NULL AND c.CONTENTTYPE IN ('PAGE', 'BLOGPOST') AND bc.BODY LIKE '%<INSERT_KEYWORD_HERE>%';