Performing Data Pipeline export results in a NullPointerException

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform Notice: Data Center Only - This article only applies to Atlassian products on the data center platform.

Problem

Attempting to export Data Pipeline report failed with the following error logged in the atlassian-confluence.log:

2022-05-16 14:36:55,851 ERROR [data-pipeline-export-executor-0] [insights.core.service.DefaultDataExportOrchestrator] lambda$null$4 Failed writing entities to file - processId: 6
-- url: /rest/datapipeline/latest/export | traceId: 4ceba8dfbb2a2b8c | userName: admin
java.lang.NullPointerException
	at com.atlassian.confluence.spaces.SpaceDescription.getSpaceKey(SpaceDescription.java:66)
	at com.atlassian.business.insights.confluence.extract.SpaceToLogRecordConverter.convert(SpaceToLogRecordConverter.java:32)
	at com.atlassian.business.insights.confluence.extract.SpaceLogRecordStreamer.toLogRecord(SpaceLogRecordStreamer.java:53)
...
	at com.atlassian.business.insights.confluence.prefetch.EntityPrefetchProvider.fetchPages(EntityPrefetchProvider.java:107)
	at com.atlassian.business.insights.confluence.prefetch.EntityPrefetchProvider.lambda$null$0(EntityPrefetchProvider.java:71)
	at com.atlassian.business.insights.core.extract.EntityPageIterator.computeNext(EntityPageIterator.java:33)
	at com.atlassian.business.insights.core.extract.EntityPageIterator.computeNext(EntityPageIterator.java:13)
...
	at com.atlassian.business.insights.core.service.DefaultDataExportOrchestrator.lambda$runFullExport$5(DefaultDataExportOrchestrator.java:130)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
...

Navigating to the Data pipeline page, we could also see that the job had failed.

Diagnosis

From the stack trace above, we could clarify that Confluence hit into the  NullPointerException when trying to retrieve the Space details (com.atlassian.confluence.spaces.SpaceDescription.getSpaceKey()).

To further investigate this, we could then enable SQL logging in Confluence instance temporarily to find the record(s) causing this issue. With SQL logging enabled, we should be able to see similar logging as below - where we could identify the exact table Confluence was querying the data from.

2022-05-16 14:36:55,361 DEBUG [data-pipeline-export-executor-0] [org.hibernate.SQL] logStatement select contentent0_.CONTENTID as CONTENTI1_13_0_, contentent0_.HIBERNATEVERSION as HIBERNAT2_13_0_, contentent0_.TITLE as TITLE4_13_0_, contentent0_.LOWERTITLE as LOWERTIT5_13_0_, contentent0_.VERSION as VERSION6_13_0_, contentent0_.CREATOR as CREATOR7_13_0_, contentent0_.CREATIONDATE as CREATION8_13_0_, contentent0_.LASTMODIFIER as LASTMODI9_13_0_, contentent0_.LASTMODDATE as LASTMOD10_13_0_, contentent0_.VERSIONCOMMENT as VERSION11_13_0_, contentent0_.PREVVER as PREVVER12_13_0_, contentent0_.CONTENT_STATUS as CONTENT13_13_0_, contentent0_.PAGEID as PAGEID14_13_0_, contentent0_.SPACEID as SPACEID15_13_0_, contentent0_.CHILD_POSITION as CHILD_P16_13_0_, contentent0_.PARENTID as PARENTI17_13_0_, contentent0_.PLUGINKEY as PLUGINK18_13_0_, contentent0_.PLUGINVER as PLUGINV19_13_0_, contentent0_.PARENTCCID as PARENTC20_13_0_, contentent0_.DRAFTPAGEID as DRAFTPA21_13_0_, contentent0_.DRAFTSPACEKEY as DRAFTSP22_13_0_, contentent0_.DRAFTTYPE as DRAFTTY23_13_0_, contentent0_.DRAFTPAGEVERSION as DRAFTPA24_13_0_, contentent0_.PARENTCOMMENTID as PARENTC25_13_0_, contentent0_.USERNAME as USERNAM26_13_0_, contentent0_.CONTENTTYPE as CONTENTT3_13_0_ from CONTENT contentent0_ where contentent0_.CONTENTID=?
2022-05-11 20:29:09,213 TRACE [data-pipeline-export-executor-0] [type.descriptor.sql.BasicBinder] bind binding parameter [1] as [BIGINT] - [360452]
2022-05-11 20:29:09,215 TRACE [data-pipeline-export-executor-0] [type.descriptor.sql.BasicExtractor] extract extracted value ([CONTENTT3_13_0_] : [VARCHAR]) - [SPACEDESCRIPTION]
2022-05-11 20:29:09,216 TRACE [data-pipeline-export-executor-0] [type.descriptor.sql.BasicExtractor] extract extracted value ([HIBERNAT2_13_0_] : [INTEGER]) - [12]
2022-05-11 20:29:09,216 TRACE [data-pipeline-export-executor-0] [type.descriptor.sql.BasicExtractor] extract extracted value ([TITLE4_13_0_] : [VARCHAR]) - [null]
2022-05-11 20:29:09,216 TRACE [data-pipeline-export-executor-0] [type.descriptor.sql.BasicExtractor] extract extracted value ([LOWERTIT5_13_0_] : [VARCHAR]) - [null]
2022-05-11 20:29:09,217 TRACE [data-pipeline-export-executor-0] [type.descriptor.sql.BasicExtractor] extract extracted value ([VERSION6_13_0_] : [INTEGER]) - [4]
2022-05-11 20:29:09,217 TRACE [data-pipeline-export-executor-0] [type.descriptor.sql.BasicExtractor] extract extracted value ([CREATION8_13_0_] : [TIMESTAMP]) - [2020-03-22 18:31:13.389]
2022-05-11 20:29:09,218 TRACE [data-pipeline-export-executor-0] [type.descriptor.sql.BasicExtractor] extract extracted value ([LASTMOD10_13_0_] : [TIMESTAMP]) - [2022-03-12 00:29:59.914]
2022-05-11 20:29:09,218 TRACE [data-pipeline-export-executor-0] [type.descriptor.sql.BasicExtractor] extract extracted value ([PREVVER12_13_0_] : [BIGINT]) - [null]
2022-05-11 20:29:09,218 TRACE [data-pipeline-export-executor-0] [type.descriptor.sql.BasicExtractor] extract extracted value ([CONTENT13_13_0_] : [VARCHAR]) - [current]
2022-05-11 20:29:09,219 TRACE [data-pipeline-export-executor-0] [type.descriptor.sql.BasicExtractor] extract extracted value ([PAGEID14_13_0_] : [BIGINT]) - [null]
2022-05-11 20:29:09,219 TRACE [data-pipeline-export-executor-0] [type.descriptor.sql.BasicExtractor] extract extracted value ([SPACEID15_13_0_] : [BIGINT]) - [null]


In this case, Confluence was querying CONTENT table in particular - hence we could then use the SQL query below to find the offending record(s).

SELECT * FROM CONTENT
WHERE CONTENTTYPE = 'SPACEDESCRIPTION'
AND SPACEID IS NULL
AND PREVVER IS NULL;

Cause

There is a NULL Space ID value exist in the CONTENT table. This should not happen as each SPACEDESCRIPTION record in the CONTENT table must have a spaceid assigned/allocated to it by default on Space creation process.

Resolution

As the Space ID value in the CONTENT table is missing, unfortunately there are no reference(s) between both CONTENT and SPACES tables anymore. To further troubleshoot this, please reach out to the Atlassian Support Team and share the following information:

  • Latest Confluence Support.zip
  • Results of the SQL queries below:

    SELECT * FROM SPACES;
    
    SELECT * FROM CONTENT
    WHERE CONTENTTYPE = 'SPACEDESCRIPTION'
    AND SPACEID IS NULL
    AND PREVVER IS NULL;


Description
Product
Last modified on May 20, 2022

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.