Build plans queued for extended duration reporting "Updating source code to latest..." inside Build activity dashboard

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

 

Summary

Build plans have been queued for building for an extended duration reporting "Updating source code to latest..." inside the Build activity dashboard. There are several agents capable of processing the builds available but they remain queued for an excessive period.

Diagnosis

The fact that several build plans are queued and seem to be stuck during "Updating source code to latest..." doesn't necessarily mean they are waiting for Bamboo to update the caches of the repositories they are using before dispatching the build. There's another article that outlines the potential causes and fixes for this type of problem here:

The issue described in this article is slightly different and affects builds after change detection has happened and respective caches been updated despite the fact Bamboo reports "Updating source code to latest..." inside the Build activity dashboard. There's one common factor to the two scenarios that are going to be described below and is very important for diagnosing this issue:

Build plans are in fact getting dispatched and built by agents while Bamboo reports "Updating source code to latest...". So the very first step to diagnosing this issue would be to review your agent logs and see if they are building the plans that Bamboo says are in the queue under the status "Updating source code to latest...".

It's helpful to understand the basics of the Bamboo build plan workflow to understand where the issue might be when the symptoms present:

  1. A build plan is triggered.
  2. Plan changes status to queued.
  3. Change detection happens on the server-side. This is where it will reach out to the repository to determine if there are any changes it needs for the build.
  4. Plan is then added to the build queue. This is when it shows up on the Build activity dashboard.
  5. Server assigns an agent for it and sends an event to the agent.
  6. Agent receives the event and starts building.
  7. Agent finishes building and sends the results back to server.

The problem described in this article happens when Bamboo has to process the events/ messages sent from the agent. The fact that builds are going to the queue, getting picked up by available agents and built all the while Bamboo is reporting "Updating source code to latest..." means Bamboo is having a hard time updating the status of your builds in the database.

Diagnosis 1

Thread dumps taken while several build plans are queued and seem to be stuck during "Updating source code to latest..."  contain RUNNABLE threads with following classes:

...
at com.atlassian.bamboo.user.rename.UserRenameHelper.updateUserInTable(UserRenameHelper.java:38)
at com.atlassian.bamboo.user.rename.UserRenameHelper.renameUserInBuildResultSummary(UserRenameHelper.java:82)
at com.atlassian.bamboo.user.rename.UserRenameServiceImpl.doRenameUser(UserRenameServiceImpl.java:179)
...

This suggests that a user renaming process is happening.

Diagnosis 2

Important Bamboo threads such as IndexerService and BuildTailMessageProcessingThread can be seen in thread dump spending extended periods in filesystem operations. Example:

8-BuildTailMessageProcessingThread-expensive:pool-16-thread-102
State
Runnable
Java Stack
at java.io.RandomAccessFile.open0(Native Method) 
at java.io.RandomAccessFile.open(RandomAccessFile.java:316) 
at java.io.RandomAccessFile. (RandomAccessFile.java:243) 
at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:193) 
at org.apache.lucene.store.Directory.copy(Directory.java:185) 
at org.apache.lucene.store.TrackingDirectoryWrapper.copy(TrackingDirectoryWrapper.java:50) 
at org.apache.lucene.index.IndexWriter.createCompoundFile(IndexWriter.java:4582) 
at org.apache.lucene.index.DocumentsWriterPerThread.sealFlushedSegment(DocumentsWriterPerThread.java:535) 
at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:502) 
at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:506) 
at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:616) 
at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2815) 
- locked [0x00000003ce9fd4e8] (a java.lang.Object) 
- locked [0x00000003cf2b1460] (a java.lang.Object) 
at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2970) 
- locked [0x00000003cf2b1460] (a java.lang.Object) 
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2940) 
at com.atlassian.bonnie.LuceneConnection.commitAndRefreshSearcher(LuceneConnection.java:566) 
at com.atlassian.bonnie.LuceneConnection.withWriter(LuceneConnection.java:506) 
at com.atlassian.bamboo.index.IndexerServiceImpl$8.run(IndexerServiceImpl.java:314) 


A quick look at the current processes utilization on the server running Bamboo shows that there's an anti-virus software consuming a lot of resources. Here's an example from running top while McAffee On-Access Scanner is running while the problem is happening:

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
 1019 bamboo    20   0  170052   3648   1628 R 63.0  0.0   0:00.36 top
 4028 root      20   0 1133188  27388  13512 S 50.0  0.0  79:33.16 oacore
 6404 root      20   0 1756680 420504   8660 S 50.0  0.6 461:45.27 OASManager
 6402 root      20   0 1756680 420504   8660 R 47.8  0.6 461:23.14 OASManager
 6408 root      20   0 1756680 420504   8660 S 47.8  0.6 461:33.43 OASManager
 6406 root      20   0 1756680 420504   8660 S 45.7  0.6 460:56.47 OASManager
 6410 root      20   0 1756680 420504   8660 S 43.5  0.6 461:46.93 OASManager


Cause

Cause 1

This is actually a bug:  BAM-20993 - Getting issue details... STATUS . The user renaming process can be quite extensive and time consuming depending on the number of records that need to be updated inside the database. This can affect Bamboo's ability to keep up with reading/ writing the status/ results of all builds.

Cause 2

The is caused by the anti-virus which is likely intercepting/ blocking read/ open/ write operations in lucene indexing (for build results and status) and/or ActiveMQ threads. The communication and transfer of data between the Bamboo server and agents is done through the Apache ActiveMQ (AMQ). In Bamboo, AMQ is configured as a persistent queue, meaning that messages that are sent are written to disk in the <Bamboo server home directory>/jms-store directory before they get to the database.

If using McAffee On-Access Scanner the cause might be (McAffee) Slow performance with Java-based applications.

Solution

Solution 1

There's no immediate solution to this issue. If the user renaming process is running you must wait until the process finishes and be careful to avoid renaming a large batch of users at once while  BAM-20993 - Getting issue details... STATUS  hasn't been fixed.

Solution 2

There are a few options to consider when it comes to anti-virus softwares:

Last modified on Apr 6, 2021

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.