Jira HTTP Requests Log Analyzer
Legacy
This page describes v1.x of the analyzer tool. We recommend you use the latest version described at Jira Access Log Analyzer
What this tool is used for
The main job of this analyzer tool is to read all the requests in the Jira request logs and put each one into a known "request category". Running this tool will create the following three files:
- A summary of the incoming requests organized into a number of known categories.
- A web page which charts the requests over the time analyzed.
- A summary of the incoming requests grouped by the most popular unrecognized requests.
Prerequisites
You must have a Java Runtime Environment installed to run the analyzer.
Note that Jira and web servers produce a few different types of logs. The tool is only made to analyze the "request logs" (also known as "access logs").
For example, standalone Jira produces the log files that look like:
tools $ ls ~/jira/releases/atlassian-jira-5.0.3-standalone/logs/
access_log.2012-05-15 catalina.2012-05-15.log catalina.out localhost.2012-05-15.log manager.2012-05-15.log
access_log.2012-05-16 catalina.2012-05-16.log host-manager.2012-05-15.log localhost.2012-05-16.log
In this case we would only want to analyze the two files starting with "access_log".
You can filter out the other files using a wildcard in the file name as explained below, or you can copy the access logs to a temporary folder and analyze that whole folder.
Installation
You can either copy the jar file to your server and run the tool there, or copy the log files to a workstation and run it locally.
Version | File | Notes |
---|---|---|
v1.1 | Added more categories including new paths for Jira 6.0 | |
v1.0 | Designed for Jira v5.x or earlier |
Running interactively
The analyzer can be run in "interactive mode" by running this command:
java -jar access-log-analyser-1.0.jar
The tool will prompt you to enter a directory or filename - this can be a relative or absolute path.
You can enter a directory containing access logs only (eg "/var/jira/access-logs/" or "../logs"), an individual file name (eg "logs/access-log-2012-08-15.txt"), or use '*' as a wildcard (eg "C:\\jira\logs\access-log.*").
Running with parameters
Alternatively you can pass the path into the command, eg:
java -jar access-log-analyser-1.0.jar file=/var/jira/access-logs/
Note that there are no spaces around the '='.
Setting the context path manually
If you run Jira under a sub-context (eg "/jira/") then the URLs in the request logs will all have the context path on the front of them. Eg. they may look like:
/jira/secure/Dashboard.jspa
This is not normally an issue as the tool will detect the context automatically. When it can be a problem is if your access logs come from a proxy server (eg an Apache server front-end) that sits in front of multiple applications or websites.
In this case, if Jira detects the wrong context path (it outputs the detected context path when it runs), you can manually set the context path by adding an extra parameter "-context-path=<path>" eg:
java -jar access-log-analyser-1.0.jar file=* context-path=/jira
Output
The tool will create 3 files:
request-log-summary.wiki
This summarizes the incoming requests into a number of known categories.
The output is in Wiki format suitable for inserting into Confluence or Jira.
requests-over-time.html
A web page which charts the requests over the time analyzed.
This uses Google charts and requires internet access to download the Google Charts javascript libraries.
unknown-requests.txt
This is only useful if the request-log-summary.wiki shows a high number of "Unknown" requests.
It attempts to group the most popular unrecognized requests.
Understanding the results
Request Categories
The main job of the analyzer tool is to read all the requests in the request logs and put each one into a known "request category". Here is a quick breakdown on some common categories and what they mean:
- Dashboard
This is a call to show the Dashboard page. There is a single one of these per Dashboard, and then each "Gadget" in the dashboard will call back to Jira to load its configuration - these are shown in the "GADGETS" category. Then, most gadgets will make at least one additional REST call to load the data that it is interested in. These will most often show up in the "REST_API" category. - Login
A Login request. - IssueNavigator
The issue search page (simple or advanced). - ViewIssue
A request to get the "View Issue" page in Jira - no surprises when this is high. - PLUGIN_RESOURCES
Resources (presumably CSS, javascript and images) that are being served from a plugin. - REST_API
Generic calls to the REST API - note that there are a specific REST endpoints that are categorized. These may be coming from AJAX calls in Jira pages, or could be external tools/plugins querying Jira for data. A single page can make multiple calls to REST to get information. Note that a few areas of REST are common enough to warrant their own specific category, eg GreenHopper REST end-points are shown under REST_GREENHOPPER. - Unauthorized_401
The request returned a 401 (Unauthorized) status code. The user will have been redirected to the Login page.
(v1.1 or higher) - Forbidden_403
The request returned a 403 (Forbidden) status code. The user was logged in and requested a URL that they are not permitted to see.
(v1.1 or higher) - CLIENT_ERROR_4xx
Any request that ended in other 4xx HTTP response, e.g. invalid URLs will get a 404 response. In v1.0 of the tool, 401 and 403 responses are included here, in v1.1 they have their own categories - see above. - RPC
Any call to the older remote APIs - XML/RPC and SOAP. - STATIC_RESOURCE
Includes all images, CSS, javascript, etc. These resources should all be cache-able in the browser. These are ignored in our "percentages" because we are trying to get relative numbers on the dynamic resources. - ProjectAvatar
The project avatar image.
Note that some of these requests will get called exactly once for a given page load (eg ViewIssue, IssueNavigator), but others can be called multiple times in a single page load (eg REST, GADGETS). The tool lists all categories ordered by the request count, and also includes a percentage and a cumulative percentage for the dynamic resources.
An example from jira.atlassian.com
J.A.C Requests by Category | |||
---|---|---|---|
Request Category | Count | Perc. | Cumul. |
STATIC_RESOURCE | 7675051 | - | - |
REST_API | 1560461 | 20.9% | 20.9% |
ViewIssue | 1168760 | 15.7% | 36.6% |
SearchRequest_XML | 1047243 | 14.0% | 50.6% |
IssueNavigator | 447391 | 6.0% | 56.6% |
GADGETS | 313464 | 4.2% | 60.8% |
REST_ACTIVITY_STREAM | 260381 | 3.5% | 64.3% |
ProjectAvatar | 214543 | 2.9% | 67.2% |
Login | 209259 | 2.8% | 70.0% |
RPC | 187814 | 2.5% | 72.5% |
CLIENT_ERROR_4xx | 177301 | 2.4% | 74.9% |
Dashboard | 162825 | 2.2% | 77.1% |
ACTIVITY_STREAMS | 138830 | 1.9% | 79.0% |
REST_GREENHOPPER | 137477 | 1.8% | 80.8% |
You can see that of these requests, only a few are primary requests to load a page - the others are "background requests" (REST, RPC, Gadgets...) plus a few errors that can be ignored (the 4xx responses). STATIC_RESOURCE is already ignored from the percentages, and of the rest "background requests" account for approximately half the total number. Meaning that the ViewIssue page accounts for 15.7% of all requests, but more interestingly this equates to about 30% of all pages visited.
So at the page level, we can see the most common actions on J.A.C are approximately:
Page | usage |
---|---|
View Issue | 30% |
XML Search Requests | 26% |
Issue Navigator (search) | 12% |
Login | 5% |
Dashboard | 4% |
High level assessment of J.A.C
1. We think that the View Issue, Issue Nav and Dashboard numbers are in the normal range for an instance of this type.
2. Although Login seems quite high, is it explained by the large number of casual users on this system - J.A.C is a public instance with about 80,000 user accounts.
3. Search Requests that return issues in XML format seems excessive and is probably quite specific to this instance - this is one of the reasons it is useful to get a broader picture of "normal usage".
Internal and external findings
Not content with running this tool on our own Jira server we also asked a number of our enterprise customers to run the tool on their servers and send us the anonymized results. By knowing how our customers were using Jira we could tune our performance tests to more accuratley reflect how Jira is being used by our largest customers.
Listed below are the combined findings on a number of large Jira servers that include our own and also several customers. The Jira version and peak load are included along with the top five most common request categories and their percentage (static resources are ignored). The "peak load" is the maximum load seen during peak times. A second load in brackets indicates a one-off spike where no other peaks come close.
Version | Peak Load | req 1 | req 2 | req 3 | req 4 | req 5 | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
v5.0.7 | 21.6 req/sec | GADGETS | 31.1% | REST_GADGET | 14.5% | ProjectAvatar | 11.9% | REST_OTHER | 9.7% | RPC | 4.7% |
v4.4.4 | 20 (45.5) req/sec | GADGETS | 17.5% | REST_GADGET | 10.8% | ViewIssue | 8.5% | REST_API | 8.3% | ProjectAvatar | 6.8% |
v4.4 | 19.8 (39.5) req/sec | REST_whoslooking | 51.3% | GADGETS | 11.9% | REST_GADGET | 8.3% | PLUGIN_RESOURCES | 4.7% | REST_API | 4.0% |
? | 15.8 req/sec | ProjectAvatar | 24.8% | REST_API | 18.9% | GADGETS | 11.6% | REST_GADGET | 6.9% | RPC | 6.7% |
? | 12.9 req/sec | ProjectAvatar | 36.2% | REST_API | 17.0% | GADGETS | 10.6% | REST_GADGET | 7.2% | ViewIssue | 4.3% |
v4.4.3 | 12.4 req/sec | ProjectAvatar | 18.5% | GADGETS | 13.8% | RPC | 11.4% | USER_AVATAR | 9.2% | REST_GADGET | 7.9% |
v5.1 | 12.1 req/sec | REST_API | 20.9% | ViewIssue | 15.7% | SearchRequest_XML | 14.0% | IssueNavigator | 6.0% | GADGETS | 4.2% |
v4.3.4 | 12.1 (31.6) req/sec | GADGETS | 23.4% | REST_API | 13.0% | REST_GADGET | 8.9% | Tempo | 8.6% | ViewIssue | 7.7% |
v5.0 | 10.3 req/sec | GADGETS | 31.3% | REST_GADGET | 21.6% | USER_AVATAR | 8.9% | ViewIssue | 6.3% | REST_API | 4.0% |
v5.1 | 9.5 req/sec | IssueNavigator | 28.7% | GADGETS | 17.0% | REST_GADGET | 10.3% | REST_OTHER | 9.6% | RPC | 4.0% |
v5.1 | 8.2 req/sec | CLIENT_ERROR_4xx | 27.5% | SearchRequest_XML | 11.1% | REST_API | 9.0% | GADGETS | 7.5% | REST_GADGET | 6.3% |
v3.13 | 4.9 req/sec | LazyPortletLoader | 26.6% | ViewIssue | 15.7% | Charts | 13.9% | Dashboard | 6.5% | WorkflowTransition | 4.8% |
v4.2.2 | 4.6 req/sec | GADGETS | 21.2% | REST_GADGET | 20.1% | REST_API | 11.4% | OpenSearch | 9.7% | ViewIssue | 7.0% |
Confirming our performance tests
Our current performance tests are based around the assumption that View Issue, Issue Nav and Dashboard are the three primary pages used and account for a majority of incoming requests. (The tests include a large number of secondary pages including Login, Edit Issue, Browse Project, etc ... but each only accounts for a relatively small share of all requests - 2% or less). We wanted to see if there were other pages that were very important for customers, and also if the ratio we had was realistic. We found that these are indeed the top three pages used by our customers as well. The most common ratio was between two and three View Issue views per Search and about 1.5 Searches per Dashboard.
The one interesting pattern that we saw was that a number of sites had a very large number of Gadget requests compared to Dashboard requests (between 10:1 and 40:1). Although this could be dashboards with more than 10 gadgets on them, it seems more likely that this indicates heavy usage of "wallboards": unattended dashboards where individual gadgets refresh themselves periodically.