Skip to end of metadata
Go to start of metadata

Repository administrators can import issue data into or export it from Bitbucket.  Bitbucket exports (and imports) data using a ZIP package. For imports, this package must contain single a db-1.0.json file and, if the issues had attached files, an optional attachments directory.  To extract the package, use the zip command or a GUI utility such as 7-zip.  

$ zip -Tv bitbucket-issues.zip 
Archive: bitbucket-issues.zip
 testing: attachments/cow_small.png OK
 testing: db-1.0.json                  OK

When using the issue tracker import/export feature to move issues from one Bitbucket repo to another, the issue IDs remain the same. However, the comment IDs and therefore the permlinks, change. Comment IDs change because, in contrast to issue IDs, they are not local to the repo. 

Certain objects, such as comments, rely on foreign keys.  During an import, Bitbucket silently uses NULL to replace any foreign keys that it cannot resolve (for example, a username that no longer exists on Bitbucket).

Finally, when you export data, Bitbucket does not export issues marked with a spam flag.  When importing data from ZIP files created by third-party extensions, the standard Bitbucket spam checking applies.  

Example db-1.0.json File

The db-1.0.json file specifies the following objects:

The following is an example db.json file:

db-1.0.json

issues

The db-1.0.json file contains an issues array. That array contains one more issue objects each of which has the following fields:

Field

Required/

non-null

Description
assignee A string value containing the username (for example, "evzijst"). This value can be null.
component A string value containing a component name. The value must be declared in a components section (for example, "api"). This value can be null.
content A string value containing the issue's content. This value can be null.
content_updated_on(tick)The timestamp of the most recent change to the issue's description.  A non-null string in ISO 8601 datetime format (for example, "2013-03-21T03:44:04.660869+00:00").
created_on(tick)A non-null string in ISO 8601 datetime format (for example "2013-01-22T15:55:13.573339+00:00").
edited_on The timestamp of the most recent change to the issue's description (deprecated).  A string in ISO 8601 datetime format (for example "2013-01-22T15:55:13.573339+00:00"). This value can be null.
id(tick)A non-null, unique integer representing the issue identifier.
kind(tick)

A non-null string containing one of the following values:

"bug"
"enhancement"
"proposal"
"task"
milestone A string value containing a milestone name. The value must be declared in a milestones section (for example, "M2"). This value can be null.
priority(tick)

A non-null string containing one of the following values:

"trivial"
"minor"
"major"
"critical"
"blocker"
reporter A string value containing the username (for example, "evzijst"). This value can be null.
status(tick)

A non-null string containing one of the following values:

"new"
"open"
"resolved"
"on hold"
"invalid"
"duplicate"
"wontfix"

title(tick)A non-null string representing the issue title. This string has a 255 chars limit.
updated_on(tick)The timestamp of the most recent change to the issue's state. This includes everything from assignee to status. A string in ISO 8601 datetime format (for example, "2013-04-21T03:12:04.660869+00:00"). This string cannot be null.
version A string value containing a version name. The value must be declared in a versions section (for example, "1.0"). This value can be null.
watchers A list of usernames. This value can be null.
voters A list of usernames. This value can be null.

comments

The db-1.0.json file contains an comments array. That array contains one more comment objects each of which has the following fields:

Field

Required

non-null

Description
content A string value containing the comment's content. This value can be null.
created_on(tick)A non-null string in ISO 8601 datetime format (for example, "2013-01-26T01:00:55.994000+00:00")
id(tick)A non-null integer that is unique among the comments array.
issue(tick)A non-null, integer representing a foreign key to an existing issue.id in the issues array.
updated_on A string in ISO 8601 datetime format (for example, "2013-04-21T03:12:04.660869+00:00"). This string can be null.
user A string value containing the username (for example, "evzijst"). This value can be null.

attachments

The db-1.0.json file contains an attachments array. That array contains one more attachment objects each of which has the following fields:

Field

Required

non-null

Description
filename(tick)A non-null string that is the name of an attachment as it appears on the issue tracker (for example, "cow_small.png").
issue(tick)A non-null, integer representing a foreign key to an existing issue.id in the issues array.
path(tick)A non-null string that contains the location of a file in the ZIP.
user A string value containing the username (for example, "evzijst"). This value can be null.

components

The db-1.0.json file contains an components array. That array contains one or more component objects each of which has a single name field. The name must be a string (max 128 characters) that is unique and non-nullable (for example, api).

milestones

The db-1.0.json file contains an milestones array. That array contains one or more milestone objects each of which has a single name field. The name must be a string (128 characters max) that is unique and non-nullable (for example, M2).

versions

The db-1.0.json file contains an versions array. That array contains one or more version objects each of which has a single name field.  The name must be a string (128 characters max) that is unique and non-nullable  (for example, 1.0).

meta

The meta object has the following fields:

Field

Required

non-null

Description
default_assignee A string value containing the username (for example, "evzijst"). This value can be null.
default_component A string value containing a component name. The value must be declared in a components section (for example, "api"). This value can be null.
default_kind(tick)

A non-null string containing one of the following values:

"bug"
"enhancement"
"proposal"
"task"
default_milestone A string value containing a milestone name. The value must be declared in a milestones section (for example, "M2"). This value can be null.
default_version A string value containing a version value. The value must be declared in a versions section (for example, "1.0"). This value can be null.

logs

The db-1.0.json file contains an logs array. That array contains one more log objects each of which has the following fields:

Field

Required

non-null

Description
changed_from A string of 255 chars or less containing the original value (for example, "/evzijst/bitbucket/issue-attachment/1/evzijst/bitbucket/1359162055.6/1/cow_small.png"). This value can be null.
changed_to(tick)A string of 255 chars or less containing the new value (for example, "cow_small.png"). This value cannot be null. Instead, an empty string is used to indicate that the field has no value.
comment(tick)A non-null integer that is a foreign key to a comment.id in the comments section.
created_on(tick)A non-null string in ISO 8601 datetime format (for example "2013-01-22T15:55:13.573339+00:00").
field(tick)A non-null string representing the name of the changed field (for example, "attachment"). This value should not exceed 32 characters.
issue(tick)A non-null, integer representing a foreign key to an existing issue.id in the issues array.
user A string value containing the username (for example, "evzijst"). This value can be null.

25 Comments

  1. Anonymous

    Thanks for keeping the data format so simple! I wrote a Python script for syncing issues with Redmine that uses this data format, and it was very easy to work with. Here's the code:

    https://bitbucket.org/theasci/redmine-issues-to-bitbucket/src/master/redmine_issues_to_bitbucket.py#cl-266

  2. I'm seeing that in a large issue import where I've ensured all comment/description fields are markdown, that a lot of issues/comments are being interpreted as "creole" instead, though there isn't a pattern to it (though I'm wondering if it's based on the date at which BB switched their markup systems?  icky...).  If I try to edit one of the fields post-import, it seems to then recognize as markdown.  Is this a bug and/or is there a way to instruct the import that all text should be intepreted as markdown and not anything else ?

    1. Hey Mike,

      Creole/MD is indeed determined on timestamp. On 4 Oct 2012 we switched everything to Markdown. Every new piece of content created after that is assumed to be markdown. Creole is no longer usable. Of course we couldn't break the existing content and so we made it such that content untouched since that date is interpreted as Creole.

      If an old issue is being edited, it switches to Markdown. We made this decision in an attempt to make things more straightforward so users only needed to know one markup language going forward.

      Now importing and exporting should not have any effect on any of this. Exporting and re-importing old, untouched issues should continue to be rendered in Creole, while new, Markdown issues should continue to be MD. It's controlled by the issue.content_updated_on and comment.updated_on fields.

      If you are seeing different behavior, can you send me the exported zip, a link to the original repo and the URL of an issue/comment that gets rendered incorrectly?

       

      1. Anonymous

        I'm running an import that was generated from an external system, e.g. trac.   So what's the solution for that use case?

      2. I'm running an import that was generated from an external system, e.g. trac.   So what's the solution for that use case?

        1. Gotcha. To be frank, that scenario wasn't considered, as the MD switch predated the import/export functionality.

          As a quick dirty workaround for now I would just set the issue.content_updated_on and comment.updated_on fields to a date after Oct 4th 2012. Unfortunately those incorrect dates are displayed in the UI.

          Since there's no markup column in the database, there's not really anything else I can do at this point, other than adding that. If you want we can raise an issue for that though.

          1. hmm isn't updated_on the same as if you had edited the comment?  I just edited a comment on my test import (https://bitbucket.org/zzzeek/sqla_exp/issue/2149#comment-8176295) and it still shows the date of the comment itself.   ill test this further and this might be how I go forward for now.

            there should be some kind of flag in the import format that can signal this kind of thing.   either an optional field on each comment/description called "format" which only accepts "markdown" or "creole" for now perhaps, or something on the import file as a whole.   Can you raise an issue for that?

            1. OK that doesn't work. 

              Here's an imported comment:

               {

                                      "content": "the example is correct.\r\n\r\nFirst one is:\r\n\r\n\n    >>> a1 = addresses.alias('a1')\r\n    >>> a2 = addresses.alias('a2')\r\n\n\r\nsecond one is:\r\n\r\n\n    >>> a1 = addresses.alias()\r\n    >>> a2 = addresses.alias()\r\n\n\r\nSQL output is also different.\r\n\r\nhowever , that whole section is too verbose and using a name in the `alias()` is only worth a side mention, also the point that \"anonymous aliases have the advantage of deterministic SQL\" is not really accurate, as a named alias has the same behavior.  just mention that anonymous aliases don't produce a \"random\" string.",

                                      "created_on": "2011-04-22T19:07:33+00:00",

                                      "user": "zzzeek",

                                      "updated_on": "2012-10-05T00:00:00+00:00",

                                      "issue": 2145,

                                      "id": 8082

                              },

               but still comes out as "Creole": https://bitbucket.org/zzzeek/sqla_exp/issue/2145#comment-8220296

               

              1. I can't reproduce that and I don't have access to that repo to see the result you're getting.

                Mind switching to email and emailing me your zip file that I can test with locally? I'm erik@atlassian.com

                 

  3. Anonymous

    "The timestamp of the most recent change to the issue's state. This includes everything from assignee to status."

    What does that mean? Clearly it means "assignee" and "status," but what are the fields other than that? (I'm attempting to write an impoter from outside Bitbucket, so I need to know more than I would if I were doing an export --> import operation.)

    1. What does that mean? Clearly it means "assignee" and "status," but what are the fields other than that?

       Any issue property. So another would be milestone, but basically everything you can change when editing an issue.

      1. Anonymous

        But, for example, a comment would not matter? Just the fields defined on "issue" (for the purposes of import)?

          1. Anonymous

            Thank you very much.

  4. Anonymous

    What are the legal values for "field" in the "logs" array? Any 128-character string? Are there any that the issue tracker "understands?"

    1. As per the spec: "This value should not exceed 32 characters."

      This field can contain any string, but is meant to contain the name of the field that was changed (e.g. 'milestone').

      You might want to create some activity in the issue tracker and then export it to inspect the generated json file.

      1. Anonymous

        Sorry, yes, 32 characters, I meant. Are any strings interpreted specially, I mean, or is the string a display-only string? Is there a list of field names? (I understand that if there isn't I can try to experiment with the issue tracker and export to try to reverse-engineer them, but I'd love a quicker and/or more complete solution if it exists.)

        1. Are any strings interpreted specially, I mean, or is the string a display-only string?

          Nope. The log entries are for display purposes only.

          Is there a list of field names?

          Sure. It's every field name on the issue object.

          1. Anonymous

            Thank you! Very clear!

  5. Anonymous

    Is anything assumed about the order of comment ids, or can each ID just be a random number? Is the ordering of comments governed entirely by created_on rather than id?

  6. Anonymous

    Are the user IDs expected to be bitbucket IDs with no "at" symbol, as in the examples? If E-mail addresses are used, are they validated in any way? Can I use E-mail addresses that do not correspond to Bitbucket accounts? (If I create my import file from another system, for example?)

  7. When trying to import a list of around 340 bugs, which typically have at least 2-3 machine generated comments each, we run into an issue where somewhere between 5-50% of the bugs (and a lot of the comments, not sure how many exactly) get automatically marked with is_spam. It seems to vary import by import. The first time we imported the bugs, we had only 146 bugs visible inthe Bitbucket issue tracker. Subsequent import resulted in 332 visible bugs, but still short of the full set. It appears that the visibility varies with the user too. The owner of the issue tracker can see more issues than a registered Bitbucket user that is not a collaborator on the repository.

    Is there a mechanism for defeating the SPAM marking on import or (programmatically) updating the is_spam flag? Our import is from a private bug tracker and we know that every single bug and comment is SPAM-free.

    1. Our import is from a private bug tracker and we know that every single bug and comment is SPAM-free.

      That spam check is there to protect Bitbucket from people deliberately uploading a ton of spam. That's why incoming content gets checked.

      Now clearly our spam checker goes off the rails somewhere if it ends up incorrectly flagging your content. We've seen this before, but haven't been able to do anything meaningful about it.

      I've created an internal issue for this, but for now, we can help you out by manually resetting the spam flags on all your imported content after you're done. Can you contact us at support@bitbucket.org and refer to this comment?

      1. Thanks Erik. We have now completed our final import of the issue data and as in our tests, there are both issues and comments marked as spam. I have a support ticket open, where I requested the spam flags to be cleared. The issue has Reference: BBS-14925

        I have also attached the zip file with the import data to the above issue, so that if you decide to work on fine tuning spam detection, you can use this data set as a test case.