Using diff transcoding in Stash

Redirection notice

This page will redirect to /display/BitbucketServer/Using+diff+transcoding+in+Bitbucket+Server .

As of Stash 3.1, Stash supports transcoding for diffs. This allows Stash to convert files in encodings like EUC-JP, GB18030 and UTF-16 to UTF-8, so they are processed correctly by git diff, which only supports UTF-8. Similar transcoding has been applied to Stash's source view since it was released, so this change brings the diff view in line with the source view. Diff transcoding is applied to commit and pull request diffs, as well as the diff-to-previous view.

Git for Windows, formerly known as msysgit, has known issues with Unicode paths. Diff transcoding works on all supported versions of Git for Windows, but 1.8.0 or higher is required to support Unicode paths.

Enabling diff transcoding

Diff transcoding must be explicitly enabled for each repository (unlike source view transcoding, which is always performed).

Repository administrators can enable diff transcoding on the repository settings page:

Performance and scaling

There's a performance consideration with transcoding. It is implemented using Git's textconv support, so using it adds overhead to displaying diffs. Where possible, the best approach, given git only supports UTF-8 content, is to use UTF-8 encoding so that transcoding is not necessary. In repositories without non-UTF-8 content, diff transcoding should be left disabled. Other encodings are often a necessity, however, and for repositories containing such content enabling diff transcoding allows using the full range of Stash features.

  Click here to read more...

When transcoding is enabled, git diff writes the before and after blobs to temporary files and invokes the textconv script once for each file. The script Stash installs uses Perl to send a request back to Stash with the path to each temporary file. Stash then opens each file, detects the encoding using the same algorithm the source view uses, converts the file to UTF-8 and streams it out for git diff to use. After git diff has invoked the textconv script the temporary files it created are deleted.

Writing the blobs to disk, starting Perl and calling back into Stash are all overhead processing compared to performing a diff without transcoding. How much overhead that is varies by the size of the diff. When nominally-sized files containing two or three thousand lines or less are being compared the overhead is miniscule, under 50 milliseconds on an average server. However, when comparing larger files the overhead can result in a noticeable delay displaying the diff.

Was this helpful?

Thanks for your feedback!

Why was this unhelpful?

Have a question about this article?

See questions about this article

Powered by Confluence and Scroll Viewport