UWC Illegal Pagenames Framework

Place to discuss the Illegal Pagenames feature of the UWC.

Overview

Different wikis have different metacharacters: characters that have syntactical meaning. These characters sometimes cannot be part of page name syntax, for the obvious reasons: What would happen if you tried to link to a page that had a right bracket in the name? So, Confluence, like many other wikis, has a list of characters that cannot be used in page names. You'll see a list of them in the error message, if you try to use any of them to create a page.

So what are the illegal chars, already?
: ; < > @ / \\ | # [ ] { } ^

And also: these can not be used to start a pagename:

$ .. ~

However, the UWC does not use Confluence's GUI to create pages, and the Confluence Remote API at this time, does not appear to test for these characters, when creating a page. Furthermore, even if it did, we'd still have an issue with maintaining the links to those pages.

Solution

So, the solution was to create two converters that are run at the end of every wiki conversion:

  1. to translate illegal pagenames into a legal equivalent
  2. to fix links to those pagenames

These converters are called directly from the engine, so that they cannot be turned off.

Concerns

  • Performance - Solutions to this problem are potentially slow, especially if being run on alot of pages (meaning thousands).
  • Doesn't handle right brackets in pagenames gracefully. (Luckily, due to the prevalence of right brackets in wiki syntax, this is an unusual use case)

Customization

Replacement Characters

The replacement characters for this conversion are kept in a properties file which you can customize.
The file is located in [uwc-dir]/conf/settings.illegalcharmap.properties

Each char's replacement is described in this file. For example, the property for semi-colons looks like this:

illegalchar.semicolon.replacement=.

Each illegal character has a similar line.
In the event that an existing illegal character is not validly mapped in this file (definition was deleted, another illegal char is the new setting, etc), the default conversion will be used:
The default conversion for an illegal character is _ (underscore).

Disabling the Default Illegal Pagenames Handling

See UWC Illegal Pagenames Framework - Disabling

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.