-
Improvement
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
None
Summarising http://bugs.musicbrainz.org/ticket/5759
For Wikipedia pages whose names contain non-ASCII characters, there are two valid kinds of URLs: those with every character represented by itself, and those with the non-ASCII characters URL-encoded. They are both "valid" in the sense that apparently both work as expected when given as addresses to browsers, when used as a "href", and when given to MusicBrainz in a URL AR.
(Note: I believe it is possible for a URL to be partly encoded, ie some non-ASCII characters can be URL-encoded and others not, in the same URL.)
It would be nice if the JS converted both kinds of URLs to the non-encoded form when pasted; it is more legible for users.
Note that some browsers (eg Firefox) convert URLs from the address bar to the encoded form when copying to the clipboard (to protect them from Unicode-deficient applications, presumably), which makes users pasting encoded URLs a common use-case.
I'm writing this specifically for Wikipedia because that's the use case I run into most often. I think this might be applied to all URLs, but I haven't tested it. Something similar might be applicable to Punicode-encoded URLs, though I haven't yet needed to use any of them.
Example:
http://ro.wikipedia.org/wiki/Ro%C8%99u_%C8%99i_Negru_%28forma%C8%9Bie%29
Note that the decision in MBS-217 means they will always be stored encoded/canonicalised, and MBS-219 means they will be un-encoded for display. This ticket proposes extending the unencoding to the client-side edit box; overriding the encoding a browser's copy+paste might do.