-
Improvement
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
None
The use of regular expressions to parse URLs is error prone and does not handle various corner cases. All modern browsers now support the URL and URLSearchParams objects for parsing/manipulating URLs and query parameters. It would be sensible to replace the use of regular expressions with these APIs in the URL Cleanup code.
This change would require some changes in frontend URL rendering, as the URL APIs perform strict percent encoding/decoding and validation, e.g.
- https://baike.baidu.com/item/啊呀啦嗦 -> https://baike.baidu.com/item/%E5%95%8A%E5%91%80%E5%95%A6%E5%97%A6
- https://embed.spotify.com/?uri=spotify:track:7gwRSZ0EmGWa697ZrE58GA -> https://embed.spotify.com/?uri=spotify%3Atrack%3A7gwRSZ0EmGWa697ZrE58GA
- https://overture.doremus.org/performance?place=Westminster%20Cathedral -> https://overture.doremus.org/performance?place=Westminster+Cathedral
The above means that there are various URLs in the database that may need reformatting.
Related tickets: