-
New Feature
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
None
The link-rot with external URLs on entities has been widely recognized as a problem. The specific problem of dead URLs hosting unrelated/spam/malicious content has been resolved by removing the hyperlink status from them (making them unclickable), but if a user wants to see what the URL held historically it still is necessary to manually copy-paste it into e.g. the Internet Archive and go through the possibly very large number of capture timestamps to find one holding the latest but still correct data.
My proposed solution to this problem would be to create a URL-URL relationship for indicating an archived version of a URL. This could be implemented e.g. with a whitelist containing only the Internet Archive Wayback Machine URL pattern for now. The archived URL could automatically be displayed alongside the original once it is marked as ended (e.g. like Official homepage: https://example.com (ended) (archived version) (with the URL still not clickable), where the "archived version" links to the snapshot.
A further refinement could be to store the capture timestamp along the archive URL (or extract it from it), with the server automatically displaying the archived URL with the timestamp closest but earlier than the "ended" date set on the original URL, if applicable. This would be secondary though, even just the plain URL-URL relationship would already help a lot.
This would allow users who went through the trouble of looking through the capture in the Internet Archive to share their effort for other users. Furthermore it would open the door to automated link archiving of submitted URLs, with a bot crawling entities and submitting archive requests to the Internet Archive, and storing the resulting URLs, which would remain hidden until they are needed once an editor marks the original URL as ended.
- has related issue
-
MBS-10942 Show an Internet Archive link for ENDED URL relationships
- Open