Uploaded image for project: 'MusicBrainz Server'
  1. MusicBrainz Server
  2. MBS-2523

Some reports that might be useful for cleanup...

XMLWordPrintable

    • Icon: New Feature New Feature
    • Resolution: Invalid
    • Icon: Normal Normal
    • None
    • None
    • Reports
    • None

      NGS post-migration clean-up:

      • Recordings that share 5+ PUIDs in common.
        This would help identify the best candidates for merges. As the list shrinks, we can lower the threshold.
      • Identically named Works by one artist.
        Relatively few artists use the same title for multiple works (aside from "Untitled" or "Mvt. I" and crap like that). It might be nice to make a search that finds Works by an artist where the name differs only in a parenthesized subtitle that includes words like "remix", "version", basically the same stuff we check for with Guess Case). That also depends on whether or not we want to merge remixes with their source works.
      • DiscIDs in release groups with no matching CDs
        More precisely: a list of releases with DiscIDs that are in release groups where there are no CD format releases that match the number of tracks in the DiscIDs. This would give us a mostly-complete list of releases where DiscIDs are at risk of being orphaned if they're removed from current issues, and still be more precise than just giving a complete list of DiscIDs attached to non-CD format releases.

      Other general clean-up

      • PUIDs assigned to recordings of different length.
      • PUIDs assigned to songs with different names by different artists.
      • PUIDs assigned to different tracks on the same album.
        This is assuming we even want to attack this mess. We could look for more candidates that require less research first by searching for PUIDs linked to 3+ tracks with the matching name and duration and one track that doesn't match.
      • Releases with conflicting DiscIDs attached.
        If the TOC on two DiscIDs attached to the same release have differences in total disc time more than 30s or if corresponding tracks differ by more than 15s.
      • Releases with DiscIDs that are substantially different than the times on the tracklist.

            Unassigned Unassigned
            torc Ryan Torchia
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved:

                Version Package