Uploaded image for project: 'Picard'
  1. Picard
  2. PICARD-507

After commit ab32813972382822797657b77668cae269811568 tracks are added to wrong release

      I've ripped two CD's, Paste Magazine Sampler #48 and #41.

      With Picard 1.1 I can load them, cluster them, look-them-up, and
      everything works.

      With the git master I load and cluster them. Looking up #48 works as
      expected, but when I look up #41 its tracks get crammed (poorly) into
      Album #48. Similar misbehavior happens if I load #41 first and #48
      second.

      I used git bisect to walk the commits between 1.1 and the tip of the
      git tree and discovered that commit
      ab32813972382822797657b77668cae269811568 caused the problem.

      The bit that seems to be causing the change in behavior is in
      metadata.py:compare_to_release, at the very end of the method, where
      the score for the release is given a boost if it is in a release group
      that has already been loaded. cluster.py:_compare_to_release calls
      metadata.py:compare_to_release when it's trying to figure out where to
      put its tracks.

      This change is sufficient to cause #48
      release: u'90b66a30-878b-4c32-a943-7f87842d2a2c'
      release group: u'826bc83c-069b-3e09-aaea-a44ebb2dbcda'
      to appear to be a better match for #41 than its real release
      release: u'30979257-92ba-455f-8423-7b7509090165'
      release group: u'14338bba-34af-3a87-9932-c2c55f96b3ee'

      I'll attach three debugging traces ( from./tagger.py -d): good-trace,
      bad.trace, and bad-trace-patched-to-work.trace which are for the good
      commit, bad commit and (yep...) the bad commit with metadata.py lines
      138-140 commented out.

      Please ignore the register_save_file_processor error, that's my
      transcoding plugin that's only supported in my code.

      All three traces have this snippet in cluster.py:_lookup_finished,
      around line 186 (right after the commented out debug statement)

      for match in matches:
      self.log.debug("Track id=%r, score=%r, ", match[1].id, match[0])

      so that the tracks and their scores appear in the trace files.

      I don't have enough background to understand why the fact that one of
      the candidate releases being considered for a cluster has already been
      loaded should matter. It certainly messes up this case.

      So far, running with metadata.py lines 138-140 commented out is
      working for me, but I don't really understand the ramifications.

      Anyone have any insights?

        1. bad.trace
          38 kB
        2. bad-patched-to-work.trace
          42 kB
        3. good.trace
          42 kB

          [PICARD-507] After commit ab32813972382822797657b77668cae269811568 tracks are added to wrong release

          I'm cleaning up old TODO list items and stumbled across this.

          Has it been dealt with?

          George Hartzell added a comment - I'm cleaning up old TODO list items and stumbled across this. Has it been dealt with?

          I'm leaning towards a modified version of the first solution: check if the release group of the top match contains a loaded release, AND check if the recording we've matched to appears on the loaded release.

          Michael Wiencek added a comment - I'm leaning towards a modified version of the first solution: check if the release group of the top match contains a loaded release, AND check if the recording we've matched to appears on the loaded release.

          Thanks for the detailed bug report.

          Giving already-loaded releases more weight was intended to solve one of the most common complaints about the "Lookup" feature, which was that tracks from the same album would often get scattered among 2-4 different versions from the same release group. Commit ab32813972382822797657b77668cae269811568 solved that very effectively (I can easily reproduce the old, bad behavior by commenting out those lines), but you're right that it goes haywire in this case.

          The easiest way to solve this is to remove the lines you suggested, so that the "similarity" weighting isn't skewed. But /afterwards/ check if the top match's release group contains an already-loaded release, and use that instead. This has its own problem: one example would be where the recording is a bonus track that doesn't appear on the already-loaded release.

          Another solution is to let things match and become scattered as they previously would, but try to automatically merge albums from the same RG /after/ they're loaded, if the combined tracks/recordings all appear on a certain version and don't otherwise conflict. This could prove difficult to implement cleanly, and I guess the user could become confused that loaded albums are disappearing (assuming they even notice).

          We can't make things work perfectly in all cases, but a balance needs to be struck here with accuracy vs. usefulness.

          Michael Wiencek added a comment - Thanks for the detailed bug report. Giving already-loaded releases more weight was intended to solve one of the most common complaints about the "Lookup" feature, which was that tracks from the same album would often get scattered among 2-4 different versions from the same release group. Commit ab32813972382822797657b77668cae269811568 solved that very effectively (I can easily reproduce the old, bad behavior by commenting out those lines), but you're right that it goes haywire in this case. The easiest way to solve this is to remove the lines you suggested, so that the "similarity" weighting isn't skewed. But /afterwards/ check if the top match's release group contains an already-loaded release, and use that instead. This has its own problem: one example would be where the recording is a bonus track that doesn't appear on the already-loaded release. Another solution is to let things match and become scattered as they previously would, but try to automatically merge albums from the same RG /after/ they're loaded, if the combined tracks/recordings all appear on a certain version and don't otherwise conflict. This could prove difficult to implement cleanly, and I guess the user could become confused that loaded albums are disappearing (assuming they even notice). We can't make things work perfectly in all cases, but a balance needs to be struck here with accuracy vs. usefulness.

            Unassigned Unassigned
            hartzell George Hartzell
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:

                Version Package