Uploaded image for project: 'MusicBrainz Search Server'
  1. MusicBrainz Search Server
  2. SEARCH-336

searching "Universal Music" and 34. entry is correct "Universal Music"

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Normal Normal
    • 2018-05-06 Last Lucene
    • None
    • None
    • None

      http://musicbrainz.org/search?query=%22Universal+Music%22&type=label&limit=100&method=indexed

      "Universal Music" gets only score 73 and is 34. match.

      How can we expect that some user may find correct "Universal Music"?

          [SEARCH-336] searching "Universal Music" and 34. entry is correct "Universal Music"

          phonebox added a comment -

          The issue seems to have been properly fixed by the new Solr search.

          phonebox added a comment - The issue seems to have been properly fixed by the new Solr search.

          Robert Kaye added a comment -

          Still a valid bug – reopening.

          Robert Kaye added a comment - Still a valid bug – reopening.

          kaik added a comment -

          for example "Universal Jazz" has score 98 and it has aliases "Universal Jazz Germany", "Universal Music Classics & Jazz" and "Universal Music Classics and Jazz Germany"

          and if I read correctly, currently aliases is double boosted (2.0) compared to names.
          http://bugs.musicbrainz.org/changeset/13662

          kaik added a comment - for example "Universal Jazz" has score 98 and it has aliases "Universal Jazz Germany", "Universal Music Classics & Jazz" and "Universal Music Classics and Jazz Germany" and if I read correctly, currently aliases is double boosted (2.0) compared to names. http://bugs.musicbrainz.org/changeset/13662

          Paul Taylor added a comment -

          If you were just searching field name yes that true, but you are not so the concept of perfect match makes no sense. i.e it matches on the name field, but if another label matches on name and alias which is the better match ?

          Paul Taylor added a comment - If you were just searching field name yes that true, but you are not so the concept of perfect match makes no sense. i.e it matches on the name field, but if another label matches on name and alias which is the better match ?

          yindesu added a comment -

          > "Universal Music" is identical to "Universal Music" though, so it should be a perfect match because of that, not because we added it to a list of exceptions.

          This is a much better way to describe the problem I tried to describe in SEARCH-308. ("Big Bang" is identical to "Big Bang" so it should be a perfect match, not a low match.)

          yindesu added a comment - > "Universal Music" is identical to "Universal Music" though, so it should be a perfect match because of that, not because we added it to a list of exceptions. This is a much better way to describe the problem I tried to describe in SEARCH-308 . ("Big Bang" is identical to "Big Bang" so it should be a perfect match, not a low match.)

          nikki added a comment -

          "Universal Music" is identical to "Universal Music" though, so it should be a perfect match because of that, not because we added it to a list of exceptions.

          https://musicbrainz.org/search?query=Warner+Music&type=label&limit=25&method=indexed is another example with the same problem - the exact match is 20th.

          nikki added a comment - "Universal Music" is identical to "Universal Music" though, so it should be a perfect match because of that, not because we added it to a list of exceptions. https://musicbrainz.org/search?query=Warner+Music&type=label&limit=25&method=indexed is another example with the same problem - the exact match is 20th.

          kaik added a comment -

          SEARCH-282: "Previously we document boosted a hard-coded list of artists and labels to resolve this. Although this approach was not perfect because the list was incomplete it did solve the most common cases, this approach is not working since moving Lucene v3 to Lucene v4 because its creating ridicously large field norms for some reason so currently disabled"

          kaik added a comment - SEARCH-282 : "Previously we document boosted a hard-coded list of artists and labels to resolve this. Although this approach was not perfect because the list was incomplete it did solve the most common cases, this approach is not working since moving Lucene v3 to Lucene v4 because its creating ridicously large field norms for some reason so currently disabled"

          kaik added a comment -

          I guess correct fix would be same as you suggest on SEARCH-366: "Could fix by partially adjusting the weightings so that name is more weighted then description and alias"

          kaik added a comment - I guess correct fix would be same as you suggest on SEARCH-366 : "Could fix by partially adjusting the weightings so that name is more weighted then description and alias"

          Paul Taylor added a comment -

          Hmm, not sure why the boost isn't having effect.

          The general thing with hardcoding the boost is for the cases where what the user types may match somthing else but we know they usually mean the lower matching results. Its more obvious with classical composers where the usual behaviour is we want a match on the artist name to do better than a match on only an alias, but for some artists they are better known by their alias so they need a boost. I dont see how we cant hardcode it unless there is something in the database that can help us identify these cases.

          Paul Taylor added a comment - Hmm, not sure why the boost isn't having effect. The general thing with hardcoding the boost is for the cases where what the user types may match somthing else but we know they usually mean the lower matching results. Its more obvious with classical composers where the usual behaviour is we want a match on the artist name to do better than a match on only an alias, but for some artists they are better known by their alias so they need a boost. I dont see how we cant hardcode it unless there is something in the database that can help us identify these cases.

          kaik added a comment -

          all or almost all of the higher scored entries has alias with Universal and Music words.

          kaik added a comment - all or almost all of the higher scored entries has alias with Universal and Music words.

            Unassigned Unassigned
            Anonymous Anonymous
            Votes:
            2 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved:

                Version Package
                2018-05-06 Last Lucene