Uploaded image for project: 'MusicBrainz Search Server'
  1. MusicBrainz Search Server
  2. SEARCH-293

Search scores absurdly low when there is a difference of only one letter

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • None
    • None

      artists with similar names should score similarly in search.

      For example:
      Masaki Suzuki
      Masaaki Suzuki

      Search for either of them, and you'll see that while the correct result is at the top with a score of 100, the other result is down much further with a score of 31.

      Obviously having a difference of one letter should allow them to score similarly.

      And importantly, this causes the second artist to fall down to page 2 (or beyond) of the quicksearch.

          [SEARCH-293] Search scores absurdly low when there is a difference of only one letter

          atj added a comment -

          Another example:

          atj added a comment - Another example:

          David Kellner added a comment -

          Even more absurd is the following example where only a missing last letter makes a huge difference:
          Searching for John Moffat returned* a lot of other artists with the last name Moffat but not John Moffatt (with two Ts). You had to type the full name up to the last T to get the desired result - before I added a search hint alias for John Moffatt.

          *) You can still see the previous results in an old forum post of mine (I know it has been a while since I wrote that but it seems I had forgotten to comment on this ticket). A very similar behaviour can still be observed in the above results for John Moffatt which do not include John Moffat.

          David Kellner added a comment - Even more absurd is the following example where only a missing last letter makes a huge difference: Searching for John Moffat returned* a lot of other artists with the last name Moffat but not John Moffatt (with two Ts). You had to type the full name up to the last T to get the desired result - before I added a search hint alias for John Moffatt . *) You can still see the previous results in an old forum post of mine (I know it has been a while since I wrote that but it seems I had forgotten to comment on this ticket). A very similar behaviour can still be observed in the above results for John Moffatt which do not include John Moffat .

          Paul Taylor added a comment -

          My mistake, took nikkis url without looking at it properly, I think there is some new stuff in Lucene 4.0 we can try for fuzzier searches without too much load on the server.

          Paul Taylor added a comment - My mistake, took nikkis url without looking at it properly, I think there is some new stuff in Lucene 4.0 we can try for fuzzier searches without too much load on the server.

          Alex Mauer added a comment - - edited

          reosarevok is correct. The search URLs are https://beta.musicbrainz.org/search?query=masaki+suzuki&type=artist&method=indexed and https://beta.musicbrainz.org/search?query=masaaki+suzuki&type=artist&method=indexed

          by quicksearch I meant anything used in the release editor or the relationship editor or 'relate to' — the point is not whether their search results are better or worse, but the fact that they limit the results-per-page to 10 means that I am unlikely to see that both these artists exist, as opposed to being simply a different Romanization of the same thing.

          Alex Mauer added a comment - - edited reosarevok is correct. The search URLs are https://beta.musicbrainz.org/search?query=masaki+suzuki&type=artist&method=indexed and https://beta.musicbrainz.org/search?query=masaaki+suzuki&type=artist&method=indexed by quicksearch I meant anything used in the release editor or the relationship editor or 'relate to' — the point is not whether their search results are better or worse, but the fact that they limit the results-per-page to 10 means that I am unlikely to see that both these artists exist, as opposed to being simply a different Romanization of the same thing.

          What they're saying is that should be the result for searching for "masaki suzuki" (no quotes), without needing the ~ and the AND

          Nicolás Tamargo added a comment - What they're saying is that should be the result for searching for "masaki suzuki" (no quotes), without needing the ~ and the AND

          Paul Taylor added a comment -

          Examples url by original submitter would be nice !

          I tried this search, and it only brings back eight results, all very similar so I cant see the problem:
          https://musicbrainz.org/search?query=masaki~+AND+suzuki~&type=artist&limit=25&method=indexed

          Is this what you mean by quicksearch, or is that something in the release editor in whihc case the bug needs moving to musicbrainz because there are still bugs in the searches being used in the release editor that need fixing by core team.

          Paul Taylor added a comment - Examples url by original submitter would be nice ! I tried this search, and it only brings back eight results, all very similar so I cant see the problem: https://musicbrainz.org/search?query=masaki~+AND+suzuki~&type=artist&limit=25&method=indexed Is this what you mean by quicksearch, or is that something in the release editor in whihc case the bug needs moving to musicbrainz because there are still bugs in the searches being used in the release editor that need fixing by core team.

          nikki added a comment -

          nikki added a comment - Ideally the four artists on https://beta.musicbrainz.org/search?query=masaki%7E+AND+suzuki%7E&type=artist&limit=25&method=advanced should be the top results.

            Unassigned Unassigned
            hawke Alex Mauer
            Votes:
            10 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:

                Version Package