-
Improvement
-
Resolution: Fixed
-
Normal
-
None
-
None
-
None
Several search queries seem to return results that are not nearly as good as they could be. It seems to me that some of the techniques described below for improving search results should be applied by default.
In particular:
The results of searching for "rudy wiedoeft" does not include "rudy wiedoeft's Californians" or "rudy wiedoeft's palace trio" on the first page; instead they appear on page five. I would expect a single-character difference to appear much higher in the search results.
Similarly, a search for "rudy wied" should return results for all the above, much closer to the top than page 5.
And, "rudy green" does not return "rudy greene" anywhere near the top of the results, despite the one-character difference.
These results can be improved by the following techniques:
(advanced search):
(rudy green) OR (rudy* green) OR (rudy green*)
(rudy wied) OR (rudy* wied) OR (rudy wied*)
"rudy* green*" gives great results compared to "rudy green"
A simple spelling mistake can turn a great search result into a terrible one:
Search for "go-cart mozart", expect to find go-kart mozart.
results contain nothing useful
Change it to
(go cart mozart) OR (go* cart mozart) OR (go cart* mozart) or (go cart mozart*)
and the results are great, suggesting that hyphens need to be considered word breaks.
Even simply switching to advanced search on a total misspelling can improve things:
(simple search): aaron lebedeef does not include the desired artist[1] anywhere near the top of the results (not even on the first page)
An advanced search for the same returns it as the ninth result.
Using the previously-described technique improves it even further: (aaron lebedeef) OR (aaron* lebedeef) OR (aaron lebedeef*) returns it as the seventh result.
Using a fuzzy search on all the above improves many of the above results even further:
"rudy~ wiedoeft~" is great.
"rudy~ wied~" is not so great, but no worse.
"rudy~ green~" is great.
"go-cart~ mozart~" is not so great, but no worse.
"aaron~ lebedeef~" is great.
Combining all techniques works out the best, though:
(rudy~ wiedoeft~) OR (rudy wiedoeft*) or (rudy* wiedoeft): great
(rudy~ wied~) OR (rudy wied*) OR (rudy* wied): great
(rudy~ green~) OR (rudy green*) OR (rudy* green): great.
(go~ cart~ mozart~) OR (go*~ cart mozart) OR (go cart*~ mozart) OR (go cart mozart*~): great
(aaron~ lebedeef~) OR (aaron lebedeef*) OR (aaron* lebedeef): great
So it seems to me that:
1. Advanced search should be on by default
2. fuzzy matching of search terms should be on by default.
3. hyphens should break words
4. combinations of fuzzy and non-fuzzy matching (appending a wildcard and fuzziness to each of the words separately and ORing that with the fuzzy search on all words) should be performed by default
Reference http://chatlogs.musicbrainz.org/musicbrainz-devel/2011/2011-03/2011-03-09.html#T20-15-07-468551 for conversation where the problems were found and discussed.
1. http://test.musicbrainz.org/artist/911d1b3b-e93a-4896-9fda-42013b2c8a7e
- has related issue
-
SEARCH-730 Searching with a Typo
- Open
-
SEARCH-84 Advanced search should AND by default, unless it's contradictory
- Closed
- is duplicated by
-
SEARCH-185 Searching without diacritics
- Closed
-
SEARCH-196 Johnnie-Johnny-John, Jackie-Jacky-Jack... artist not found
- Closed
- is related to
-
MBS-2684 default Advanced Search
- Closed