• Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • Search, Web service
    • None

      MB Search API documentation says you can use offset and limit to paginate the results of a query, but it does not work:

      /ws/2/<entity-type>/?query=<query>&limit=100&offset=<offset>

      Try with a query that returns more than 100 results, try with offset=0 or offset=101.
      Results are ordered randomly.
      If you paginate through all possible offsets to collect all results, you will have duplicates of some items and you will miss other items, randomly, always.

      So the offset system is broken, since SOLR, probably?

      Could we have a sort order set for all queries? It could be anything like, for instance, order by row ID.


      Old description, before noticing it was a global issue:

      I try to use /ws/2/work?query=…&offset=x&limit=y in JSON.
      The results are never ordered in the same fashion, even between two calls of the same query and same limit with a different offset to fetch pages.

      Therefore it's impossible to use this web service when there can be more than 100 results.

          [MBS-12154] MB Search API is broken, no pagination possible

          GitHub Bot added a comment -

          See code changes in pull request #62 submitted by alastair.

          GitHub Bot added a comment - See code changes in pull request #62 submitted by alastair .

          jesus2099 added a comment - - edited

          Today, I face the same problem for another userscript feature:

          /ws/2/release-group?query=arid%3A5441c29d-3602-4898-b1a1-b77fa23b8e50+AND+releases%3A1

          /ws/2/release-group?query=arid%3A5441c29d-3602-4898-b1a1-b77fa23b8e50+AND+releases%3A1&fmt=json

          Same as web site advanced search: https://musicbrainz.org/search?query=arid%3A5441c29d-3602-4898-b1a1-b77fa23b8e50+AND+releases%3A1&type=release_group&method=advanced

          When you reload this URL, the first result is symptomatically switching between Day in Day Out: Radio Broadcast and Seven Years in Tibet and Bridge Benefit 96 and so on (obviously, all have same score: 100 because this query is either yes or no).

          I dream that the results would be sorted by score, and then by row id (they are all unique integers).

          Sort by row ID (as last resort if you want, after some more "smart" sorts) has no equivalent, it is the deterministic sort, that will always be the same when paginating.

          Anyway it is not possible to stay like this, with web services and MB search pages that don't allow the use of pagination.

          jesus2099 added a comment - - edited Today, I face the same problem for another userscript feature : /ws/2/release-group?query=arid%3A5441c29d-3602-4898-b1a1-b77fa23b8e50+AND+releases%3A1 /ws/2/release-group?query=arid%3A5441c29d-3602-4898-b1a1-b77fa23b8e50+AND+releases%3A1&fmt=json Same as web site advanced search: https://musicbrainz.org/search?query=arid%3A5441c29d-3602-4898-b1a1-b77fa23b8e50+AND+releases%3A1&type=release_group&method=advanced When you reload this URL, the first result is symptomatically switching between Day in Day Out: Radio Broadcast and Seven Years in Tibet and Bridge Benefit 96 and so on (obviously, all have same score: 100 because this query is either yes or no). I dream that the results would be sorted by score, and then by row id (they are all unique integers). Sort by row ID (as last resort if you want, after some more "smart" sorts) has no equivalent, it is the deterministic sort, that will always be the same when paginating. Anyway it is not possible to stay like this, with web services and MB search pages that don't allow the use of pagination.

          I believe that search items are in fact sorted by the solr "score" parameter, It looks like the issue arises here if there are more items with the same score than the size of the pager. e.g. 120 recordings in a release group with query rgid:abc and a limit of 100 means that solr doesn't know how to sort these 120 items.

          I think your suggestion of a secondary sort for the case when the score is equal is a good idea. This could be the sort name if the entity has one, or maybe even the uuid.

          Alastair Porter added a comment - I believe that search items are in fact sorted by the solr "score" parameter, It looks like the issue arises here if there are more items with the same score than the size of the pager. e.g. 120 recordings in a release group with query rgid:abc and a limit of 100 means that solr doesn't know how to sort these 120 items. I think your suggestion of a secondary sort for the case when the score is equal is a good idea. This could be the sort name if the entity has one, or maybe even the uuid.

          jesus2099 added a comment -

          I have edited the title and description to emphasise that it's a global Search API bug.
          And as this ticket went quite unnoticed, I wrote about it in the IRC: https://chatlogs.metabrainz.org/libera/metabrainz/msg/5039599/

          jesus2099 added a comment - I have edited the title and description to emphasise that it's a global Search API bug. And as this ticket went quite unnoticed, I wrote about it in the IRC: https://chatlogs.metabrainz.org/libera/metabrainz/msg/5039599/

          jesus2099 added a comment - - edited

          Oh no! Same problem with recordings (web and WS), the order of results is random so you have duplicates and misses when you collect results page by page.

          jesus2099 added a comment - - edited Oh no! Same problem with recordings ( web and WS), the order of results is random so you have duplicates and misses when you collect results page by page.

          jesus2099 added a comment - - edited

          But it seems there is no problems with work web site search?

          When clicking first page and last page of this search (same web site search as the bugged web service search in OP), I always see same first and last results, consistently.

          Maybe it's by chance?
          I think it was by chance, probably.

          jesus2099 added a comment - - edited But it seems there is no problems with work web site search? When clicking first page and last page of this search (same web site search as the bugged web service search in OP), I always see same first and last results, consistently. Maybe it's by chance? I think it was by chance, probably.

          jesus2099 added a comment - - edited

          Same thing was noticed with release group searches:

          jesus2099 added a comment - - edited Same thing was noticed with release group searches: web service search: SEARCH-667 web site search

          jesus2099 added a comment -

          For the moment, I will try to make smaller queries to avoid the paginating threshold. 🙂

          jesus2099 added a comment - For the moment, I will try to make smaller queries to avoid the paginating threshold. 🙂

            alastairp Alastair Porter
            jesus2099 jesus2099
            Votes:
            3 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:

                Version Package