Uploaded image for project: 'MusicBrainz Server'
  1. MusicBrainz Server
  2. MBS-10923

Pagination of collections not returning consistent results

XMLWordPrintable

      Pagination of releases in collections does not appear to be working correctly.

      I'm seeing multiple calls to https://musicbrainz.org/ws/2/collection with the same limit/offset returning different sets of releases.

      Sometimes the results are just permuted, other times including one or more different entries (like maybe the overall list from which the 'slice' is taken is not stably sorted?)

      I noticed because it makes it impossible to iterate through all releases in a collection, since you get duplicates or missing items depending on the results at each step.

      I first saw this via the musicbrainzngs Python 3 bindings, but I have repro'd via curl so I think the issue is in the service responses not the bindings.

      This is a private collection with 2322 releases in it. 72357d35-55c7-4525-9c7a-7d02e4377f2e in case that's helpful to someone with the right privileges...

      I don't see this with limit=25, offset=0 but using non-defaults for either seems to be sufficient:

      $ curl -H 'Accept: application/json' --digest -u ijc:'secret' 'https://musicbrainz.org/ws/2/collection/72357d35-55c7-4525-9c7a-7d02e4377f2e/releases?offset=75' > l.1
      $ curl -H 'Accept: application/json' --digest -u ijc:'secret' 'https://musicbrainz.org/ws/2/collection/72357d35-55c7-4525-9c7a-7d02e4377f2e/releases?offset=75' > l.2
      $ curl -H 'Accept: application/json' --digest -u ijc:'secret' 'https://musicbrainz.org/ws/2/collection/72357d35-55c7-4525-9c7a-7d02e4377f2e/releases?offset=75' > l.3
      $ jq -r '.releases[].id' l.1 > j.1
      $ jq -r '.releases[].id' l.2 > j.2
      $ jq -r '.releases[].id' l.3 > j.3 
      $ md5sum j.*
      03fe9ee0dd9ac8d120cbab9b34a1548b  j.1
      03fe9ee0dd9ac8d120cbab9b34a1548b  j.2
      6c381135334fea6bcceaba981d7b80bc  j.3
      $ diff -u j.1 j.3
      --- j.1	2020-06-27 14:55:57.926332408 +0100
      +++ j.3	2020-06-27 14:55:59.110361288 +0100
      @@ -1,4 +1,3 @@
      -ac2f1ef4-32be-4608-8a56-432bdb88f41b
       0797f65a-c694-3b7b-88ad-590949b12be5
       336828a7-663c-4f72-940a-b54ec9a7472f
       ece07fc8-023a-3a8a-ab27-2bd463c01e89
      @@ -23,3 +22,4 @@
       c8ae9b5e-67c1-37c9-bd92-623881a78a7c
       14ea2d25-1221-4ef3-9887-f840a8a9067f
       df1832f8-30df-4ee1-a683-93e294881fb1
      +aecfa9be-64fd-46f2-8e94-f30539ee051c
      

      I think the above case looks like a symptom of something extra being in the 0..74 range for the 3rd iteration, so the slice is one later. In another case (with offset=50) I saw:

      $ diff -u j.1 j.3
      --- j.1	2020-06-27 14:59:13.291096405 +0100
      +++ j.3	2020-06-27 14:59:14.199118541 +0100
      @@ -21,5 +21,5 @@
       c4c8b834-6a48-4b2f-ad82-5d48ad396951
       b9c52cf2-9f23-4986-8ce2-5e0015cdef27
       9643ead9-b88c-365a-a305-2eaf195c6e2a
      +7b38394d-d5bf-32f5-8aa8-12c16c9dc4ff
       ecc2e9db-637c-4f21-955c-6ab3e1123ffc
      -ac2f1ef4-32be-4608-8a56-432bdb88f41b 

      i.e. an extra entry.

       

       

            bitmap Michael Wiencek
            ijc Ian Campbell
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved:

                Version Package
                2020-08-10