-
Bug
-
Resolution: Fixed
-
Normal
-
None
-
None
Pagination of releases in collections does not appear to be working correctly.
I'm seeing multiple calls to https://musicbrainz.org/ws/2/collection with the same limit/offset returning different sets of releases.
Sometimes the results are just permuted, other times including one or more different entries (like maybe the overall list from which the 'slice' is taken is not stably sorted?)
I noticed because it makes it impossible to iterate through all releases in a collection, since you get duplicates or missing items depending on the results at each step.
I first saw this via the musicbrainzngs Python 3 bindings, but I have repro'd via curl so I think the issue is in the service responses not the bindings.
This is a private collection with 2322 releases in it. 72357d35-55c7-4525-9c7a-7d02e4377f2e in case that's helpful to someone with the right privileges...
I don't see this with limit=25, offset=0 but using non-defaults for either seems to be sufficient:
$ curl -H 'Accept: application/json' --digest -u ijc:'secret' 'https://musicbrainz.org/ws/2/collection/72357d35-55c7-4525-9c7a-7d02e4377f2e/releases?offset=75' > l.1 $ curl -H 'Accept: application/json' --digest -u ijc:'secret' 'https://musicbrainz.org/ws/2/collection/72357d35-55c7-4525-9c7a-7d02e4377f2e/releases?offset=75' > l.2 $ curl -H 'Accept: application/json' --digest -u ijc:'secret' 'https://musicbrainz.org/ws/2/collection/72357d35-55c7-4525-9c7a-7d02e4377f2e/releases?offset=75' > l.3 $ jq -r '.releases[].id' l.1 > j.1 $ jq -r '.releases[].id' l.2 > j.2 $ jq -r '.releases[].id' l.3 > j.3 $ md5sum j.* 03fe9ee0dd9ac8d120cbab9b34a1548b j.1 03fe9ee0dd9ac8d120cbab9b34a1548b j.2 6c381135334fea6bcceaba981d7b80bc j.3 $ diff -u j.1 j.3 --- j.1 2020-06-27 14:55:57.926332408 +0100 +++ j.3 2020-06-27 14:55:59.110361288 +0100 @@ -1,4 +1,3 @@ -ac2f1ef4-32be-4608-8a56-432bdb88f41b 0797f65a-c694-3b7b-88ad-590949b12be5 336828a7-663c-4f72-940a-b54ec9a7472f ece07fc8-023a-3a8a-ab27-2bd463c01e89 @@ -23,3 +22,4 @@ c8ae9b5e-67c1-37c9-bd92-623881a78a7c 14ea2d25-1221-4ef3-9887-f840a8a9067f df1832f8-30df-4ee1-a683-93e294881fb1 +aecfa9be-64fd-46f2-8e94-f30539ee051c
I think the above case looks like a symptom of something extra being in the 0..74 range for the 3rd iteration, so the slice is one later. In another case (with offset=50) I saw:
$ diff -u j.1 j.3 --- j.1 2020-06-27 14:59:13.291096405 +0100 +++ j.3 2020-06-27 14:59:14.199118541 +0100 @@ -21,5 +21,5 @@ c4c8b834-6a48-4b2f-ad82-5d48ad396951 b9c52cf2-9f23-4986-8ce2-5e0015cdef27 9643ead9-b88c-365a-a305-2eaf195c6e2a +7b38394d-d5bf-32f5-8aa8-12c16c9dc4ff ecc2e9db-637c-4f21-955c-6ab3e1123ffc -ac2f1ef4-32be-4608-8a56-432bdb88f41b
i.e. an extra entry.