Uploaded image for project: 'ListenBrainz'
  1. ListenBrainz
  2. LB-106

New API based scraping does not capture all the data available

      The new API based import feature does not import all of the data that is returned by the last.fm API. In particular it may return MBIDs for artists and releases that are not captured. urls for last.fm pages should also be stored.

      For some reason the Spotify links that are submitted with scrobbles that were captured with the screen scaping importer are not being provided by the new API based scraper. Why is that? Do we need to change the call to the getRecentTracks in order to capture that?

      See the attached image for what data is available.

          [LB-106] New API based scraping does not capture all the data available

          Robert Kaye added a comment -

          Sadly, spotify URLs are not available via the API, so this is done, I believe.

          Robert Kaye added a comment - Sadly, spotify URLs are not available via the API, so this is done, I believe.

          Calvin Walton added a comment -

          Just to note - the artist and release MBIDs from the last.fm API are known to be low quality.
          It doesn't return an artist MBID from the scrobble, but rather the MBID associated with the (non-disambiguated) last.fm artist page.
          Release MBIDs aren't included in submissions, so the ones returned are just guessed (maybe useful as a release group lookup, but we'd probably do better to match ourselves)

          The recording MBIDs returned by the API are more useful - if a recording MBID was included in the scrobble, the same MBID will be returned here.

          However, last.fm will sometimes return recording MBIDs for tracks which did not have MBIDs included in the submission. These are more likely to be correct than the artist and release mbids, at least.

          Calvin Walton added a comment - Just to note - the artist and release MBIDs from the last.fm API are known to be low quality. It doesn't return an artist MBID from the scrobble, but rather the MBID associated with the (non-disambiguated) last.fm artist page. Release MBIDs aren't included in submissions, so the ones returned are just guessed (maybe useful as a release group lookup, but we'd probably do better to match ourselves) The recording MBIDs returned by the API are more useful - if a recording MBID was included in the scrobble, the same MBID will be returned here. However, last.fm will sometimes return recording MBIDs for tracks which did not have MBIDs included in the submission. These are more likely to be correct than the artist and release mbids, at least.

            Unassigned Unassigned
            rob Robert Kaye
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved:

                Version Package