Uploaded image for project: 'Zapped: AcousticBrainz'
  1. Zapped: AcousticBrainz
  2. AB-478

Move AB data storage from database to disk

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • None
    • None

      When we first wrote AB we decided to use Posgres' new json data type because the output of the AB extractor was also json. This helped us get off the ground quickly but we've since encountered many issues with it:

      • We never actually perform queries into the json objects, so having it easy to hand in postgres isn't useful
      • It means that when we need to process files (HL extractor, model training, backups) we have to pull the data out into a temporary file, do something with it, then delete the file
      • Our database continues to grow in size, and we have no good way of backing up/replicating the database, or separating the features from other user data (such as accounts, datasets, etc).

      We should switch to storing lowlevel files on disk instead of in the database. This means that tasks that require a file can directly load it from disk, we can keep direct disk backups, and in the case that our data increases we can switch to a larger storage engine.

      The database should still contain a record of submissions. We need to decide how much of HL data we should store - this is much smaller than LL, and we already have it split up into multiple tables to make it easy to update and return subsets of the data.

      The steps for this task are going to look something like this (not complete):

      • Start saving items from the API to disk
      • Move items from DB to disk
      • Update API to read from disk (including the individual feature endpoint)
      • Update dumps, dataset training to read from disk
      • Decide what to do with HL

          [AB-478] Move AB data storage from database to disk

          There are no comments yet on this issue.

            Unassigned Unassigned
            alastairp Alastair Porter
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:

                Version Package