Uploaded image for project: 'MusicBrainz Server'
  1. MusicBrainz Server
  2. MBS-12794

Don't use spammer tags/ratings when calculating tag counts / rating averages

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • Schema Change
    • None

      Rating averages and tag counts are calculated automatically via database triggers. We want to avoid spammer tags and ratings to affect real values. As such, we need:

      1) A one-off script to recalculate all tag counts and ratings, skipping any by users currently marked as spammers.

      2) Triggers that run on update of user and check if spamming "privs" have been added / removed - if they have, they should update tag counts and average ratings

      • For tags, we would need a new trigger update_tag_counts_for_update AFTER UPDATE ON editor that updates every aggregate tag for each one of their tags if they are being set as spammer or unset as spammer, and adjusts the tag's ref counts accordingly.
      • For ratings, we would need a new trigger update_aggregate_rating_for_update AFTER UPDATE ON editor that updates every aggregate rating and rating count for each one of their ratings if they are being set as spammer or unset as spammer.

      3) Changes to the current triggers updating tags and ratings to avoid making any changes if the editor is marked as a spammer (in theory this should not be needed since someone already marked as a spammer cannot log in, but better safe than sorry - we don't want to find out something failed and we have to recalculate everything again)

      • For tags, there are three functions ran by the triggers: update_tag_counts_for_raw_insert / update_tag_counts_for_raw_update / update_tag_counts_for_raw_delete. We have access to the editor ID here, so we could certainly check their privs - the only question would be how much slower the triggers would be with the extra query each time. If they would be significantly slower, we could probably skip this step. These update both tag counts in entity_tag tables and ref_count for the tag itself.
      • For ratings, there are three functions ran by the triggers: update_aggregate_rating_for_raw_insert / update_aggregate_rating_for_raw_update / update_aggregate_rating_for_raw_delete. We have access to the editor ID here, so we could certainly check their privs - the only question would be how much slower the triggers would be with the extra query each time. If they would be significantly slower, we could probably skip this step. These run update_aggregate_rating which updates both the rating value and count.

            reosarevok Nicolás Tamargo
            reosarevok Nicolás Tamargo
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:

                Version Package