• Icon: New Feature New Feature
    • Resolution: Fixed
    • Icon: Normal Normal
    • None
    • None

      Requirements

      "Like what we currently have but throttling instead of blocking".

      Lots of details tbc before we think of implementing this.

          [MBH-192] Bad User-Agent, v2

          Dave Evans added a comment -

          ratelimit-server modified.

          TODO (on my part): monitoring (can't quite remember how that works) and nginx mail to ruaok.

          Dave Evans added a comment - ratelimit-server modified. TODO (on my part): monitoring (can't quite remember how that works) and nginx mail to ruaok.

          Dave Evans added a comment -

          Note to self: "Don't remove the nginx block just yet; email ruaok explaining how to remove it."

          Dave Evans added a comment - Note to self: "Don't remove the nginx block just yet; email ruaok explaining how to remove it."

          Dave Evans added a comment - - edited

          Changes needed to implement this:

          musicbrainz-server

          In the 2 (?) places where we currently apply the ws rate limit, which looks
          like this:

          • check "ws ip=x.x.x.x", reject with a message if failed
          • check "ws global", reject with a message if failed

          Then add an extra check to precede the existing two:

          • check "ws ua=$user_agent", reject with a message (exact message tbc)

          ratelimit-server

          • munge "ws ua=$user_agent" (where $user_agent != python-musicbrainz/0.7.3)
            to "ws ua=generic-bad-ua"
          • apply a leaky 500/10s limit to "ws ua=python-musicbrainz/0.7.3"
          • apply a leaky 500/10s limit to "ws ua=generic-bad-ua"

          other hosting

          • ensure we have rrd graphing etc of "ws ua=generic-bad-ua"
          • remove the nginx outright block on bad UAs

          Dave Evans added a comment - - edited Changes needed to implement this: musicbrainz-server In the 2 (?) places where we currently apply the ws rate limit, which looks like this: check "ws ip=x.x.x.x", reject with a message if failed check "ws global", reject with a message if failed Then add an extra check to precede the existing two: check "ws ua=$user_agent", reject with a message (exact message tbc) ratelimit-server munge "ws ua=$user_agent" (where $user_agent != python-musicbrainz/0.7.3) to "ws ua=generic-bad-ua" apply a leaky 500/10s limit to "ws ua=python-musicbrainz/0.7.3" apply a leaky 500/10s limit to "ws ua=generic-bad-ua" other hosting ensure we have rrd graphing etc of "ws ua=generic-bad-ua" remove the nginx outright block on bad UAs

          Robert Kaye added a comment -

          Q3: As decided in IRC:

          djce: an aggregate block, and report on the aggregate.
          djce: one limit/report for py73, and one limit/report for "other bad uas"

          Robert Kaye added a comment - Q3: As decided in IRC: djce: an aggregate block, and report on the aggregate. djce: one limit/report for py73, and one limit/report for "other bad uas"

          Robert Kaye added a comment -

          A4: On leaky vs strict: Lets go with leaky.

          Robert Kaye added a comment - A4: On leaky vs strict: Lets go with leaky.

          Robert Kaye added a comment -

          A2:
          per ip: 503 Your ip is over limit. Stop it.
          per UA: 503 your user agent string has been throttled. See wiki page, blog, sky for details.
          global: 503 we're over capacity. not your fault, sorry.

          Robert Kaye added a comment - A2: per ip: 503 Your ip is over limit. Stop it. per UA: 503 your user agent string has been throttled. See wiki page, blog, sky for details. global: 503 we're over capacity. not your fault, sorry.

          Robert Kaye added a comment -

          A1: We want throttling on a per UA string. We want to throttle python-musicbrainz/0.7.3 differently from YourMom/8.3. We should check in this order: ua -> ip -> global

          Robert Kaye added a comment - A1: We want throttling on a per UA string. We want to throttle python-musicbrainz/0.7.3 differently from YourMom/8.3. We should check in this order: ua -> ip -> global

          Dave Evans added a comment -

          Also, a rationale would be good, i.e. why change from what we have right now (the outright block).

          Dave Evans added a comment - Also, a rationale would be good, i.e. why change from what we have right now (the outright block).

          Dave Evans added a comment -

          Q1: What kind of throttling? For example, is it a single throttle shared by all bad UAs, or one throttle for python-musicbrainz/0.7.3 and one for the rest, or something else? Or some sort of per-IP throttling?

          Q2: When over the limit, what response should be served? Do we know what the behaviour of the bad-UA clients is for that response? For example we might find that python-musicbrainz/0.7.3 responds well to 403 but not to 503, so maybe that UA should get a 403 when over the limit not a 503. But perhaps other bad UAs should get a 503. etc.

          Q3: What monitoring is required?

          Dave Evans added a comment - Q1: What kind of throttling? For example, is it a single throttle shared by all bad UAs, or one throttle for python-musicbrainz/0.7.3 and one for the rest, or something else? Or some sort of per-IP throttling? Q2: When over the limit, what response should be served? Do we know what the behaviour of the bad-UA clients is for that response? For example we might find that python-musicbrainz/0.7.3 responds well to 403 but not to 503, so maybe that UA should get a 403 when over the limit not a 503. But perhaps other bad UAs should get a 503. etc. Q3: What monitoring is required?

            djce Dave Evans
            djce Dave Evans
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved:

                Version Package