• Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • None

      MusicBrainz Server should move away from using Perl as the server code.

      To do this, we decided at Summit 2013 that these things need to be done:

      1. Pick a simple entity - for example, Series or URLs.
      2. Implement ws/3 for this entity, and implement any required minimal ws/3 functionality for related entities.
      3. Create a data aggregation service to create JSON blobs from different ws/3 responses.
      4. Create a minimal Perl front-end which queries the aggregation service once for each template to be rendered.
      5. Repeat for other entities, one at a time.
      6. Remove the all minimal Perl front-ends and replace them with something else.

      Discussed at http://chatlogs.musicbrainz.org/musicbrainz-devel/2013/2013-09/2013-09-30.html#T19-15-12-920285

          [MBS-6773] Slowly move away from Perl in the MBS code.

          Paul Taylor added a comment - - edited

          Your usecase is much simpler than mine, a tagging application has alot more to it then just the getRelease() function.
          Again you link this specification to 'If the rate limit is flexible and probably higher since queries are smaller,' well we could have higher rate limiter with the existing system, we could have a flexible rate limiter the solution proposed does not in itself make either these things possible/not possible.

          You've ignored my point about search.

          Also you say ' The argument for using a modular WS is not that it will be quicker. It's that it will break data up ', one of the arguments for this new proposal was that it would be quicker which is why the faster/flexible rate limiter is linked in with it like a carrot even though its not related. The number one issue that users of the webservice is that it is too slow (because of the rate limiting) so if the central aim is not to reduce the time to take to do queries then ws/3 is pointless. Whilst some of the lookups do return too much information and need trimming (as was done with the recent remove artist details from relations fix) users dont really mind that there is too much data, that in itself is not a problem for them only Musicbrainz.

          I don't see the problem with discussing here, better to start with a considered responses here than a free for all on irc where half the points made get ignored and lost.

          Paul Taylor added a comment - - edited Your usecase is much simpler than mine, a tagging application has alot more to it then just the getRelease() function. Again you link this specification to 'If the rate limit is flexible and probably higher since queries are smaller,' well we could have higher rate limiter with the existing system, we could have a flexible rate limiter the solution proposed does not in itself make either these things possible/not possible. You've ignored my point about search. Also you say ' The argument for using a modular WS is not that it will be quicker. It's that it will break data up ', one of the arguments for this new proposal was that it would be quicker which is why the faster/flexible rate limiter is linked in with it like a carrot even though its not related. The number one issue that users of the webservice is that it is too slow (because of the rate limiting) so if the central aim is not to reduce the time to take to do queries then ws/3 is pointless. Whilst some of the lookups do return too much information and need trimming (as was done with the recent remove artist details from relations fix) users dont really mind that there is too much data, that in itself is not a problem for them only Musicbrainz. I don't see the problem with discussing here, better to start with a considered responses here than a free for all on irc where half the points made get ignored and lost.

          Ben Ockmore added a comment -

          I like all of the ideas within the document, and I don't see the problem with making 11 queries. I like it so much I'm planning to use it for my WS on WavePlot. If the rate limit is flexible and probably higher since queries are smaller, you can just get the release, then almost instantly request data for all the recordings if you need it. The argument for using a modular WS is not that it will be quicker. It's that it will break data up from 1 large query which produces unnecessary data, into several smaller queries that can be targeted exactly how they're needed.

          I really don't see the problem with changing the number of requests in code. If you have a getRelease() function, you just have to add in a for loop to fetch additional per-track data.

          Finally, I don't think this discussion is productive here, and we should wait until Rob and others decide whether we're going for something like this in the near future before we discuss the details as a group.

          Ben Ockmore added a comment - I like all of the ideas within the document, and I don't see the problem with making 11 queries. I like it so much I'm planning to use it for my WS on WavePlot. If the rate limit is flexible and probably higher since queries are smaller, you can just get the release, then almost instantly request data for all the recordings if you need it. The argument for using a modular WS is not that it will be quicker. It's that it will break data up from 1 large query which produces unnecessary data, into several smaller queries that can be targeted exactly how they're needed. I really don't see the problem with changing the number of requests in code. If you have a getRelease() function, you just have to add in a for loop to fetch additional per-track data. Finally, I don't think this discussion is productive here, and we should wait until Rob and others decide whether we're going for something like this in the near future before we discuss the details as a group.

          Paul Taylor added a comment - - edited

          I think this document clarifies for me why we can't go straight to ws/3 because we have no agreement on how it should work, I think the ws/3 propsal document in its current form is a bad idea. For example in ws/2 I can get most of the information I require from a release lookup at the moment, but in the proposal you would have to make additional requests. And although in the document does return recordings for releases some people think it should not so then we would need separate requests for each recording so for a 10 track release thats eleven requests. Given that in most cases users will want the track information it makes no sense to have 11 requests instead of 1. We have a weak argument that somehow these 11 queries will be quicker than the 1 existing query because they are smaller, I find that very hard to believe. It will only be quicker if mbservers resources are skewed to provide more resources to serving ws/3 requests than ws/2 requests. The design seems to be very much modeling the database rather than providing a webservice that would be useful.

          The problem with inc parameters can be resolved by just providing multiple endpoints for currently used inc combinations, and removing those combinations that do provide too much, after all we did some analysis and found most combinations were never used.

          Also, as a developer who has used the webservice extensively I've spent some time working out strageties to get the information in the least amount of queries, making compromises where neccessary. So I would rather evolve ws/2 int ws/3 rather than implementing a very different system which will mean extensive rewriting of my code to use it.

          We would be better off clearly defining the problems with ws/2 and then find solution to them.

          i.e

          ws/3/rid/releasesimple (i.e equivalent to ws/2/release/rid/?inc=release-groups+recordings+artists
          ws/3/rid/releaseall (i.e equivalent to ws/2/release/rid/?inc=release-groups+recordings+artists+media+alias+tags

          Also the ws/3 document doesn't discuss search results, it is important to emphasize a search for releases is not the same as a lookup of a release, and should not return the same information. Usually if users can search by a particular field they expect to see how these results matched up. For example I can search for artists by aliases, or an alias expression such as alias:b* if the search results just match the artist endpoint I cannot actually see the aliases of the artists that have matched, and what has matched which would not be adequate. If the search results return all the information they currently return which is not particularly large but larger than the proposed basic lookup endpoints we would have the perverse situation of people doing searches by id rather than lookups because they are more useful !

          Paul Taylor added a comment - - edited I think this document clarifies for me why we can't go straight to ws/3 because we have no agreement on how it should work, I think the ws/3 propsal document in its current form is a bad idea. For example in ws/2 I can get most of the information I require from a release lookup at the moment, but in the proposal you would have to make additional requests. And although in the document does return recordings for releases some people think it should not so then we would need separate requests for each recording so for a 10 track release thats eleven requests. Given that in most cases users will want the track information it makes no sense to have 11 requests instead of 1. We have a weak argument that somehow these 11 queries will be quicker than the 1 existing query because they are smaller, I find that very hard to believe. It will only be quicker if mbservers resources are skewed to provide more resources to serving ws/3 requests than ws/2 requests. The design seems to be very much modeling the database rather than providing a webservice that would be useful. The problem with inc parameters can be resolved by just providing multiple endpoints for currently used inc combinations, and removing those combinations that do provide too much, after all we did some analysis and found most combinations were never used. Also, as a developer who has used the webservice extensively I've spent some time working out strageties to get the information in the least amount of queries, making compromises where neccessary. So I would rather evolve ws/2 int ws/3 rather than implementing a very different system which will mean extensive rewriting of my code to use it. We would be better off clearly defining the problems with ws/2 and then find solution to them. i.e ws/3/rid/releasesimple (i.e equivalent to ws/2/release/rid/?inc=release-groups+recordings+artists ws/3/rid/releaseall (i.e equivalent to ws/2/release/rid/?inc=release-groups+recordings+artists+media+alias+tags Also the ws/3 document doesn't discuss search results, it is important to emphasize a search for releases is not the same as a lookup of a release, and should not return the same information. Usually if users can search by a particular field they expect to see how these results matched up. For example I can search for artists by aliases, or an alias expression such as alias:b* if the search results just match the artist endpoint I cannot actually see the aliases of the artists that have matched, and what has matched which would not be adequate. If the search results return all the information they currently return which is not particularly large but larger than the proposed basic lookup endpoints we would have the perverse situation of people doing searches by id rather than lookups because they are more useful !

          Paul Taylor added a comment -

          I think they are plural to indicate search endpoints, i.e search for multiple entities

          Paul Taylor added a comment - I think they are plural to indicate search endpoints, i.e search for multiple entities

          Ulrich Klauer added a comment -

          s/are intended/are not intended/?

          Ulrich Klauer added a comment - s/are intended/are not intended/ ?

          http://ocharles.org.uk/v3.html contains some notes on ws/3. They are intended to be discussed now, I'm just linking it here so if you ever want to know my thoughts in the future - there they are.

          Oliver Charles added a comment - http://ocharles.org.uk/v3.html contains some notes on ws/3. They are intended to be discussed now, I'm just linking it here so if you ever want to know my thoughts in the future - there they are.

          Ben Ockmore added a comment - - edited

          Ollie wrote down WS/3 on the day, and it's how I remember it. WS/3 is the future of a non-perl MusicBrainz, and it makes sense to have both the site and client libraries querying the same service.

          It's the path to much more functional desktop applications, like custom MB editing programs, editing from within Picard, and smarter music consumption applications.

          There's no easy way this can be done with WS/2, and if it's attempted that way it'll take just as long and be less efficient.

          WS/3 will be a much more modular web service, which should allow us to get exactly the data we need for the data aggregator which would feed the site.

          With WS/2 we would have to use a lot of inc parameters, and possibly call multiple endpoints too, for each page.

          Ben Ockmore added a comment - - edited Ollie wrote down WS/3 on the day, and it's how I remember it. WS/3 is the future of a non-perl MusicBrainz, and it makes sense to have both the site and client libraries querying the same service. It's the path to much more functional desktop applications, like custom MB editing programs, editing from within Picard, and smarter music consumption applications. There's no easy way this can be done with WS/2, and if it's attempted that way it'll take just as long and be less efficient. WS/3 will be a much more modular web service, which should allow us to get exactly the data we need for the data aggregator which would feed the site. With WS/2 we would have to use a lot of inc parameters, and possibly call multiple endpoints too, for each page.

          Paul Taylor added a comment -

          That's not how I remember it, Im sure we said we would use existing ws/2 , adding to it as neccessary (probably undopcumented additions), this experience will help us to get to ws/3 in a further iteration we would then get to an offical to ws/3. Trying to implement ws/3 at the same time as this other stuff suddenly makes the problem harder. And doing this vertically is problematic because if you want to implement ws/3 at the same time you'll end up having to implement ws/3 for entities not yet being done but accessed by the entity you are working on.

          The edit/view is an important distinction and perhaps we should divide into two tasks view and edit.

          Paul Taylor added a comment - That's not how I remember it, Im sure we said we would use existing ws/2 , adding to it as neccessary (probably undopcumented additions), this experience will help us to get to ws/3 in a further iteration we would then get to an offical to ws/3. Trying to implement ws/3 at the same time as this other stuff suddenly makes the problem harder. And doing this vertically is problematic because if you want to implement ws/3 at the same time you'll end up having to implement ws/3 for entities not yet being done but accessed by the entity you are working on. The edit/view is an important distinction and perhaps we should divide into two tasks view and edit.

          Ben Ockmore added a comment -

          2 - No, the plan was to implement WS/3. If you look at the end of the summit notes, the stuff I wrote on the day talks about making a new WS. And Ollie's diagram also says WS/3 (I've attached it here). The main problem with WS/2 is that it doesn't support editing. Also, it already aggregates a lot of data and isn't particularly efficient because of that.

          Ben Ockmore added a comment - 2 - No, the plan was to implement WS/3. If you look at the end of the summit notes, the stuff I wrote on the day talks about making a new WS. And Ollie's diagram also says WS/3 (I've attached it here). The main problem with WS/2 is that it doesn't support editing. Also, it already aggregates a lot of data and isn't particularly efficient because of that.

          Paul Taylor added a comment -

          1> there was discussion about whether best to do a new entity not currently in musicbrainz or modify existing entity.

          2> is misleading, the way I remember it was that we used existing ws/2 making modifications as necessary rather than trying to implement s brand new replacement ws/3 at this stage.

          Paul Taylor added a comment - 1> there was discussion about whether best to do a new entity not currently in musicbrainz or modify existing entity. 2> is misleading, the way I remember it was that we used existing ws/2 making modifications as necessary rather than trying to implement s brand new replacement ws/3 at this stage.

            Unassigned Unassigned
            lordsputnik Ben Ockmore
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:

                Version Package