Uploaded image for project: 'Picard'
  1. Picard
  2. PICARD-300

Only save files if they were changed

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • File Move & Rename
    • None

      from http://bugs.musicbrainz.org/ticket/3205

      Currently PicardQt saves files even if they were not changed. This causes a lot of unnecessary IO if one loads and saves his whole collection (e.g. to update the tags).

      The Trac ticket has a patch, but it's buggy according to the comments.

          [PICARD-300] Only save files if they were changed

          Sophist added a comment - - edited

          To clarify what Philipp said, suppose your files are ogg and some of the metadata is not saved.

          When you load the file again, the metadata read will be incomplete, and so there will be metadata downloaded that is new and the file will appear to need saving even though the new data wont be saved and the file will be the same afterwards. 

          We either need to ensure that all data is saved to all formats (a massive undertaking but one which should be undertaken at some point) or we need to have a detailed mapping of what data is saved to what formats to know whether the extra data we have will be discarded or not.

          Picard is already more complicated than you think, and fixing these sorts of issues is also more complicated than you think.

          And of course, Picard does not have a paid team of programmers behind it to support it - it was programmed by volunteers, and it continues to be maintained and enhanced by volunteers.

          Sophist added a comment - - edited To clarify what Philipp said, suppose your files are ogg and some of the metadata is not saved. When you load the file again, the metadata read will be incomplete, and so there will be metadata downloaded that is new and the file will appear to need saving even though the new data wont be saved and the file will be the same afterwards.  We either need to ensure that all data is saved to all formats (a massive undertaking but one which should be undertaken at some point) or we need to have a detailed mapping of what data is saved to what formats to know whether the extra data we have will be discarded or not. Picard is already more complicated than you think, and fixing these sorts of issues is also more complicated than you think. And of course, Picard does not have a paid team of programmers behind it to support it - it was programmed by volunteers, and it continues to be maintained and enhanced by volunteers.

          @Stunkymonkey: Yes, your theory is correct. The problem is that this detection code is not reliable enough, and that needs to be fixed first. It is unfortunately not reliable in both ways: It can indicate a change even if there was none, and it does not indicate a change on some things that would require you to save the files again (e.g. changing from ID3 v2.3 to 2.4 or vise versa).

          So this needs to be solved first. If the change detection is unreliable basing a logic to skip saving on it will cause more trouble than it solved.

          Philipp Wolfer added a comment - @Stunkymonkey: Yes, your theory is correct. The problem is that this detection code is not reliable enough, and that needs to be fixed first. It is unfortunately not reliable in both ways: It can indicate a change even if there was none, and it does not indicate a change on some things that would require you to save the files again (e.g. changing from ID3 v2.3 to 2.4 or vise versa). So this needs to be solved first. If the change detection is unreliable basing a logic to skip saving on it will cause more trouble than it solved.

          Stunkymonkey added a comment -

          what i thought: to detect a change there has to be already code. if the check-mark is shown nothing has to be changed and if the "green/yellow/red"-box is shown, then there exists a change.

          So in my mind it did not sound very complicated.

          Stunkymonkey added a comment - what i thought: to detect a change there has to be already code. if the check-mark is shown nothing has to be changed and if the "green/yellow/red"-box is shown, then there exists a change. So in my mind it did not sound very complicated.

          As Sophist wrote implementing this first requires very accurate knowledge about if a file will be changed or not. Sophist pointed out the cover art issues (see PICARD-1001), that's one area of things that can cause wrong detection. But it it only one possible problem, in detail there are a lot more cases. More general the issue is to exactly know in advance which tags will be written how to the underlying format (based on all the settings) and how they will look like when loaded back. PICARD-1001 is another special case. I think there are also issues with change detection for multi-value tags if ID3v2.3 is used. Even the cover art issue is not only a single cause but multiple distinct cases.

          I think solving this in a way that is more robust requires some general changes how Picard handles metadata. The basic model is that Picard has the current metadata as a combination as data loaded from the file and what it had loaded from MB (possible modified by the user).But only if you press save Picard will actually run the code that does all the conversions necessary to save this data to the underlying format. In the end the format has some limitations and might not support all the tags and all the data written to those tags. This is also influenced by settings. Only after saving and reading back the saved data you can currently say for sure how the data got saved.

          Some cases are already handled for the change detection (e.g. tags not supported at all, or some ID3v2.3 specific cases), but the potential causes for differences between data supposed to be saved and data actually saved are basically endless. So there needs to be a way to actually process the metadata independent of saving it to check for differences.

          Philipp Wolfer added a comment - As Sophist wrote implementing this first requires very accurate knowledge about if a file will be changed or not. Sophist pointed out the cover art issues (see PICARD-1001 ), that's one area of things that can cause wrong detection. But it it only one possible problem, in detail there are a lot more cases. More general the issue is to exactly know in advance which tags will be written how to the underlying format (based on all the settings) and how they will look like when loaded back. PICARD-1001 is another special case. I think there are also issues with change detection for multi-value tags if ID3v2.3 is used. Even the cover art issue is not only a single cause but multiple distinct cases. I think solving this in a way that is more robust requires some general changes how Picard handles metadata. The basic model is that Picard has the current metadata as a combination as data loaded from the file and what it had loaded from MB (possible modified by the user).But only if you press save Picard will actually run the code that does all the conversions necessary to save this data to the underlying format. In the end the format has some limitations and might not support all the tags and all the data written to those tags. This is also influenced by settings. Only after saving and reading back the saved data you can currently say for sure how the data got saved. Some cases are already handled for the change detection (e.g. tags not supported at all, or some ID3v2.3 specific cases), but the potential causes for differences between data supposed to be saved and data actually saved are basically endless. So there needs to be a way to actually process the metadata independent of saving it to check for differences.

          Sophist added a comment -
          1. We need to indicate to the user which Albums / Tracks need to be saved - which could be because a file needs to be moved or renamed, or because the tags contained in it have changed or because the cover art saved as separate files has changed.
          2. When the user saves an album or track, then we need to do the minimum amount of I/O needed to save the changes. If tags have not changed, then they should not be rewritten.

          At present, the code to determine when changes have been made is broken. You can save and reload an album and the vast majority of the time it will still be shown as needing changes. My guess (and it is no more than that) is that it is the cover art (in most cases the exact same image) which gets downloaded again which is falsely marked as changed.

          I am unsure whether the changes for an Album are held granularly as tags changed, file path/name changed, cover art changed, but they need to be. It also gets more complicated when you load files first and then change the options for what gets saved or e.g. change from cover art saved as files to saved as images as you then need to have flags for options having changed.

          Finally, there is an equal case for the user to be able to force a full save to be made (tags and cover art files written) even if they are all identical. Sometimes you just want to make sure that the data is rewritten - but I agree that this should not be the norm (which should be to only do the writes that are necessary).

          Sophist added a comment - We need to indicate to the user which Albums / Tracks need to be saved - which could be because a file needs to be moved or renamed, or because the tags contained in it have changed or because the cover art saved as separate files has changed. When the user saves an album or track, then we need to do the minimum amount of I/O needed to save the changes. If tags have not changed, then they should not be rewritten. At present, the code to determine when changes have been made is broken. You can save and reload an album and the vast majority of the time it will still be shown as needing changes. My guess (and it is no more than that) is that it is the cover art (in most cases the exact same image) which gets downloaded again which is falsely marked as changed. I am unsure whether the changes for an Album are held granularly as tags changed, file path/name changed, cover art changed, but they need to be. It also gets more complicated when you load files first and then change the options for what gets saved or e.g. change from cover art saved as files to saved as images as you then need to have flags for options having changed. Finally, there is an equal case for the user to be able to force a full save to be made (tags and cover art files written) even if they are all identical. Sometimes you just want to make sure that the data is rewritten - but I agree that this should not be the norm (which should be to only do the writes that are necessary).

          Stunkymonkey added a comment -

          i understand, that implementing this is not easy. But when someone would like to update a big music library, this enhancement would increase the time needed for saving everything and produces less IO (good for SSD). So please consider implementing this in the near future. Thanks for all your work so far.

          Stunkymonkey added a comment - i understand, that implementing this is not easy. But when someone would like to update a big music library, this enhancement would increase the time needed for saving everything and produces less IO (good for SSD). So please consider implementing this in the near future. Thanks for all your work so far.

          This is related to PICARD-196 "Don't show a green checkbox if tags identical but file would be moved/renamed".

          If the path changes the file must be considered changed and saving be performed.

          Philipp Wolfer added a comment - This is related to PICARD-196 "Don't show a green checkbox if tags identical but file would be moved/renamed". If the path changes the file must be considered changed and saving be performed.

          Peter Culak added a comment -

          Unfortunately, the issue is still present. I believe it's PICARD-1001 which didn't make it into 1.4.1.

          Peter Culak added a comment - Unfortunately, the issue is still present. I believe it's PICARD-1001 which didn't make it into 1.4.1.

          Sophist added a comment -

          P.S. For the record https://github.com/metabrainz/picard/pull/535 was not implemented.

          Sophist added a comment - P.S. For the record https://github.com/metabrainz/picard/pull/535  was not implemented.

          Sophist added a comment -

          A lot of work has been done recently to avoid cover art that hasn't changed making the albums non-perfect.

          I believe this was included in the new release 1.4.1 made available yesterday.

          Can you try the new release and see if it fixes this problem?

          Sophist added a comment - A lot of work has been done recently to avoid cover art that hasn't changed making the albums non-perfect. I believe this was included in the new release 1.4.1 made available yesterday. Can you try the new release and see if it fixes this problem?

            samj1912 Sambhav Kothari
            nikki nikki
            Votes:
            12 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:

                Version Package