I think it might be helpful for Philipp to summarize the #2172 discussion here. But to take this a little further, perhaps we should be looking at micro-threads. i.e. split a single file load into several parts that run in different threadpools according to whether they are CPU or I/O intensive.
(Note: Apologues for this being a bit of an unorganised brain dump.)
In essence, in current Python with a GIL, you can have as many parallel I/O threads as you need to optimise I/O, but you can only have a single CPU thread executing at any one time (and because of the poor way that Python handles deciding which thread runs next (competitive/random rather than deterministic), attempting to run >1 CPU thread can create major lags on the main thread.
That is why we have separate threadpools in Picard for I/O and CPU workloads.
At present I have a feeling that we e.g. queue a file load onto an I/O worker thread and this first does the actual I/O (via Mutagen - which includes some processing also), but then we do e.g. the old vs. new comparison of metadata also on an I/O thread whereas we should probably do this on CPU thread. Thus we should perhaps look at splitting e.g. a single file load into multiple smaller pieces that are individually scheduled onto a CPU thread.
The #2172 discussion also talks about doing the directory scans on an I/O thread, and in general terms we should probably give these priority (because they load up the file queue). So the priority for I/O queues would be Directory scans, then the mutagen parts of file saves, then the mutagen parts of file loads.
The priority for the CPU queue should be e.g.: first the CPU intensive preps (if any) for file saves, then the processing for MB WS API responses, then the CPU intensive processing for file loads, then the CPU intensive parts of e.g. matching files in a release to the tracks in a release, etc.
As you can see, we would need to maintain several queues of micro-tasks and service them in priority sequence.
The `sleep` option put forward by Gabriel Carver would seem to need only apply to CPU threads, but would be exactly what is needed to allow the main thread to get some CPU to update the GUI and keep it lively. We should perhaps also consider the use of sleep on a CPU intensive thread every so many interations of a loop so as to give the GUI more update slots, or even break and queue such a task into several microtasks at the outset in order to allow other higher priority micro-tasks to get a look in. (Or you simply start a single CPU task, do a bunch of iterations, then queue yourself onto the head of the appropriate CPU intensive queue to do the next chunk whilst allowing other higher priority types of CPU task to get a lookin.)
We might also need to have some mechanism to make exceptions to the strict priority sequencing for the I/O threads if the highest priority items are only I/O without CPU intensive workload because these will leave the CPU intensive thread idle. So if e.g. we have 1 or more directory read micro-tasks AND 1 or more file load tasks, then before deciding to dispatch the first of the directory read tasks we might want to check whether the CPU intensive thread is idle and if so dispatch a file load item instead.
Finally as an entirely separate brainstorm idea, we might want to think about whether any of the CPU intensive tasks (like metadata summarisation or old/new metadata comparison) can be undertaken using NUMPY processing - because NUMPY runs without the GIL and so can actually utilise multi-core processors (and using compiled code rather than interpreted Q-code - and so run faster anyway) . My initial gut reaction without having done the detailed analysis is that some of them probably can, in which case this might be worth a detailed look. P.S. To avoid the overhead of converting e.g. `metadata` objects into Numpy objects and then convert back again, we might also need to consider holding such data in Numpy ready form (in addition to or instead of current form).
And apologies again for the braindump.
@outsidecontext Philipp
Actually, it may be that we should stop passing (synchronous) callbacks at all, and instead switch to (asynchronous) queues (and possibly signals).
The main issue with this is that it would introduce a significant backwards incompatibility unless we continue to allow plugins to use callbacks in the same way they do now.