-
Bug
-
Resolution: Fixed
-
Normal
-
None
-
2.0.4
-
None
-
Windows 10 1603, Core i7, 8GB, SSD
When working with over 3000 files, the system is unresponsive but is working on whatever task at hand. Knowing that the application performance is poor but still accomplishes what it needs to, I am currently working with 42k files and it is processing about 500-750 an hour to perform clustering. The CPU is only taxed at 17-20% and the applications memory is under 1.5GB.
Performance degrades considerably when working with too many files at once. If I had worked on the 3k files alone, it would have completed in a few hours, but because I have 42k files, the same 3000 files of processing has been running for 7 hours.
There are a couple things I think could be adjusted.
- - Provide an option to enable more threads to be used.
- - Seems the application is limiting itself to never go above 20% CPU utilization. In 15 hours of churning, looking back at perfmon logs, the application seems to cap at 20%.
- - Provide an interrupt/cancel button that stops the pending workstream. Say for example someone selects all clusters and chooses scan. There should be a way to stop the scan and select a smaller number of files.
- has related issue
-
PICARD-1843 Improve load and clustering performance
- Closed
-
PICARD-1262 Picard 2.x and big collections: Slow as hell, nearly unusable, no reaction for several minutes
- Closed
- is related to
-
PICARD-1618 macOS and Windows packages built without C astrcmp
- Closed
- is resolved by
-
PICARD-1339 Removing unclustered files can be very slow
- Closed