[PICARD-2339] Improve clustering performance - MetaBrainz Tickets

Type: Improvement
Resolution: Fixed
Priority: Normal
Fix Version/s: 2.7.0b3
Affects Version/s: None
Component/s: None
Labels:
None

The existing clustering code is using the Levenshtein distance to calculate similarity, which caused a O(n^2) performance. But since only exactly similar matches are being used (similarity threshold 1.0) this was is not strictly necessary.

If we don't use a threshold (and Picard never did) then we can simplify the code.

has related issue

PICARD-2353 Post cluster focus regression

Closed

is related to

PICARD-2361 Removing files while clustering

Closed

PICARD-2340 Use configured name for Various Artists for clusters with unknown artist name

Closed

Assignee:: Philipp Wolfer

Reporter:: Philipp Wolfer

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2021-11-23 07:12

Updated:: 2021-12-14 07:06

Version	Package
2.7.0b3

Details

Description

Attachments

Issue Links

Activity

People

Dates

Packages