INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Feature Request - Duplicate Management  (Read 1225 times)

dfortney

  • Galactic Citizen
  • ****
  • Posts: 310
Feature Request - Duplicate Management
« on: February 02, 2019, 04:44:05 pm »

Maybe this is already a feature I haven't found in MC24 but just in case it isn't I would like to request track duplicate management features by which it shows a view with all tracks grouped as duplicates based on audio fingerprint similarity (ideally with selectable similarity threshold) and sorted to show the best copy (highest quality / bitrate / resolution / format / least compression / ...) first to the worst mp3 with hiss and pops last.  A right click option on selected duplicates allows the user to keep only the best of the duplicates and delete or remove from the library the rest.  It would also be cool if there was a right click option to update the track metadata with that from some standardized best-of online tag database.

Alternatively instead of deleting the tracks it could also just move them into a separate library and folder so you could move files back that it might have gotten wrong later.
Logged

Peter_T

  • Galactic Citizen
  • ****
  • Posts: 352
Re: Feature Request - Duplicate Management
« Reply #1 on: August 10, 2019, 03:45:12 pm »

I would love this. 
Logged

wer

  • Citizen of the Universe
  • *****
  • Posts: 2640
Re: Feature Request - Duplicate Management
« Reply #2 on: August 10, 2019, 04:18:36 pm »

There is already a built in smartlist for "Possible duplicates" although it is much less sophisticated than you envision, as it goes by metadata.  Have you tried it?

A debatable example: Stevie Ray Vaughn's "Pride and Joy" from the Texas Flood album vs the live version on Live at Ripley's.  The smartlist thinks they might be duplicates, even though one is over a minute longer than the other.  Do you think they are duplicates?  Would you expect audio fingerprinting to find them?

Do you consider the same piece by Mozart, played by two different orchestras, to be possible duplicates?  Audio fingerprinting just might possibly identify them as possible duplicates.  But the metadata should mostly match, as they are the same piece, so they can be identified that way and you don't need audio fingerprinting.

Conversely, "Wild Horses" by the Rolling Stones vs the same song by The Sundays would not have remotely the same audio fingerprinting.  Clearly not duplicates, but the metadata should also show that, and again the audio fingerprinting is not needed.

There's a judgement call here.  Not sure what you're really after...

If two tracks sound so similar through audio fingerprinting that they could be considered duplicates despite conflicting metadata, that means one of them is tagged wrong.

It seems like in most cases, audio fingerprinting is needed when metadata is either totally missing (change your ripping practices) or intentionally obfuscated (youtubers trying to avoid copyright violation).  Unless it's to distinguish that two tracks are definitely NOT duplicates despite identical metadata (like a live vs studio version by the same artist).

I'm not against your request, but I'm trying to understand the use case.  Which problem are you trying to solve, that the smartlist does not address?
Logged

blgentry

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 8014
Re: Feature Request - Duplicate Management
« Reply #3 on: August 11, 2019, 09:19:10 am »

The word "duplicate" would make you think this was an easy task.  But, as dfortney has described, what people really want is a way of sorting out their collection to find different bit rates, formats, and versions of songs. 

Getting your library cleaned up is a really good first step in all of this.  It can be very helpful to build some panes views that let you see different things.  For example, showing songs by bit rate.  Panes has a feature that will sort things into "buckets".  So, as an example, you could have bit rate buckets that were 0-128kbps, 128-256 kpbs, etc.  This way you can find the low resolution stuff in your collection and decide if it's worth keeping at all.  My answer, for 128k and below is a resounding "no" in almost all cases.

As you explore your collection using filters, you are probably going to find directory trees that you remember creating, but had forgotten about.  This can help you again decide to keep or delete whole chunks of your library.

At some point you can go through some duplicate finding.  I've built several Panes views that are designed specifically for this task.  See attached screen shot for an example.  You could easily add more Panes to this for bit rate and others.  I have had several versions of this, some of which had that feature.

The tasks I'm outlining here are really a library audit.  Which is really what you are asking for.  It's really what most people mean when they say "duplicate finder" or "duplicate management".  I've performed this auditing process on at least one large collection.  It was extremely time consuming.  I think I spent 20 to 30 hours on that one. 

When a collection is large and not consolidated, there is a LOT to sort out.

Starting with some good Panes views is very helpful.  They can do a lot more than most people think.

Brian.
Logged

JimH

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 72438
  • Where did I put my teeth?
Re: Feature Request - Duplicate Management
« Reply #4 on: August 11, 2019, 09:40:39 am »

The word "duplicate" would make you think this was an easy task.  But, as dfortney has described, what people really want is a way of sorting out their collection to find different bit rates, formats, and versions of songs. 
Stacks and auto-stack might be useful.
Logged

dfortney

  • Galactic Citizen
  • ****
  • Posts: 310
Re: Feature Request - Duplicate Management
« Reply #5 on: September 25, 2019, 06:19:45 pm »

Yeah I am looking for something that does a lossy match with a basis in the audio fingerprint.  Of course perfect matches should show as duplicates but also very reasonably the same sounding should as well regardless of bitrate file format etc.  It takes forever and is prone to error trying to cull a library without this sort of help.  It is pretty much impossible to do and some feature like this might really help.   Some simple rule would be incredible like ...

 - group duplicates of X% certainty or better -> move duplicates under Ykbps to Library 'Junk' -> ...
Logged

blgentry

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 8014
Re: Feature Request - Duplicate Management
« Reply #6 on: September 25, 2019, 07:07:59 pm »

Never mind.
Logged
Pages: [1]   Go Up