INTERACT FORUM

More => Old Versions => JRiver Media Center 18 for Windows => Topic started by: buckeyewalt on June 18, 2013, 12:59:27 pm

Title: Duplicate Files
Post by: buckeyewalt on June 18, 2013, 12:59:27 pm
I know that this topic has been beat around many times, however, I think now would be a good time for something simple to be done for the duplicate files that MC creates (or us). I have on my other computer a copy of Music Bee and if somehow we could copy this, it would be a god send. They have a tool that removes all of the duplicate files with just a simple click and it "hides" all of the duplicate files that were created, just one click! Couldn't be any more simple than that, and if you ever want the files back, just a click away! By the way, I am not advocating switching, on the contrary, I have the current  MC18 and I love it, BUT please, do something (easy) with the duplicate files that have been created.

Come on Matt, see if you can get this done for us!
Title: Re: Duplicate Files
Post by: MrC on June 19, 2013, 08:49:46 pm
This is a tough one, as one must define "Duplicate" from one or more of:

    - Identical content
    - Identical file name
    - Identical meta data
    - Approximately identical content, file name, or meta data
    - Acoustically similar
Title: Re: Duplicate Files
Post by: buckeyewalt on June 21, 2013, 02:31:02 pm
This is a tough one, as one must define "Duplicate" from one or more of:

    - Identical content
    - Identical file name
    - Identical meta data
    - Approximately identical content, file name, or meta data
    - Acoustically similar

I guess that I would define it as identical file name. Again like if I am importing a ZZ Top tune lets say TV Dinners and somehow it duplicates itself either through a computer crash, program crash or what have you, and another copy of TV Dinners shows up in the same album, there should be an easy way of eliminating the duplicates. Maybe I'm asking too much, but if other music players can do it, why can't MC? Note that I say "easy way"!
Title: Re: Duplicate Files
Post by: MrC on June 21, 2013, 03:03:58 pm
So let's take this further.  A quick inspections shows I have 441 files comprising what a file name-only definition would detect as duplicates.  There are, all from different artists or album versions/types (e.g. live, acoustic, studio):

   3    01 Linus and Lucy.flac
   3    01 Opening Title.flac
   3    01 Take Five.flac
   2    02 Blue Moon.flac
   3    02 Dreams.flac
   3    02 Sweet Georgia Brown.flac
   2    21 The Truth.flac
   3    01 Prologue.flac

etc. So, it should be clear that file name alone is not a safe strategy to apply generally.  You can pretty easily test this on your collection.  Create a smartlist that shows only Duplicates of Filename (name), and then add a Filename (name) column and examine the results.

This is just a quick example of the difficulty.  For every solution, there are some problems to be encountered.  Short of acoustic analysis (which is very slow), the best you can do is get close.   Ultimately you the user has to make the choices.  Ask if you need more help.
Title: Re: Duplicate Files
Post by: buckeyewalt on June 22, 2013, 04:27:38 pm
So let's take this further.  A quick inspections shows I have 441 files comprising what a file name-only definition would detect as duplicates.  There are, all from different artists or album versions/types (e.g. live, acoustic, studio):

   3    01 Linus and Lucy.flac
   3    01 Opening Title.flac
   3    01 Take Five.flac
   2    02 Blue Moon.flac
   3    02 Dreams.flac
   3    02 Sweet Georgia Brown.flac
   2    21 The Truth.flac
   3    01 Prologue.flac

etc. So, it should be clear that file name alone is not a safe strategy to apply generally.  You can pretty easily test this on your collection.  Create a smartlist that shows only Duplicates of Filename (name), and then add a Filename (name) column and examine the results.

This is just a quick example of the difficulty.  For every solution, there are some problems to be encountered.  Short of acoustic analysis (which is very slow), the best you can do is get close.   Ultimately you the user has to make the choices.  Ask if you need more help.

I think you have something of an exception there. Lets say you have 20 songs in a playlist (they were duplicated either with a sync or whatever) and somehow it got duplicated, now you have 40 songs, but half are duplicates. All I am trying to say is that there has to be an easier way to eliminate duplicate (hide?) than what we have to go through without making another playlist or what have you. Doing what you suggested just takes too long.

 3    01 Linus and Lucy.flac
   3    01 Opening Title.flac
   3    01 Take Five.flac
   2    02 Blue Moon.flac
   3    02 Dreams.flac
   3    02 Sweet Georgia Brown.flac
   2    21 The Truth.flac
   3    01 Prologue.flac
 3    01 Linus and Lucy.flac
   3    01 Opening Title.flac
   3    01 Take Five.flac
   2    02 Blue Moon.flac
   3    02 Dreams.flac
   3    02 Sweet Georgia Brown.flac
   2    21 The Truth.flac
   3    01 Prologue.flac
 3    01 Linus and Lucy.flac
   3    01 Opening Title.flac
   3    01 Take Five.flac
   2    02 Blue Moon.flac
   3    02 Dreams.flac
   3    02 Sweet Georgia Brown.flac
   2    21 The Truth.flac
   3    01 Prologue.flac

If you have duplicates of the same name (within the same album/folder), you could easily develop something that would eliminate the issue.
Title: Re: Duplicate Files
Post by: MrC on June 22, 2013, 05:00:38 pm
These discussions are always difficult, as each of us sees our case as the typical, primary and often singular.  But each of our cases is just one of many cases, and solutions have to work for the many cases, not just yours or mine.

So, stepping away from the philosophical, duplicate detection in MC is metadata based.  You tell it what fields you want concatenated into sufficient uniqueness or distinction for your purposes, and MC can present from that data:

   a) all duplicates
   b) one from each duplicate

Using these two constructs, you can also present:

   c) all but one (random) duplicate

It takes less time to construct these smartlists than any of our posts thus far.  Once created, they are useful tools forever.  So if that's too much, I suppose you'll have to wait until something else is implemented by JRiver.
Title: Re: Duplicate Files
Post by: MrC on June 22, 2013, 05:08:16 pm
See this post for how to create the three smartlists required:

   http://yabb.jriver.com/interact/index.php?topic=81212.msg554881#msg554881 (http://yabb.jriver.com/interact/index.php?topic=81212.msg554881#msg554881)

Ask if you need more help.
Title: Re: Duplicate Files
Post by: AssadMawad on June 23, 2013, 04:12:13 am
Check this post : http://yabb.jriver.com/interact/index.php?topic=62180.0

Title: Re: Duplicate Files
Post by: cobar53 on August 03, 2013, 08:57:42 pm
I looked at all the above and concluded the easiest way out was to clear my entire Library and re-import.

This is a tiresome way of managing what should be a simple task.

I am VERY disappointed with this software  :'(
Title: Re: Duplicate Files
Post by: spiggytopes on August 03, 2013, 11:05:24 pm
OK, firstly, please don't just jump away from J River - I sympathise with you, but also believe that the software is very good and the support first class, as I have proved to myself several times.


Truly, it is not so easy to do to find the duplicates, I think.


In my case, I have several copies of many tracks in FLAC but different "quality", eg Elvis remasters over the years; I have tried sorting by file size and sampling rate, but it does not reliably offer the ones I want to discard, as I am looking for the date of the remaster.

If your duplicates are exactly the same file, then can you sort them either in MC or in Windows explorer to delete the extras?

If I have missed the point, apologies.
Title: Re: Duplicate Files--Alternatives
Post by: nickeaston on September 29, 2013, 12:57:45 pm
My primary de-duping app is DoubleKiller using very specific settings developed by trial over several years.  I figured out years ago that de-duping by metadata (tag values) was not at all satisfactory for me.  I can manipulate my tags any way I want in MC--what a great program.  DoubleKiller is a standalone app that will evaluate several drives and/or folders simultaneously if desired for duplicates based on sampling a portion of the the audio content fingerprint (crc) of the files.
 
If anyone is interested in this method check out the DoubleKiller website and I can furnish my settings for best mp3 results...which produce a higher yield than the settings suggested in the help file.

MrC: Is this an accurate way of describing an alternative de-duping technique, not dependent on metadata, outside of MC?
Title: Re: Duplicate Files
Post by: MrC on September 29, 2013, 01:00:28 pm
Yeah, seems like it is.  It is the item I listed as "Acoustically similar" above.