INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Duplicate Files  (Read 4125 times)

buckeyewalt

  • World Citizen
  • ***
  • Posts: 110
Duplicate Files
« on: June 18, 2013, 12:59:27 pm »

I know that this topic has been beat around many times, however, I think now would be a good time for something simple to be done for the duplicate files that MC creates (or us). I have on my other computer a copy of Music Bee and if somehow we could copy this, it would be a god send. They have a tool that removes all of the duplicate files with just a simple click and it "hides" all of the duplicate files that were created, just one click! Couldn't be any more simple than that, and if you ever want the files back, just a click away! By the way, I am not advocating switching, on the contrary, I have the current  MC18 and I love it, BUT please, do something (easy) with the duplicate files that have been created.

Come on Matt, see if you can get this done for us!
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Duplicate Files
« Reply #1 on: June 19, 2013, 08:49:46 pm »

This is a tough one, as one must define "Duplicate" from one or more of:

    - Identical content
    - Identical file name
    - Identical meta data
    - Approximately identical content, file name, or meta data
    - Acoustically similar
Logged
The opinions I express represent my own folly.

buckeyewalt

  • World Citizen
  • ***
  • Posts: 110
Re: Duplicate Files
« Reply #2 on: June 21, 2013, 02:31:02 pm »

This is a tough one, as one must define "Duplicate" from one or more of:

    - Identical content
    - Identical file name
    - Identical meta data
    - Approximately identical content, file name, or meta data
    - Acoustically similar

I guess that I would define it as identical file name. Again like if I am importing a ZZ Top tune lets say TV Dinners and somehow it duplicates itself either through a computer crash, program crash or what have you, and another copy of TV Dinners shows up in the same album, there should be an easy way of eliminating the duplicates. Maybe I'm asking too much, but if other music players can do it, why can't MC? Note that I say "easy way"!
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Duplicate Files
« Reply #3 on: June 21, 2013, 03:03:58 pm »

So let's take this further.  A quick inspections shows I have 441 files comprising what a file name-only definition would detect as duplicates.  There are, all from different artists or album versions/types (e.g. live, acoustic, studio):

   3    01 Linus and Lucy.flac
   3    01 Opening Title.flac
   3    01 Take Five.flac
   2    02 Blue Moon.flac
   3    02 Dreams.flac
   3    02 Sweet Georgia Brown.flac
   2    21 The Truth.flac
   3    01 Prologue.flac

etc. So, it should be clear that file name alone is not a safe strategy to apply generally.  You can pretty easily test this on your collection.  Create a smartlist that shows only Duplicates of Filename (name), and then add a Filename (name) column and examine the results.

This is just a quick example of the difficulty.  For every solution, there are some problems to be encountered.  Short of acoustic analysis (which is very slow), the best you can do is get close.   Ultimately you the user has to make the choices.  Ask if you need more help.
Logged
The opinions I express represent my own folly.

buckeyewalt

  • World Citizen
  • ***
  • Posts: 110
Re: Duplicate Files
« Reply #4 on: June 22, 2013, 04:27:38 pm »

So let's take this further.  A quick inspections shows I have 441 files comprising what a file name-only definition would detect as duplicates.  There are, all from different artists or album versions/types (e.g. live, acoustic, studio):

   3    01 Linus and Lucy.flac
   3    01 Opening Title.flac
   3    01 Take Five.flac
   2    02 Blue Moon.flac
   3    02 Dreams.flac
   3    02 Sweet Georgia Brown.flac
   2    21 The Truth.flac
   3    01 Prologue.flac

etc. So, it should be clear that file name alone is not a safe strategy to apply generally.  You can pretty easily test this on your collection.  Create a smartlist that shows only Duplicates of Filename (name), and then add a Filename (name) column and examine the results.

This is just a quick example of the difficulty.  For every solution, there are some problems to be encountered.  Short of acoustic analysis (which is very slow), the best you can do is get close.   Ultimately you the user has to make the choices.  Ask if you need more help.

I think you have something of an exception there. Lets say you have 20 songs in a playlist (they were duplicated either with a sync or whatever) and somehow it got duplicated, now you have 40 songs, but half are duplicates. All I am trying to say is that there has to be an easier way to eliminate duplicate (hide?) than what we have to go through without making another playlist or what have you. Doing what you suggested just takes too long.

 3    01 Linus and Lucy.flac
   3    01 Opening Title.flac
   3    01 Take Five.flac
   2    02 Blue Moon.flac
   3    02 Dreams.flac
   3    02 Sweet Georgia Brown.flac
   2    21 The Truth.flac
   3    01 Prologue.flac
 3    01 Linus and Lucy.flac
   3    01 Opening Title.flac
   3    01 Take Five.flac
   2    02 Blue Moon.flac
   3    02 Dreams.flac
   3    02 Sweet Georgia Brown.flac
   2    21 The Truth.flac
   3    01 Prologue.flac
 3    01 Linus and Lucy.flac
   3    01 Opening Title.flac
   3    01 Take Five.flac
   2    02 Blue Moon.flac
   3    02 Dreams.flac
   3    02 Sweet Georgia Brown.flac
   2    21 The Truth.flac
   3    01 Prologue.flac

If you have duplicates of the same name (within the same album/folder), you could easily develop something that would eliminate the issue.
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Duplicate Files
« Reply #5 on: June 22, 2013, 05:00:38 pm »

These discussions are always difficult, as each of us sees our case as the typical, primary and often singular.  But each of our cases is just one of many cases, and solutions have to work for the many cases, not just yours or mine.

So, stepping away from the philosophical, duplicate detection in MC is metadata based.  You tell it what fields you want concatenated into sufficient uniqueness or distinction for your purposes, and MC can present from that data:

   a) all duplicates
   b) one from each duplicate

Using these two constructs, you can also present:

   c) all but one (random) duplicate

It takes less time to construct these smartlists than any of our posts thus far.  Once created, they are useful tools forever.  So if that's too much, I suppose you'll have to wait until something else is implemented by JRiver.
Logged
The opinions I express represent my own folly.

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Duplicate Files
« Reply #6 on: June 22, 2013, 05:08:16 pm »

See this post for how to create the three smartlists required:

   http://yabb.jriver.com/interact/index.php?topic=81212.msg554881#msg554881

Ask if you need more help.
Logged
The opinions I express represent my own folly.

AssadMawad

  • Recent member
  • *
  • Posts: 49
Re: Duplicate Files
« Reply #7 on: June 23, 2013, 04:12:13 am »

Logged

cobar53

  • Recent member
  • *
  • Posts: 6
Re: Duplicate Files
« Reply #8 on: August 03, 2013, 08:57:42 pm »

I looked at all the above and concluded the easiest way out was to clear my entire Library and re-import.

This is a tiresome way of managing what should be a simple task.

I am VERY disappointed with this software  :'(
Logged

spiggytopes

  • World Citizen
  • ***
  • Posts: 211
Re: Duplicate Files
« Reply #9 on: August 03, 2013, 11:05:24 pm »

OK, firstly, please don't just jump away from J River - I sympathise with you, but also believe that the software is very good and the support first class, as I have proved to myself several times.


Truly, it is not so easy to do to find the duplicates, I think.


In my case, I have several copies of many tracks in FLAC but different "quality", eg Elvis remasters over the years; I have tried sorting by file size and sampling rate, but it does not reliably offer the ones I want to discard, as I am looking for the date of the remaster.

If your duplicates are exactly the same file, then can you sort them either in MC or in Windows explorer to delete the extras?

If I have missed the point, apologies.
Logged

nickeaston

  • Regular Member
  • World Citizen
  • ***
  • Posts: 127
  • nothing more to say...
Re: Duplicate Files--Alternatives
« Reply #10 on: September 29, 2013, 12:57:45 pm »

My primary de-duping app is DoubleKiller using very specific settings developed by trial over several years.  I figured out years ago that de-duping by metadata (tag values) was not at all satisfactory for me.  I can manipulate my tags any way I want in MC--what a great program.  DoubleKiller is a standalone app that will evaluate several drives and/or folders simultaneously if desired for duplicates based on sampling a portion of the the audio content fingerprint (crc) of the files.
 
If anyone is interested in this method check out the DoubleKiller website and I can furnish my settings for best mp3 results...which produce a higher yield than the settings suggested in the help file.

MrC: Is this an accurate way of describing an alternative de-duping technique, not dependent on metadata, outside of MC?
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Duplicate Files
« Reply #11 on: September 29, 2013, 01:00:28 pm »

Yeah, seems like it is.  It is the item I listed as "Acoustically similar" above.
Logged
The opinions I express represent my own folly.
Pages: [1]   Go Up