INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Managing Duplicates?  (Read 2098 times)

al1947

  • World Citizen
  • ***
  • Posts: 162
Managing Duplicates?
« on: August 03, 2013, 01:11:35 pm »

I am hopelessly lost from my searches of the wiki and Interact. I am trying to figure out a way to manage the many duplicates on my system.

http://wiki.jriver.com/index.php/Duplicate_Files has been the most help. But that helps identify duplicates, and I am not clear on how to jigger the syntax to exclude dupes while simultaneously including the non-dupes.

Basically I have two kinds of duplicates:

-- from format upgrade/side-grade: By which I mean I downloaded an MP3 or AAC, then decided to buy the CD and rip it to a lossless format. Or ripped a CD and then got a HD Audio version. In those cases, I would want to select the version that has the highest quality encoding. (Some of the dupes are side-grades caused because I subscribe to iTunes Match, so if I bought, say, and Amazon MP3, iTunes Match can create an AAC version of it.) Searching by song name, track number, and album usually works here. But not always since the MP3 or AAC purchased track might be from a different album than the CD I eventually purchased.

-- from library expansion: Typically I bought a compilation or greatest hits album, then bought the full albums. Or, as with the Beatles and Rolling Stones, bought the original CDs, then the remastered versions. The typical problem here is that if one searches by artist, song name, and playing time (so that different length versions of a song are treated as different songs), there needs to be some tolerance in the time. Inevitably there is a 1-3 second difference in running times.

I assume that I eventually will have to create a playlist and do some manual sorting and editing of it. But I would be grateful for guidance on where to look for instructions on automating as much of this as possible.
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Managing Duplicates?
« Reply #1 on: August 03, 2013, 01:40:50 pm »

Unless you use a tool that uses acoustic signatures, no process will be automatic.

The length and depth of your posting suggests the complexities involved using a meta-data based approach, comparing strings.  It is an iterative process, where you create smartlists and filtering views to help you see duplicate candidates, and then you take action.

MC provides you with the ability to show items that have duplicates, and to show files minus the duplicated items (i.e. unique ones).  With these two smartlists, you can construct a third smartlist that shows all but one of the files (so this would be your delete candidates).  See:

   http://yabb.jriver.com/interact/index.php?topic=81444.msg555371#msg555371
   http://yabb.jriver.com/interact/index.php?topic=81212.msg554881#msg554881
Logged
The opinions I express represent my own folly.

al1947

  • World Citizen
  • ***
  • Posts: 162
Re: Managing Duplicates?
« Reply #2 on: August 05, 2013, 08:47:18 pm »

Quote
You need to create three smartlists for this:

  1) shows all duplicates of a given criteria
  2) shows no duplicates of the same criteria
  3) contains two rules: Playlists is any Playlist 1 above and Playlists is not any of Playlist 2 above

Smartlist 3 is the one you'll use to give you the files you'd want to delete.

Where I get lost is on smart list 2. When I construct it, it selects a single instance of the song using the criteria chosen. But I have no idea of why the particular instance is chosen. Is that documented somewhere and is it configurable?

As an alternative I created smart list 1 as per your specs. But my smart lists 2 is all songs not in playlist 1 -- so all the unduped songs. Then I add all of list 2 to to a new, regular playlist, and manually sort through list one, and choose the ones I want to add to that third playlist. Tedious. But that may be the only way to do what I want.

Also is there smart list syntax that allows you to allow for a slight variance in the time of a song when using the time as a criteria for selecting dupes? I am trying to distinguish between the typical 1 or 2 second variance in the playing time of songs on different albums as opposed to longer and shorter versions of a song.
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Managing Duplicates?
« Reply #3 on: August 05, 2013, 08:55:29 pm »

Where I get lost is on smart list 2. When I construct it, it selects a single instance of the song using the criteria chosen. But I have no idea of why the particular instance is chosen. Is that documented somewhere and is it configurable?

Without a ~sort modifier, they are essentially random.  The sort modifier allows sorting on criteria, so you can sort first, and then apply ~nodups.  So you might want to sort on file type, or path fragment, or ... something else.  You can use the Import/Export button to move items around.

Also is there smart list syntax that allows you to allow for a slight variance in the time of a song when using the time as a criteria for selecting dupes? I am trying to distinguish between the typical 1 or 2 second variance in the playing time of songs on different albums as opposed to longer and shorter versions of a song.

Not directly.  But you could create a custom calculated field, and assign it, say, a truncated to the nearest 10 seconds Duration value.  This would give you a pretty close approximation.  Then you can just use your calculated Duration value in your dups detection.
Logged
The opinions I express represent my own folly.

al1947

  • World Citizen
  • ***
  • Posts: 162
Re: Managing Duplicates?
« Reply #4 on: August 05, 2013, 09:06:18 pm »

hey, I'm just a newbie!

Quote
Without a ~sort modifier, they are essentially random.  The sort modifier allows sorting on criteria, so you can sort first, and then apply ~nodups.  So you might want to sort on file type, or path fragment, or ... something else.  You can use the Import/Export button to move items around.

Quote
Not directly.  But you could create a custom calculated field, and assign it, say, a truncated to the nearest 10 seconds Duration value.  This would give you a pretty close approximation.  Then you can just use your calculated Duration value in your dups detection.

No idea how to do either of those. What do I search for in the wiki or wherever to get an introduction to modifiers and calculated fields?
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Managing Duplicates?
« Reply #5 on: August 05, 2013, 09:15:15 pm »

This might be of some utility:

   http://wiki.jriver.com/index.php/Smartlist_and_Search_-_Rules_and_Modifiers

For the first topic, you'll want to be sure your Sort comes before your No Duplicates in the Wizard.  But since you probably already have search terms and modifiers there, you can use the Import/Export button to move the phrases around.  Add your Sort and then use the button.  It should be pretty clear.

For the calculation, create a new user field, and select Calculated Data, and enter the expression:

   math(10 * formatnumber(math([Duration,0] / 10),0))

New user fields: Tools > Options > Library & Folders > Manage Library Fields and click the Add New Field button.
Logged
The opinions I express represent my own folly.
Pages: [1]   Go Up