INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Library Cleanup - Sort & Filter options...  (Read 782 times)

pilotbum

  • Regular Member
  • Junior Woodchuck
  • **
  • Posts: 99
  • nothing more to say...
Library Cleanup - Sort & Filter options...
« on: June 08, 2020, 01:00:22 pm »

I have a HUGE music library (500+ GB, 1800+ Artists, 5000+ Albums,  48,000+ tracks) that I know contains a lot of duplicates and some music that is missing metadata/tags. Is there any way to filter my Library so that it will only show music that is, 1) missing tags, and 2) has duplicates. i.e., if I have multiple of the same album, multiple of the same tracks, etc. So I may cleanup/delete the lesser quality duplicates, fill in missing tag info, etc.

As far as the Tags are concerned I know I can sort by any of the tag fields and see what is missing that way. Ultimately I can use this to fill in missing tag fields if there is no way to filter. But I've yet to find a way to easily see duplicates.

I've scoured through the menus and if such options exist, I haven't found them.

Thanks!
Logged

wer

  • Citizen of the Universe
  • *****
  • Posts: 2640
Re: Library Cleanup - Sort & Filter options...
« Reply #1 on: June 08, 2020, 01:36:14 pm »

You need to learn about Smartlists and the Panes view.  These will let you search and filter by any criteria, or the absence of it.

In Smartlists, look at the pre-existing Audio - Missing Cover Art smartlist, it will give you ideas. There is also one for Audio-Task-Possible duplicates.

Use google to search for jriver find duplicates.  That has been discussed many times.

You can define a custom panes view that searches for or filters by any tag you want with an empty value.  You have a big undertaking in cleaning up your library, so you will need to spend some time learning these techniques.
Logged

pilotbum

  • Regular Member
  • Junior Woodchuck
  • **
  • Posts: 99
  • nothing more to say...
Re: Library Cleanup - Sort & Filter options...
« Reply #2 on: June 09, 2020, 11:04:57 am »

I know I have a big undertaking ahead of me. That's why I'm trying to figure out/learn the best/easiest way of approaching and doing it.

Thanks for the input. As powerful as this software is I knew there had to be something I wasn't seeing.
Logged

Doof

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 5907
  • Farm Animal Stupid
Re: Library Cleanup - Sort & Filter options...
« Reply #3 on: June 09, 2020, 03:58:33 pm »

FWIW I find Panes View (and others) to be much more helpful when doing maintenance-type tasks like this. Smartlists are better for consuming the fruits of your labors. To give you an idea, here's the search string I have on one of the Views I use to help me find duplicate photos in my library:

Quote
~dup=[Duplicate Photos] ~sort=[Latitude],[Altitude]-d,[People],[Places],[Events],[Caption],[File Size]-d,[Filename (path)]

Basically what this does is finds all files in the library that it matches with on all of the fields listed after the ~dup= segment. Then it's all sorted by a different set of criteria (all that stuff after the ~sort=) that help me pick which of the duplicates is "best". In this case, it's only looking for matches on a custom field called [Duplicate Photos].

[Duplicate Photos] is just a calculated field I created that equals this: [Camera] | [Date /(filename friendly/)] | [Dimensions] | [File Type] | [Source] | [Track #]. So MC renders out that string as a simple sort of hash value and then looks for all files that have that same hash. The theory being that the only way multiple files could have the same hash is by matching on all of those fields and allegedly the only way that should happen is if they're the same photo. It's not perfect but it's proven to be pretty good. Generally, if a false positive slips through it's because it's either missing metadata or the two photos were taken so close together that they have the same timestamp. And that's when I use the Track # field to manually differentiate them from each other.

I could call each of those fields individually in the ~dup statement, but by concatenating them all together into a single field like this it gives me the ability to also use this field in the Group By setting of my duplicates view, giving a nice visual queue of which files are potentially duplicates of which. And I can just update the value of this one field to simultaneously update how duplicates are determined, and how they are grouped together.
Logged

pilotbum

  • Regular Member
  • Junior Woodchuck
  • **
  • Posts: 99
  • nothing more to say...
Re: Library Cleanup - Sort & Filter options...
« Reply #4 on: June 15, 2020, 09:54:28 pm »

That's pretty cool but I can see it'll take some playing to get it to work right. I still have yet to attempt this monumental task...
Logged

Doof

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 5907
  • Farm Animal Stupid
Re: Library Cleanup - Sort & Filter options...
« Reply #5 on: June 16, 2020, 11:39:34 pm »

Yeah, it can certainly be a daunting task. MC does make the task easier once you understand how all of its tools work and a few tricks on how to use them in sometimes unconventional ways to get the results you're after. If you haven't already, experiment with the Library Tools -> Fill Properties From Filename... tool. It might help ease the pain of filling in missing metadata, which I'd honestly make the first step in your cleanup/de-dupe process. Clean up the basic metadata, (artist, album, date, track #, and name) and then you can easily find duplicates just by setting up a view similar to what I described above. Once you've de-duped everything, then you can focus on tagging files more thoroughly than the basics if you want.

I really like using the calculated field to define the criteria that qualify files as "duplicates", because it allows me to use that as the Group By field in the view and group all of the duplicates together. Then you can sort inside the groups by whatever criteria define one to be better than the other so the best one bubbles up to the top and at that point it becomes trivial to quickly scan the entire the list and confirm the one you want to keep is the top one, highlight the rest and delete them. If there are files you want to delete that have metadata you want to keep in the better copy, just Copy the file with the metadata and use Ctrl-Shift-V to get a list of tags to paste into the new file. To make it even easier to mass delete the duplicates once you're done examining the list in detail view and copy/pasting any tags you want, switch to thumbnail view with small enough thumbnails that you get multiple columns. Now the file on top (the chosen one) is in the leftmost column meaning you can easily highlight everything else to the right of that column and delete all of the duplicates in one fell swoop. This setup works so well for me I was able to visually scan and eliminate 80k+ duplicate images out of my photo collection in just a couple of evenings of working on it.

It's certainly pretty powerful software. I'm constantly amazed at the things I can coerce it into doing.

But to quickly answer your other question about filtering on missing metadata, you can use the Search Box (or setup a Smartlist or View) using a query like [Artist]=[] and that will show you all files that have an empty Artist field. You can do the same with any field you consider missing. Album, Name, etc. and wrap them in () to "OR" them. So ([Artist]=[] [Album]=[] [Name]=[]) would do a search for any file that has either an empty Artist, Album, OR Name field. Check out the wiki for more details on how the search stuff works. It's pretty powerful, but the basics are easy enough to learn. You can use the Search Wizard too if you prefer a GUI.
Logged
Pages: [1]   Go Up