INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: How to remove a Gazillion Dupe files??  (Read 4345 times)

JIMV

  • World Citizen
  • ***
  • Posts: 146
How to remove a Gazillion Dupe files??
« on: December 08, 2014, 02:22:43 pm »

Backstory....My ancient iMac died the  'white screen' of death. I bought a new thinkpad and used backups from my backup hard drive to put the music files on the new device. When done I found hundreds (if not thousands) of files duplicated, as in 4 copies of track one, 2 etc...

Is there a way not requiring a degree in programming to scan and identify duplicate files and then batch delete them???

I really do not desire to go file by file and delete that way...
Logged

Matt

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 42029
  • Shoes gone again!
Re: How to remove a Gazillion Dupe files??
« Reply #1 on: December 08, 2014, 02:33:51 pm »

I think emptying the library and doing a fresh import would be easiest.
Logged
Matt Ashland, JRiver Media Center

JIMV

  • World Citizen
  • ***
  • Posts: 146
Re: How to remove a Gazillion Dupe files??
« Reply #2 on: December 08, 2014, 02:40:47 pm »

How would re-importing a corrupt library solve the issue?  I want to scan what is on the PC, get a dump list of duplicate files and then hit a button to make all the dupes go away...How do i do that?
Logged

Matt

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 42029
  • Shoes gone again!
Re: How to remove a Gazillion Dupe files??
« Reply #3 on: December 08, 2014, 02:42:40 pm »

You mean you have duplicates at the file system level?  If that's the case, figure out what folder is the real folder and erase the other folders.
Logged
Matt Ashland, JRiver Media Center

JIMV

  • World Citizen
  • ***
  • Posts: 146
Re: How to remove a Gazillion Dupe files??
« Reply #4 on: December 08, 2014, 04:08:26 pm »

alas, I am still not being clear...I have a separate drive that has 65GB of music on it. This music includes a vast number of duplicate files. That music is now on the computer.

I want to discover how to review all my existing music files on the computer, identify automatically the files with the same name, consolidate them in one pace and then remove them leaving a single file for each specific piece of music.
Logged

Matt

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 42029
  • Shoes gone again!
Re: How to remove a Gazillion Dupe files??
« Reply #5 on: December 08, 2014, 07:07:40 pm »

I don't think MC is the tool for the job.  I'm not sure what is.
Logged
Matt Ashland, JRiver Media Center

InflatableMouse

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3978
Re: How to remove a Gazillion Dupe files??
« Reply #6 on: December 09, 2014, 12:38:34 am »

Are these duplicate files the same name but in a different directory or are they named differently? Or maybe they are of a different file type (mp4 vs m4a)?

If you can find some commonality between the dupes and if its always like that, you can use an expression to find them and delete them.

Make sure you have a backup.
Logged

Krazykanuck

  • Regular Member
  • Galactic Citizen
  • ****
  • Posts: 267
Re: How to remove a Gazillion Dupe files??
« Reply #7 on: December 09, 2014, 09:47:38 am »

I use a tool called clonespy when trying to perform disk cleanup.
Logged

JIMV

  • World Citizen
  • ***
  • Posts: 146
Re: How to remove a Gazillion Dupe files??
« Reply #8 on: December 09, 2014, 04:38:10 pm »

so there is no way to scan a library in JRiver and delete dupe files? Anyway, thanks for the responses. I recall one of my earlier bits of music software being able to do that though I cannot recall the software....perhaps media Monkey or Pure Music on my MAC...

Would be a good feature in MC20.
Logged

InflatableMouse

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3978
Re: How to remove a Gazillion Dupe files??
« Reply #9 on: December 10, 2014, 12:30:21 am »

Maybe you missed my reply?

If we know a little bit more about how your library is organized at a file level (ie, Artist dir\Album dir\01 - tracknames.mp3) and your duplicates for an album are in that same directory, then they must be named differently because files cannot exist with exactly the same name in the same directory.

If they are scattered but have the same name, it means your directory structure and the correct files per album are correct, but somewhere there is an incorrect directory where the dupes reside.

Obviously there can be plenty of other situations but either way, knowing more about how its gone wrong is essential to solving it.

It can be done with MC, it just can't be done automatically. I don't see how that would work reliably either, there must be some level of interaction where you choose which files can be safely deleted. This is no different in MC.

If you can't figure this out by yourself than possibly you could backup your library (MC Library, as in the database) and share it. I'm willing to have a look at it but I'm not an expression wizard so I can't promise you anything.
Logged

Omni

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 827
Re: How to remove a Gazillion Dupe files??
« Reply #10 on: December 10, 2014, 08:49:28 pm »

So wouldn't the built-in "Task -- possible duplicates" (under "Auto Smartlists (music)") be a good place to start?  Just double-click on that smartlist and see if your files show up.  If they do, then from there InflatableMouse or someone more savvy than me with script expressions (and maybe stacks) could probably guide you on how to create a nice Find/Replace expression to delete the duplicates.
Logged

Bill Kearney

  • Regular Member
  • Galactic Citizen
  • ****
  • Posts: 373
Re: How to remove a Gazillion Dupe files??
« Reply #11 on: December 11, 2014, 09:42:18 am »

Trouble is it's actually a bit difficult determining what is or isn't a duplicate.  Sure, direct bit-for-bit scanning will tell you, but that's very resource-intensive (both CPU and disk).  There are various PC file duplicate options out there (all suck, in different ways). 

Music files are often "collected" from various sources, not always ones the RIAA would considering legitimate.  Along with that there's a ton of screwy problems.

Comparing based on metadata makes assumptions on the metadata being present and/or accurate and that's not always the case.  So making blanket decisions based on bitrate or other metadata is problematic.  How does it decide?  Should it keep 192kbps VBR MP3 files in favor of WMVs?  Lossless vs lossy (due to disk space arguments).

So if you follow this then you'll understand why there's not a simple "remove duplicates" function.
Logged

Arindelle

  • Citizen of the Universe
  • *****
  • Posts: 2772
Re: How to remove a Gazillion Dupe files??
« Reply #12 on: December 11, 2014, 01:09:14 pm »

I set up a file type view with file path as one of the columns

you can import this code using the import/export button in customizing the view if you want to copy/paste to try it in JRiver

Quote
[Media Type]=[Audio],[TV],[Video] ~dup=[Name],[Album],[Duration] ~sort=[Filename],[Album Artist],[Album],[Track #],[Name],[Disc #],[Artist],[Bitrate]
  this will only work for identical duplicates of files actually imported but avoids removing different file types . like same song in mp3 and flac won't be called a duplicate. You can remove duration if you are not getting any dupes

Otherwise externally you can download the freeware of Glary Utilities http://www.glarysoft.com/ they have a good duplicate search and its multi-drive if you need that. Probably better in your case.



 
Logged

Omni

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 827
Re: How to remove a Gazillion Dupe files??
« Reply #13 on: December 11, 2014, 03:13:32 pm »

Trouble is it's actually a bit difficult determining what is or isn't a duplicate.  Sure, direct bit-for-bit scanning will tell you, but that's very resource-intensive (both CPU and disk).  There are various PC file duplicate options out there (all suck, in different ways). 

Music files are often "collected" from various sources, not always ones the RIAA would considering legitimate.  Along with that there's a ton of screwy problems.

Comparing based on metadata makes assumptions on the metadata being present and/or accurate and that's not always the case.  So making blanket decisions based on bitrate or other metadata is problematic.  How does it decide?  Should it keep 192kbps VBR MP3 files in favor of WMVs?  Lossless vs lossy (due to disk space arguments).

So if you follow this then you'll understand why there's not a simple "remove duplicates" function.

I understand perfectly, probably better than you, actually, so there's no need to talk down to me.  But JIMV's situation is unique where everything you said is a non-issue.  His duplicates literally are true duplicates due to his having to restore from a backup hard drive image which itself was a combination of many backups, hence the duplication.

So the idea is that he should be able to simply stack/group all the files with just the same filename and (maybe) file size, and then just prune the stacks.  It's that last part I'm unsure of because I left the A/V world shortly before JRiver introduced stacks, but I'm sure there is probably a way to manipulate them via an expression.

If not, then, well, there are plenty of file duplicate finders out there on the net, and then he can just go find them, delete them, and then follow Matt's original suggestion to just recreate his library.
Logged

gvanbrunt

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 1232
  • MC Nerd
Re: How to remove a Gazillion Dupe files??
« Reply #14 on: December 11, 2014, 10:27:04 pm »

JIMV: If you want to tackle this problem, JRiver can probably be used as at least part of the solution. To help you do it, you'll need to provide what some member have asked for: more information on your file system setup. If they know the structure of where the files are and where duplicates can show up etc, they can probably help you locate them with a smartlist etc that you can then delete the files...

If there is no structure, that makes it more difficult then. However if they are all in the library and you can "see" the duplicates, you can probably create a smartlist from the library that may do the job as well. Again some more information on what you "consider" a duplicate would be helpful.
Logged

Bill Kearney

  • Regular Member
  • Galactic Citizen
  • ****
  • Posts: 373
Re: How to remove a Gazillion Dupe files??
« Reply #15 on: December 31, 2014, 01:24:41 pm »

I understand perfectly, probably better than you, actually, so there's no need to talk down to me.  But JIMV's situation is unique where everything you said is a non-issue.  His duplicates literally are true duplicates due to his having to restore from a backup hard drive image which itself was a combination of many backups, hence the duplication.

You're making assumptions not in evidence.  One being that anyone's "talking down" to anyone else here.  So just jump off your high horse from that nonsense.  Likewise with the "better than you" jab.  Really?

That and just what structure of junk he's got in front of him isn't explained.  Sure, there's a ton of ways restoring from backups might make a mess.   Potentially made even worse by how programs like iTunes might have bastardized his libraries.  Since he mentioned an iMac being involved, and I've had to deal with the mess that all that "pretend" friendliness can inflict on unsuspecting users.  But that's an assumption I wasn't willing to make, hence I didn't wander down that road.

So when you read someone posting a question wanting a simple answer to a complex problem, but without enough details, don't just leap to conclusions when someone else tries to contribute.   

A combination of external windows-based file duplicate checking and MC will go a long way toward getting a grip on the mess he's got on his hands.  But it won't, as I posted, come without more work than just a "simple" duplicate remover function could ever hope to effectively provide.
Logged

astromo

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 2239
Re: How to remove a Gazillion Dupe files??
« Reply #16 on: December 31, 2014, 04:31:00 pm »

It's been a few weeks since JIMV asked for help and hasn't reported back.

If the person with the problem is not engaging, then it doesn't make sense to ponder about the what ifs or debate who has a better understanding of the issues.


Logged
MC31, Win10 x64, HD-Plex H5 Gen2 Case, HD-Plex 400W Hi-Fi DC-ATX / AC-DC PSU, Gigabyte Z370 ULTRA Gaming 2.0 MoBo, Intel Core i7 8700 CPU, 4x8GB GSkill DDR4 RAM, Schiit Modi Multibit DAC, Freya Pre, Nelson Pass Aleph J DIY Clone, Ascension Timberwolf 8893BSRTL Speakers, BJC 5T00UP cables, DVB-T Tuner HDHR5-4DT

HiFiTubes

  • Citizen of the Universe
  • *****
  • Posts: 1123
Re: How to remove a Gazillion Dupe files??
« Reply #17 on: January 09, 2015, 04:13:49 am »

Man I'm so close!

I'm merging some libraries and wondering if MC could implement a kind of post-filter.

I've located EXACT dups, about 20K, of them and when I display only these dups I thought of two ways to eliminate them.

1. Rename/Move all my files with DUPS REMOVED filter. This is cumbersome and I'm not sure if my tages would facilitate the move e.g. using Genre to create a folder structure or even Album Artist (auto).

2. Move out all the DUPS into one location. I like to use a [file type] tag but in this case it won't help since they are exact dups, same bit rate and file type. Also I noted that if I simply move/rename the DUPS if the album is the same, all the files are dumped into one [Album] folder.

Finally, if MC had a kind of post-filter, I could just delete the displayed DUPS. Currently, you get a list of all files that are DUPS. If another filter was available, to only apply after previous and not affect the parent filter, I could click delete. Most of mine of in different file locations so I am looking at how that might help, but it's seems MC is so close to making this easy....
Logged

AndrewFG

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3392
Re: How to remove a Gazillion Dupe files??
« Reply #18 on: January 09, 2015, 04:23:35 am »


I suggest that you buy the PerfectTunes DeDup application ( http://www.dbpoweramp.com/perfecttunes.htm )

Logged
Author of Whitebear Digital Media Renderer Analyser - http://www.whitebear.ch/dmra.htm
Author of Whitebear - http://www.whitebear.ch/mediaserver.htm

HiFiTubes

  • Citizen of the Universe
  • *****
  • Posts: 1123
Re: How to remove a Gazillion Dupe files??
« Reply #19 on: January 09, 2015, 04:59:33 am »

Thanks dude! Will check it out.
Logged

HiFiTubes

  • Citizen of the Universe
  • *****
  • Posts: 1123
Re: How to remove a Gazillion Dupe files??
« Reply #20 on: January 09, 2015, 05:51:56 am »

Not really cooking....on a NAS that does 60mb/s w/r

Logged

AndrewFG

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3392
Re: How to remove a Gazillion Dupe files??
« Reply #21 on: January 09, 2015, 06:23:34 am »

Yes. Instead of just checking the meta data, it actually "listens" to each track to see if it "sounds" like any other tracks. This listening is faster than real time, but it does indeed take a while, and it certainly does help if you have a fast machine. If your music is on a detachable HD then plug it into the fastest PC in the house for that purpose. I have 12k tracks and I let it run overnight and it was ready by morning..

PS is your PC connecting to the NAS by a UNC share name (//Server/Folder/..) or via a mapped drive letter (F:) since I seem to recall that it may be faster with the latter..
Logged
Author of Whitebear Digital Media Renderer Analyser - http://www.whitebear.ch/dmra.htm
Author of Whitebear - http://www.whitebear.ch/mediaserver.htm
Pages: [1]   Go Up