INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Is there an easy way to get rid of duplicate files?*now with feature suggestion*  (Read 9796 times)

Matt22

  • Recent member
  • *
  • Posts: 29

I did an all inclusive auto import (added My Computer and my external Hard Drive as the folders), and now I have close to 10,000 songs (8,464 to be exact) in the audio section. Is there an easy way to delete duplicates en mass? One by one would take forever in my case.

If there isn't then it would be a good idea to add a way to automatically highlight duplicates like so in one of the next builds.

This is just me pressing ctrl while clicking some duplicates but you get the idea.

Logged

raldo

  • Citizen of the Universe
  • *****
  • Posts: 1102
Re: Is there an easy way to get rid of duplicate files?
« Reply #1 on: July 15, 2008, 05:56:46 am »

Make a smartlist which sorts out your duplicates. You must configure the smartlist to pick the songs which are the same (track#, name, artist,...)

Then group the smartlist by, for example, album, and sort the list by file name.

Your Duplicates will be nicely arranged and you can use right click + mouse drag to select the songs to be deleted!
Logged

Matt22

  • Recent member
  • *
  • Posts: 29
Re: Is there an easy way to get rid of duplicate files?
« Reply #2 on: July 16, 2008, 07:00:53 pm »

I tried doing that, but I don't think I did it right. Could you please explain step by step (if you can be bothered Lol)?

Anyway, that's my feature suggestion. I'd love a smart duplicate finder integrated with Media Centre that does as in the picture above (but that it also works with mp3 players).
Logged

ADDiCT

  • Regular Member
  • World Citizen
  • ***
  • Posts: 235
  • I'm a bad llama!

A smartlist or view scheme for identifying duplicates can be built by a few mouse clicks. Create a new smartlist, and choose "Only duplicates of" and any criteria you like (Name, Artist, etc.) in the "Modify Results" area.

I think your real problem can't be solved with a new feature in MC. Let me explain this with an example: your harddisk is like an office, and your music files are single pieces of paper that need to be sorted (and are, by the looks of it, currently lying scattered around the floor) and bound together. MC is a collection of ring binders, plastic folders, staplers, and so on. Before you can use the tools to create order out of chaos, you'll have to think about ordering schemes, storage, etc. . Just dropping the tools on the heaps of paper won't create order. You'll have to pre-sort your papers, and then use the tools to arrange them neatly, in a fashion that allows for later easy retrieval of specific papers. Only after arranging everything, you'll be able to identify duplicate pieces of paper, too.
Logged

Matt22

  • Recent member
  • *
  • Posts: 29

Well I actually did find a software months ago that could do that, so I know for a fact it is possible.  I was just hoping it could somehow be implemented in future versions of Media Center. Everything else seems to be.  ;D
Logged

ADDiCT

  • Regular Member
  • World Citizen
  • ***
  • Posts: 235
  • I'm a bad llama!

What software? Links?
Logged

Matt22

  • Recent member
  • *
  • Posts: 29

That's the million dollar question. Lol

It was months ago and I can't find it now. I'm sure its there somewhere, Will look again later,
Logged

AustinBike

  • Regular Member
  • World Citizen
  • ***
  • Posts: 215
  • nothing more to say...

I have a request for a "similar" feature. 

I have ~50K of songs in my collection and ~6-8K of artists.  Much of my collection is compilations from asia and europe.  Unfortunately I get thinks like "Blank & Jones" and "Blank and Jones" or "Funky Lowlives" and "The Funky Lowlives." And the problem is that when you import a new compilation you have artists all over the place, so it is hard to easily find them all (not like importing an album by a single artist and looking next to it.

I would love to have a tool that highlights possible matches.  Currently I have to scroll through the list manually, artist by artist and correct it.  I would love a way to highlight possible matches so I could quickly fix what I need to.
Logged

scarbos

  • Member
  • *
  • Posts: 4

I have used this software a few times. So far it works for me. It is shareware and they ask for a donation if possible. Hope it helps. Thanks, Hollye

http://www.digitalvolcano.co.uk/dupe.html
Logged

Matt22

  • Recent member
  • *
  • Posts: 29

That's actually freeware, and doesn't do quite what I meant. What I meant was that it would scan the library and automatically highlight all the duplicates so you could delete them in one go.
Logged

Matt22

  • Recent member
  • *
  • Posts: 29

Ive used that before and its not great tbh. It certainly doesn't do as I've shown in the picture above (and described one post ago).
Logged

Matt22

  • Recent member
  • *
  • Posts: 29

Well I didn't find it simple to use.

Anyway we are getting off topic. Ive already made my request for the next build or so, so lets leave it at that.
Logged

Frobozz

  • Citizen of the Universe
  • *****
  • Posts: 634
  • There is a small mailbox here.

A feature in MC that finds duplicates by audio fingerprints would be a nice feature.  Using audio fingerprints instead of tag info would catch files even with poor or incorrect tags.  It is a question though how well audio fingerprinting can do in finding duplicates.

I just ran a google search to see if there was anything out there currently using audio fingerprints to find duplicate music files.  I found one: MP3 Duplicate Finder.  I tried it.  It sort of works.  If found some duplicates but missed others.  It did manage to identify a few duplicates that were encoded with different encoders (one LAME and the other with the iTunes MP3 encoder).  Unfortunately it missed many more duplicates than it found.  The interface for the software is also not up to prime time use.  It's more of a beta or proof of concept than a finished product.  Interesting though in that it shows that audio fingerprinting can work in finding duplicates.  Just don't expect a 100% success rate.
Logged

Alex B

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 10121
  • The Cosmic Bird

If it wasn't already clear MC has a fine duplicate finder.

It can be invoked by creating a smartlist that uses the "Only duplicates of" rule.

The older MC versions had a preconfigured "Task -- possible duplicates" smartlist rule, which was
[Media Type]=[audio] ~dup=[artist],[name] ~sort=[Artist],[Name]

MC12 has only a few example smartlists (= stock smartlists) and the users are supposed to use the redesigned smartlist wizard for creating new smartlists.

A "find duplicates" smartlist rule can be based on the file tags and if preferred it can also check physical properties like bitrate and duration.

For example, [Media Type]=[Audio] ~dup=[Bitrate],[File Size],[Duration],[Filename /(name/)] ~sort=[Bitrate],[File Size],[Duration],[Filename (name)] would find audio files that have identical Bitrate, File Size, Duration, and Filename (but not path) and sort the search results by using the same library fields so that the possible duplicates would appear side by side. It is possible to paste and/or directly edit the rule text in the wizard by clicking the Import/Export button, but a smartlist like this can be created with the wizard without having any knowledge of the rule language.
Logged
The Cosmic Bird - a triple merger of galaxies: http://eso.org/public/news/eso0755

StFeder

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 1493
  • Fight! You may win. If you don't, you already lost

A feature in MC that finds duplicates by audio fingerprints would be a nice feature.  Using audio fingerprints instead of tag info would catch files even with poor or incorrect tags.  It is a question though how well audio fingerprinting can do in finding duplicates.

You can do some fake-fingerprinting within MC using it Analyze feature. Take a look at this thread. It's far away from doing a perfect job. I did some testing and found that the same files encoded with different encoders doesn't get the same values even not for bpms. Although I found some dublicates using the smartlist given in the mentioned thread I had never found before :) .
Logged

Alex B

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 10121
  • The Cosmic Bird

You can do some fake-fingerprinting within MC using it Analyze feature. Take a look at this thread. It's far away from doing a perfect job. I did some testing and found that the same files encoded with different encoders doesn't get the same values even not for bpms. Although I found some dublicates using the smartlist given in the mentioned thread I had never found before :) .

Naturally it is possible to add any library fields to an "Only duplicates of" smartlist. Fields like "Replay Gain" or "BPM" can be used in conjuction with other fields for further limiting the smartlist search results. In addition a smartlist can include or exclude the results of other smartlists so it is possible to create a few different basic "find duplicates" smartlists and one or more "combination" smartlists for displaying the combinated results in a single list.

It isn't clear if the OP has "audio content duplicates" (i.e. exactly identical audio files in different locations and with possibly different tags) or "audio source duplicates" (i.e. technically different audio files that are encoded from the same audio source) or perhaps both kinds of "duplicates".
Logged
The Cosmic Bird - a triple merger of galaxies: http://eso.org/public/news/eso0755

hit_ny

  • Citizen of the Universe
  • *****
  • Posts: 3310
  • nothing more to say...

I have ~50K of songs in my collection and ~6-8K of artists.  Much of my collection is compilations from asia and europe.  Unfortunately I get thinks like "Blank & Jones" and "Blank and Jones" or "Funky Lowlives" and "The Funky Lowlives." And the problem is that when you import a new compilation you have artists all over the place, so it is hard to easily find them all (not like importing an album by a single artist and looking next to it.

I would love to have a tool that highlights possible matches.  Currently I have to scroll through the list manually, artist by artist and correct it.  I would love a way to highlight possible matches so I could quickly fix what I need to.

That would indeed be great, i've asked for a fuzzy matcher ages ago for this exact reason.
Logged

raldo

  • Citizen of the Universe
  • *****
  • Posts: 1102

I tested clone remover and it was extremely slow. MC is on hyperdrive compared to Clone remover when talking about looking up duplicates. MC; 2 seconds looking up. Clone Remover: All nighter.

Not just that, you can also configure MC, whereas Clone Remover was very limited in that respect.

As far as I can see, clone remover doesn't do more than MC wrt. finding duplicates.


Logged

MrHaugen

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 3774

MC12 can remove duplicates, but it's no way to tell it wich of the tracks to remove.
I would love to be able to say: compare Artist/name tags (normale duplicate compare) and remove the track with lowest rating/number plays/costum field etc.
Logged
- I may not always believe what I'm saying

JimH

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 71469
  • Where did I put my teeth?

I removed a spam post for Clone Remover.  It's the third or fourth time he's posted about it.
Logged
Pages: [1]   Go Up