INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Fuzzy Search  (Read 2832 times)

Tanoshimi

  • Junior Woodchuck
  • **
  • Posts: 57
Fuzzy Search
« on: February 20, 2007, 05:56:56 pm »

I have written a plugin that scans Billboard Charts and will let me know if my songs are charted, at what number, and also provide me a list of songs that have charted that I do not have.  The problem is exact searches. Any type of discrepancy in the Artist or Name, and I can't find it.  What I want is to find the "close enough" matches.  A LIKE search might do the trick.  Is there anything like that in MC12 pr VB?  Currently, I use the Filter command to filter the list down to only songs by the target artist, then Filter the titles. 
Logged

benn600

  • Citizen of the Universe
  • *****
  • Posts: 3849
  • Living: Santa Monica CA Hometown: Cedar Rapids IA
Re: Fuzzy Search
« Reply #1 on: February 23, 2007, 10:32:31 am »

This is also a problem when finding duplicates.  In fact, I wouldn't mind this ability in general.  I always hand check every song title, album, artist, year, & genre to make sure that there aren't any mistakes but I still find problems.  Checking your songs against Billboard would essentially return any songs that are mispelled, etc!  Then I could correct the mistake!  Won't this take a ton of time to do?!  Looking up 12 thousand songs could take a while when a few songs can take a minute (with a web browser).  I suppose if you can submit song info and get back a number it could go pretty fast.  Does this happen in the background?  I would like this feature!
Logged

John Gateley

  • Citizen of the Universe
  • *****
  • Posts: 4957
  • Nice haircut
Re: Fuzzy Search
« Reply #2 on: February 25, 2007, 10:52:12 am »

Fuzzy search is hard. If you have any ideas, let me know. I have quite a few, but they are english-centric, and I have no idea if they work with international character sets, especially non-european ones.

j

Tanoshimi

  • Junior Woodchuck
  • **
  • Posts: 57
Re: Fuzzy Search
« Reply #3 on: February 28, 2007, 10:21:12 am »

Unfortunately, all of mine work the same way, and are also English Centric.  I create a KEY based upon the Capitalized, Alphanumerics only.  I then compare this KEY with the SEARCH string using an InStr or Left$ command.  For example:

SEARCH = (R.E.M.) - Everybody Hurts = REMEVERYBODYHURTS
TITLE = R. E. M. everYbody_Hurts (Remix) = REMEVERYBODYHURTSREMIX

The I check Left(SEACH,len(TITLE))=TITLE and Left(TITLE,LEN(SEARCH))  If either condition is true then it's considered a match.

I have considered creating a library field (Calculated Field) based upon the Artist and Title that would produce this kind of KEY.  Then I could create a search function that converts the Criteria to a KEY and compares it with the KEY field.  Incorporating this directly into MC12, rather than implementing as a plugin would be preferable.

Any thoughts?
-Tano
Logged

John Gateley

  • Citizen of the Universe
  • *****
  • Posts: 4957
  • Nice haircut
Re: Fuzzy Search
« Reply #4 on: February 28, 2007, 11:24:04 am »

That fuzzy search doesn't work well with misspellings. Soundex (http://en.wikipedia.org/wiki/Soundex) works better at this, though I'm not sure if it could be expressed in Media Center without some underlying support. But this is english-centric.

j

Tanoshimi

  • Junior Woodchuck
  • **
  • Posts: 57
Re: Fuzzy Search
« Reply #5 on: February 28, 2007, 01:01:43 pm »

I agree.  I used my key generating idea because of the inconsistancies in spelling/capitalization/spacing/punctuation, and since 99% of my collection is English, this worked for me.  I'm just curious how the Soundex would handle things like 3 Doors Down and 313 Mafia.  Maybe, and this is for my own purposes, I'll incorporate the KEY and the Soundex together.  Basically, taking the Soundex of the KEY.

Of course, there are still problems immediately evident.  Technically "Jay-Z Featuring Linkin Park" and "Linkin Park featuring Jay-Z" are different, but in essence the same.  Contrapositively, "Nelly" and "Nelly Furtado" are different.  For finding items, that's ok, but for removing duplicates, not so much so.  If Nelly and Nelly Furtado both do songs called "I Miss You" I wouldn't want to lose one of them. 

Quite the pickle.  I shall put some thought into this.  It's good to know that if a reasonable means is available, you guys are open to suggestions.

On a related note, we've always wanted to be able to add buttons and commands to menus.  I personally would like to add my Fuzzy Search to the interface, so I can easily use it against a premade KEY field. any chance of that happening?
Logged

John Gateley

  • Citizen of the Universe
  • *****
  • Posts: 4957
  • Nice haircut
Re: Fuzzy Search
« Reply #6 on: February 28, 2007, 01:39:01 pm »

Let me know what you come up with. I'm working on a similar problem (for YADB).

I don't know about adding to the menus, sorry...

j
Pages: [1]   Go Up