INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: CARNAC/TVDB issues  (Read 5369 times)

lepa

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 2033
CARNAC/TVDB issues
« on: December 11, 2015, 03:59:06 pm »

Could you do some fine tuning in the Carnac/TVDB co-working. Currently autoimporting has some problems with series which name includes parentheses or dots. It removes them.

So for example Marvel's Agents of S.H.I.E.L.D become Marvel's Agents of S H I E L D and autoimport doesn't find the series information. (Luckily it is a lousy series :) )

Other cases: Doctor Who (2005) becomes Doctor Who 2005, Castle (2009) --> Castle 2009 and so on.
Logged

RoderickGI

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 8186
Re: CARNAC/TVDB issues
« Reply #1 on: December 11, 2015, 04:22:30 pm »

That is strange.

I have been doing work on my EPG source and cleaning up recordings I have. I have been specifically using "Doctor Who" and "Marvel's Agents of S.H.I.E.L.D" for some of the work. For some of the testing I even added the " (2005)" to the end of the "Doctor Who" in my EPG source to improve matching.

Anyway, the bottom line is, MC does not remove the dots or parentheses from either series name.

Of course, I'm talking about recorded programs, that MC already knows about through the EPG data. Are you importing files sourced from elsewhere that have structured names, which MC, through CARNAC, must rely on to work out what the program is?

Even if you are importing such files, I have a few examples of those for the Marvel program, and they import correctly, are identified correctly, and the Series name is created as "Marvel's Agents Of S.H.I.E.L.D", while the Episode name, Season, and Episode numbers are correct.

So, there must be some difference between our systems if you are seeing that problem. What version of MC are you using? What is the source of the programs? i.e. Are you just importing files with structured names?
Logged
What specific version of MC you are running:MC27.0.27 @ Oct 27, 2020 and updating regularly Jim!                        MC Release Notes: https://wiki.jriver.com/index.php/Release_Notes
What OS(s) and Version you are running:     Windows 10 Pro 64bit Version 2004 (OS Build 19041.572).
The JRMark score of the PC with an issue:    JRMark (version 26.0.52 64 bit): 3419
Important relevant info about your environment:     
  Using the HTPC as a MC Server & a Workstation as a MC Client plus some DLNA clients.
  Running JRiver for Android, JRemote2, Gizmo, & MO 4Media on a Sony Xperia XZ Premium Android 9.
  Playing video out to a Sony 65" TV connected via HDMI, playing digital audio out via motherboard sound card, PCIe TV tuner

lepa

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 2033
Re: CARNAC/TVDB issues
« Reply #2 on: December 11, 2015, 05:12:17 pm »

Try to autoimport file named "Marvel.Agents.of.S.H.I.E.L.D.S02E19" which has not been tagged or indexed by MC before.

That parentheses thing is different scenario as files are often "wrongly" named like "doctor.who.2005.s01e01". So it was incorrect to say that it removes the parentheses. It just doesn't add it when match is found from TVDB as MC doesn't overwrite Series field (I'm not completely sure I wanted it to do so though)
Logged

Hendrik

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 10935
Re: CARNAC/TVDB issues
« Reply #3 on: December 11, 2015, 05:23:38 pm »

I have been wondering if anything could be done smarter to try to fix series like Doctor Who, which are often stored with the year after the name.
Carnac of course knows its importing a series name, so maybe it should just add parenthesis if the last part is a year, but I could totally see some series not having that because the year is actual part of the proper name. So... meh!

Maybe just in the TVDB lookup code, if there is no match with the original Carnac name, try adding parenthesis, and if there is a match, fix the series name? Suppose that could work.
Wonder if TVDB should generally fix the series name? Someone might not like that, however.
Logged
~ nevcairiel
~ Author of LAV Filters

RoderickGI

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 8186
Re: CARNAC/TVDB issues
« Reply #4 on: December 12, 2015, 04:23:03 pm »

Wonder if TVDB should generally fix the series name? Someone might not like that, however.

The trouble is that there are several series with the same naming convention. I'm not sure if it is in their policies or standards, but they do consistently use the mechanism to differentiate between remakes or later runs of series of the same name. Not to mention that lots of other applications already account for the convention.

I've been working with Steve Bickell from EPG Collector to improve matching with TheTVDB for Doctor Who initially and now more generally. You can read the latest thread here: https://sourceforge.net/p/epgcollector/discussion/1125946/thread/d9cfd011/

He has made a few changes, but specifically;
1. He used to do a fuzzy match on the Series name, and normal match on the Episode name. With "Doctor Who", where in Australia particularly the EPG only includes "Doctor Who" while TheTVDB calls the series "Doctor Who (2005)", TheTVDB returns all series that match the fuzzy search, but the best match is "Doctor Who", which is the 1960's to 1980's series. EPG Collector would then try to find the Episode with the best matched series, and fail.
So he changed how he matched the Series. First he does a fuzzy match, which finds all Series which may match "Doctor Who", and their Episode names. Then he searches for the Episode name in the returns, and when a match is found, he selects the Series that includes that Episode name. For current programs, that means he gets "Doctor Who (2005)".
2. He has made quite a few improvements to the matching of Episodes with suffixes and prefixes, such as Doctor Who episodes with storyline part numbers at the end i.e. (1) and (2) for a two part story. He has now gone further and made Episode name matching use fuzzy logic as well. That means that there is a greater risk of making an incorrect match, but initially it looks like I am getting much better matches, hence much better Season and Episode numbers in my EPG, which then means better data in MC, when a program is recorded.

Note that I don't know exactly what Steve has done. Some of it is described in the thread I linked to, and some not, I'm sure. I haven't looked at the code, and couldn't make a lot of sense of it if I did, but it is open source, so it is available. Not the version I am using yet though, since it hasn't been publically released.

Note also that at the beginning of that thread I was adding the suffix " (2005)" using EPG Collector functionality, before TheTVDB lookup happened, which was finding matches on the full TVDB name. Now I have taken that edit out and I am still getting matches.

Anyway, as you commented, I thought I would share.

PS:
I have been wondering if anything could be done smarter to try to fix series like Doctor Who, which are often stored with the year after the name.

Actually there is a lot you could do to make TheTVDB lookup smarter generally, rather than just for series like "Doctor Who". I only look up metadata using EPG Collector because it can look up TheTVDB and find the correct Season and Episode numbers for me, and MC insists on having that data before it will look up TheTVDB.

There is absolutely no reason why MC couldn't look up TV program data based on only the Series and Episode names, just as EPG Collector does. If that capability was implemented in MC, a very large part of the issues around EPG data would disappear, since people would no longer need to find an EPG source that includes Season and Episode numbers. Outside of the Microsoft/Rovi and PercData/Gracenote solutions, which means mostly outside the USA, that would make a large difference. It should also help ATSC users who get their EPG OTA in the USA as well.

So there is your challenge (or another one), should you accept it.  8)
Logged
What specific version of MC you are running:MC27.0.27 @ Oct 27, 2020 and updating regularly Jim!                        MC Release Notes: https://wiki.jriver.com/index.php/Release_Notes
What OS(s) and Version you are running:     Windows 10 Pro 64bit Version 2004 (OS Build 19041.572).
The JRMark score of the PC with an issue:    JRMark (version 26.0.52 64 bit): 3419
Important relevant info about your environment:     
  Using the HTPC as a MC Server & a Workstation as a MC Client plus some DLNA clients.
  Running JRiver for Android, JRemote2, Gizmo, & MO 4Media on a Sony Xperia XZ Premium Android 9.
  Playing video out to a Sony 65" TV connected via HDMI, playing digital audio out via motherboard sound card, PCIe TV tuner

RoderickGI

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 8186
Re: CARNAC/TVDB issues
« Reply #5 on: December 12, 2015, 05:07:18 pm »

Try to autoimport file named "Marvel.Agents.of.S.H.I.E.L.D.S02E19" which has not been tagged or indexed by MC before.

I imported a file named as above into an almost empty test library and yes, it was imported as series name "Marvel Agents of S H I E L D". When the file name is using a "." separator it seems to remove all "." instances.

When I import "Marvel's.Agents.Of.S.H.I.E.L.D.3x04.Devils.You.Know.HDTV.x264-KILLERS.[tvu.org.ru].mp4" it pulls out all the extra data, but also drops the dots and gives the series the name "Marvel's Agents Of S H I E L D". (Then I fixed the two files so they both had the same series name, "Marvel's Agents Of S H I E L D", with the "'s".)

However, if I do a "Get Movie and TV Info..." on one of the episodes, and search using "Marvel's Agents Of  S3E4", I get a match to the correct series. When I select it, TheTVDB ID is saved for the series in MC. From now on, if I import files using that structure, they are matched to the correct series in MC.

But the series still has the wrong name. If I manually change the series name to "Marvel's Agents Of S.H.I.E.L.D." for the existing files I have for the series, then import a new file using the same naming structure, I get a second Series group with the incorrect series name, "Marvel's Agents Of S H I E L D". I don't recall that happening in my main library, but perhaps I just automatically go and correct the series name for newly imported files, as I do for so many other programs.

So I guess the original issue stands. CARNAC is stripping out punctuation that it shouldn't, and the "Get Movie and TV Info..." doesn't use some fuzzy logic or something to find the correct series match.
Logged
What specific version of MC you are running:MC27.0.27 @ Oct 27, 2020 and updating regularly Jim!                        MC Release Notes: https://wiki.jriver.com/index.php/Release_Notes
What OS(s) and Version you are running:     Windows 10 Pro 64bit Version 2004 (OS Build 19041.572).
The JRMark score of the PC with an issue:    JRMark (version 26.0.52 64 bit): 3419
Important relevant info about your environment:     
  Using the HTPC as a MC Server & a Workstation as a MC Client plus some DLNA clients.
  Running JRiver for Android, JRemote2, Gizmo, & MO 4Media on a Sony Xperia XZ Premium Android 9.
  Playing video out to a Sony 65" TV connected via HDMI, playing digital audio out via motherboard sound card, PCIe TV tuner

Matt

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 42373
  • Shoes gone again!
Re: CARNAC/TVDB issues
« Reply #6 on: December 14, 2015, 08:07:27 am »

There is absolutely no reason why MC couldn't look up TV program data based on only the Series and Episode names, just as EPG Collector does. If that capability was implemented in MC, a very large part of the issues around EPG data would disappear, since people would no longer need to find an EPG source that includes Season and Episode numbers. Outside of the Microsoft/Rovi and PercData/Gracenote solutions, which means mostly outside the USA, that would make a large difference. It should also help ATSC users who get their EPG OTA in the USA as well.

Next build:
Changed: TheTVDB lookup can lookup tracks that don't have a season or episode number and will now lookup just by the episode name.

It was a little bit of a bear to implement, but I think it should all be good once the next build ships.
Logged
Matt Ashland, JRiver Media Center

Hendrik

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 10935
Re: CARNAC/TVDB issues
« Reply #7 on: December 14, 2015, 08:28:00 am »

Note that it will only do its job if no season/episode info is set, otherwise it would conflict with the S/E lookup, which is generally favored, if available.
Otherwise common workflows to fix misnumbered eps would break (ie. change episode#, name lookup uses old [Name], nothing worked), or fall through on series with rather generic episode names (ie. Downton Abbey names all their eps "Episode 1", "Episode 2" etc, so S1E1 and S2E1 have the same name)
Logged
~ nevcairiel
~ Author of LAV Filters

RoderickGI

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 8186
Re: CARNAC/TVDB issues
« Reply #8 on: December 14, 2015, 07:20:59 pm »

Next build:
Changed: TheTVDB lookup can lookup tracks that don't have a season or episode number and will now lookup just by the episode name.

It was a little bit of a bear to implement, but I think it should all be good once the next build ships.

Excellent. I know it is a bit of a bear, as TheTVDB is a particular beast to work with, with many strange idiosyncrasies.

Of course TheTVDB lookup only happens after a program is recorded, so the additional information isn't available in the Guide, but it is a great start. I meant to come back to this thread and point that out, but forgot.

So the next challenge then, is to do the lookup for the Guide data. That becomes a bigger challenge, as there is a lot more data to lookup, and there will be a lot more exceptions to work around. Plus if you were to do that, I may start asking to some user control, perhaps to set rules that said something like; "If the Description exactly matches the Title in the EPG data, look up the correct Description from TheTVDB, and overwrite the existing Description."

Regardless, I look forward to seeing how this functionality works out, and if it does a good job, I can stop looking up the additional data using EPG Collector. The more that is in MC, the better I think, as long as flexibility isn't lost.

Note that it will only do its job if no season/episode info is set, otherwise it would conflict with the S/E lookup, which is generally favored, if available.
Otherwise common workflows to fix misnumbered eps would break (ie. change episode#, name lookup uses old [Name], nothing worked), or fall through on series with rather generic episode names (ie. Downton Abbey names all their eps "Episode 1", "Episode 2" etc, so S1E1 and S2E1 have the same name)

That makes complete sense, and is the only way to do it.

Although I assume that the new functionality will be exposed if I manually run "Get Movie and TV Info..." ? So I could edit the search criteria and have it search for the correct Series and Episode name, just as I can currently edit the Series name and Season and Episode numbers in a manual lookup? I guess I should just wait and see how you have done it.  :D

Thank you gentlemen. This is a very good step forward for TV users who rely on poor EPG source data.
Logged
What specific version of MC you are running:MC27.0.27 @ Oct 27, 2020 and updating regularly Jim!                        MC Release Notes: https://wiki.jriver.com/index.php/Release_Notes
What OS(s) and Version you are running:     Windows 10 Pro 64bit Version 2004 (OS Build 19041.572).
The JRMark score of the PC with an issue:    JRMark (version 26.0.52 64 bit): 3419
Important relevant info about your environment:     
  Using the HTPC as a MC Server & a Workstation as a MC Client plus some DLNA clients.
  Running JRiver for Android, JRemote2, Gizmo, & MO 4Media on a Sony Xperia XZ Premium Android 9.
  Playing video out to a Sony 65" TV connected via HDMI, playing digital audio out via motherboard sound card, PCIe TV tuner

RoderickGI

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 8186
Re: CARNAC/TVDB issues
« Reply #9 on: December 16, 2015, 09:39:04 pm »

I get a second Series group with the incorrect series name, "Marvel's Agents Of S H I E L D". I don't recall that happening in my main library, but perhaps I just automatically go and correct the series name for newly imported files, as I do for so many other programs.

Well, since writing the above I have recorded another episode OTA of "Marvel's Agents of S.H.I.E.L.D" on my main HTPC and as expected, it was imported into MC with the correct name, since MC used the Series name from the EPG data.

However I also imported another episode from a file with a structured name, and it also imported correctly and used the correct name, without removing the "."s in the "S.H.I.E.L.D." portion of the name.

So the nearly empty local database on my Workstation did the wrong thing and removed the "."s, while my main HTPC did the right thing and retained them. Both had the correct TVDB ID and Series name recorded against earlier programs for the series. Something must be different between the two installations.

I can't for the life of me think what that difference may be though. Sure, there are differences; Media Server not running on the Workstation, ROHQ on the Workstation, etc., but nothing I can think of that would affect this issue. I tried importing a single folder, and auto-importing. Same result.

PS: I am running the latest VS Beta, with the above TVDB fix, but this issue happened previously anyway.

PPS: How do I know if the new functionality has done something, other than just importing EPG data without Season and Episode numbers? Is there some specific record type I could look for in the log? My EPG currently has Season and Episode numbers against many programs, but not all, so the change should be trying to do some lookups, even if it doesn't find a match.
Logged
What specific version of MC you are running:MC27.0.27 @ Oct 27, 2020 and updating regularly Jim!                        MC Release Notes: https://wiki.jriver.com/index.php/Release_Notes
What OS(s) and Version you are running:     Windows 10 Pro 64bit Version 2004 (OS Build 19041.572).
The JRMark score of the PC with an issue:    JRMark (version 26.0.52 64 bit): 3419
Important relevant info about your environment:     
  Using the HTPC as a MC Server & a Workstation as a MC Client plus some DLNA clients.
  Running JRiver for Android, JRemote2, Gizmo, & MO 4Media on a Sony Xperia XZ Premium Android 9.
  Playing video out to a Sony 65" TV connected via HDMI, playing digital audio out via motherboard sound card, PCIe TV tuner
Pages: [1]   Go Up