Wonder if TVDB should generally fix the series name? Someone might not like that, however.
The trouble is that there are several series with the same naming convention. I'm not sure if it is in their policies or standards, but they do consistently use the mechanism to differentiate between remakes or later runs of series of the same name. Not to mention that lots of other applications already account for the convention.
I've been working with Steve Bickell from EPG Collector to improve matching with TheTVDB for Doctor Who initially and now more generally. You can read the latest thread here:
https://sourceforge.net/p/epgcollector/discussion/1125946/thread/d9cfd011/He has made a few changes, but specifically;
1. He used to do a fuzzy match on the Series name, and normal match on the Episode name. With "Doctor Who", where in Australia particularly the EPG only includes "Doctor Who" while TheTVDB calls the series "Doctor Who (2005)", TheTVDB returns all series that match the fuzzy search, but the best match is "Doctor Who", which is the 1960's to 1980's series. EPG Collector would then try to find the Episode with the best matched series, and fail.
So he changed how he matched the Series. First he does a fuzzy match, which finds all Series which may match "Doctor Who", and their Episode names. Then he searches for the Episode name in the returns, and when a match is found, he selects the Series that includes that Episode name. For current programs, that means he gets "Doctor Who (2005)".
2. He has made quite a few improvements to the matching of Episodes with suffixes and prefixes, such as Doctor Who episodes with storyline part numbers at the end i.e. (1) and (2) for a two part story. He has now gone further and made Episode name matching use fuzzy logic as well. That means that there is a greater risk of making an incorrect match, but initially it looks like I am getting much better matches, hence much better Season and Episode numbers in my EPG, which then means better data in MC, when a program is recorded.
Note that I don't know exactly what Steve has done. Some of it is described in the thread I linked to, and some not, I'm sure. I haven't looked at the code, and couldn't make a lot of sense of it if I did, but it is open source, so it is available. Not the version I am using yet though, since it hasn't been publically released.
Note also that at the beginning of that thread I was adding the suffix " (2005)" using EPG Collector functionality, before TheTVDB lookup happened, which was finding matches on the full TVDB name. Now I have taken that edit out and I am still getting matches.
Anyway, as you commented, I thought I would share.
PS:
I have been wondering if anything could be done smarter to try to fix series like Doctor Who, which are often stored with the year after the name.
Actually there is a lot you could do to make TheTVDB lookup smarter generally, rather than just for series like "Doctor Who". I only look up metadata using EPG Collector because it can look up TheTVDB and find the correct Season and Episode numbers for me, and MC insists on having that data before it will look up TheTVDB.
There is absolutely no reason why MC couldn't look up TV program data based on only the Series and Episode names, just as EPG Collector does. If that capability was implemented in MC, a very large part of the issues around EPG data would disappear, since people would no longer need to find an EPG source that includes Season and Episode numbers. Outside of the Microsoft/Rovi and PercData/Gracenote solutions, which means mostly outside the USA, that would make a large difference. It should also help ATSC users who get their EPG OTA in the USA as well.So there is your challenge (or another one), should you accept it.