INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Need Help With Smart Link to JustWatch  (Read 1367 times)

elprice7345

  • Galactic Citizen
  • ****
  • Posts: 254
Need Help With Smart Link to JustWatch
« on: January 16, 2024, 01:21:18 am »

I'm trying to create a Smart Link that will link movies and TV Shows directly to their respective JustWatch web pages. This will allow me to quickly see where a movie or TV show is available to stream.

I've gotten the base link to work as long as the movie title doesn't contain punctuation marks.

For example, the movie "The Golden Child" needs to be converted to https://www.justwatch.com/us/movie/the-golden-child

In this case, I only need to replace the spaces with hyphens. Simple enough with a Smart Link: https:////www.justwatch.com//us//movie//Replace([Name],/ ,-)

But, JustWatch also wants any punctuation to be replaced with hyphens as well:

"Fast & Furious Presents: Hobbs & Shaw" becomes https://www.justwatch.com/us/movie/fast-and-furious-presents-hobbs-and-shaw

Is there a way to do this using MC's expression language?


Logged

zybex

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 2619
Re: Need Help With Smart Link to JustWatch
« Reply #1 on: January 16, 2024, 03:15:53 am »

This should work for most titles:
replace(regex(replace(clean([name],9),&,/ and/ ),/#(\w+)#/,-2),;,-)
Logged

elprice7345

  • Galactic Citizen
  • ****
  • Posts: 254
Re: Need Help With Smart Link to JustWatch
« Reply #2 on: January 17, 2024, 02:02:23 am »

Thanks so much @zybex for your help!

I'm still testing, but I have a few questions. Let me see if I understand correctly.

The innermost expression "Replace(Clean([Name],9),&,/ and/ )", removes any accents from [Name] and replaces any "&" with " and ", correct? Why do you need 2 spaces both before and after the "and"?

I'm pretty novice with Regex, but have used it before. Your Regex expression outputs any characters in the range 0 - 9, A - Z, a - z as a semi-colon delimited list. The semi-colons are then replaced with hyphens? The Regex will ignore any other characters like semi-colons, exclamation marks and colons, correct?

Last question: I will also use a similar Smart Link for TV Shows. Some of my series titles include a parenthetical year to distinguish them from other shows with identical titles.

For example: I append the beginning year of a series, "The Americans (2013)" to distinguish it from an identically named series "The Americans (1961)".

JW doesn't append the year to their links, so I would like to remove "(2013)" from the link in the example above.

Can you help with this also?

TIA
Logged

zybex

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 2619
Re: Need Help With Smart Link to JustWatch
« Reply #3 on: January 17, 2024, 02:43:31 am »

The innermost expression "Replace(Clean([Name],9),&,/ and/ )", removes any accents from [Name] and replaces any "&" with " and ", correct? Why do you need 2 spaces both before and after the "and"?
Clean mode 9 removes accents by (eg, replacing á with a)
The expression replaces & with " and " so that Law&Order doesn't become "LawandOrder", which would be wrong. Adding extra spaces is not an issue as the Regex will handle multiple consecutive spaces/symbols as if they're one.

Quote
I'm pretty novice with Regex, but have used it before. Your Regex expression outputs any characters in the range 0 - 9, A - Z, a - z as a semi-colon delimited list. The semi-colons are then replaced with hyphens? The Regex will ignore any other characters like semi-colons, exclamation marks and colons, correct?
Correct. Consecutive symbols and spaces are basically ignored, only words/numbers are captured. Then the captured words are joined with a single hyphen.

Quote
For example: I append the beginning year of a series, "The Americans (2013)" to distinguish it from an identically named series "The Americans (1961)".
A second Regex is needed to remove the Year:
replace(regex(replace(clean(regex([name],/#(.+?)( ?\(\d{4}\))?$#/,1),9),&,/ and/ ),/#(\w+)#/,-2),;,-)

This raises a problem though: it looks like JustWatch is not consistent in how it handles the URLs of duplicate names; it only adds the year to the older series, not the latest:
https://www.justwatch.com/us/tv-show/battlestar-galactica-1978
https://www.justwatch.com/us/tv-show/battlestar-galactica

So by removing the year from the name on both of them, they'll both open the page for the latest one.

PS: You can use Zelda to play with expressions.
Logged

elprice7345

  • Galactic Citizen
  • ****
  • Posts: 254
Re: Need Help With Smart Link to JustWatch
« Reply #4 on: January 18, 2024, 02:25:54 am »

Thanks @zybex!

Let me play with the expressions and do some testing with ZELDA.
Logged

elprice7345

  • Galactic Citizen
  • ****
  • Posts: 254
Re: Need Help With Smart Link to JustWatch
« Reply #5 on: March 15, 2024, 03:51:08 am »

@zybex thanks for all of your help!

You are correct that JustWatch is VERY inconsistent in how the create their URLs, so I'm not sure there's a perfect answer. Without some kind of unique JW identifier, the best I can do is use the most logical expression and go from there.

The movie smart link uses the MC [Name] field which doesn't have the year appended and works unless the title is duplicated. For example: "The Whale" links to "The Whale (2011)" instead of "The Whale (2022)".

I updated my TV Show smart link to remove the appended year, because more series don't have the year appended than do. If a series has a duplicate title, often the duplicate is listed on the JW page in the "People who liked" section.

Again, thanks for your help!

Ed
Logged

elprice7345

  • Galactic Citizen
  • ****
  • Posts: 254
Re: Need Help With Smart Link to JustWatch
« Reply #6 on: March 17, 2024, 04:49:11 pm »

One more question: in the following expression, what does (.+?) do?

replace(regex(replace(clean(regex([name],/#(.+?)( ?\(\d{4}\))?$#/,1),9),&,/ and/ ),/#(\w+)#/,-2),;,-)

The 2nd capture group gets the year along with the parentheses.

(.+?) is the 1st capture group and are the characters that are eventually returned by the entire expression using Run Mode 1, but what is this expression trying to match?
Logged

zybex

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 2619
Re: Need Help With Smart Link to JustWatch
« Reply #7 on: March 17, 2024, 06:00:12 pm »

This site explains each regex token:
https://regex101.com/r/2rANxz/1

(.+) will capture as many chars as possible before the next regex token match (greedy capture)
(.+?) will capture as few chars as possible before the next regex token match (lazy capture)

Without the lazy capture, the first group would always capture the entire string because the second group is optional.
Logged

elprice7345

  • Galactic Citizen
  • ****
  • Posts: 254
Re: Need Help With Smart Link to JustWatch
« Reply #8 on: March 19, 2024, 03:33:55 am »

Thanks @zybex!
Logged

elprice7345

  • Galactic Citizen
  • ****
  • Posts: 254
Re: Need Help With Smart Link to JustWatch
« Reply #9 on: April 08, 2024, 05:09:27 pm »

@zybex - another question:

Testing the expression: Replace(Regex(Replace(Clean(Regex([Series],/#(.+?)( ?\(\d{4}\))?$#/,1),9),&,and),/#(\w+)#/,-2),;,-) with any TV Show with a ' in the title, e.g, "Space's Deepest Secrets", the expression replaces the ' with a "-", yielding  "Space-s Deepest Secrets" and the link doesn't work.

If the series is "America's National Parks (2015)", the same expression returns "America-s National Parks", but resolves to "Americas National Parks" and the link works.

Any idea what is going on here?

If I adjust the expression to: Replace(Regex(Replace(Replace(Clean(Regex([Series],/#(.+?)( ?\(\d{4}\))?$#/,1),9),&,and),',),/#(\w+)#/,-2),;,-), it produces "Spaces Deepest Secrets" and the link works.

I didn't have to make the change for movies, because all of my movies end in a parenthetical year "(2003)"
Logged

zybex

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 2619
Re: Need Help With Smart Link to JustWatch
« Reply #10 on: April 09, 2024, 04:24:49 am »

If the series is "America's National Parks (2015)", the same expression returns "America-s National Parks", but resolves to "Americas National Parks" and the link works.

Not sure what you mean here - I just tested and the expression produces "America-s National Parks", similar to "Space-s Deepest Secrets".
Your fixed expression seems to handle this case well, so go with it.

To handle other exceptions to the rule which are bound to happen, I suggest you use the 'expression override' feature of MC. Just create a Calculated field called [JustWatch URL] or [JustWatch ID] with the latest expression and check the 'allow custom data to override the expression'. Then use the field in the Smart Link. This way when there's a non-working link, you can just paste the correct one in the field for that title and that's it.

You can adjust the expression to have the full URL or just the ID, depending on your preference:
Code: [Select]
https:////www.justwatch.com//us//tv-show//Replace(Regex(Replace(Replace(Clean(Regex([Series],/#(.+?)( ?\(\d{4}\))?$#/,1),9),&,and),',),/#(\w+)#/,-2),;,-)
Logged

elprice7345

  • Galactic Citizen
  • ****
  • Posts: 254
Re: Need Help With Smart Link to JustWatch
« Reply #11 on: April 16, 2024, 01:34:44 am »

Thanks @zybex!
Logged
Pages: [1]   Go Up