INTERACT FORUM
More => Old Versions => JRiver Media Center 21 for Windows => Topic started by: hoyt on December 02, 2015, 04:30:57 pm
-
I'm having a hard time getting something to work in MC that I know works in an online perl regex tester. I'm trying to parse the Description field for NFL games given the data coming into MC. I want to then populate Home Team and Visiting Team. Here's an example of the text:
San Francisco (3-6) at Seattle (4-5). Michael Bennett's 3.5 sacks led the Seahawks to a 20-3 rout of the Niners on Oct. 22, as Seattle allowed a season-low 142 yards. 'Hawks RB Marshawn Lynch is averaging 101.9 ypg with 10 TDs in the last nine meetings.
I then setup this:
regex([Description],/(?<=at)(.*)(?=[\(])/,-1,0)[R2]
I expect to get back ' Seattle '. I would then trim the result. However, I'm just getting a partial output of my regex into an expression column where I'm testing. It shows:
0(.*)(?=[\(]),-1,0)
What am I missing? Thanks!
-
Without actually analyzing the regex itself, I see one thing wrong:
MC Regex patterns start with /# and end with #/ . So I don't think your pattern is being recognized.
Brian.
-
Interesting, didn't realize that. My Regex for the Home Team is this:
RemoveCharacters(Regex([Description],(/^[^\(]+/),1), ,2)
And that works. I added the # after and before the /'s and now the return is just blank. That makes me feel like it's at least trying to eval the expression now though...
-
Without actually analyzing the regex itself, I see one thing wrong:
MC Regex patterns start with /# and end with #/ . So I don't think your pattern is being recognized.
Brian.
Ok, that lead me in the right direction, now I have:
regex([Description],/#(?<=at)(.*)(?=[\(])#/,-1,0)[R1]
For some reason I was thinking each () would be an R#. But now something with two ('s is matching too large of a set:
Oakland (4-5) at Detroit (2-7). Joique Bell and the Lions' 32nd-ranked ground game meet an Oakland run defense that's allowed 458 yards the last two weeks. Ex-Michigan DB Charles Woodson (5 INTs) looks to add to Detroit's league-high 21 turnovers.
Turns into:
Detroit (2-7). Joique Bell and the Lions' 32nd-ranked ground game meet an Oakland run defense that's allowed 458 yards the last two weeks. Ex-Michigan DB Charles Woodson
I need to figure out how to change that set to match until the first (, not the last, but wanted to at least update that I was getting it working with the #s.
-
Every time I have to do Regex, my head spins. Here's what I ended up with that appears to work how I want:
regex([Description],/#(?<=at)(.+?)(?=[\(])#/,-1,0)[R1]
-
I'm pretty decent with regex, but I don't know or use it's full syntax. For example, the (<=stuff) idiom is new to me. I've never used it and I don't know exactly how it works.
It sounds like you've got it under control. If you need more advice, let me know and I'll try to help.
Edit: I couldn't help myself and had to try it myself. Here's my take:
regex([Description],/#.+ at ([^\s]+)#/,-1,0)[R1]
The "trick" here is the stuff I'm matching on inside the () isn't just a dot character (which can be anything). Instead I'm matching on [^\s]. Which means "match any character except space type characters". So it stops the match when it hits the first space, tab, or other whitespace type character.
Seems a little more readable and simple to me... but then again, I wrote it, so it should seem easier to understand for ME. :)
Brian.
-
I'm pretty decent with regex, but I don't know or use it's full syntax. For example, the (<=stuff) idiom is new to me. I've never used it and I don't know exactly how it works.
It sounds like you've got it under control. If you need more advice, let me know and I'll try to help.
Edit: I couldn't help myself and had to try it myself. Here's my take:
regex([Description],/#.+ at ([^\s]+)#/,-1,0)[R1]
The "trick" here is the stuff I'm matching on inside the () isn't just a dot character (which can be anything). Instead I'm matching on [^\s]. Which means "match any character except space type characters". So it stops the match when it hits the first space, tab, or other whitespace type character.
Seems a little more readable and simple to me... but then again, I wrote it, so it should seem easier to understand for ME. :)
Brian.
The <= just means up to, at least, that's how I understood it... This site makes it easy to understand by highlighting and adding notes to the side: https://regex101.com/#pcre
I think I actually need the (.*) because I want to match San Francisco, not just San. Like you said, I got it to work and with regex there are all sorts of ways to do the same thing. I'll keep with what I have until I come across a broadcast that breaks it :)
Thanks!