INTERACT FORUM
More => Old Versions => Media Center 17 => Topic started by: marko on May 06, 2012, 09:19:28 am
-
like this?
if(regex([camera],/#^([A|E|I|O|U|H])#/,0),[R1],No Vowel)
If that's right, and case insensitivity is the default, why does the above appear to work, but if the letters between the pipes are lower case, it fails?
-
I don't know much about this stuff, but I believe I read on another post that the "0" and "1" in your expression changes case-sensitive to case-insensitive.
-
Indeed, "0" is case insensitive, and what the regex should default to if no sensitivity is specified. I did not specify any sensitivity, so was expecting matches despite using lower case in the expression, even though all of the camera values start with a capital letter.
I've made some progress here though, and have a new question...
What's the difference between [A|E|I|O|U|H] and (A|E|I|O|U|H)
With curly brackets, it works as expected, with case sensitivity switch and all, but with square brackets, the case sensitivity switch is ignored.
-
The default in MC is case-insensitive (this choice isn't too uncommon).
But there seems to be a bug introduced recently (going back to at least 16.0.169). This should succeed:
regex(A, /#^([aeiou])#/,1,0)
and yet fails (A should be output).
Edit: Apparently, in the TR-1 implementation of regular expressions, case-insensitivity applies to only Ordinary characters, so you'll have to include both in your range.
The constructs below are two different things, the first not doing what you think, the second more expensive and doing more than you think:
[A|E|I|O|U|H] (A|E|I|O|U|H)
[A|E|I|O|U|H]: The character class [ ] needs no alternation. Anything inside is just the list or range of characters that you want the RE to match. So [A|E|I|O|U|H] is equivalent to the simpler [AEIOUH|]. Notice that the pipe char is included (the original included it five times), and inside a character class, it is just an ordinary charcter. So the character class [AEIOUH|] matches exactly one character which must be any of A, E, I, O, U, H, or |.
(A|E|I|O|U|H): This is a capture group with alternation. This matches either the character A, and if that fails, match the character E, and if that fails, match the character I, ... and if that fails, match the character H. For single character matches, always use character classes instead of alternation. Alternation is expensive, avoid it whenever possible. The RE engine must often try all possibilities (but this isn't obvious with just this simple RE). Think if it like driving a mile or two down every street, and when you find the street dead-ends, backtracking and trying the next street. Expensive. The other thing this RE is doing is capturing what was matched, via the parenthesis - a capture group. What was matched is remembered, for later use.
So, once the bug is fixed, you'd want:
if(regex([camera],/#^([aAeEiIoOuUhH])#/),[R1],No Vowel)
-
Thanks again for your time MrC.
A little frustrating too as "regex(A, /#^([aeiou])#/,1,0)" is what I started with and when it didn't work, figured I'd got it wrong again... which I guess is exactly true, but then, I've no idea how wrong and head off in completely the wrong direction looking for the correction.
I do pay attention to your explanations however, so over time, this may get less foggy for me...
-
You're welcome.
I've updated the wiki with a little note in the case-sensitivity option to help us all avoid this hazard:
Note: Case insensitivity does not apply to characters inside a character class [ ]. Use both uppercase and lowercase characters when necessary to match either case (e.g. [aAbB] to match either uppercase or lowercase A or B).