INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Expression to find and remove string + x number of characters from [Name]  (Read 2564 times)

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

I'm getting all OCD on getting my Classical music organized. I have copied most of my composer's composition catalog numbers to their own library field, so I can easily sort and identify them by "Op. 1," "Op. 2," etc. What I am trying to do do is unify the name of each piece so the format is [Composition]: [Name]. The first step is to remove the string "Op."+x where x represents a number of characters to the right of "Op." to include, and then remove from the name. For those with Classical collections, you know that there is no standard formatting, so I'll have to modify this expression depending on the group of files I am applying it to, but I'd like it to be adaptable as possible, able to remove the phrase+additional characters, no matter where they are in the [Name] filed.

I'm pretty sure it will involve Regex but I'm not great with the syntax of this and I would appreciate some help on it.
Logged

Moe

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 718
  • Hi

Can you list like 3-5 examples of what you have now and how you'd like them to end up?  That will help to figure it out.
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

Sure, here are some various filenames that I would like to revise in one fell swoop:

Fantasia for piano, chorus and orchestra in C minor, Op. 80 'Choral Fantasy' - Adagio
Missa solemnis in D major, Op. 123 - II. Gloria: Amen, in gloria Dei Patris
Christus am Ölberge (Oratorium) Op. 85 - Arie 'Preist des Erlösers Güte'

In each case, I would like to remove the string "Op. " + the following number. If it were just "Op. " that's an easy fix. But the following number is different, and generally has between 1 and 4 digits. So after running the expression on the name, the results should be:

Fantasia for piano, chorus and orchestra in C minor,  'Choral Fantasy' - Adagio
Missa solemnis in D major,  - II. Gloria: Amen, in gloria Dei Patris
Christus am Ölberge (Oratorium)  - Arie 'Preist des Erlösers Güte'

I can clean up the double spaces and extra dashes on a separate step.

I've been trying to figure it out but I'm not a programmer or scripter. But I thought this would work:

=regex([Name], /#^Op. (\d+)#/, 1)Replace([Name],Op. [R1],)

But it doesn't.
Logged

blgentry

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 8014

You probably want something like:

Code: [Select]
regex([Name],/#(.+)Op.\s\d+\s(.+)#/,-1,0)[R1] [R2]

Put that into an expression column and see how it works.  I tried it on two of your examples and I think I got the intended result.

You might want to read through my regex lesson I posted a while back if you feel like you might be writing more of these yourself.  Regex looks insanely weird until you understand it.  It seems like you already have some understanding already.  Maybe my little post will help solidify some of the ideas for you.

https://yabb.jriver.com/interact/index.php?topic=97996.0

Good luck.

Brian.
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

You probably want something like:

Code: [Select]
regex([Name],/#(.+)Op.\s\d+\s(.+)#/,-1,0)[R1] [R2]

Put that into an expression column and see how it works.  I tried it on two of your examples and I think I got the intended result.

You might want to read through my regex lesson I posted a while back if you feel like you might be writing more of these yourself.  Regex looks insanely weird until you understand it.  It seems like you already have some understanding already.  Maybe my little post will help solidify some of the ideas for you.

https://yabb.jriver.com/interact/index.php?topic=97996.0

Good luck.

Brian.

Thank you very much. I did a quick test and this expression worked great. I don't quite understand all the commands in the phrase, but I will read your post on it before I start asking questions. I very much appreciate the help.

Edit: The wiki need to be edited to direct to your post for the 'regex' entry. It's a great tutorial.
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

How does one create an Expression Column? I've just been punching my expression into a library field. I can't seem to find any info on setting up an Expression Column, although lots of people say this is the way to do it.
Logged

JimH

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 72438
  • Where did I put my teeth?

You can create a custom tag or field.  More here:
https://wiki.jriver.com/index.php/Library_Fields
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

Ok. So I create a custom library field, make it calculated data, add the expression I want to test, and then create a 'Panes' view with that library field as the column filter?

Edited for clarity.
Logged

Moe

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 718
  • Hi

Looks like blgentry beat me to it. 

Quote
How does one create an Expression Column?

You can make a permanent field library field as you described, or you can just make an expression column.  Right click one of your column headers and there's an option "Add Expression Column..."

Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

Aha! Yes, ok I see now. My terminology was wrong so I was looking in the wrong place. Thanks! Super helpful.
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

Ok, so I read the post, which was super helpful. I got through tons of entries with only some slight edits to the expression here and there for changing context. So if I could take a bit more of your time, please let me see if I have parsed your expression correctly, and then let me follow up with something that I thought would work, but doesn't.

So this: regex([Name],/#(.+)Op.\s\d+\s(.+)#/,-1,0)[R1] [R2]

regex([Name], <-- This inputs the string to be parsed
/# <-- this says 'Regex expression starts here
(.+) <-- this says, Make R1 = everything from the beginning until...
Op. <-- ... you hit the string 'Op.'
\s <-- ... and a space?
\d+ <-- ... and some undetermined number of digits
\s <-- ... and another space?
(.+) <-- now make R2 = everything after that last space
#/ <-- end of expression
,-1,0) <-- Run in silent mode, which I don't quite understand, and be case insensitive, which I do understand
[R1] [R2] <-- create a new string with the values of R1 and R2

Ok, so if I have that correct, I am now trying to create an expression to do the opposite. I would like it to extract the composition number from the Name. I have the name:

Quintet for Horn, Violin, 2 Violas & Cello in E flat major KV 407-1.Allegro

And I want it to extract 407, so I have created the expression:

Code: [Select]
=regex([Name],/#.+\KV\s(\d+)\-(.+)#/,1,0)[R1]
And it almost works but returns:

407407

I cannot figure why the composition number has doubled up.
Logged

Moe

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 718
  • Hi

Quote
And it almost works but returns:

407407

I cannot figure why the composition number has doubled up.

It's exporting 407407 because of the "#/,1" 1 in this part of the expression, the one is saying print R1 and since you're also telling it to return R1 it's printing it twice.  change it to #/,-1 (or remove [R1]) and you'll  get a singular 407.
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

It's exporting 407407 because of the "#/,1" 1 in this part of the expression, the one is saying print R1 and since you're also telling it to return R1 it's printing it twice.  change it to #/,-1 (or remove [R1]) and you'll  get a singular 407.

Ah ok, thus the 'Silent mode' thing, which still passes the variables on but doesn't print anything, do I have that right?
Logged

Moe

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 718
  • Hi

Correct.  That way you could use the value later in a much more complex expression.
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

Brilliant. Man I've been using this program for years but there is so much depth that there's always another layer. I just don't discover it until I need it. I'll probably pepper this thread with a few more questions, but I think I have a grasp of it for now. Thanks lads, this round is on me!  ;D
Logged

Moe

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 718
  • Hi

I've been here over 15 years and I probably learn something new every month, if not week.  As an example, today I used the length() and compare() functions for the first time ever. 
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

Arg, I thought I had it but I don't. Ok, I have this name:

KV 254 Klaviertrio No 1 B-Dur - 01 - Allegro

and this expression:

Code: [Select]
regex([Name],/#KV\s\(d+)\s\.+#/,-1,0)[R1]
and I thought it would return the composition number 254 but it returns nothing. What am I missing?
Logged

Moe

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 718
  • Hi

I highly suggest this site when trying to figure out regex, it's been invaluable to me https://regex101.com/

if all you want to capture is the number after KV then you can use this expression

Code: [Select]
regex([Name],/#KV\s(\d+).+$#/,-1,0)[R1]
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

I highly suggest this site when trying to figure out regex, it's been invaluable to me https://regex101.com/

Wow, the rabbit hole just got a lot deeper.

if all you want to capture is the number after KV then you can use this expression

Code: [Select]
regex([Name],/#KV\s(\d+).+$#/,-1,0)[R1]

Ok, so I was close. I don't quite get the meaning of the $. the site you linked me to explains it as "asserts position at the end of a line" but I don't know what that means. I'm sorry if I'm taking up too much of your time.
Logged

Moe

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 718
  • Hi

the $ isn't really necessary for you what you are doing, I've just been told it's good practice.  Your expression could be as short as. 

Code: [Select]
regex([Name],/#KV\s(\d+)#/,-1,0)[R1]
The $ just indicates the end of a string.
Logged

JimH

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 72438
  • Where did I put my teeth?

This is a great example of how good a forum can be.  Thanks to both of you.  Fun to watch.
Logged

blgentry

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 8014

KV 254 Klaviertrio No 1 B-Dur - 01 - Allegro

and this expression:

Code: [Select]
regex([Name],/#KV\s\(d+)\s\.+#/,-1,0)[R1]
and I thought it would return the composition number 254 but it returns nothing. What am I missing?

First, I'm happy that you are developing your own Regexes and equally happy that my little introductory tutorial helped you out.

Now on this specific expression, Moe has already shown you two ways, both of which seem correct to me.  Nice job Moe.  But I thought I'd point out what I think is the problem with this one.  It's the backslash before the ".+".  I think if you remove the backslash it will work as you expected it to when you wrote it.

If you really want to get fancy, you might come up with a more general "KV" expression that accounts for several cases at once with optional pieces and parts.  Remember the * operator which allows you to put optional pieces into your regexes. 

Or that might be a big pain in the butt and not worth doing.  Just throwing out some extra ideas.

Good luck!

Brian.
Logged

Moe

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 718
  • Hi

If you want to play around with fancier, check out this and play around with it.

https://regex101.com/r/scX71S/1

Pay attention to the "Match Information" section on the right hand side.  You could correlate Group 1., Group 2., Group 3. etc. with [R1], [R2], [R3] in MC

This expression works on both your Op and KV problem.  This was done pretty quick and dirty and could probably be optimized.
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

Now on this specific expression, Moe has already shown you two ways, both of which seem correct to me.  Nice job Moe.  But I thought I'd point out what I think is the problem with this one.  It's the backslash before the ".+".  I think if you remove the backslash it will work as you expected it to when you wrote it.

Yes, I'm still trying to figure out the syntax on a lot of these. I don't seem to be grasping when I need to separate with the backslash and when I don't need it. However, the website Moe suggested to test expressions is extremely helpful, so if I am stuck on one I can generally paste it over there and discover my error.

Quote
If you really want to get fancy, you might come up with a more general "KV" expression that accounts for several cases at once with optional pieces and parts. 

Right now I kind of bop around using different techniques for whatever is easiest for that particular group of songs. Composition codes in classical music are generally such a sloppy mess that I don't think it would be worth the time to try to figure a one-size-fits-all expression. For some, a reworking the expressions work, for others just using Find/Replace works fine. With literally thousands of compositions in my collection, I'm trying to juggle what is most efficient. That being said, I'm also enjoying getting more comfortable with regex expressions, so sometimes I just use it because I want to learn.

Quote
Remember the * operator which allows you to put optional pieces into your regexes.

I still don't quite get what the asterisk does. You said in your tutorial "Repeat the thing that comes right before this ZERO or more times" but I don't quite get why you would need to tell it that. In your example you search for ( -) which is a specific string with no repeat modifier. Why would you need to say "stop" ?
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

This expression works on both your Op and KV problem.  This was done pretty quick and dirty and could probably be optimized.

That's amazing. Great just to be able to look at it in this tool so I can see how the parts relate. Unfortunately, I think trying to find the one-size-fits-all expression might be a fool's errand. Many composers have several differing catalogs depending on who was correlating the work, as well as some pieces being from multiple catalogs, some pieces have additional characters in the number, and the formatting is all over the place. Which is why I'm trying to wrangle it all together of course. I never would have been able to do it without these tutorials! I literally would have spent months doing what is taking a few days now.
Logged

blgentry

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 8014

I still don't quite get what the asterisk does. You said in your tutorial "Repeat the thing that comes right before this ZERO or more times" but I don't quite get why you would need to tell it that. In your example you search for ( -) which is a specific string with no repeat modifier. Why would you need to say "stop" ?

Think of the * as meaning "this might be here or it might not".  Notice in Moe's linked expression that it starts with

(.*)stuff

...where "stuff" is more regex.  That means that this regex will match

A. Some stuff
B. stuff that I like
C. a whole lot of stuff that I like

In case A the first group (R1) is "Some".  In case B R1 is empty because there's nothing before "stuff".   ...and obviously by now, in case C R1 is "a whole lot of".

If you didn't include the (.*) part, then you could not match on Case A and case B and still have a value for R1.  I'm not sure if this is clear or not.  I hope it is.

Brian.
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

Think of the * as meaning "this might be here or it might not".  Notice in Moe's linked expression that it starts with

(.*)stuff

...where "stuff" is more regex.  That means that this regex will match

I think I get it. It's basically "optional" stuff? As in, "don't break if nothing is here, but if something is here assign it to a variable." I'll have to play with it when I have a chance later.
Logged

sram16

  • Recent member
  • *
  • Posts: 24

Is it possible to change the column preset in the playback device using an expression? Like if you're playing classical music the columns will change to a custom made classical preset (that displays composers/conductors), whereas if it's nonclassical it will only show artists, etc
Logged

Moe

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 718
  • Hi

technically speaking * means match the value zero or more times.

a* = match the letter "a" zero or more times
\d* = match any digit zero or more times

Here is Brian's stuff example in regex101 https://regex101.com/r/LFjOOv/1
Logged

Moe

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 718
  • Hi

Is it possible to change the column preset in the playback device using an expression? Like if you're playing classical music the columns will change to a custom made classical preset (that displays composers/conductors), whereas if it's nonclassical it will only show artists, etc

No, it is not.
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

technically speaking * means match the value zero or more times.

Ah, great that does make sense!
Logged

sram16

  • Recent member
  • *
  • Posts: 24

No, it is not.
ah that's too bad. That would be a really cool feature and probably make having multiple custom presets usable
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

Going through my Brahms now and this expression is hitting about 90% of my songs with some occasional punctuation issues that are easily solved with a quick 'Find/Replace' or 'Clean Properties.' I can't thank Brian and Moe enough for your help on this!!

Code: [Select]
=[Composition]: regex([Name],/#(.*)WoO|Op. *\d+(.*)#/,-1,0)[R1] [R2]
Edit: ugh spoke too soon

Edit 2: ignore me
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

Ok, I could use some help with this. Here is my current best expression:

Code: [Select]
(.*)Op. *\d+(.*)
It works on these, with some slight punctuation errors:
Missa solemnis in D major, Op. 123 - II. Gloria: Amen, in gloria Dei Patris
Christus am Ölberge (Oratorium) Op. 85 - Arie 'Preist des Erlösers Güte'
Op. 55 Klaviertrio No 1 B-Dur - 01 - Allegro
Op.55 Klaviertrio No 1 B-Dur - 01 - Allegro
Klaviertrio No 1 B-Dur - 01 - Allegro, Op.55
Symphony No.1 In C,op.68-Un Poco Sostenuto-Allegro-Meno Allegro
String Quartet No1 In C Minor Op.51,  Allegro

However, when I try to add a second classifier 'WoO' using the | OR symbol, it breaks everything.

Code: [Select]
(.*)Op.|WoO *\d+(.*)
What am I missing?
Logged

Moe

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 718
  • Hi

If you want to match either Op. or WoO then you want to format it like this (Op.|WoO)

I showed you an example of this earlier https://regex101.com/r/scX71S/1
Logged

Mike Foran

  • World Citizen
  • ***
  • Posts: 212

If you want to match either Op. or WoO then you want to format it like this (Op.|WoO)

I showed you an example of this earlier https://regex101.com/r/scX71S/1

Right, thanks. Ok, so then I would create my new name from [R1] and [R3] to drop out the composition prefix, which has now been captured as [R2], correct?
Logged

Moe

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 718
  • Hi

You can make a group non capturing by adding ?: to it.  So, if you do  (?:Op.|WoO) it will not capture that grouping. 
Logged
Pages: [1]   Go Up