INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Renaming scheme for files  (Read 4380 times)

ashbel

  • Guest
Renaming scheme for files
« on: December 16, 2011, 05:50:01 am »

I have a lot of audio files that I'm trying to reorganise, and I don't understand the field extractor language.

The format for the file name is:

(Artist) - (Descriptor) dd-mm-yyyy (("Hour #" if applicable))

Descriptor may be more than one word.

Ex.

Airwave - Global DJ Broadcast 01-11-2004 (Hour 2)
John Askew - Trancemission 20-03-2004

I would like the syntax to be:

(Artist) - yyyymmdd (Descriptor) (("Hour #" if applicable))

Ex.

Airwave - 20041101 Global DJ Broadcast (Hour 2)
John Askew - 20040320 Trancemission.

Suggestions for a quick way to do this?
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Renaming scheme for files
« Reply #1 on: December 16, 2011, 12:21:48 pm »

Because the fields vary, the Fill Properties From Filename tool won't work for this.  So you'll use another method.

Within MC, select the files that you want renamed (start with a few to test), and right click, select Library Tools > Rename, Move & Copy Files.  In the pulldown, select Rename... and deselect the Directories checkbox and the Find & Replace checkbox.  You'll just be working with the Filename rule, so select it.  In the pulldown, select Expression Editor... and copy/paste the expression below:

Quote
if(Regex([Filename (name)],
   /#^([^-]+) - (.+?)\s+(\d+)-(\d+)-(\d+)( \(Hour.+\))?#/),
        [R1] - [R5][R4][R3] [R2][R6],
        Filename(,0))

You should now see the preview that shows what will change.  The expression will pull the fields you want, and rearrange them.  It takes care to only rearrange those filenames that match the pattern, otherwise the filename is left alone.

Ask if you want an explanation of how and why it works.

Here's the same expression in an expression column showing the before and after results:



Ignore that there is no filename suffix in the Reordered Filename column - the Rename dialog doesn't need it.
Logged
The opinions I express represent my own folly.

rick.ca

  • Citizen of the Universe
  • *****
  • Posts: 3729
Re: Renaming scheme for files
« Reply #2 on: December 16, 2011, 04:55:33 pm »

Quote
Ask if you want an explanation of how and why it works.

It's a good example, so that might be helpful.

I'm wondering why you've included the [^-] (i.e.,  "any character that is not a '-' "). It seems to prevent the proper handling of hyphenated Artist names—for no reason I can see. I'm assuming, of course, those would be hyphenated with '-' and not ' - '. :-\
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Renaming scheme for files
« Reply #3 on: December 16, 2011, 05:54:56 pm »

Ok, I'll write up an explanation tonight.

Re: the dash, the forward conversion of a template such as:

  Artist - Descriptor

is easy, since Artist and Descriptor gets expanded, and there is no ambiguity.  But going in reverse from expanded to template, we could have input such as:

  Foo - Bar - Anything - Goes

Breaking this apart is ambiguous.  We could let the RE consume as much as possible (greedy) for Artist:

   Foo - Bar - Anything

or be non-greedy by consuming as little as possible:

  Foo

leaving the remainder for Description.

So I tend to write REs in a restrictive fashion, where the rules can later be relaxed as more input cases are seen/known.
Logged
The opinions I express represent my own folly.

rick.ca

  • Citizen of the Universe
  • *****
  • Posts: 3729
Re: Renaming scheme for files
« Reply #4 on: December 16, 2011, 06:42:08 pm »

Quote
So I tend to write REs in a restrictive fashion, where the rules can later be relaxed as more input cases are seen/known.

Understood. But I'm still curious about how to handle the common situation where ' - ' has been used as a delimiter, but a value might include a hyphenated word (i.e., one that includes '-' without spaces).
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Renaming scheme for files
« Reply #5 on: December 16, 2011, 07:56:35 pm »

Understood. But I'm still curious about how to handle the common situation where ' - ' has been used as a delimiter, but a value might include a hyphenated word (i.e., one that includes '-' without spaces).

Sorry, missed responding to this...

We'd just remove the restrictive non-dash and modify to:

Quote
if(Regex([Filename (name)],
    /#^(.+) - (.+?)\s+(\d+)-(\d+)-(\d+)( \(Hour.+\))?#/),
        [R1] - [R5][R4][R3] [R2][R6],
        Filename(,0))

Thus, anything can match now for the Artist name.  Because there is an explicit sequence of " - " in the RE, the RE engine tries to match that (technically, the RE engine tries to match as much as possible with the initial .+ but when that fails to be able to match remaining portions of the RE, the engine backtracks).

Logged
The opinions I express represent my own folly.

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Renaming scheme for files
« Reply #6 on: December 16, 2011, 09:34:29 pm »

OK, here's how it works.  Reiterating the expression here:

if(Regex([Filename (name)],
   /#^([^-]+) - (.+?)\s+(\d+)-(\d+)-(\d+)( \(Hour.+\))?#/),
        [R1] - [R5][R4][R3] [R2][R6],
        Filename(,0))

we'll break it into components.  As described in the Expressions wiki page, Regex takes two mandatory arguments, with a third optional argument that defaults to the value of 0.  We're using that mode here - it is a test-only mode, with no output (only captures).

The goal is to match:

   Artist - Descriptor dd-mm-yyyy (Hour #)

where the trailing portion " (Hour #)" is optional.  For now, we'll assume Artist does not contain a dash character.

The regular expression itself, color-coded here to match the above components is:

   ^([^-]+) - (.+?)\s+(\d+)-(\d+)-(\d+)( \(Hour.+\))?

These can be interpreted as:

  REDescription
  ^([^-]+)    match any non-dash character one or more times, and remember that (parens - [R1])
  " - "match a literal space dash space (quotes used here to see the spaces)
  (.+?)match one or more characters (the .+), as few as possible (the ?), and remember that (parens - [R2])
  \s+match one or more whitespace characters
  (\d+)match one or more digits, and remember that (parens - [R3])
  -match a dash
  (\d+)match one or more digits, and remember that (parens - [R4])
  -match a dash
  (\d+)match one or more digits, and remember that (parens - [R5])
  ( \(Hour.+\))?     optionally, match the group of an opening parenthesis (quoted), followed by a space, followed by Hour, followed by one or more characters, followed by a closing parenthesis (quoted), and remember that (parens - [R6])

The parenthesis around the expressions store what was matched into memories.  These, in MC are named as [R1], [R2], ... [R9].  We're using 1 through 6 here.

In mode 0 of Regex(), we test only that the expression matches.  If it does, with the help if the If() statement, we'll output the captured [Rn] values in the order that expresses the desired rearrangement.  In this case, it is:

   [R1] - [R5][R4][R3] [R2][R6]

For the case when the regular expression pattern does not match the input file name, we'll just return the filename itself, minus the extension (so the rename tool will make no changes to this file).  We'll employ the Filename() function with mode 0:

  Filename(,0)

Now, only files that match the pattern will be renamed; others will remain unchanged.



The /# and #/ inside Regex() simply tells MC to not try to interpret the contents, since it conflicts with MCs own language - it is a form of quoting.  These don't have to be used, but without them, you must take care of MC-quoting anything where there is conflict between MC's language and REs.  Just use them.
Logged
The opinions I express represent my own folly.

rick.ca

  • Citizen of the Universe
  • *****
  • Posts: 3729
Re: Renaming scheme for files
« Reply #7 on: December 16, 2011, 10:27:48 pm »

Great walk-through. Thanks.
Logged

ashbel

  • Guest
Re: Renaming scheme for files
« Reply #8 on: December 20, 2011, 05:34:46 pm »

Thanks.  This should be in the wiki.

What characters can I *NOT* use as a delimiter that can be used in a filename?  Are there any?  Will "~" work?  If that's the case I can just simply change all of the " - " to " ~ "  and then if the artist name has a hyphen it'll work, eh?
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Renaming scheme for files
« Reply #9 on: December 20, 2011, 05:52:20 pm »

Thanks.

Since these are file / directory renaming rules we're discussing, ultimately you'll have to avoid the reserved characters for the file system.

Aside from those, you can use any characters in the RE; however, if the character is an RE-reserved character defined by the RE language, it will need to be escaped.  See the expression language entry for Regex() and the mentioned TR1 engine link that describes the RE language used by MC.

In general, the special characters are: . ( ) [ ] * + ? \ { } and ^ and - when inside [ ]

To use one, backslash escape it, such as:  \*

You can use ~ freely - it just matches itself (i.e. it is not a meta-character).
Logged
The opinions I express represent my own folly.
Pages: [1]   Go Up