INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Parsing Dates from fields  (Read 6260 times)

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Parsing Dates from fields
« on: March 21, 2014, 12:28:53 pm »

Edit: this topic was split from its original, where stiv32 asks about parsing dates from various fields to be used for de-duping.  The conversation continues...

Hi stiv32,

I just finished looking at your example tracks and fields.  There is much inconsistency in your data, with various forms of dates in the Filename and/or Name field:

   - MM-DD-YY
   - MM-DD-YYYY
   - YYYY
   - YYYY-MM-DD
   - YYYY-YYYY

And there are some empty Date values.  Some files have no date information at all.  From the example data, Name doesn't add any additional information over Filename.

Given this, MC is probably not the right tool to attempt the initial cleanup because coding an MC expression to do this would be very ugly and non-trivial.   The task is non-trivial also because you have ambiguously abbreviated dates: 10-10-28.  Is that Oct 10, 1928 or Oct 28, 2010?

I'd suggest you use a tool that can parse and rename the file (by using the full file path, looking for all the date variations), and unify dates into a single scheme which you can then use in MC.  And you'll have to manually correct the ambiguous dates.

There are many different renaming tools available for Windows; I'm not sure what your experience level is, nor what tools you already possess.  I'd use Directory Opus for this, since I already own it and have written many renaming scripts for myself and other folks (example: Dynamic Renamer).  I recently wrote one for a user in particular who needed exactly this ability (grabbing dates from filenames and setting the Windows Date Modified value).  On a Unix/Linux environment, I'd write a perl script to manage this task.
Logged
The opinions I express represent my own folly.

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Parsing Dates from fields
« Reply #1 on: March 21, 2014, 01:45:38 pm »

I thought about this some more, and I think the pscriptor tool with a new date-detecting scriptlet would be perfect for your needs.  I'm thinking the scriptlet would attempt to find any dates in any fields you want, and populate a field of your choosing with a sortable date (such as yyyy-mm-dd).  Would that be useful?
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Parsing Dates from fields
« Reply #2 on: March 21, 2014, 08:38:27 pm »

Yes, I guess; having a specific date format would make it easier to identify it during different processes like finding the duplicates.

Unfortunately I can't write any code. Some times I can do some light editing in scripts... I think will check out Directory Opus :)

But then, how will I be able to preserve the tracks with a date string at i.e. the Filename (name) and delete those without a date?
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Parsing Dates from fields
« Reply #3 on: March 21, 2014, 09:18:59 pm »

If you're open to using pscriptor, I'll write the Scriplet for you.  It will come in handy for others someday, as it is a nice, generic utility.

But stepping back - have you tried using acoustic fingerprinting on your tracks?  I don't have any idea what the hit rate would be, but if it was reasonably high, de-duping is easy.  Look in the pscriptor thread, reply #47 and #48.  You can use the fingerprint data to tell you which tracks are identical.  There are other tools too, such as Jaikoz.
Logged
The opinions I express represent my own folly.

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Parsing Dates from fields
« Reply #4 on: March 21, 2014, 10:11:31 pm »

If you're open to using pscriptor, I'll write the Scriplet for you.  It will come in handy for others someday, as it is a nice, generic utility.

I'm interested.

Very interested, actually.  It could come in handy for cleaning up live recordings too.  Sometimes they encode the date into tags in a wide variety of forms.  It would be sweet to be able to Select the files, Right Click > Send To (External) > Collect date to Field X and have it do it in a standardized way.

Yep.  Count me in.
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

stiv32

  • World Citizen
  • ***
  • Posts: 162
Parsing Dates from fields
« Reply #5 on: March 22, 2014, 12:21:56 am »

Of course I am open using pscriptor ;) I am going to download it right now, to see how it works. Whenever you have the script, just tell me what I should do :)

I have thought of using fingerprinting, but I would prefer to do it through JRiver. In that way I might be able to preserve hundreds of hours of work done, categorizing my tracks in playlists. I undrestand also through fingerprinting we are capable of finding the different-bitrate-duplicates and keep the tracks with the highest bitrate.

Logged

stiv32

  • World Citizen
  • ***
  • Posts: 162
Parsing Dates from fields
« Reply #6 on: March 22, 2014, 12:42:00 am »

I read the other posts as you suggested Mr C. I can see that pscriptor is pretty powerful tool. I believe that it can be used to solve my entire problem correctly and not just a part of it, like I was trying to do now.

Is there a possibility that you add to the script a command that will copy-merge the Playlist info of the duplicates (and the other Playstat info) into the track we keep? In that way I will be able to preserve my Playlists as well...  :)

ps. can it be done for the date strings we talked about; as well?
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Parsing Dates from fields
« Reply #7 on: March 22, 2014, 03:37:42 pm »

I'm interested.

Very interested, actually.  It could come in handy for cleaning up live recordings too.  Sometimes they encode the date into tags in a wide variety of forms.  It would be sweet to be able to Select the files, Right Click > Send To (External) > Collect date to Field X and have it do it in a standardized way.

Yep.  Count me in.

Great.  However, you might want to forget about Send To.  MC opens a new command shell for each file.  While it will work in this case (since this scriptlet only processes a single file at a time, unlike others that do post-processing on all files seen), you'll have to close all those command windows after the fact, and it will be much slower.  (Now if only Send To had a batch mode, that simply called an external script once).

Is there a possibility that you add to the script a command that will copy-merge the Playlist info of the duplicates (and the other Playstat info) into the track we keep? In that way I will be able to preserve my Playlists as well...  :)

ps. can it be done for the date strings we talked about; as well?

These sound like entirely different features, so don't make sense here in this scriptlet, which is just a date parser.  Let's separate out those for later.

Here's a movie of the scriptlet in action (download and play the movie).  

   https://dl.dropboxusercontent.com/u/87189402/ParseDate.mp4

The scriplet has some pre-configured date patterns using date formatting templates.  These are tried in order from strictest to most liberal until a match is found, or until complete failure.  The scriptlet is being run against the sample data you send me the other day.  There is one case that I'm having trouble with (the 6th one, with the single year value of 1970 - the module I'm using is failing.  This may be the second bug I've found in the module).  But I'll get that one worked out.  Edit: year-only values are now detected.

The way the scriptlet works is that it takes the named field, and writes the captured date in YYYY-MM-DD format, so that it can be used for sorting, or easy re-parsing in MC.  I'm currently writing to a test field I have setup in MC, as I don't want to overwrite the input cell.  The easiest way to evaluate multiple fields would be to create an expression column which combines your desired fields into a single string.  Example - here's an expression column named Parse String:

   [Filename (name)] :: [Name] :: [Comment]

and you'll tell pscriptor that's the field you want to examine.  (pscriptor can accept multiple fields as input, but I don't yet have a way to short-circuit processing - in other words, every field would be parsed, and the last one would win).

I'd like to get some more sample dates in your real data,  perhaps your file names, or some field such as comment, etc.
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Parsing Dates from fields
« Reply #8 on: March 23, 2014, 07:18:54 am »

Could you please, be more specific regarded the data you want me to send you? Would you like me to upload a playlist with all the duplicate files I can find through JRiver, or just send a bigger sample of files with dates in any column? Also, how big do you need the sample to be?

I played the movie. I think that in this way my problem has been already solved, because now I can sort the tracks that have dates against those that don't have dates. In that way I can delete the duplicates without dates easier.

I think that the fingerprint can be imprinted into a string. In case I am correct you could create a custom field Expression Column with each file's fingerprint. In that way we could sort them in JRiver combined with other desired fields and delete the duplicates according to multiple criteria (i.e. bitrate, fingerprint, those which have dates, etc.) . If that could happen, every time we needed to check for duplicates we only would need to take fingerprints from the new songs we have added to our database.
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Parsing Dates from fields
« Reply #9 on: March 23, 2014, 12:38:24 pm »

I've gone ahead and posted version 1.07.  Here's the announcement post:

    http://yabb.jriver.com/interact/index.php?topic=85990.msg605098#msg605098

We can address miss-detected dates over time.  Here's generally how you'll probably want to use it:

   perl pscriptor.pl -v -c pscriptor-config.txt -E DateParse  -f  'Comment::Date Detected'

Note the -f option above.  This uses the new -f syntax, and specifies that input comes from the Comment field in the view and output goes to the Date Detected field (which is not required to be in the view).  You can use multiple -f options to read/write any number of field pairs.  This was implemented as the standard behavior of -f is to read and write the single named field, and you probably don't want to overwrite the field you're reading from when using detection-type scriptlets.

I think that the fingerprint can be imprinted into a string. In case I am correct you could create a custom field Expression Column with each file's fingerprint. In that way we could sort them in JRiver combined with other desired fields and delete the duplicates according to multiple criteria (i.e. bitrate, fingerprint, those which have dates, etc.) . If that could happen, every time we needed to check for duplicates we only would need to take fingerprints from the new songs we have added to our database.

Yes, the fingerprint data you'll want to use will be output to an MC field, which you can use for confident de-duping. (there's other info available too).  Create the custom fields you need in MC (Tools > Options > Library & Folders > Manage Library Fields) and configure the scriptlet to use your named fields (and add them to the view if you want to see them updated live).

There's no need to use other fields for the de-duping.  The video you saw shows the value to use - this value is reliable for your de-duping.  With that, you can create a Smartlist that shows only duplicates of that field.
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Parsing Dates from fields
« Reply #10 on: March 23, 2014, 01:10:09 pm »

Thank you very much!!!

I am going to work with it right now, and let you know how it goes :)

Is there a possibility of adding another option to pscriptor, which would make it possible for us to merge fields of MC? For example merge the playlist fields in the duplicates or the dates fields etc.

In that way we could delete the duplicated tracks with increased safety, knowing that we don't lose precious info.
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Parsing Dates from fields
« Reply #11 on: March 23, 2014, 01:20:01 pm »

You're welcome.  I'm around much of today, so you'll likely get fast replies. :-)

Re: playlists - how do you want to define the "group" of files?  Via the selected files only?  Or some subset of those files (i.e. sub-groups)?  If the latter, what will tell pscriptor how to define a group?  Playlists is available as a file list column in MC, so its value is available to pscriptor so long as it is in the view's file list.  For example, the following just prints the playlist value to the console (you can run this to try it):

   perl pscriptor.pl  -c pscriptor-config.txt -f 'Playlists' -e 'print "$file->{'filename'}: $_\n";'

So long as your fields are in the view, pscriptor can collect and process any of the fields.  In can do so file by file, or it can do so on the entire set of selected files.
Logged
The opinions I express represent my own folly.

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Parsing Dates from fields
« Reply #12 on: March 23, 2014, 01:31:10 pm »

Great.  However, you might want to forget about Send To.  MC opens a new command shell for each file.  While it will work in this case (since this scriptlet only processes a single file at a time, unlike others that do post-processing on all files seen), you'll have to close all those command windows after the fact, and it will be much slower.  (Now if only Send To had a batch mode, that simply called an external script once).

I think that's easy enough to work around.

Can I make it so pscriptor does this script on whatever files are currently selected in the active view?  I'm pretty sure from our other discussions that's how lots of your pscriptor scripts work, anyway.

Assuming yes, then you don't need the filename at all, so whatever MC does other than call the script the first time is pretty useless.  So, I'd just wrap it in my own script that checks for a semaphore file, creates one if it doesn't exist, runs the pscriptor script, and then when it returns, removes the semaphore.
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Parsing Dates from fields
« Reply #13 on: March 23, 2014, 01:36:00 pm »

So long as the chain:

    <Send To item 1>
      <pscriptor's MCWS command to copy clipboard>
      <pscriptor's MCWS command to update fields>
    <Send To item 2>
    ...

works, then it should be fine.
Logged
The opinions I express represent my own folly.

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Parsing Dates from fields
« Reply #14 on: March 23, 2014, 01:40:14 pm »

PS. Since I imagine some of your scripts (if not this one) could go on pretty long, it might not be a bad idea for pscriptor to prevent re-entrance itself.
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

stiv32

  • World Citizen
  • ***
  • Posts: 162
Parsing Dates from fields
« Reply #15 on: March 23, 2014, 01:44:23 pm »

I am thinking it like this:

I run pscriptor and for each group of files with the same fingerprint it will
merge the fields: playlist, date, custom1, custom2 etc.,
for all the duplicated songs.

So before I de-dup, I will have homogenized data for all the duplicates.
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Parsing Dates from fields
« Reply #16 on: March 23, 2014, 01:49:14 pm »

PS. Since I imagine some of your scripts (if not this one) could go on pretty long, it might not be a bad idea for pscriptor to prevent re-entrance itself.

Command line programs don't generally do this.  Its up to the caller to behave appropriately, whether it be a script or manual usage.

The only issue w/parallel runs is the clipboard copy.  Once that copy is complete, in order to determine the file list and field properties (and that is very fast), everything else should work fine (so long as MC doesn't have problems with parallel MCWS writes).
Logged
The opinions I express represent my own folly.

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Parsing Dates from fields
« Reply #17 on: March 23, 2014, 01:53:56 pm »

I am thinking it like this:

I run pscriptor and for each group of files with the same fingerprint it will
merge the fields: playlist, date, custom1, custom2 etc.,
for all the duplicated songs.

So before I de-dup, I will have homogenized data for all the duplicates.


In that case, it will be very easy to write a post-processing scriptlet that uses a given field and updates one or more other fields.  The MakePlaylist scriptlet uses this concept - it collects all the filenames into Album groupings and then creates a playlist file for each album.  For your scriptlet,  you'll just have to define for me what fields you want updated and how you want them updated.

Since you have the data that defines the groupings, there's no reason to run pscriptor once / grouping.  Instead, you just select all the files and the scriptlet will do the collection / processing for all.
Logged
The opinions I express represent my own folly.

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Parsing Dates from fields
« Reply #18 on: March 23, 2014, 02:09:09 pm »

Command line programs don't generally do this.  Its up to the caller to behave appropriately, whether it be a script or manual usage.

The only issue w/parallel runs is the clipboard copy.  Once that copy is complete, in order to determine the file list and field properties (and that is very fast), everything else should work fine (so long as MC doesn't have problems with parallel MCWS writes).

Right, I know.  Yours is just somewhere in between, of course, Perl-guy.  ;)

No worries, I'm happy to wrap it in another script.

I've thought about adding a similar mode to MCFileIngester, where you could select two files in MC and run it to operate on the two files in-place.  But, I think remembering which file comes "first" in the selection would be too tough and prone to errors, so I haven't done it.

It might be quite cool for StackSwap mode though...
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Parsing Dates from fields
« Reply #19 on: March 23, 2014, 02:21:58 pm »

I've been toying with the idea of making it a server app, but I don't know of any good way from MC to trigger a Go event (and I don't want the server to poll).

Maybe we can convince JRiver to implement a batch-mode Send To.

Alternatively, I could implement a different means of communicating MC data.  It could be entirely command line driven, taking a list of FileKey() values (avoids filename quoting and Unicode issues), and a list of fields for reading and writing (both via MCWS also) - hence, no clipboard.

You could get into trouble of your writing data to a field while another instance is read-modifying that same data too.  This things usually just accept the last-write-wins rules.
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Parsing Dates from fields
« Reply #20 on: March 23, 2014, 02:48:26 pm »

Personally I need only "Date Recorded" to be updated to the blank fields.

But I also need the Playlist field to be merged among the tracks and update all of their fields with the merged data. For example check please  the Playlist fields at the attacked txt (csv) file.
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Parsing Dates from fields
« Reply #21 on: March 23, 2014, 03:06:04 pm »

Personally I need only "Date Recorded" to be updated to the blank fields.

I'm not sure what I'm looking at/for in the file?  I see Date Recorded, and it is empty for all files.

But I also need the Playlist field to be merged among the tracks and update all of their fields with the merged data. For example check please the attacked txt (csv) file.

Unfortunately for playlists it isn't as simple as just merging the values to the "Playlists" column.  This is a special read-only column that MC provides in the file list.  To add files to a playlist externally involves a much more complicated process via MCWS:

  - obtain the list of all playlists
  - match the name in the Playlists column against that list (to avoid re-adding the file that already exists)
  - for each file, send an MCWS command to add the file to each playlist where it doesn't already exist (very slow - many MCWS calls).

However, it is easy to populate some other field with the names of all the playlists - I don't know if this is useful to you or not.

I'd need an exact list of fields you want "merged" so I can respond to each of them, as necessary.
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Parsing Dates from fields
« Reply #22 on: March 23, 2014, 03:44:25 pm »

I am sorry if it isn't that clear the way I write it down. Despite that Mr C you have answered my questions concrete.

I don't know if it would be helpful another field with the names of all playlists (for each group of songs). I think it would be helpful if I could replace it; with the playlist field. From what you are writing, this seems to be too troublesome, so lets skip it :)


'Date_Recorded' will be the field where the scriplet that you wrote for me, will place the extracted Dates.

So, the only field I need to be filled-merged; is 'Date_Recorded'.
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Parsing Dates from fields
« Reply #23 on: March 23, 2014, 03:55:00 pm »

So, the only field I need to be filled-merged; is 'Dates Recorded'.

The "filled" part I understand.  Just run pscriptor as shown above using your Dates Recorded field as an output field, such as:

   perl pscriptor.pl -v -c pscriptor-config.txt -E DateParse  -f  'Filename (name)::Dates Recorded'

If you want to look for dates in more fields than just Filename (name), create an expression column in the file list combining your fields and use the column name as the input field.  I mentioned this idea above in reply #16.

As for the "merged" part, I'm not sure I understand which specific things are being merged (since Dates Recorded is empty)
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Re: Parsing Dates from fields
« Reply #24 on: March 23, 2014, 04:55:14 pm »

I am trying to run the Dateparse script but when I chose one song I get this error:

X:\JRiver\Plugins\pscriptor>perl pscriptor.pl -v -c pscriptor-config.txt -E Date
Parse  -f Filename (name)::Date_Recorded
File: X:\t\music\classic\3\EdDo04\Orquesta Edgardo Donato Vol 04(1930-1934)\123-
Que haces Ciriaco-Luis Diaz(1931)-FALTANTE.mp3
        in(filename): 'X:\t\music\classic\3\EdDo04\Orquesta Edgardo Donato Vol 0
4(1930-1934)\123-Que haces Ciriaco-Luis Diaz(1931)-FALTANTE.mp3'
        old(filename): <unknown>
        new(filename): '1930'
Uncaught exception from user code:
        MCWS write failed:
        http://localhost:52199/MCWS/v1/File/SetInfo?Value=1930&Field=filename&Fi
leType=Filename&File=X%3A%5Ct%5Cmusic%5Cclassic%5C3%5CEdDo04%5COrquesta%20Edgard
o%20Donato%20Vol%2004(1930-1934)%5C123-Que%20haces%20Ciriaco-Luis%20Diaz(1931)-F
ALTANTE.mp3
        500 Internal server error
        <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
        <Response Status="Failure" Information="Changes are only accepted from a
uthenticated clients.  Enable authentication on the server in Options &gt; Media
 Network."/>
        MCUtils::MCWS::send_items('MCUtils::MCWS=HASH(0x1368c74)', 'REF(0x361512
c)') called at MCUtils/MCWS.pm line 55
        MCUtils::MCWS::send('MCUtils::MCWS=HASH(0x1368c74)', 'ARRAY(0x243e234)')
 called at pscriptor.pl line 188





When I choose more tracks I get the following error:


X:\JRiver\Plugins\pscriptor>perl pscriptor.pl -v -c pscriptor-config.txt -E Date
Parse  -f Filename (name)::Date_Recorded
Uncaught exception from user code:
        Clipboard does not have the Filename column, or clipboard data is not fr
om MC.
        Stopped at pscriptor.pl line 176.


Logged

stiv32

  • World Citizen
  • ***
  • Posts: 162
Re: Parsing Dates from fields
« Reply #25 on: March 23, 2014, 05:02:05 pm »

Also I read in AcousticFingerprint.pm that I should download from a specific address codegen.exe. When I try to execute the codegen.exe I get an error that the zlib.dll is missing.
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Parsing Dates from fields
« Reply #26 on: March 23, 2014, 05:07:15 pm »

I am trying to run the Dateparse script but when I chose one song I get this error:

X:\JRiver\Plugins\pscriptor>perl pscriptor.pl -v -c pscriptor-config.txt -E Date
...
Uncaught exception from user code:
        MCWS write failed:
        http://localhost:52199/MCWS/v1/File/SetInfo?Value=1930&Field=filename&Fi
leType=Filename&File=X%3A%5Ct%5Cmusic%5Cclassic%5C3%5CEdDo04%5COrquesta%20Edgard
o%20Donato%20Vol%2004(1930-1934)%5C123-Que%20haces%20Ciriaco-Luis%20Diaz(1931)-F
ALTANTE.mp3
        500 Internal server error
        <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
        <Response Status="Failure" Information="Changes are only accepted from a
uthenticated clients.  Enable authentication on the server in Options &gt; Media Network.
"/>
...

First fix this.  Enable Authentication in MC's Media Network, and edit the config file setting the values you entered in MC.
Logged
The opinions I express represent my own folly.

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Parsing Dates from fields
« Reply #27 on: March 23, 2014, 05:15:25 pm »

Also I read in AcousticFingerprint.pm that I should download from a specific address codegen.exe. When I try to execute the codegen.exe I get an error that the zlib.dll is missing.

I'm sorry.  I didn't provide you with all the required DLLs.  I've uploaded a zip file which should have all that you need.  Download the same file listed, but in the URL replace the .exe with .zip.
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Re: Parsing Dates from fields
« Reply #28 on: March 23, 2014, 05:19:56 pm »

Done. We are here now:

X:\JRiver\Plugins\pscriptor>perl pscriptor.pl -v -c pscriptor-config.txt -E Date
Parse  -f Filename (name)::Date_Recorded
File: X:\t\music\classic\3\EdDo04\Orquesta Edgardo Donato Vol 04(1930-1934)\123-
Que haces Ciriaco-Luis Diaz(1931)-FALTANTE.mp3
        in(filename): 'X:\t\music\classic\3\EdDo04\Orquesta Edgardo Donato Vol 0
4(1930-1934)\123-Que haces Ciriaco-Luis Diaz(1931)-FALTANTE.mp3'
        old(filename): <unknown>
        new(filename): '1930'
Uncaught exception from user code:
        MCWS write failed:
        http://localhost:52199/MCWS/v1/File/SetInfo?Value=1930&Field=filename&Fi
leType=Filename&File=X%3A%5Ct%5Cmusic%5Cclassic%5C3%5CEdDo04%5COrquesta%20Edgard
o%20Donato%20Vol%2004(1930-1934)%5C123-Que%20haces%20Ciriaco-Luis%20Diaz(1931)-F
ALTANTE.mp3
        500 Internal server error
        <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
        <Response Status="Failure"/>
        MCUtils::MCWS::send_items('MCUtils::MCWS=HASH(0x1298c74)', 'REF(0x35954c
4)') called at MCUtils/MCWS.pm line 55
        MCUtils::MCWS::send('MCUtils::MCWS=HASH(0x1298c74)', 'ARRAY(0x236e234)')
 called at pscriptor.pl line 188

X:\JRiver\Plugins\pscriptor>


Also, how many files can I choose everytime I run pscriptor?
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Parsing Dates from fields
« Reply #29 on: March 23, 2014, 05:23:45 pm »

Is your field actually named (with the underscore):

    Date_Recorded

and not:

   Date Recorded

I don't check validity on the outfield names, since they may not be in the view.

Also, how many files can I choose everytime I run pscriptor?

As many as you want, within memory limits (and this is harder to define).
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Re: Parsing Dates from fields
« Reply #30 on: March 23, 2014, 05:30:42 pm »

Yes, it is named Date_Recorded and I have added it in view.
In detail I have added the following information.

Display, Date_Recorded
Flags, Audio;Image;Video;Data;TV

Data,
User Data,
Data, Type String
Relational, Not relational (store one value for each file)
Edit Type, Standard
Also checked is, Save in file tags
.......................................................

About the fingerprinting. Should I create all the 'En Artist', 'En Name', etc. Expression fields and appear them as columns in the MC view window?
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Parsing Dates from fields
« Reply #31 on: March 23, 2014, 05:33:34 pm »

Oh, here's your problem:

X:\JRiver\Plugins\pscriptor>perl pscriptor.pl -v -c pscriptor-config.txt -E DateParse  -f Filename (name)::Date_Recorded

You need double quotes around this argument since it has a space.  Use:

  ... -f "Filename (name)::Date_Recorded"

About the fingerprinting. Should I create all the 'En Artist', 'En Name', etc. Expression fields and appear them as columns in the MC view window?

You need to create each field you want values for. Fields that don't exist won't obviously get values. You can use the fields listed in the scriptlet, or edit and modify the names on the right side of the table.  See the directions at the top of the scriptlet.
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Re: Parsing Dates from fields
« Reply #32 on: March 23, 2014, 05:38:59 pm »

I tried with the double quotes:

X:\JRiver\Plugins\pscriptor>perl pscriptor.pl -v -c pscriptor-config.txt -E Date
Parse  -f "Filename (name)::Date_Recorded"
File: X:\t\music\classic\3\EdDo04\Orquesta Edgardo Donato Vol 04(1930-1934)\123-
Que haces Ciriaco-Luis Diaz(1931)-FALTANTE.mp3
        in(filename (name)): '123-Que haces Ciriaco-Luis Diaz(1931)-FALTANTE.mp3
'
        old(date_recorded): ''
        new(date_recorded): '1931'
Uncaught exception from user code:
        MCWS write failed:
        http://localhost:52199/MCWS/v1/File/SetInfo?Value=1931&Field=date_record
ed&FileType=Filename&File=X%3A%5Ct%5Cmusic%5Cclassic%5C3%5CEdDo04%5COrquesta%20E
dgardo%20Donato%20Vol%2004(1930-1934)%5C123-Que%20haces%20Ciriaco-Luis%20Diaz(19
31)-FALTANTE.mp3
        500 Internal server error
        <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
        <Response Status="Failure"/>
        MCUtils::MCWS::send_items('MCUtils::MCWS=HASH(0x12b8cb4)', 'REF(0x357c1e
4)') called at MCUtils/MCWS.pm line 55
        MCUtils::MCWS::send('MCUtils::MCWS=HASH(0x12b8cb4)', 'ARRAY(0x237acc4)')
 called at pscriptor.pl line 188

X:\JRiver\Plugins\pscriptor>


I have to go for a few hours. I'll check back later. Thank you Mr C for all the help!
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Parsing Dates from fields
« Reply #33 on: March 23, 2014, 06:36:41 pm »

I'm at a loss as to why MCWS is failing for you.   I tried essentially the same thing here, with the same filename and path (as much as possible here) and it works fine.

When you return, can you verify the version of MC you are using?  Two MCWS bugs have been fixed recently.
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Re: Parsing Dates from fields
« Reply #34 on: March 24, 2014, 01:33:17 am »

That was it! I updated to the latest version and everything works great now.

About the merging script. I was talking about this new Date_Recorded field.

If the duplicates that I delete have a value in Date_Recorded field, the value will be lost.

So, can this value be merged to the blank fields of the other duplicates?

The criteria could be the fingerprint of each track.

In that way the tracks with the same fingerprint will have the same Date_Recorded value.
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Parsing Dates from fields
« Reply #35 on: March 24, 2014, 02:39:52 am »

That's good news.

I'll address your other question in the morning.
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Re: Parsing Dates from fields
« Reply #36 on: March 24, 2014, 06:15:00 am »

Thank you very much. I believe that I will have completed the Date extraction process until then, so that I will be able to use it immediately  :D

I think I should note that, after some tests that I run, I believe that the problem wasn´t at the latest version but at the portable installation.

Also, there is a problem with the results that I get. Some of the dates return as 2019-mm-dd. I attach a sample so that you can understand what is happening.

Logged

stiv32

  • World Citizen
  • ***
  • Posts: 162
Re: Parsing Dates from fields
« Reply #37 on: March 24, 2014, 11:39:34 am »

Here is the attachment (test3.txt (csv))
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Parsing Dates from fields
« Reply #38 on: March 24, 2014, 01:31:27 pm »

I need to know which field(s) you are evaluating, and which entry is causing the 2019-mm-dd problem.

Also, in the future for this test data, if you could export as MPL format, it will be easier for me to create sample data to test.

Edit: never mind the request to output to MPL.  I've updated my csv2mpl program to handle this correctly.
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Re: Parsing Dates from fields
« Reply #39 on: March 24, 2014, 03:58:24 pm »

I am sorry about the csv file. I have attached here a txt (mpl) sample.

I am evaluating the Filename (name) field. All the entries that I have sent you in the sample are causing the 2019-mm-dd problem, except from the last 3 that are 2020-mm-dd and 4335-mm-dd.
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Parsing Dates from fields
« Reply #40 on: March 24, 2014, 04:21:42 pm »

No problem.  CSV is fine*.  However, you should just keep the extension (.csv or .mpl) and ZIP the file to attach here.  That way, I don't have to change it each zip and I get the raw file.

Anyway, you have a date which looks like:

   34335-2-1938.mp3

This is one of those ambiguous cases.  It clearly appears like it contains a date.  But which is the date part?  Is it:

   34335-2-1938.mp3
   34335-2-1938.mp3
   34335-2-1938.mp3

Given that we don't know the y-m-d ordering, nor the maximum year to accept as valid (4335 is a valid year, but should be rejected here since all of our dates should be in the past).   I suppose I can try to reject years beyond today's year value, and so the best match would be the stricter 5-2-1938 (as opposed to the less strict 35-2-19 date).

* Since I had your CSV file already, I decided to greatly improve an existing csv2mpl script I had to make it more general, and to add the features I need to convert and tranform a CSV into an importable MPL.  I'll probably post it someday.
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Re: Parsing Dates from fields
« Reply #41 on: March 24, 2014, 04:42:25 pm »

You are right! In different cases each of those choices would be correct. Unfortunately most of the results I 've got have 2019 instead of the correct year. Anyway the most important thing here is the year and then the month.

It might be helpful that here (Greece) and at Argentina from where I have downloaded many of my songs use mostly a date system like this dd-mm-yy.

ps. I posted my fingerprinting questions at a new topic. I hope it was the correct thing to do.
Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Parsing Dates from fields
« Reply #42 on: March 24, 2014, 06:42:10 pm »

I'm working on some changes to better detect the dates.  I've already added the code to restrict dates to no later than this year, and may add more for month and days as well.

I'll show you later how to adjust which dates are examined first.  You'll have to be careful, as you may end up getting too many incorrect dates.
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Re: Parsing Dates from fields
« Reply #43 on: March 24, 2014, 06:43:39 pm »

Alright, thanks. I should note that, there are many files with dates that cannot be mistaken.

For example, the Filename (name) is
Maula - Rosita Quiroga - (16-04-1952).mp3

and gives

Date_Recorded: 2019-04-16

Logged

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Parsing Dates from fields
« Reply #44 on: March 24, 2014, 08:30:28 pm »

Here's an updated DateParse scriptlet.  Unzip and install into the scriplets directory.

I've decided that these dates:

   34335-2-1938

are going to be treated as:

   2-1938

as the leading 34335 is arbitrary junk.  In fact, more specifically, I'm disallowing a digit from butting up next to a date component (the 3433 butts up against the 5, which could be part of the date).  The reason is that this:

   123412-2-1938

makes it impossible to know if the month is 1 or 12, so I feel it is better to ignore that leading junk and go with 2-1938, and default to using day = 1, resulting in 1938-2-1.

It might be helpful that here (Greece) and at Argentina from where I have downloaded many of my songs use mostly a date system like this dd-mm-yy.

Regarding your preferred date formats.  In the scriptlet, you'll find an array:

Code: [Select]
my @date_formats = (
    # strict, common formats first
    '(?:^|\D)%Y-%m-%d(?:\D|$)', # YYYY-MM-DD
    '(?:^|\D)%m-%d-%Y(?:\D|$)', # MM-DD-YYYY
    '(?:^|\D)%d-%m-%Y(?:\D|$)', # DD-MM-YYYY
    ...

You are welcome to move items around, but I will warn, you might find some dates are incorrectly parsed.  This has to do with how dates are being matched, from strictest to loosest.  You'll find that dd-mm-yy is near the bottom of the list.  Move at your own peril.

Alright, thanks. I should note that, there are many files with dates that cannot be mistaken.

For example, the Filename (name) is
Maula - Rosita Quiroga - (16-04-1952).mp3

and gives

Date_Recorded: 2019-04-16

This is resolved:

   DateParse: date found: year: 1952, month: 04, day: 16: format: (?:^|\D)%d-%m-%Y(?:\D|$)
   in(comment): 'Maula - Rosita Quiroga - (16-04-1952).mp3'
   old(date detected): ''
   new(date detected): '1952-04-16'
Logged
The opinions I express represent my own folly.

MrC

  • Citizen of the Universe
  • *****
  • Posts: 10462
  • Your life is short. Give me your money.
Re: Parsing Dates from fields
« Reply #45 on: March 24, 2014, 10:52:16 pm »

Is there a possibility that you add to the script a command that will copy-merge the Playlist info of the duplicates (and the other Playstat info) into the track we keep? In that way I will be able to preserve my Playlists as well...  :)

ps. can it be done for the date strings we talked about; as well?

I haven't forgotten about this, but fell behind today.  I need to add some more functionality to pscriptor to do this.  Be back tomorrow...
Logged
The opinions I express represent my own folly.

stiv32

  • World Citizen
  • ***
  • Posts: 162
Re: Parsing Dates from fields
« Reply #46 on: March 27, 2014, 01:21:30 am »

This request has been resolved.

Using the DataParse script of Pscriptor
http://yabb.jriver.com/interact/index.php?topic=85990.0

Thanks to Mr C, I could arrange the dates according to my most preferable format and merge them in my desired groups (duplicates in my case), or even copy them easily from one entry to others.

I was able to do that in the same column or transfer them in a new one.

Logged
Pages: [1]   Go Up