INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Yet Another Expression Question  (Read 931 times)

timwtheov

  • Galactic Citizen
  • ****
  • Posts: 354
Yet Another Expression Question
« on: December 31, 2020, 08:47:22 pm »

OK, since I've spent nearly two hours on this, I thought I'd swallow my pride and ask here.

I'm trying to either create a smartlist or view (I'm thinking the latter might be simpler, but I could be wrong) that will show me duplicates of the same performances of a given classical composition, so that I can delete the one I don't want. I have a lot of classical album remasters that I've accumulated over the years, and as I get more and more of these pieces fully tagged in my library, it's getting more and more tedious to try and remember to check to see if I have duplicate performances of that Bernstein Copland 3rd symphony on Sony, for example, or that Mravinsky Shostakovich 8th on Philips or Alto or Melodiya.

It would seem that for a given composition, I could just compare all artists (I have a list of artists in [Artist] for each classical composition, from 1 to n semicolon-delimited members, e.g., conductor;orchestra or soloist or a list of vocalists;chorus;conductor;orchestra, and so on), and if any duplicates are found, return the full set (all tracks) of these duplicates so I could delete the version(s) I didn't want to keep. But how do I get the "for a given composition" part? Is that the global variable limitiation I've run into before in MC? Or is there some way around this I'm not seeing?

Logged

wer

  • Citizen of the Universe
  • *****
  • Posts: 2640
Re: Yet Another Expression Question
« Reply #1 on: December 31, 2020, 10:10:44 pm »

I'm going to have to start charging you, Tim...  ;)

This can be done, but the ease with which it can be done depends on how well you have tagged your files.

First, of course, you must have collisions (matches) on the name of the composition itself. I use [Composition] (which is a calculated field) for this, naturally. But I seem to recall you unfortunately are not.  So you'll need to describe how you are calculating Composition.

Are you scrupulous in your naming, so that the names of the individual tracks would also match?  In other words, is the [Name] field for the first movement of Beethoven's first symphony the same on all albums on which that track appears?

Do you calculate or fill out a [Movement] field or equivalent?

It's necessary to have multiple vectors to determine a duplicate in a smartlist, since every track that is part of a composition, even if there are no "duplicates" will collide on the Composition name with every other track that is part of that same composition.
Logged

timwtheov

  • Galactic Citizen
  • ****
  • Posts: 354
Re: Yet Another Expression Question
« Reply #2 on: January 01, 2021, 12:30:24 am »

Thanks, Wer: couple of partiers here, writing about MC on New Year's Eve!

[AMG Work Name], my [Composition], is standardized because I tag almost exclusively through AMG via MCUtils. The only time I don't is when a composition isn't "on" their site, but a) that doesn't happen all that often, and b) the stuff that isn't on their site I'm pretty anal about making sure is standardized. They themselves are pretty good about standardizing stuff like composition and composer and pretty comprehensive for classical and popular music alike, which is why I went with them over, say, Discogs or Musicbrainz. And even when I can't grab a full album, I can still use the unique "workid" and "artistid" (for composers and artists alike) in MCUtils and run the amg.pl script to get pretty standardized metadata. The only time there's a problem, beyond when something simply doesn't exist on AMG, is when they change something. For example, just this week, I noticed that they'd monkeyed with the names of some Shostakovich symphonies, so I had to run "workid" in Perl on all of his symphonies in MC to keep them standardized. But as with compositions not being present on AMG, this doesnt' happen all that often.

This is the same, for the most part, with performers as well. I will confess, however, to being a little lazy sometimes with E. European artists (vocalists and soloists) on semi-obscure labels and sometimes will just copy/paste those from Discogs or Presto Classical so as not to have to check each one on AMG if they're not already in MC, which is how I usually keep performers standardized; so there might be an issue there, though I could probably spend a little time and rectify those.

The biggest issue would be with [Name], which is not entirely consistent, mostly because AMG itself itsn't. If the album is fully on AMG, i.e., if the album is present with full tracks, I populate [Name] with those track names and after use a calculated field (=[btn]--basically [Movement] in your question) to make [Name] into

Code: [Select]
[AMG Work Name]: [Name]
The problem is that AMG isn't entirely consistent with their classical track names, e.g., you might get "Allegro" or "1. Allegro" or even "I. Allegro." The other issue is that when an album isn't fully tracked on the main album page or any of its release pages, then I populate [Name] in one of two ways: either via the work's movement details ([AMG Work Parts]], which then, via another calculated field (=[btn1]--another iteration of [Movement]) becomes

Quote
[AMG Work Name]: [AMG Work Part (via Index)]

or, if there are no movement details, I typically use what's in the track already, if it doesn't look too bad, or if the work is only one track, I use [AMG Work Name] as [Name]. In retrospect, if I were starting from scratch, I might have standardized [Name] more systematically with the work parts, but since that ended up being a later addition to MCUtils, I didn't.

As I type all this out, I see that some of these vectors won't work reliably for the kind of precision necessary for this kind of operation, as really only [AMG Work Name] and [Composer]/performers are going to be pretty standard. However, I mostly need this for the most common stuff, the basic repertoire where I might have mutliple Ormandy or Szell or Rubinstein recordings of a given a work, and those names are standardized in my library.

Logged

wer

  • Citizen of the Universe
  • *****
  • Posts: 2640
Re: Yet Another Expression Question
« Reply #3 on: January 01, 2021, 12:48:47 am »

Show me some data.

So is [AMG Work Part] the movement number?  If that is consistently populated, that would be enough.

For example, if your tracks were (in my view) well tagged, then a track:
Sonata No. 22 for Violin and Piano in A, K.305/293d: 1. Allegro di molto

Would yield
[Composition]=Sonata No. 22 for Violin and Piano in A, K.305/293d
[Movement]=1. Allegro di molto
[Movement Number]=1

In my database, all these fields are automatically calculated because I enforce a schema on classical track names.

It seems that in yours, [Composition]/[AMG Work Name] is some arbitrary value pulled from their database.  That is ok, and you seem to be saying they are pretty consistent about [AMG Work Name] even if they are not with [Name].  If they are not consistent with [AMG Work Name] you are screwed.

If you are also saying they consistently assign [AMG Work Part]=1 in the above example, and the next track has [AMG Work Part]=2 etc then you have two vectors and can make it work.  But if that field randomly contains 1, I, or No. 1 then you will have problems.
Logged

timwtheov

  • Galactic Citizen
  • ****
  • Posts: 354
Re: Yet Another Expression Question
« Reply #4 on: January 01, 2021, 01:36:46 am »

Yes, [AMG Work Name] (composition) is relatively consistent.

The rest is not, unfortunately. [AMG Work Part] is parsed from "Parts/Movements" on a composition page like this one (scroll down a tad):

https://www.allmusic.com/composition/symphony-no-9-in-d-minor-choral-op-125-mc0002366840

That in itself is obviously consistent; however, my usage of it is not because I don't always use [AMG Work Part], i.e., I only use it when I get an album like this one

https://www.allmusic.com/album/avet-rubeni-terterian-symphonies-nos-3-and-4-mw0003323466

where no tracks are present and therefore no track info. to scrape into [Name]. What I then do is go to [Work Part Index] and manually fill in the numbers ([Movement Number] in your example), then run AMG's unique "work id" to scrape the work part info into [AMG Work Part], which will in turn populate [AMG Work Part (via Index)]. [Name] then becomes

Code: [Select]
[AMG Work Name]: [AMG Work Part (via Index)]
via my calculated field, [btn1].

Because AMG isn't consistent with track names and because my own workflow requires some juggling based on their lack of metadata on some albums, plus my impatience and consequent need for speed because I have a massive un-tagged-to-my-satisfaction collection of classical music, [Name] isn't going to be consistent enough to use, at least in toto; and though [AMG Work Part] would be present for all works in my library, AMG isn't always consistent with including it when works have multiple movements. For basic repertoire stuff, it's usually there, but the more obscure one gets, the less likely it will be there, as with the Terterian Symphony No. 4 above:

https://www.allmusic.com/composition/symphony-no-4-mc0002463683

which is tracked as 3 movements on the recording in question.

That's another issue too: not all CDs track works the same way, particularly when you get into operas and ballets and other longer works with many, many parts. And those on a given disc almost never correspond with the "Parts/Movements" on AMG. But even with the Beethoven Symphony No. 9 above: most recordings track that with 4 movements, some with 5 as per AMG's "Parts/Movements," and I believe I've seen it with 6-7, depending on how the last movement is divided up.

So I don't know: maybe it's not doable.
Logged

wer

  • Citizen of the Universe
  • *****
  • Posts: 2640
Re: Yet Another Expression Question
« Reply #5 on: January 01, 2021, 02:57:37 am »

It's doable with good data. It's not doable with bad data.

Here's a smartlist that gives good results on my database:

Code: [Select]
[Media Type]=[audio] [Genre]=[Classical] ~dup=[Artist],[Composition],[Movement] ~sort=[Artist],[Album],[Name] ~nodup=[Album] ~expand="Artist - Album - Composition"
You should be able to adapt it to your fields.  Although at this rate, perhaps you'd be better off just standardizing on my field names.  :P

But if your movement data won't give you the needed collisions, then you'll be disappointed.
Logged

wer

  • Citizen of the Universe
  • *****
  • Posts: 2640
Re: Yet Another Expression Question
« Reply #6 on: January 01, 2021, 04:19:08 am »

There is a quick and dirty way that might get you started. (This could be built with GroupSummaryQuery, but it's pokey with large libraries.)

Create a new view. View as Categories. First/only category is an expression: [Composer] - [AMG Work Name]
Set the list style to Details. Make sure Album is a column.
Now sort by the Album column.

Up at the top will be all the Compositions that show an album of [Varies].  These are your compositions that show up on multiple albums. In other words, duplicates.  Double-click on a line in the list, and the file list will open below showing all the tracks for that composition, grouped by album.
Logged

EnglishTiger

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 967
Re: Yet Another Expression Question
« Reply #7 on: January 01, 2021, 04:51:35 am »

When it comes to sites that provide Metadata for Music Albums/Tracks it doesn't surprise me that there isn't a single one that has an accuracy rating of over 90% indeed some of then struggle to hit an accuracy rating of 50% , and that includes some of the sites you have to pay to access. The problem starts at the original source of what we refer to as metadata - the Record Companies, the only thing that lot can agree on is devising even more ways of ripping off their customers. Whilst ripping my Classical Boxsets I've come across multiple instances where:- The Booklet, The Sleeve Notes, The information printed on the CD Face and when present The CD-Text file have either contained different variants of the same data or even different data.
Logged
Win NUC - VENOEN 11Th NUC Mini PC Core i7 1165G7,Dual HDMI 2.0+Mini DP,Windows 11 Mini Desktop Computer,Thunderbolt 4.0,1 Lan, USB-C,Wifi,Bluetooth 5.0,32GB RAM Toshiba MQ04ABF100 ‎500Gb 5400 RPM ‎eSATA HD, Gigabyte GP-GSM2NE3512GNTD 1Tb NVMe SSD, Samsung 870 QVO 8 TB SATA 2.5 Inch SSD (MZ-77Q8T0) in Sabrent Ultra Slim USB 3.0 to 2.5-Inch SATA External Aluminium Hard Drive Enclosure (EC-UK30)

Apple 2020 Mac mini M1 Chip (8GB RAM, 512GB SSD)
Sabrent Thunderbolt 3 to Dual NVMe M.2 SSD Tool-Free Enclosure with Sabrent 2TB Rocket NVMe PCIe M.2 2280 High Performance SSD + Crucial P3 Plus 4TB M.2 PCIe

ET Skins & TrackInfo Plugins - https://englishtiger.uk/index.html

timwtheov

  • Galactic Citizen
  • ****
  • Posts: 354
Re: Yet Another Expression Question
« Reply #8 on: January 01, 2021, 03:00:04 pm »

@Wer, You know, it's been at the edge of consciousness for a few months now that I should have a more standardized [Name] structure, and it keeps leading to the likewise semiconscious warning that this is going to be a problem down the line. Well, I guess we're at the line now. So I think I'm going to try to standardize my [Name] fields (gulp: we're talking 83,287 files here). Looks like I'll be dealing with the 'ole regex soon!

@EnglishTiger, Yes, it's a problem. As I said above, AMG is pretty good overall, but individual track names for classical albums are all over the map, probably (as you say) because the labels themselves have no standardization.

Hey side note, EnglishTiger: I see you have a GTX 970 GPU and an overall similar PC to mine (i7 processor, etc., though you have more RAM). Do you have a 4K setup with Red October HQ + custom MadVR settings? I just ordered a 4K TV and have the exact same card hooked up (via the HDMI 2.0 port) to a Marantz AVR and have been a bit worried about how well it will function. I realize I probably won't be able to do maximum up-scaling or maybe even mid-range up-scaling with MC, but I am curious, if you have one and have some custom MadVR settings, how it works overall. Thanks!
Logged

wer

  • Citizen of the Universe
  • *****
  • Posts: 2640
Re: Yet Another Expression Question
« Reply #9 on: January 01, 2021, 03:51:11 pm »

Did you see my 2nd post about the quick and dirty? I would think that would get you a long way...
Logged

timwtheov

  • Galactic Citizen
  • ****
  • Posts: 354
Re: Yet Another Expression Question
« Reply #10 on: January 01, 2021, 04:08:13 pm »

Yes, Wer, tried it and will probably use it if I decide not to try the big task of standardizing [Name] (more an issue of futureproofing for a speedy workflow for future tagging; that's where I get hung up a bit, as it already takes way too long to tag classical albums).

Anyway, I've got to monkey with the view you suggested to make it quicker to assess stuff, to see what I've got, since there are thousands of compositions. Haven't had much time today but will look at it in more detail later tonight or tomorrow.
Logged

EnglishTiger

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 967
Re: Yet Another Expression Question
« Reply #11 on: January 02, 2021, 01:48:46 am »

Tim - the only things connected to by Video Card are the PC Screen and my Samsung HW-Q90R Harman Kardon Cinematic soundbar with Dolby Atmos (via an html cable).

AMG certainly got it wrong when it comes to Beethoven's 9th Symphony - If they had bothered to check the Original Music Score they would have discovered that there are only 4 Movements. However the 4th movement has, and probably will continue to, cause problems; most "Experts" can't agree on it's True Format, some think it's in Sonata Format whilst others think it's in Cantata Format. The general consensus being that the 4th Movement, mainly because of it's length, is probably a symphony,  with 4 un-interrupted movements, within a symphony.

Any recording of the Choral Symphony that claims to have more than 4 movements is the result of somebody listening to a "Self Proclaimed Expert" instead of checking the Original Score.
Logged
Win NUC - VENOEN 11Th NUC Mini PC Core i7 1165G7,Dual HDMI 2.0+Mini DP,Windows 11 Mini Desktop Computer,Thunderbolt 4.0,1 Lan, USB-C,Wifi,Bluetooth 5.0,32GB RAM Toshiba MQ04ABF100 ‎500Gb 5400 RPM ‎eSATA HD, Gigabyte GP-GSM2NE3512GNTD 1Tb NVMe SSD, Samsung 870 QVO 8 TB SATA 2.5 Inch SSD (MZ-77Q8T0) in Sabrent Ultra Slim USB 3.0 to 2.5-Inch SATA External Aluminium Hard Drive Enclosure (EC-UK30)

Apple 2020 Mac mini M1 Chip (8GB RAM, 512GB SSD)
Sabrent Thunderbolt 3 to Dual NVMe M.2 SSD Tool-Free Enclosure with Sabrent 2TB Rocket NVMe PCIe M.2 2280 High Performance SSD + Crucial P3 Plus 4TB M.2 PCIe

ET Skins & TrackInfo Plugins - https://englishtiger.uk/index.html

timwtheov

  • Galactic Citizen
  • ****
  • Posts: 354
Re: Yet Another Expression Question
« Reply #12 on: January 02, 2021, 11:49:27 am »

Thanks for the info., EnglishTiger! Are you a music scholar? That's an impressive bit of knowledge to have at one's fingertips. I myself can't read a note, but ever since hearing the opening of Beethoven's 5th (on piano) when I was 5, I've been hooked on (western) classical, though there was a long detour/dry spell from about age 5 till 20 or 21, as I didn't know what that Beethoven I heard was (or didn't remember) or where to find more of it (I grew up in a 900 person town in the rural USA midwest with pretty much only country, classic rock, and pop surrounding me).

Now that you mention it, I wonder how AMG does determine "Parts/Movements." Unthinkingly, I just assumed they were aggregating from recordings, but that can't be right (how would they do it anyway?). I wonder what they use.
Logged

EnglishTiger

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 967
Re: Yet Another Expression Question
« Reply #13 on: January 03, 2021, 05:59:26 am »

Tim - like nearly every other online source of Metadata AMG get a lot of their data/info from music fans and/or other online sources.

Only AMG can tell you why they decided to divide Classical Music Works into parts instead of following the long established practice of dividing them into movements. However, you don't have to spend too long reading about Classical Music, or browsing through a few track lists to spot that the "One Label/Rule" approach most web-sites/metadata providers deploy does not always apply to classical music. Don't forget we are talking about Composers who Invented their Own Rules!

A significant amount of Classical Music does lend itself to using a Two Tier tagging approach where the Work/Composition Name is at the top level with a varying number of entries/tags at the lower (movement) level.
Unfortunately there is a lot of Classical Music that requires a Three Tier Approach - some of it's called Operatic, Ballet and Theatre Music; where you have a Work/Composition that consists of Overtures, Acts and Scenes.

Now for the good news. You don't have to follow somebody else's inflexible rules or tag names; unlike the vast majority of media players out there, which restrict you to using a limited selection of tags and predefined rules, MC is flexible enough to allow you to create your own tags and set your own rules.

Tim - to answer your question about my knowledge of/about music, especially classical music. A lot of my knowledge has been acquired from sites like Wikipedia, BBC/Open University Documentaries, Biographies and the Booklets and Sleeve Notes that come with Albums/Boxsets.

Incidentally when it comes to Tagging Classical Music there are really only 2 things that can be considered "5 minute tasks"; one is ripping a cd to disc, the other one is messing it all up. Correcting just a basic set of tags that allow you to Identify each track, i.e. Track Name, Composer, Conductor & Orchestra takes time especially if the only source of that information is the case/slip cover the CD came in. Because all too often you will find that the only reliable source of MetaData/Information is some web-site that makes claims, explicit or implied, it's owners/operators can't justify.
Logged
Win NUC - VENOEN 11Th NUC Mini PC Core i7 1165G7,Dual HDMI 2.0+Mini DP,Windows 11 Mini Desktop Computer,Thunderbolt 4.0,1 Lan, USB-C,Wifi,Bluetooth 5.0,32GB RAM Toshiba MQ04ABF100 ‎500Gb 5400 RPM ‎eSATA HD, Gigabyte GP-GSM2NE3512GNTD 1Tb NVMe SSD, Samsung 870 QVO 8 TB SATA 2.5 Inch SSD (MZ-77Q8T0) in Sabrent Ultra Slim USB 3.0 to 2.5-Inch SATA External Aluminium Hard Drive Enclosure (EC-UK30)

Apple 2020 Mac mini M1 Chip (8GB RAM, 512GB SSD)
Sabrent Thunderbolt 3 to Dual NVMe M.2 SSD Tool-Free Enclosure with Sabrent 2TB Rocket NVMe PCIe M.2 2280 High Performance SSD + Crucial P3 Plus 4TB M.2 PCIe

ET Skins & TrackInfo Plugins - https://englishtiger.uk/index.html
Pages: [1]   Go Up