INTERACT FORUM
More => Old Versions => JRiver Media Center 30 for Windows => Topic started by: CltrAltDel on March 27, 2023, 09:45:53 am
-
For a long time I've been using a pane view with (among others) a category/column "A-Z" for movies.
This uses the following expression for grouping:
if(isrange([name], a-z), formatrange([name],1,0), #)
which results in a column with "#ABC...Z" for filtering, so something like
# -> 2 Guns | 2 Days in the Valley | 12 Monkeys | 2001: A Space Odyssey | ...
A -> A Beautiful Mind | À La Carte! | Almost Famous | Ärger Im Gepäck | Atomic Blonde | ...
B -> Babylon | Bad Santa | Barbarella | Barton Fink | ...
C -> Car Wash | Casablanca | Casino | Chinatown | ...
...
Z -> Zabriskie Point | Zero Effect | Zodiac | Zulu | ...
After the update to MC30 the column looks like this:
# -> 2 Guns | 2 Days in the Valley | 12 Monkeys | 2001: A Space Odyssey | ...
(Others) -> À La Carte! | Ärger Im Gepäck
A -> A Beautiful Mind | Almost Famous | Atomic Blonde | ...
B -> Babylon | Bad Santa | Barbarella | Barton Fink | ...
C -> Car Wash | Casablanca | Casino | Chinatown | ...
...
Z -> Zabriskie Point | Zero Effect | Zodiac | Zulu | ...
So umlauts and letters with accents are not handled like letters any more.
Interestingly, the ligature "Æ" is handled like a number, so the movie "Æon Flux" is sorted to "#" (should also go to "A"...).
Is this a bug or is it by design?
If it's by design, how to get around it?
I made some attempts with adding lines like
isequal([name], À), A,
isequal([name], Ä), A,
to the expression, but that does not seem to work.
As I'm not that savy with the expression language, some help would be greatly appreciated :)
-
This is not easy to do without a new function to remove diacritics from text (unicode normalization). Perhaps Matt can add a Normalize() function for that, or a new mode to the existing FixCase() function.
Then you would use something like this (if you have no names starting with symbols/punctuation):
Letter(normalize([name]),1,2)
-
Look at Clean(...) in mode 9.
-
Perfect :) That mode is not documented, apparently: https://wiki.jriver.com/index.php/String_Manipulation_Functions#Clean
Then it's just:
Letter(Clean([Name],9),1,2)
-
Perfect :) That mode is not documented, apparently: https://wiki.jriver.com/index.php/String_Manipulation_Functions#Clean
Then it's just:
Letter(Clean([Name],9),1,2)
Just updated.
-
Thank you.
Apparently Æ and other ligatures are not handled. NFKD is likely the best for this: https://unicode.org/reports/tr15/#Norm_Forms, but maybe not worth the effort unless you're using some library like Boost that already does it.
-
Apparently Æ and other ligatures are not handled.
I just added Æ and will translate it to A. If there are others just let me know.
Thanks.
-
Thanks for looking into that - the clean function did the trick :)
I changed the expression to
if(isrange(Clean([Name],9), a-z), formatrange(Clean([Name],9),1,0), #)
and that worked like charm!
If the Æ is added to the clean function, would adding Œ be reasonably?
And maybe the documentation could be expanded to mention that not only accents but also umlauts, circumflexes, cedillas, tildes, rings, slashes (Ø), carons (and more?) and (possible) ligatures are "cleaned"... or in short just "Removes diacritics." ;D
-
While reviewing expressions for actor names I just stumbled across another relatively common letter, the Icelandic thorn "Þ" for "Th"... would it be reasonably to translate it to "T"?
-
I helped out with the Clean(9) function to translate diacriticals. I used this table as the source of the translation:
https://docs.oracle.com/cd/E29584_01/webhelp/mdex_basicDev/src/rbdv_chars_mapping.html
It has all the characters mentioned in this thread as far as I am aware. It's possible I missed some of them, but I think they are all in what I gave to Matt.
Brian.