Volume Leveling is designed so that it will never clip, but it must reduce the volume for any file where 23dB is not enough headroom to prevent clipping, resulting in uneven leveling for those files.
As I said, I'd like to have that 'never clipping' behavior mentioned somewhere, at least. I'd prefer to (optionally?) not have it at all, so as to really have a consistent volume.
So what is likely happening is:
[48 → 96] → [96 → 92.16 → 96]
Ideally that would be:
48 → [48 → 92.16 → 96]
I agree, though I would have understood it to resolve into the ideal of
48 →
[(48 kHz = 46.08 kHz) → 96
](i.e. show the adjusted rate relative to the input sample rate, rather than relative to the target sample rate). That's only because I misinterpreted your ideal at the first glance and thought it would resample twice. I used '=' to denote that a reinterpretation of the current format occurs, rather than a resampling. And yes, once I started thinking, I understood that your ideal black box simply reinterprets the 92.16 kHz as 96 kHz also... I was momentarily confused, that's all.
But if the SoundTouch library is a "black box" perhaps it's not possible to set the output rate to be something different from the input rate.
For what it's worth, my 'ideal' DSP Studio (for now) would work rather different in that it would have the Audio Renderer simply ask for what it needs (e.g. 2.0 96 kHz), and each filter (with each step getting closer to the source) having the option to modify the input accordingly. This would mean that the Renderer requests 2.0 96 kHz, and if SoundTouch was the last occurrence of a resampler (i.e. the first one that's asked to resample), it could know to resample whatever input format (be it 48 kHz, 44100 Hz, something rather exotic like, say, 13484 Hz, or whatever) and output 96 kHz from it (thereby also consuming the resampling request, of course).
Only when no VST has been capable of fulfilling a specific requirement (e.g. 2.0 channels), the Audio Renderer would perform the necessary actions.
Of course, if the Audio Renderer (basically what currently resides in the Output Format tab) is configured to pass through the input data, and there is an error somewhere because e.g. the playback device doesn't support 96 kHz, that would be passed as an error to the user, as it is right now.
And yes, I'm aware that at best, this would require a major rewriting effort for the DSP tab, which is why I just mention it as an 'ideal'. I don't actually expect it to be realized, much less soon. I don't even know if it's possible with VSTs at all!
-23 LUFS would be appropriate for a broadcast stereo downmix, but broadcast recommendations for stereo downmixes discard LFE, while Media Center does not.
And it includes the LFE channel for a good reason: because many MC customers are sending the audio to high-end stereo or headphone setups, rather than using TV speakers.
I agree.
So -31 LUFS would be a better target for leveling across all video playback in MC, especially since it would be decoding the HD formats internally anyway, instead of passing them through to the AVR.
I can also agree to adjusting the reference volume in order to adapt to circumstances.
Much more than about the Volume Leveling target itself, however, I care about having it consistent regardless of the input. Once, before I discovered MC (and from then on, only a short while until I discovered Volume Leveling), I had to constantly adjust my volume, depending on which file I played. I'm rather fond of knowing that I can set e.g. my TV's volume to 25 (on a scale from 1 to 100) and be done with it. I prefer having to readjust that value once (e.g. to 37) over having to adjust that value every time I play something that isn't also with (or without) Video. (Of course, I suppose that if the values are known, then I could also create two Zones, one for each content type, and adjust those accordingly. I prefer the consistent volume option, though).
Wow, I get the impression that I've been repeating myself over the course of this post. I'm rather certain that I would still do that if I now rewrote this entire post, though, so here you go...