Topic: Integer -> Floating Point conversion: different sound (@Matt, nevcairiel) (Read 13711 times)

TheLion · « **on:** February 26, 2012, 10:10:52 am »

Please excuse me for starting just another "questionable" thread discussing differences in perceived sound of bit-perfect playback ;-)

Before we had LAV Audio I used ffdshow together with JRiver MC for audio decoding. Back then I perceived a difference in "sound signature" when selecting 32bit fp output instead of 16bit/24bit int in ffdshow. The situation is just the same since using LAV audio. I discussed this matter with nevcairiel over at Doom9 some time ago and he suggested to keep the source bit depth untouched for output (therefor letting LAV audio auto-selecting the output bit-depth, because LAV audio's int->float converter is rudimentary and just intended as fallback for hardware not supporting certain formats). With lossless audio this is 24bit int (in very few cases 16bit int). So far so good. After we got convolution integrated into MC now I decided to do another comparison.

My playback chain: source is bdmv in most cases (lossless audio tracks) -> decoded by LAV audio (with the ARCSOFT dts decoder used for DTS HD tracks) -> in Media Center I use convolution (using 64bit fp Acourate filter) and "room correction" DSP for setting the channel levels. No other DSPs are used. Output is to a Prism Sound Orpheus Firewire interface via ASIO. 7.1 speaker setup using Danley SH-50 and Genelecs 1037Cs.

I am comparing LAV audio decoding with 24bit int output (native source format of my lossless sample tracks) and 32bit fp output (be forcing this output in LAV Audio).

I want to suggest the following:

- the final audio output of these two options is quite different.

- Subjectively I would describe the difference as following: in famous audiophile terms the 24bit decoding (with MC doing the int-float conversion) is sounding "rather smooth and laid back" in comparison. Forcing LAV Audio to output 32bit fp into the MC audio engine results in a more "aggressive, coarse" sound signature. My personal preference (sadly) is for later. Logic suggests that doing this int->fp conversion once with the JRiver audio engine is the "better" way to handle this necessary step. But no matter how often I switch between these two options with a large variety of content I continue to prefer the 32bit fp output from LAV audio (subjectively).

So, I should be happy and just use the option I like better, right? ;-)

Well, I want to understand why there is a sound difference. I am willing to provide samples and record the output of both options to make my point. But anybody trying these two options should be able to hear the difference (given you have a very accurate, revealing speaker setup).

If we can agree that int->fp conversion really makes a difference whe might learn something in order to improve payback with LAV audio and JRiver media center. But wait, it is all bit-perfect playback ;-)

For the time being I enjoy my placebo effect and keep forcing LAV audio to output 32bit fp ;-)

@ nevcairiel: Would it be much effort to implement 64bit fp as output option in LAV audio? On the one hand this would likely make my placebo effect even stronger (larger numbers are always better...) and on the other hand this is the native format JRiver including convolution (and my FIR filters) does all its internal processing. I would very much appreciate it although I am aware that you don't see much sense in such an option.

Matt · « **Reply #1 on:** February 26, 2012, 10:18:29 am »

As soon as data is delivered to us, we convert it to 64bit regardless of the format.

I think we have LAV configured to deliver data in the native format, and we convert to 64bit.

TheLion · « **Reply #2 on:** February 26, 2012, 10:30:32 am »

Quote from: Matt on February 26, 2012, 10:18:29 am

As soon as data is delivered to us, we convert it to 64bit regardless of the format.

I think we have LAV configured to deliver data in the native format, and we convert to 64bit.

I completely understand that, Matt. But the difference I am hearing suggests that when this int->float conversion is done within LAV Audio the results are different. For whatever reason. This is consistent with using ffdshow as decoder.

Is this int->float conversion a "mathematically lossless" process or can the results be different (eg. due to less precision, converting 24bit int to 32bit fp versus 64 bit fp)?

Somehow I always feel rather stupid when trying to discuss things like that. I am 33 years old, have an academic degree in engineering and I am still feeling certain to perceive differences using different playback options of said "bit-perfect" playback (like the difference using ASIO instead of WASAPI/KS, or let my audio interface do the clocking instead of setting it to synchronous mode). But rest assured - my JPlay double blind test didn't result in any perceived advantage. So not all hope is lost for me... ;-)

I just want to learn. And with this subject I am certain there is something different. Perhaps you take a minute of your time and switch between these two options in LAV audio in real time. Thank you very much!

Matt · « **Reply #3 on:** February 26, 2012, 10:39:49 am »

Quote from: TheLion on February 26, 2012, 10:30:32 am

Is this int->float conversion a "mathematically lossless" process or can the results be different (eg. due to less precision, converting 24bit int to 32bit fp versus 64 bit fp)?

32-bit float can hold a 24-bit integer without loss (but can't hold a 32-bit integer).

Maybe nevcairiel could comment on the 24bit to 32bit math used? I believe normally you would use something like float(value) / float(2 ^ (bits - 1)). There shouldn't be any limiting, dithering, or anything else.

I can post exactly what JRiver uses for 24bit to 64bit conversion on Monday if it would be helpful.

Matt · « **Reply #4 on:** February 26, 2012, 10:49:58 am »

Quote from: TheLion on February 26, 2012, 10:30:32 am

I just want to learn. And with this subject I am certain there is something different. Perhaps you take a minute of your time and switch between these two options in LAV audio in real time. Thank you very much!

Would you be willing to play a movie both ways using Options > Audio > Disk Writer?

The resulting WAV files should be identical. If they're not, something is different.

TheLion · « **Reply #5 on:** February 26, 2012, 10:54:23 am »

Quote from: Matt on February 26, 2012, 10:49:58 am

Would you be willing to play a movie both ways using Options > Audio > Disk Writer?

The resulting WAV files should be identical. If they're not, something is different.

I would be very happy to. I was thinking about using a pink noise calibration sequence from AIX Blu-ray (dts-hd ma 7.1 stream) for that. How can I evaluate the statistical difference between the resulting files? Which tool would you recommend for that?

Did you try to listen to both options?

TheLion · « **Reply #6 on:** February 26, 2012, 10:55:15 am »

Quote from: Matt on February 26, 2012, 10:39:49 am

32-bit float can hold a 24-bit integer without loss (but can't hold a 32-bit integer).

Maybe nevcairiel could comment on the 24bit to 32bit math used? I believe normally you would use something like float(value) / float(2 ^ (bits - 1)). There shouldn't be any limiting, dithering, or anything else.

I can post exactly what JRiver uses for 24bit to 64bit conversion on Monday if it would be helpful.

Thank you for that!

Hendrik · « **Reply #7 on:** February 26, 2012, 11:01:39 am »

int -> fp -> int is only lossless if both conversions are done using the same algorithm.

Anyway, here is what i do (24-bit is shifted to 32-bit and the same logic applied)
float = value * (1.0f / (1U<<31))

This is "method 1" according to this post, which seemed to be a at least one standard in audio implementations:
http://blog.bjornroche.com/2009/12/int-float-int-its-jungle-out-there.html

I have no big interest in adding any more complicated conversions, because i strongly believe that the decoder should output untouched audio. Those options really are only there to ensure compatibility with other audio hard/software or for debugging.
Therefor, i will also not add 64-bit float, because that is truely pointless.

Matt · « **Reply #8 on:** February 26, 2012, 11:18:22 am »

Quote from: TheLion on February 26, 2012, 10:54:23 am

I would be very happy to. I was thinking about using a pink noise calibration sequence from AIX Blu-ray (dts-hd ma 7.1 stream) for that. How can I evaluate the statistical difference between the resulting files? Which tool would you recommend for that?

I would do a binary compare. I think you can do it with the command line.

BeyondCompare is a great difference tool, albeit overkill for this task.

Matt · « **Reply #9 on:** February 26, 2012, 11:25:17 am »

Quote from: nevcairiel on February 26, 2012, 11:01:39 am

int -> fp -> int is only lossless if both conversions are done using the same algorithm.

Anyway, here is what i do (24-bit is shifted to 32-bit and the same logic applied)
float = value * (1.0f / (1U<<31))

This is "method 1" according to this post, which seemed to be a at least one standard in audio implementations:
http://blog.bjornroche.com/2009/12/int-float-int-its-jungle-out-there.html

That jungle of differences would only explain a difference of one in 24-bit, which would be impossible to hear (the best hardware can use about 20 bits).

But if something overflowed and looped around, a person could hear it.

The binary comparison of Disk Writer output should be definitive.

Quote

I have no big interest in adding any more complicated conversions, because i strongly believe that the decoder should output untouched audio. Those options really are only there to ensure compatibility with other audio hard/software or for debugging.

I don't see a reason to put bitdepth conversion in the decoder. That should be our problem, not your problem.

TheLion · « **Reply #10 on:** February 26, 2012, 11:35:49 am »

Quote from: nevcairiel on February 26, 2012, 11:01:39 am

int -> fp -> int is only lossless if both conversions are done using the same algorithm.

Anyway, here is what i do (24-bit is shifted to 32-bit and the same logic applied)
float = value * (1.0f / (1U<<31))

This is "method 1" according to this post, which seemed to be a at least one standard in audio implementations:
http://blog.bjornroche.com/2009/12/int-float-int-its-jungle-out-there.html

I have no big interest in adding any more complicated conversions, because i strongly believe that the decoder should output untouched audio. Those options really are only there to ensure compatibility with other audio hard/software or for debugging.
Therefor, i will also not add 64-bit float, because that is truely pointless.

nevcairiel,

Vielen Dank for the response. I agree that any decoder should output untouched audio and that all necessary conversions should be done later in the chain. You mention that int -> fp -> int is only lossless if both conversions are done using the same algorithm -> that leaves this task to Media Center as it does the final fp->int conversion for output as well. I know the procedure and logic behind it and I would be happy if I don't hear a difference or if I would prefer Media Center doing this conversion. But as it is I will keep forcing fp output in LAV audio and I hope that I/we get behind what is causing the perceived difference (other than mental issues ;-).

Thanks again!

TheLion · « **Reply #11 on:** February 26, 2012, 12:05:28 pm »

Matt,

I have done as you suggested - using Disk Writer instead of ASIO as output option. I recorded title 8 of the AIX calibration blu-ray. This is a 7.1 track in 96khz DTS-HD MA. I identifies each channel (voice) followed by pink noise. I use my standard DSP studio setup as described (including convolution with 96khz filters). I disabled Auto A/V sync in LAV audio because the Disk Writer options plays the video file much faster than normal playback. I recorded two files: one with 24bit int output from LAV audio and 1 with 32bit fp output forced.

The file sizes are identical - therefor the recorded stream is of the exact same running time = comparison is valid. I will try to make a binary comparison now.

TheLion · « **Reply #12 on:** February 26, 2012, 12:14:12 pm »

I use the fc command (Windows command line, as described here: http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/fc.mspx?mfr=true) with the following parameters: fc /b AIX_24bit_int.wav AIX_32bit_fp.wav

It is running now...it seems to be quite a workout for my i7 comparing these 732MB files on a binary level.

TheLion · « **Reply #13 on:** February 26, 2012, 12:58:43 pm »

It has finished. As output I get alot of numbers which I take are differences in the bitstream. It seems the files are certainly not identical. See attachment.

TheLion · « **Reply #14 on:** February 26, 2012, 01:08:32 pm »

Matt,

another matter is the final fp -> int conversion necessary for output. Some hardware players offer multiple options for different dithering algorithms used for this task. As this is something that objectively has an impact on the sound signature of the output it may be useful to make such options user select-able in Media Center. Because it is a matter of preference and giving the user the choice leaves you from the responsibility of selecting the "best" dithering algorithm ;-)

TheLion · « **Reply #15 on:** February 28, 2012, 11:12:31 am »

I guess I am asking about selectable dithering / noise shaping algorithms with my last post. With all the discussion about "bit-perfect" playback lately it is my opinion that there are topics like the one I showed here that have an impact on perceived sound "quality" and don't qualify as "audiophile snakeoil".

The issue above shows differences both objectively (bit comparison of the outputs) and subjectively (quite different sound signature). I trust that both - JRiver MC and LAV Audio - do the int->float conversion in a "mathematically correct way". Still the final output is different. In the context of my setup I tend to prefer the more "edgy, forward" sound signature I get with forcing LAV audio to do this conversion. Logic tells me it is much better to let MC handle it.

But when we agree and can proof (-> bit compare) that different algorithms (for tasks like int->float->int conversion/dithering/noise shaping) result in relevant differences that users are also able to perceive as such it is IMHO best to let users choose between them. Because everybody has different preferences and each and every system/setup sounds vastly different with the same input signal.

But the easy way out is to suggest that digital audio with sufficient precision and therefor "bit-perfect" playback sounds the same by definition. But what is "bit-perfect playback" when different algorithms for the necessary tasks mentioned above are used? The moment a single bit from the output is different from the input stream it is not bit-perfect playback ;-) Therefor JRiver MC doesn't provide "bit-perfect" playback - and that is a very good thing! I extensively use convolution which is the anti-thesis to "bit-perfect" playback (if you take the true meaning of the word and not just "mathematically correct processed audio"). Things like internal volume and even converting to 64bit fp break "bit-perfect" playback. So whenever you process audio streams (like JRiver MC always does by using a 64bit fp audio engine) there is no bit-perfect playback. All you can do is using max. precision and the "best" algorithms suitable for the given task. But what is "best" algorithm depends in many factors. So making those user selectable is a very sensible thing to do for a developer who doesn't believe in "absolute truth" ;-)

Matt, you started an (infamous) thread on another forum asking for what is necessary to build the "best audio playback engine". Giving the user relevant choices is my answer. That's something JRiver historically does best - giving options (even if their value is "questionable" -> like memory playback). Different algorithms with different bit-output and perceived sound signature are not questionable in the world of digital signal processing.

Matt · « **Reply #16 on:** February 28, 2012, 11:57:31 am »

I tested LAV native output vs 32bit output, and it's bit-identical for me at 24bit output.

Testing was done with a Blu-ray with a DTS-MA track (using dtsdecoderdll.dll).

First I played with Disk Writer using Red October Standard. LAV delivers native 24bit data in this case.

Then I used Custom video mode and configured LAV to always deliver 32bit data. Media Center converts back to 24bit for disk writing.

The two clips are identical.

That leads me to believe there's no problem or room for improvement here.

TheLion · « **Reply #17 on:** February 28, 2012, 12:53:23 pm »

Matt,

thank you for checking. I don't quite understand the results you are getting. I just redid the "Disk Writer" test. This time I captured the DTS-HD MA encoded clip 2 times with each setting. After that I ran the fc command with the 2 files of the same setting. The message I get is: "FC: no differences encountered". Then I run fc comparing files with different settings. I get the same list I posted before. So my captures are different. AND they sound different as well. And nevcairiel also mentioned that "int -> fp -> int is only lossless if both conversions are done using the same algorithm.". As you are likely not using the same algorithm as the one he posted the results will vary (after you do the final fp -> int conversion).

Which application did you use for binary compare? Does it matter that I have Output Format, Convolution and Room Correction enabled in DSP studio (it shouldn't alter the results)?

Anyway - for the time being I will simply output 32bit fp from LAV Audio and enjoy my placebo ;-) Thank you.

Hendrik · « **Reply #18 on:** February 28, 2012, 01:28:23 pm »

If you want to compare on a bit level, you should try without any DSP effects.

Regarding the int -> fp -> int conversion - even if there were different algorithms used, they difference in the audio would be only one bit, at 24bit output thats below any humans perception, no matter how good you think your ears are.

Not to mention that even high quality DACs very rarely can process the full 24-bits.

mojave · « **Reply #19 on:** February 28, 2012, 02:35:37 pm »

I just tested a 3 minute 11 second DTS-HD clip of Battle: Los Angeles which is 16 bit and 48 kHz. Using Beyond Compare it is bit perfect with RO Standard, RO Custom with LAV at 32-bit floating point, 24-bit integer, and 16-bit integer. It was not bit perfect when using 8-bit integer which is expected. Interestingly, when using 32-bit float in LAV, MC doesn't light up the Audio Path Direct icon because it is changing to 24-bit output for disk writer and thinks the source is 32-bit.

Hendrik · « **Reply #20 on:** February 28, 2012, 03:07:40 pm »

Quote from: mojave on February 28, 2012, 02:35:37 pm

Interestingly, when using 32-bit float in LAV, MC doesn't light up the Audio Path Direct icon because it is changing to 24-bit output for disk writer and thinks the source is 32-bit.

Thats the behaviour i would expect, because it simply is not bit accurate.

32fp can store more precision then 24bit integer, and MC doesn't know that the decoder just converted it for no good reason.

TheLion · « **Reply #21 on:** February 29, 2012, 01:47:55 am »

Thank you everybody for checking. It seems to be a problem on my part - some bad interaction of settings perhaps. Sorry for the confusion. I will try the 30 days trial of Beyond Compare to see what's going on.

Thanks.

mojave · « **Reply #22 on:** February 29, 2012, 09:14:47 am »

I did the same tests with the same clip, but using the playback range tag to shorten the clip to a little over a minute. With this method the file sizes were different time and the files didn't match. I think the playback range isn't precise enough to start the file at exactly the same place each time.

TheLion · « **Reply #23 on:** February 29, 2012, 10:35:45 am »

Mojave,

I was thinking the same thing - when I start my test clip it is doing the standard hardware sync for 2 seconds while switching to Fullscreen Exclusive mode (madVR). Disk Writer should start recording with the first audio sample (video frame?) - so the startup time shouldn't matter. What what happens if audio and video are off-sync by a few samples? (I disabled auto A/V sync for my test runs)

But that doesn't explain the difference I hear. This perceived difference caused my to start this thread. Is placebo that strong? Or does it matter that I switch between these two options in realtime while playback is enabled - perhaps whenh switching between them something in the playback chain is off. While it works ok when playing the clip exclusively with one setting. It is very hard to tell/remember the difference when playing with one setting and after that re-playing the same clip with the other setting. I would compare the difference I (believe) to hear with a CD versus a SACD of the same recording.

Strange.

Matt · « **Reply #24 on:** February 29, 2012, 10:53:31 am »

Quote from: TheLion on February 29, 2012, 10:35:45 am

Mojave,

I was thinking the same thing - when I start my test clip it is doing the standard hardware sync for 2 seconds while switching to Fullscreen Exclusive mode (madVR). Disk Writer should start recording with the first audio sample (video frame?) - so the startup time shouldn't matter. What what happens if audio and video are off-sync by a few samples? (I disabled auto A/V sync for my test runs)

But that doesn't explain the difference I hear. This perceived difference caused my to start this thread. Is placebo that strong? Or does it matter that I switch between these two options in realtime while playback is enabled - perhaps whenh switching between them something in the playback chain is off. While it works ok when playing the clip exclusively with one setting. It is very hard to tell/remember the difference when playing with one setting and after that re-playing the same clip with the other setting. I would compare the difference I (believe) to hear with a CD versus a SACD of the same recording.

Strange.

There may be something real happening, but I think you need to do more testing to get to the bottom of it.

Start with the test I did above. See if you get the same results. Disable extra processing.

Then add complexity slowly until you find what makes the difference.

To speed up testing, you can play just a few minutes of a video and hit stop (hit stop twice to clear the bookmark for the next play). Even if the files are different lengths, the audio data at the front should match. Any good comparison tool will make this easy to see.

INTERACT FORUM

Author Topic: Integer -> Floating Point conversion: different sound (@Matt, nevcairiel) (Read 13711 times)