Do you like your music audio-processed?

rrsounds · Mar 17, 2009

littlejohn said:
Gentlemen -

The last several posts remind me of the country church during services (I grew up in the very rural Southeastern US) when Brother Eddie was testifying. He testified about his fight with temptation when he and Sister Jane were climbing in the hills. And testified about the second fight with temptation as they continued their climb. As he began his testimony of his third fight with temptation, an LOLwBH (Little Ol Lady with Blue Hair). stood up and said "Reverend, make Brother Eddie be quiet. He ain't testifyin' any more, he's braggin'!".

Gentlemen, more testimony and less brag, whattaya say?

I don't know if you noticed it or not, but there's a whole lot of SINNIN' goin on in the audio processor world!!
We all gotta TESTIFY!! LOL!

;D

David

konbaasiang · Mar 17, 2009

A-MEN to that

.

///Leif

The F Mister · Mar 17, 2009

littlejohn said:
Gentlemen -

The last several posts remind me of the country church during services (I grew up in the very rural Southeastern US) when Brother Eddie was testifying. He testified about his fight with temptation when he and Sister Jane were climbing in the hills. And testified about the second fight with temptation as they continued their climb. As he began his testimony of his third fight with temptation, an LOLwBH (Little Ol Lady with Blue Hair). stood up and said "Reverend, make Brother Eddie be quiet. He ain't testifyin' any more, he's braggin'!".

Gentlemen, more testimony and less brag, whattaya say?

As long as they respects each other’s point of view there's nothing wrong right? I must say that as I read along (and have to google/wiki a lot) I'm learning from what is being said.

littlejohn · Mar 17, 2009

Oh, there's good information being put forth. It requires only that one pick the pearls out of the roadapples. (More than one parishioner put Sister Jane on their visitation list, too!)

jesseg · Mar 18, 2009

David Reaves said:
You're actually reinforcing some of my points.
To begin...
My antenna ALWAYS goes up when someone uses the qualifier: "done properly."

The reason I said this is because ReplayGain is a proposed standard, it has not gone through the ISO approval process yet, but I except it should eventually, now that XIPH has money (thanks to Mozilla Foundation). That being said, there's no "wrong" way of doing it, yet. Only a "proper" way, which is why I used that word. Also it should be noted that I have never seen ReplayGain done where it doesn't happen in a proper way.

Prove me wrong (without coding one that truncates yourself, which nobody would ever use).

David Reaves said:
Next, one cannot assume ANYthing about activity in the lowest number of bits by only examining the overall loudness level. The automatic assumption that, just because the top bits are trashed, it's therefore OK to trash the bottom bits, is specious. You may get lucky, heck, you may always be lucky! But, personally, I try to keep luck and chance at arm's length.

I assume not, and I asked you to check out lossyWAV because it's by far the easiest way to see how many bits you do not need, and I asked you to check for yourself to see if you can actually hear a difference -- using a blind ABX test of statistical significance which is widely considered to be P<0.05 in the audio world. You have not done this, and now you come here asking us not to assume things?

David Reaves said:
Further, I'm sure you respect the idea that dithering is not something to be applied casually or at numerous stages. Let's say the CD mastering engineer struggled to dither material down to 20 dB below the CD noise floor. Maybe he spent weeks on it, getting it just right...Is ReplayGain going to respect that when it blindly reduces gain a further 12 dB? Always?

It's not a matter of "respect". Dithering in the first place is entirely based on errors. It deliberately introduces errors into the data stream. There is no respect at all for the data, from a mathematical perspective. Luckily dithering is not done to keep mathematicians happy (because they would never dither in the first place). It's made to best represent the format of data within the allotted sample resolution, as specifically tuned as possible to it's own format. In this case audio.

David Reaves said:
I'm confident you can find thousands of examples of songs that have limited bit-depth, or that if bit-reduced, don't present obvious side effects. It the risk of sounding sarcastic I would respectfully answer "Duh!"
That's the world we live in! But that description does not fit every song, which is the crux of my argument.

actually... it's the crux of your opinion, having not even tried what you are talking about. ...at least, if you have experience, you're sure not saying what your experience with bit reduction actually is, and it's left your argument standing in the doorway of opinion.

David Reaves said:
IOW, just because a destructive technique works transparently with certain files, maybe even a statistically large number of them, doesn't make its use categorically applicable to ALL files. Particularly since we have zero knowledge of how the future will treat those files. Future processing, future personnel, etc.

Actually, with standard settings in lossyWAV (which btw only uses every-day TPDF dithering, nothing advanced at all) there are ZERO known problem samples (as in nobody can blind ABX the difference between anything) right now, and that's with a very very small fraction not at 14bit (and less) bit resolution from a 16bit source. And they have been going through a mountain of material with similar properties that had been known to cause problems with it's detection algorithm in the past. (mainly the bits getting reduced too much, which is statistically more like winning the lottery now.) And even those problem tracks, I only remember two of them EVER having a problem because it was reducing the bit depth to even 14bits. The problem was always because it was reducing more than 14bits. And I should ask you to remember to keep in mind the problem is that these very experience listeners (some of the same people that made Lame codec what it is today) could even blind ABX the difference at all, on very specific sound and aspect of that sound...

David Reaves said:
Over and over, in this post and others, I read: "listen for yourself!" But almost without fail, these examples are files that have not been post-processed in any way, and post processing is a HUGE unknown! My gut feeling: to assume that in our industry there will be little or no processing after ReplayGain (or lossyWAV, or whatever) is, at best, standing upon shakey ground.
Cuz, hey, even though some 128kbs MP3s sound acceptable until they're processed. ;D ... still they DO get processed.

I'm not assuming anything. First off, I process for taste much of what I listen to. (on-topic, imagine that). Secondly I mainly use mp3 files. Thirdly I use mp3gain which changes the volumes internally in the mp3 file, losslessly, and when the mp3 is re-created back into RIFF PCM it *still* goes through the same dithering in the decoder that it would have went through anyways. mp3 does not have a bit depth. But even so, the FLAC that I do have of my CD & Vinyl collection does go through ReplayGain, and it is dithered with such quality that I would not be able to blind ABX it against ANY of the methods of dithering that I have available - which is everything i have ever heard of, including powR, various TPDF, various Gaussian, psycho-acoustics like psychodither and MBit+, Apogee (in plugin form too), etc, etc, etc... (most of the rest, crap)

I've trained myself to be very sensitive to dithering & dithering noise actually... And there's a number of unique recordings (of worthy quality) that I can blind ABX the difference between 24bit original and 16bit very carefully dithered bit-reduced versions... to P<0.05.

And what'll ya know, lossyWAV works on anything up to 32bit INT + 8-byte float. When used on these VERY FEW known 24bit recordings that very very few people are able to blind ABX to P<0.05... it doesn't even reduce the depth to 16bits.

More like 18-19 bits. This goes to show how well tuned the lossyWAV detection algorithm is, and you would be foolish to not try it out to hear (or most likely not hear) what I've talking about.

David Reaves said:
Education only comes when someone has the desire (and the time!) to gain knowledge. The 'tweekers' amongst us are in that category. But once again, my experience is that tweekers are a very small minority of the universe of people who are called upon to use all these tools we are giving them. We HOPE they use them with intelligence, but we must never ASSUME that will be the case! Have you listened to the radio lately?

I have. And might I suggest that some people who are "tweekers" are not just limited by time and desire. Your opinion of what is "good" is only as good as the best thing you have ever heard, and for how long you have had time with it. There are many things that limit that... first and foremost: listening gear, and the environment it is in.

If, for instance, Peavey's in your bathroom is the only listening experience you have ever had in your whole life... time and desire is NOT a defining factor of what is wrong with that picture. I'm not saying you can't learn a lot and develop a pretty accurate opinion on an actually reputable (for good statistical reason, not just people's opinions) pro-sumer listening setup... But a $500 "theater system" plugged into a Sound Blaster is sitting right next to the bathroom, if you get my drift.

David Reaves said:
As a designer, I have what I call my "Guardrail at the Grand Canyon" rule. There's a guardrail at the Grand Canyon, so tourists won't fall and kill themselves. Now, rock climbers don't need a guardrail, they need a rope! But we don't hand out ropes to tourists. They get the guardrail.

You may consider yourself to be a "rock climber," rather than a "tourist," and, God bless ya, YOU may have a great skill set. Grab yerself a rope!
But I try not to confuse the two categories, because MOST people we will meet belong in the 'tourist' category, for good reason.

I have to very carefully consider the outcome before I start handing out ropes to them.

Name me one person that doesn't know how to code a dithering algorithm, that has ever coded one? ONE. Then you will see that - in and of itself - IS the guard rail. I'm not saying that there has never ever been coded a DSP that ruins music, but ReplayGain is far FAR from that. In fact it's held up entirely to the likes of HydrogenAudio who literally BAN people are having a subjective opinion. They are only objective, and a blind ABX test can only prove weather you can or can't hear the difference between two things. ReplayGain has held up to that brutal environment, and come out with shining colors. And so has Lame, FLAC, Vorbis, Speex, lossyWAV, MPC, Monkey's Audio, Wavepak, Lossy Wavepack, Nero AAC, etc, etc, etc x1,000.... all of these have been improved to be the utmost highest of standards for quality in their classes (which is also tested with statistical significance, be it subjective, it's still been tested regularly)

David Reaves said:
Because my point was that a reduced level 16-bit signal will fit 24 bits just fine; of course you won't need to truncate it. Or do anything else. Human hearing is an amazing thing, but it doesn't have 144 db dynamic range! But, unfortunately OTOH, neither do most radio stations have 24-bit storage systems.

Then you clearly don't understand the math at hand. Riddle me this. If ReplayGain is used, and is truncating to 24bits from a 16bit source.... and it decides that the signal level has to be reduced 4dB... how do you fit the FRACTIONAL math into only 24bits, without losing ANY information (from a mathematical perspective)?

This is WHY we have dithering as part of any "proper" re-quantization method.

"Signal requantization to reduce the word-length of an audio stream introduces distortions. Noise shaping can be applied in combination with a psychoacoustic model in order to make requantization distortions minimally audible."

So in the case of ReplayGain reducing a signal by 4dB... it is turning the PCM samples into floating-point numbers (which are still not large enough to completely accurately represent the signal, but are much more accurate than 24bit INT)... making it's change to those numbers, and then requantizing those numbers back into PCM samples. If those samples are 24bit INT, then there is still no reason not to dither the LSB, instead of just truncating it. It's highly likely you won't hear the difference even after processing on most gear.

Co-incidentally it's the analog systems that largely barely even approach 16bits of resolution, much less 24bits. Not the digital systems.

My own "best" reference listening device is an Apogee Mini-DAC. Having an analog volume pot AFTER the amp... it is capable of getting DARN close to 24bit accuracy in the analog world, and that doesn't include the +24dBu it's THD+N was rated at!!! I don't actually know of another DAC that can claim such statistics. And yes, I've measured it's minimal voltage at my "reference" output levels, on the headphone output. My Lipinski amps don't even approach 24bits of dynamic range so I could care less of the XLR outputs, but those are even cleaner than the headphone outputs according to the specs.

And I don't have to remind you that the noise floor of even a mono FM transmission itself would LOVE to sound as good as 14 bits can.

Yes, I said it. 14 bits sounds better than the FM transmission itself, under ANY circumstances. The noise floor of the transmission has your connection of your opinion with an FM plant... shot in the foot before it even leaves the opinion door.

David Reaves said:
And finally, that last line of yours sums up my whole reason for doubting the wholesale incorporation of ReplayGain etc:

"Just because you personally can't hear something on your own gear, doesn't mean it never will be heard"

BINGO! Words to live by!

Kind Regards,
David

First off, you're missing the whole point -- the use of ReplayGain that we're talking about. The context of this entire thread has been for personal listening, and more recently broadcasting. It has never once been about archival.

The reason I mentioned that you should use dithering instead of truncation was to minimize the need to redo your various libraries of copies of your media.

And on the discussion of archival... you really don't want to get into that with me. I'm sitting next to a SADiE 5 (+ CEDAR) right now, and have several other 1-bit recording systems in the room. I'm not trying to have this be a pissing contest. Just simply pointing out that if you're trying to "school" me on dithering, bit-depth, what happens to digital audio in the broadcast chain (good and bad ones), you're preaching to the pastor. 8)

Lastly.... I need to point out that having educated myself, I'm still not suggesting what decision people should make, I'm asking that everyone make an educated decision for themselves. You however are suggesting that people (like yourself) should decide not even try to educate themselves, and just stay away from something entirely.... despite ReplayGain's proposed standard having more thought and testing put into it than the CD Audio & DAT standards combined! I mean... come ON!!! If you can't even trust your own ears or gear, at least do some research on people's findings rather than live in the dark. SURELY you did not research CDs to find out if they were good enough for your purposes BEFORE you started using them... you heard them, and that combined with the research and testing Sony & Philips put into the standard, you decided it was adequate. EVEN THOUGH mathematically it's true... when being processed through DSP it can in extreme cases (lots of gain from low level source with DSP code of very high resolution) reveal the source's digital nature in the analog world (after DAC conversion)... yet you STILL are using CDs.

Why is that? Why have you not moved on to SACD yet? Is it because 16bits is already MORE than good enough? And does this have a connection with the FACT that almost all CDs ever released can be inaudibly reduced to at least 14bits?

_________________________________________________

Like I said before... if you ever want to put your money where your mouth is, I'll PERSONALLY help you setup a double blind ABX test, with ANY 16bit sources you want.

That is your original argument. If replay gain to the point of reducing the original 16bit source to 14bits was audible. And I would love to see you put your money where your mouth is. Anything less and everything you are saying about this topic is pure unfounded opinion... and is in opposition NOT to me, nor my subjective opinion... but in opposition to the countless man-hours and experienced listeners who have helped bring ReplayGain into existence. And in opposition to the countless men (and women) who have not found ANY audible problems with ANY of it's implementations.

So the ball is in your court. I won't reply in public to you again on this topic, but you may reply to me in public if you want. I will still read it.

jesseg · Mar 18, 2009

Here's a great example of what exactly you are removing, objectively (not including the subjective improvement of dithering)... when you remove bits from a 16bit source

The first row is when ALL bits are set to 1. So counting down from there are the audio levels, and how many changes between them, are available with the remaining bits. You'll count 16 remaining rows, one for each bit.

So if you remove two of the bottom rows of that, you get 14bit audio. You will notice you have removed a total of TWO possible volume changes (out of 65,536) and 12db of dynamic range assuming truncation... at the bottom of the volume scale.

If you want to move that into the broadcast world... even in a quiet room, even with 12db of gain, even with source truncated to 14bits... it is very unlikely you will be able to hear any difference on the vast majority of content.

(of course, add dithering into the picture and it's not likely on any content)

Part of this stems from there being more dithering in the audio processors themselves, which helps mask earlier stages of dithering AND truncating. And if you're using a digital mixer, that also has dithering in it. And if you're using digital audio networked transmission codecs, it's likely that they also have dithering as part of their adaptive sample rate conversion DSP (so that the audio never under-buffers). And there are other more rare devices and methods of moving digital audio that also can add dithering.

And of course the original 16bit CD source itself likely has dithering on it... even if it's just analog to ADC transfer from an analog mastering chain, almost all ADCs today dither from 24bit to the destination depth.

The digital recording systems will do the same thing within the DAC (if it's less than the tracking depth) and the mixing cores of those apps are also at least 32bits these days... 64bit floating-point cores are starting to crop up now, and they ALL use dithering to reduce bits to the destination format (i prefer at least 24bit INT for my mastering sources)...

There are so so many places dithering can be introduced, I really would not believe someone who was trying to tell me that their broadcast only has one level of dithering, if they don't create ALL of the program material in-house. It is completely unavoidable today, and I'm hoping you will realize that it's much more of a blessing, than a curse.

jesseg · Mar 18, 2009

Erm, the modify thing it too short (yes i know, i should have typed it all in one) but I though of a great addendum to that.

Most basic TPDF styled dithering algorithms with basic noise-shaping can increase the subjective dynamic range by roughly ~15dB.

That's more dynamic range than what was being removed by those two bits.

So what exactly is audible? Well... TPDF is basically injecting errors guided by the shaped noise. So it's somewhat random, and the noise itself IS being added to the new LSB.

Let's say that a 16bit signal is processed in some inaudible way, and it's output represents around 13.5 bits, and we're creating 14 bit samples during requantization. Assuming we would not have been able to hear the 13.5 bits change and it could be translated to analog audio perfectly (which is not possible yet) and we could not tell the difference between it and the original source... You would have only a 25% chance of hearing the dithering ONLY IF you could tell the different between the 16bit source, and a source capable of representing -99dB of dynamic range... give or take the dithering noise which should be no louder than half (on average) of the new least significant bit, our 14th bit...

In other words... You would have to be able to hear the difference between the 16 bit original, and what sounds like a small fraction of 1 bit less than that 16 bit original.

But this is not the only thing ReplayGain does.

I have ran a small number of things through ReplayGain in which it had to turn up the gain to reach the reference level I set, which is 85dB instead of ReplayGain's proposed standard of 89dB, simply because I have 85dB calibrated monitoring. Even when turning up the levels, it's a total freak accident if it needs to turn up (or down) the levels with a exact bit of difference. Even when turning up the levels it benefits the transparency of the audio greatly by having this dithering.

High internal resolution and dithering it's final output benefits everything else that adjusts volume digitally as well, which is why this technique is used so much.

rrsounds · Mar 18, 2009

Jesse Graffam said:
<huge snippage>

So the ball is in your court. I won't reply in public to you again on this topic, but you may reply to me in public if you want. I will still read it.

Jesse,

My argument is NOT that this and related processes are audible, but that they cannot be proven to be inaudible, under all circumstances. Simply because we do not know what all the circumstances are.

Kind Regards,
David

The F Mister · Mar 18, 2009

Jesse Graffam said:
Here's a great example of what exactly you are removing, objectively (not including the subjective improvement of dithering)... when you remove bits from a 16bit source

The first row is when ALL bits are set to 1. So counting down from there are the audio levels, and how many changes between them, are available with the remaining bits. You'll count 16 remaining rows, one for each bit.

So if you remove two of the bottom rows of that, you get 14bit audio. You will notice you have removed a total of TWO possible volume changes (out of 65,536) and 12db of dynamic range assuming truncation... at the bottom of the volume scale.

Silly question: Removing 2 bits don't you actually always remove the upper two? And thereby actually remove the highest two possible level changes?

jesseg · Mar 18, 2009

No. All bits enabled still gets you 0dBfs at any bit depth of PCM sampling.

What does happen also is you reduce the number of total volume changes that way - because you're removing bits which multiply the total possibilities. So you would end up with 16,384 actual volume levels at 0dBfs. Thanks to dithering though, you end up with the perception of the signal being close to the original 65,536 levels even approaching the lowest signal levels.

PCM audio can't be taken at it's face value of number of volume changes. For instance, most people (including me) have a very tough and/or impossible time blind ABXing between 16bit and 24bit... 65,536 levels, and 16,777,216 levels, respectively. If one were to take those numbers at face value, they would declare 24bit an EASY winner. In practice it is not always the case at all. 16bits is 1/256th of 24bit's audio quality by those numbers, but that also is not true in practice at all.

And even with truncation... 14bit is not 1/4 of the audio quality of 16bit.

GeneSavage · Mar 18, 2009

My argument is NOT that this and related processes are audible, but that they cannot be proven to be inaudible, under all circumstances.

I read this & thought, "he's an agnostic!" ;D

(This has been a FASCINATING thread, and while some of it has gone over my head, I'm learning a lot as well... fight on, fight on! I shall read what is posted...)

rrsounds · Mar 18, 2009

Jesse Graffam said:
<snip>
PCM audio can't be taken at it's face value of number of volume changes. For instance, most people (including me) have a very tough and/or impossible time blind ABXing between 16bit and 24bit... 65,536 levels, and 16,777,216 levels, respectively. If one were to take those numbers at face value, they would declare 24bit an EASY winner. In practice it is not always the case at all. 16bits is 1/256th of 24bit's audio quality by those numbers, but that also is not true in practice at all. And even with truncation... 14bit is not 1/4 of the audio quality of 16bit.

When I got my first Macintosh in the late 1980's, it had eight-bit mono audio I/O, sampled at 22.05kHz.

Now, you may say, "what a load of crap!" and from a hi-fi standpoint, you'd be right...But one weekend when we used it to put jingles and promos --even a few songs-- on a pirate AM station....no one really was the wiser.

The number of bits required for any particular purpose really is contextual. While at least in theory I think it's best to keep audio at the highest resolution possible, in a S/N environment of AM radio, even as few as 8 bits is not a dealbreaker. The only place you really notice is on fades.

Kind Regards,
David