Quick question about file specs.

Workaphobe · Feb 21, 2009

I have a somewhat embarrassing question because I feel like I should know the answer to this already. But, I don't.

A new client has given me this instruction: "The final level of the audio track should be -3 dB down."

Now, I'm using Cool Edit Pro 2.0. Should I Normalize and tell CEP to normalize to -3 db? Or is there some other way I should do this? Normalizing to -3 db seems to reduce the overall levels too much since it's making the peaks top out at -3 db.

Emmett1057 · Feb 22, 2009

That's exactly what you should do...-3dB isn't much reduction at all, so I wouldn't worry about it reducing anything "too much".

Emmett

SirRoxalot · Feb 22, 2009

dB, or not dB

You do realize that -3dB is half the amplitude of 0 dB, right?

I'm not saying that it's not a good level that allows plenty of overhead to avoid clipping, but to say that it "isn't much reduction at all" isn't quite accurate.

dB reductions and gains are not linear in nature, they're exponential. dB are not a unit - they're a power ratio based on incoming power and outgoing power. There's a complicated logarithmic formula, but it's easier to work from a "rule of thumb":

+3dB is doubles the gain of 0dB
+10dB is 10 times the gain of 0dB
-3dB is half the gain of 0dB
-10dB is 1/10th the gain of 0dB

These dB ratios are cumulative. -3dB is 1/2 the gain of 0dB. -6dB is 1/4 (1/2 of 1/2) of the gain. -10dB is 1/10 of the gain. -13dB is a 1/20 (1/2 of 1/10) of the gain of 0dB.

The same thing works on the plus side. +3dB is 2x the gain of 0dB. +6dB is 4x the gain. +10dB is 10x the gain. +13dB is 20x the gain of 0dB.

You can see where overmodulation can really get you into clipping in a hurry. Since a gain of "only" +3dB doubles the voltage of an audio signal, clippers and AGCs jump in real fast to prevent damage to the system.

Requesting that "the final level of the audio track should be -3dB down" essentially protects the sender from creating files that might have transient spikes that might easily distort, and/or would be crushed by a clipper or compressor.

Emmett1057 · Feb 22, 2009

Yes, half the power expressed as a voltage ratio of -1.414:1. Yes, believe me, I realize plenty. We can get into how intensity works within the wattage and voltage scales, if you like, but I don't see any reason to go down that road. What you're failing to take into account is the 96.33dB of available amplitude with 16-bit audio. This is, in fact, dBfs we're talking about here, so the only biproduct of this can be the 3dB loss of S/N ratio...Which will be long gone from the mp3 conversion anyway and well covered by 1-bit of dither used during midown of the final product.

This is digital folks...Your voice recordings should not be peaking above -3dBfs anyway.

Workaphobe · Feb 24, 2009

Thanks for the response, guys. Most of the level reduction seemed to be on the negative amplitude side. But, the client seemed to be pleased with the test file I recorded. So, I guess I'm good to go. Thanks again!

Goat Rodeo Cowboy · Feb 24, 2009

I have a guess about so many people having a tradition of keeping digital audio peaking (normalized) 3 db below full scale. Some of the early digital equipment had poorly designed analog output systems as part of the D/A conversion. A fully "modulated" digital signal would overload the analog output. I haven't found any devices that I work with... including cheap discman style CD players that suffer this problem.

Original recordings should be made may 6db to 9 db down so that those spikes and transients do not clip. Once those are edited out and at least some compression is applied, the the finished content can be raised to -3db or whatever the end user wants.

I edit and reproduce church recordings and personal projects for my friends. (I just transferred someones family Christmas recordings from 1969 and 1971 on cassette to CD. I took out the worst thumps where the tape was stopped and restarted but left other mic handling noise and barking dogs in the recording. I have have a favorite level for CDs that I produce: -0.667db.

Emmet: When you say voice recordings should not peak above -3db, are you talking about the original recording, or the edited and finished recording?

Emmett1057 · Feb 24, 2009

Goat Rodeo Cowboy said:
I have a guess about so many people having a tradition of keeping digital audio peaking (normalized) 3 db below full scale. Some of the early digital equipment had poorly designed analog output systems as part of the D/A conversion. A fully "modulated" digital signal would overload the analog output. I haven't found any devices that I work with... including cheap discman style CD players that suffer this problem.

Original recordings should be made may 6db to 9 db down so that those spikes and transients do not clip. Once those are edited out and at least some compression is applied, the the finished content can be raised to -3db or whatever the end user wants.

I edit and reproduce church recordings and personal projects for my friends. (I just transferred someones family Christmas recordings from 1969 and 1971 on cassette to CD. I took out the worst thumps where the tape was stopped and restarted but left other mic handling noise and barking dogs in the recording. I have have a favorite level for CDs that I produce: -0.667db.

Emmet: When you say voice recordings should not peak above -3db, are you talking about the original recording, or the edited and finished recording?

I was talking about the original recording...If you're peaking above -3dB, you're getting DANGEROUSLY close to clipping and it becomes a real challenge to inflect properly. As far as the final product goes, if it's wav or AIF, all the way up to -.001dB is fine...For mp3, I usually recommend -1dB because mp3 does more than just data reduction...And if you're right at 0dB, it will clip during the encoding process. Mp3 is really a rancid format and I can't wait until we can all stop using it! (In fairness, it's fine for most things, I just hate to hear the subtleties of fine preamps getting lost in an mp3...)

As per the original topic though, by normalizing peaks to -3dB, you're not losing much and probably nothing audible, so if it makes the client happy, let them have it! I've had other projects where I've produced something for a client who has requested that the delivery be -6dB because that's where everything in their automation system was normalized to and it saved them a step if I delivered it that way.

Emmett

Goat Rodeo Cowboy · Feb 24, 2009

Emmett said:
I I've had other projects where I've produced something for a client who has requested that the delivery be -6dB because that's where everything in their automation system was normalized to and it saved them a step if I delivered it that way.

That is the most logical reason for a client asking for a particular level setting: so the new audio being produced comes out of the station's audio system at the same level as everything else they currently are using.

I have listened to some stations who obviously were not policing that area. Very noticeable changes in volume level as one program element transitioned to the next. This is noticeable on stations that do not have an audio chain designed to cope with reasonable changes in level.

SirRoxalot · Feb 24, 2009

-3dB

Having files normalized to a pre-set standard should be SOP. There's a very valid reason for specifying -3dB if - as Emmett pointed out - you're talking about -3dBfs. That happens to be the highest value that you can use with causing clipping in digital files.

For a relatively short explanation, see http://en.wikipedia.org/wiki/DBFS.

Goat Rodeo Cowboy · Feb 25, 2009

SirRoxalot: I am totally and utterly confused by your post. I read the Wiki reference and that was even more confusing. Whoever wrote that may be a genius, but cannot communicate with we who are mere mortals.

In Audition, in Edit View, I take some audio that I have just recorded. I go to the 'Normalize' icon and set it for -0 or 100% and I run it. The highest single peak anywhere in that recording reaches up or down to KISS the full-scale/zero level line. AT this point there is no clipping. Why would you say that this recording now has distortion in it?

There is a feature I do not use. There is a "group normalize" process and the verbiage in the help file starts talking about loudness and RMS peak values. Here I see some opportunities for "-3dBfs RMS" to be the maximum data that can avoid distortion.

Help me understand how you are using these terms. I think this whole discussion is bogged down in differing semantics and definitions.

SirRoxalot · Feb 25, 2009

Encoding and dB

Here's the simple explanation:

If you're working with a digital encoding format that's based on peak amplitude, you can "normalize" to 0dB without a problem, and without fear of clipping.

If you're working with a digital encoding format (like PCM .wav) that's based on RMS amplitude, you should "normalize" to -3dB to prevent clipping.

How do different programs handle "normalize" on their own scales? I don't know. It's very possible that they adjust for different formats, and normalizing to 0dB is just fine. To avoid any possibility of clipping, though, -3dB would be a better option. As long as your noise floor is well below your signal level, it won't make much difference. Once you go digital, the signal-to-noise ratio becomes more of an issue than a "low" level.

Emmett1057 · Feb 25, 2009

To clarify a little more...

The -3dB RMS value is based on a sine wave. Any voice or production you would produce would be much more dynamic than a sine wave, thus, if you normalized to an RMS level of -3dB, you would SEVERELY clip.

A typical voice recording that has been normalized to peak at 0dB has an RMS of about -20dB. Even very heavily compressed voice won't go much louder than -10dB RMS without sounding heavily distorted.

When working with a DAW, there isn't really a need to worry about RMS unless you're trying to achieve something specific...For example, matching the percieved volume of two recordings...For this RMS is usually better, but even then, there's no substitute for using your ears.

Emmett

Goat Rodeo Cowboy · Feb 25, 2009

This whole thing gets more complex as you look deeper into it.

I use Adobe Audition 2.0 and other versions may offer other features. When you normalize an audio file and specify a value, it looks for PEAK values, not RMS values. Thus, specifying a PEAK value of 0 dBfs or -0.3 dBfs, clipping does not occur, and distortion does not occur. And I can normalize to the peak value of my choice: 0, -1.0, -3.0, -10.0, -25 if you like. (does not have to be round number. I typically take work that I am finished editing and normalize to -0.667.... fractional amounts are accepted. Or at the click of a check-box I can specify values in percentages: 100, 93, 80% etc.

Then under WINDOW > AMPLITUDE STATISTICS I can see the values of the area I have selected:
Minimum RMS peak values,
Maximum RMS peak values,
Average RMS values
Total RMS values

and Emmett nailed it: you can normalize your audio to 0 or -3 PEAK values and go to statistics and find that your AVERAGE RMS value is in the -20 dBfs range. Head-banging rock music normalized to 0 will have a considerably high AVERAGE RMS than a lecture by a seminary professor using medium paced to slow conversational style speech. (-23 to -26 Average RMS value)

Thank you for bringing up this topic. Here is what I got busy and discovered about Audition today. They have a feature that I had never opened up and explored. It is designed for people getting ready to move some audio to a CD. Would work well also for people putting recordings onto a hard drive for a station automation system. It's called "Group Waveform Normalization". Open up multiple files. Select all the files in Group Waveform Normalization and ANALYZE. The little statistics table will tell you what the Average RMS value of each file is. BUT WAIT, THERE'S MORE! The feature will analyze the critical mid-range frequencies that affect listen-ability and also calculate: LOUDNESS Average RMS. When you push the button and execute a Group Normalize you end up with all the tracks appearing to be of EQUAL LOUDNESS. When you look at the resulting wave forms on screen some fill the screen vertically with a lot of white space in the horizontal space. Others will only partially fill the vertical space but the wave form will be almost solid black. (Head banging rock and roll?) ;D

And Emmett: A wave form does not have to be a pure Sine Wave to have an RMS value. Complex voice waves also have an RMS value as to SQUARE WAVES.

I think this topic may be more complex than many of us are ready to think. I'm going to play with this GROUP Waveform Normalization some. I have some uses for that!

Emmett1057 · Feb 26, 2009

Well of course those things HAVE an RMS value...RMS is simply root mean square...The equal loudness contour is standard A-weighting. But the -3dB law applies to a sine wave only...That is a sine wave normalized to peak at 0dB will have an RMS value of -3dB. Therefore, a sine wave can never exceed -3dB RMS without clipping.

Workaphobe · Mar 24, 2009

Hope you guys don't mind me reviving this topic...

The client I mentioned in the OP is asking me to amplify the audio that he had previously asked to be at -3dB. The problem he seems to be having is what he's calling and asymmetrical wave. The negative amplitude is higher than the positive amplitude. Does that difference really make a...well...difference? The level meters are maxing out at -3dB. Isn't that the bottom line here? From what I can tell, there isn't any clipping.

What causes such a difference in amplitude? And is it cause for concern?

Anyway, I made two versions for him. The first I normalized to 100% instead of -3dB to see what he thinks of that. The second I used Hard Limiting with "Limit Max Amplitude" set to -0.6dB and "Boost Input" set to 4dB.

Any thoughts?

Goat Rodeo Cowboy · Mar 24, 2009

Workaphobe said:
The client I mentioned in the OP is asking me to amplify the audio that he had previously asked to be at -3dB. The problem he seems to be having is what he's calling and asymmetrical wave. The negative amplitude is higher than the positive amplitude.

When you see all the variations expressed in this thread about defining level, you probably did a good thing in making a choice of samples for the client to review.

Record a dozen people, singing or speaking, and you are going to observe that the relationship of "peaks vs. average levels" is going to be all over the place. I walked into a station one day to meet the person in charge and he had some voice on display in Cool Edit. It was.... "pretty". He observed: Did you ever notice that people who work in radio learn to use their voice in such a way that those pesky spikes just aren't there.

Years ago, in the hey-day of A.M. broadcasting there was a piece of equipment that a number of stations acquired called the Kahn Symmetra-Peak or something like that. It was designed to even out that asymentrical condition. Real human voice tends to be asymetrical. Is that a problem? It bothers those of us who see it on a screen... but, Is that a problem?

At one time I think I observed that running a track through Adobe Audition's Multiband Processor (even if set with thresholds that there wasn't really any processing going on!) seemed to reverse the asymmetry and tone it down a bit. Running the track twice did a lot to minimize that unbalanced look. I haven't gone back and experimented with that since then.

Emmett1057 · Mar 24, 2009

This could be DC offset, which is not good...or it could just be asymmetry, which is simply pressure differences between the front and back of the mic capsule...Nothing to worry about. Its just polarity differences and its normal, as GRC suggests.