When we record digital audio, the computer helpfully shows a picture of what it looks like. But most musicians are more interested in music than waveforms, so if it vaguely looks “alright” we tend to concentrate on getting the song in tune and in time with the right lyrics.
We tend to find out most of the technical stuff by trial and error. For instance our ears tell us pretty quickly that if we record music too loud it distorts and sounds horrible. The waveform meanwhile starts to look like this:
If you look closely, instead of the usual spiky peaks, some waves are squared-off at the top and bottom. Engineers call it ‘clipping’. Any fool with a computer soon learns that you have to pull down the fader a bit so that the loud bits don’t sound distorted.
Back in the days of analogue there was an equally obvious reason not to record too quietly either: your recordings got drowned out by tape hiss. So the skill lay in getting your music onto the tape as loud as possible without it actually distorting. so as to drown out the background noise.
But I fondly imagined that with digital – now tape hiss was a thing of the past – then keeping the levels high didn’t matter any more. I started recording my demos at nice low levels so as to avoid any possibility of clipping. As a result my waveforms looked like this:
What does it matter how loud you record when it’s all nice clean 16 bit digital audio, I used to say smugly. Until one day an audio engineer finally explained in mind-numbing detail what 16-bit actually means.
Almost all of it went over my head apart from this one vital fact: a digital audio file “describes” soundwaves in a series of, erm, bits. So a 16-bit audio file describes the dynamic range from silence to full volume with sixteen bits. And you can draw those bits as a series of lines like this:
It then became clear that my recordings were actually only using half of the available bits : the top eight were describing nothing at all. So in fact I was recording in low quality 8 bit audio which is, as we know, A Bad Thing.
Even if you’ve recorded your audio at too low a level, it’s still worth pushing it up afterwards as high as it’ll safely go, using the “Normalize” function in your editing software. Here’s that same audio file after being normalised:
Normalising doesn’t improve the quality of your original recording, but it will improve the results of any further processing such as converting to Mp3 or adding EQ/compression etc. It’ll also increase the output level of your sound file.
Which is all a long way of saying:
1) The higher level you record at, the higher your sound quality
2) Normalising the recording afterwards will give you better quality, louder Mp3’s
I’m not sure this is correct. If you go onto the soundonsosund.com mastering forum, there is a different point of view.
Track your levels at an average level(avoiding peaks)
When playing back tracks for the mix, it is also important to not let the individual track levels go anywhere near the red. Keeping it below -6db on each instrument. Even on playback you can introduce subtle digital distortion.
When you mix down to stereo L-R keep the levels quite high, but not fully up to 0db, maybe -3db
It is in the final mastering stage that you push the level as loud as you can, upto 0db..
There is a really good guide called tweaks guide
http://messageboard.tapeop.com/viewtopic.php?t=38430
Skopje
Thanks for that link, Skopje. I made this posting because a lot of the mp3s that people send us for the radio show are unmastered and mixed at very low levels. So they end up sounding much quieter and flatter than other tracks in the playlist.
But I’m absolutely not an expert, and it’s great to read stuff from people who are, even if I don’t really understand it. The link above is to a message board where a vehement and knowledgeable contributor rails bitterly against the concept of Using All The Bits: “It’s a horrible, horrible idea that anyone who has the very most basic handle of the concepts of recording levels and gain-staging would never do.”
I don’t even know what gain-staging is, and certainly can’t really follow the technical language in his argument.
All I know is, there’s a wide variation in the sound of mp3s that people send us, even when they’re at the same bitrate. The loud ones with the waveform going right up to the top tend to sound better than ones that have just a little squiggle of black in the middle and lots of white space on either side.
Going on what Skopje says, it seems like the key issue we’re really talking about here is mixing and mastering levels. If anyone out there knows of a Simplified Guide To Mastering or How To Make Great Sounding Mp3s it would be great to have some more links…
Yes i see your point.
What you’re talking about actually happened to me. I was lucky enough to get played on introducing over a year ago, and i was new to PC recording.
I was used to working in the analogue world, and in the tv industry, and not used to digital levels.
The MP3 i sent was far too quiet. It actually made the compressors on the radio broadcast, turn the levels up and down on different sections of the song
Very proud to have been played, but it was quite an expensive way to learn about digital levels
SKOPJE
I really think that the 8/16/24/whatever-bit could mean the resolution for the wave file, and it does not actually have anything to do with volume but with the proximity with the original real sound.
It’s like a scanned drawing: The better ppi resolution you use, more details you’ll get in your picture.
While the sample rate means the EQ apperture/range, the bits mean the resolution for the “drawing” of your wavefile.
Cheers Jethro, that makes perfect sense and I stand corrected. So my diagrams above are incorrect and the 16 bits must be measured LONGWAYS (ie side-to-side) rather than vertically. Which means that regardless of what LEVEL you record at, higher bit rates will still give you greater fidelity. So it this whole post completely wrong – or was my engineer friend still correct to advise against recording at low gain levels ?
^ not quite, digital to analogue conversion happens at a fixed time interval e.g. every 1/10,000 of a second or whatever.
The vertical values are not a simple linear scale but are logarithmic, so your engineer was more or less right, its just that probably 13 of those 16 bits are covering the bottom half of the graph.
Actually I’m pretty sure the algorithm is more complex than a pure log scale but I’m trying to keep things simple to understand.
for more detail:
http://en.wikipedia.org/wiki/Digital_to_analog_converter
http://www.turnmeup.org