This post focuses on the basics of digital audio: sample rate, bitrate, and how analog signals are represented digitally.
We use digital audio all the time, but I am surprised on a fairly regular basis how many people are unclear about how digital audio works. Digital audio has two primary qualities that compose the way the audio is described. These two qualities correlate to the qualities of real world sounds more like metaphors than anything else. Real sounds have frequencies and volumes. In order to measure real world sounds and represent them digitally, we have created sample rate and bitrate as digital’s audio qualities. Sample rate determines how analog frequencies are described digitally whereas bitrate determines how analog volume is described digitally. The two qualities need each other in order to describe a sound. You can’t have volume without frequency or frequency without volume.
In order to understand why sample rate and bitrate came about, you need to understand a little bit about how all things digital work. Digital works like a ticking second hand. Whereas time and the world as we know it seems continuous and seamless, digital breaks things like time up into little measurements. When we’re talking about measurements of time, we are talking about ticks just like the ticking of a second hand. If anything is to happen digitally, it has to happen on a tick. The rate of these ticks is measured in hertz. A 2ghz computer has 2,000,000,000,000 ticks a second. That’s a lot of ticks.
What is a Sample Rate?
Sample rate is a rate, just like the ticks we just talked about. Analog signal is smoooooth, just like the image you see to the right here. Like the real world, it just keeps going. In order to get this signal represented in the digital world, we need to measure it into little chunks by defining a rate. On the bottom line of the graph to the right you’ll see time is represented by t.
If we start to split up the smooth analog signal into digital chunks, you will start to see something like the second image. With each tick of the clock we measure what’s happening at each tick ‘t’. That measurement is documented, represented by the balls on the signal graph. How often we do these measurements is called the sample rate. The higher the rate, the closer you’ll get to the smoothness of the first image. Measuring things in bits like this is called quantizing and the measurements are called samples (hence sample rate).
The sample rate can be thought of as how often or how much the sound is described.
CD quality audio has 44,100 of these measurements a second. That’s called 44.1 kilohertz (khz).
What is a Bit Depth?
With what we just learned in mind, consider that in order for these ticks to make any sense at all they need to actually be measuring something. What is it that we’re measuring? Volume. Volume is represented by the height of the balls in the image. With each tick a new measurement of the volume is made. How do we describe the volume? Is it a range from 0 to 100? 0 to 2000? 0 to 1? The range of volumes that can be described is the bit rate. Now, in each of these examples, 0 means totally silent and 100, 2000, or 1, respectively, means as-loud-as-it-can-get. So the only difference between each of these ranges is not how loud the sound can be but how many different volumes can be described. We only have two choices for ‘0 to 1’, ie. is there a sound or not? But from 0 to 2000 we can have half volume (1000), quarter volume (500), or even somewhere in between (829). The higher the bitrate, the more accurately we can communicate exactly how loud the volume of the ‘real’ sound we want to describe is.
The bit rate can be thought of as how well the sound is described.
CD quality audio has 65,536 volumes to choose from for every sample that’s measured. That’s called 16-bit audio (because 2 to the 16th power is 65,536).
Putting Them Together
As mentioned earlier, these things are only useful when used together. With each sample rate and bitrate there’s a limit to how accurately the analog sound to be described can be described. The Nyquist–Shannon sampling theorem states that a sample rate of twice the maximum frequency of the signal being sampled is needed to describe the frequency. Most humans can hear from 20hz to 20khz, so the sampling rate of 44.1khz was chosen to be able to capture frequencies up to 22.05khz.
Speaking in General Terms
In general, the higher bitrate the ‘smoother’ the sound will be. 8-bit sounds rather grainy and harsh whereas 16-bit sound sounds quite a bit better. 24-bit sound is used by most audio professionals these days not because it sounds so much better than 16-bit sound but because the higher accuracy is useful because so much is done to the audio in the recording, mixing, and mastering process. Higher bitrate means that each change that is done to the sound produces a more accurate result. Imagine only being able to describe the sounds you’re recording with two volumes: on or off. It would be impossible to produce any music at all with such a low bitrate.
A couple of years ago there was a lot of buzz going on about high sample rates in pro audio equipment. Higher sample rates are theoretically able to capture higher frequencies that humans may or may not be able to perceive (it’s still being debated whether it makes a difference or not). There are many things to consider when making these claims, including the quality of the microphones used, the sounds being recorded, the delivery medium, and the quality of the speakers to be used to listen to the material. Since then, most people have come to the agreement that high resolution samples rates are not as important as higher bit rates with respect to pro audio. Again, you can’t replace one with the other, so a balance is required, but 44.1khz/24-bit audio is still the standard when producing 44.1khz/16-bit audio CD quality audio. Why the higher sample rate if your ultimate destination is lower? It just sounds better, especially if the final product is dithered (a subject for another post altogether)
When digital audio is played back, the audio processor looks at the information and recreates the waveform from the sample/bit rates. It’s actually creating, as best it can, real continuous sound from the quantized digital data. Remember: you can’t hear digital so it’s up to the audio processor to figure out how to create sounds from the information.
As with any choices of this type, one needs to measure what one’s needs are. Why carry around a dozen cups from shot glasses to five gallon jugs when one 12 ounce travel mug suits most of your drinking needs throughout the day? It’s the same way with sample rates and bitrates. If you don’t need super high-res audio, it may not be worthwhile to record it. The higher the bitrate and samplerate, the more data will be recorded, the larger your sessions will be, and the harder your DAW will have to work. See this table for a few examples (taken from tweakheadz.com)…
|Bit Depth||Sample Rate||Bit Rate||File Size of one stereo minute||File size of a three minute song|
|16||44,100||1.35 Mbit/sec||10.1 megabytes||30.3 megabytes|
|16||48,000||1.46 Mbit/sec||11.0 megabytes||33 megabytes|
|24||96,000||4.39 Mbit/sec||33.0 megabytes||99 megabytes|
|mp3 file||128 k/bit rate||0.13 Mbit/Sec||0.94 megabytes||2.82 megabytes|
|Bit depth/sample rate||number of mono tracks||size per mono track||size per song||songs per 20 gigabyte hard disk||songs per 200 gigabyte hard disk|
|16/44.1||8||15.1 megs||121 megs||164||1640|
|24/96||8||49.5 megs||396 megs||50||500|
|16/48||16||16.5 megs||264 megs||74||740|
|24/96||16||49.5 megs||792 megs||24||240|
Do some listening, talk with your clients, and get a feel for the capabilities of your gear to decide what suits your needs!
I hope this helps demystify the guts of digital audio for folks!