Whirlwind Tour of Digital Audio

In this section we will give a very quick overview of some concepts relevant to digital audio and sound cards.

Sound is produced when waves of varying pressure travel through a medium, usually air. It is inherently an analog phenomenon, meaning that the changes in air pressure can vary continuously over a range of values.

Modern computers are digital, meaning they operate on discrete values, essentially the binary ones and zeroes that are manipulated by the computer's CPU. In order for a computer to manipulate sound, it converts the analog sound information into digital format.

A hardware device called an analog-to-digital converter converts analog signals, such as the continuously varying electrical signals from a microphone, to digital format for manipulation by the computer. Similarly, a digital-to-analog converter converts digital values into analog form so that they can be sent to an analog output device such as a speaker. Sound cards typically contain several analog-to-digital and digital-to-analog converters.

The process of converting analog signals to digital form consists of taking measurements or samples of the values at regular periods of time, and storing these samples as numbers. The process of analog-to-digital conversion is not perfect, however, and introduces some loss or distortion. Two important factors that affect how accurately the analog signal is represented in digital form are the sample size and sampling rate.

The sample size is the range of values of numbers that are used to represent the digital samples, usually expressed in bits. For example, an 8-bit sample would convert the analog sound values into one of 28 or 256 discrete values. A 16-bit sample size would represent the sound using 216 or 65,535 different values. A larger sample size allows the sound to be represented more accurately, reducing the sampling error that occurs when the analog signal is represented as discrete values. The tradeoff with using a larger sample size is that the samples require more storage (and the hardware is typically more complex and therefore expensive).

The sample rate is the speed at which the analog signals are periodically measured over time. It is properly expressed as samples per second, although sometimes informally but less accurately expressed in Hertz. A lower sample rate will lose more information about the original analog signal, while a higher sample rate will more accurately represent it. The sampling theorem states that to accurately represent an analog signal it must be sampled at least twice the rate of the highest frequency present in the original signal.

The range of human hearing is from approximately 20 to 20,000 Hertz under ideal situations. To accurately represent sound for human listening, then, a sample rate of twice 20,000 Hertz should be adequate. CD player technology uses 44,100 samples per second, which is in agreement with this simple calculation. Human speech has little frequency activity above 4,000 Hertz. Digital telephone systems typically use a sample rate of 8,000 samples per second, which is perfectly adequate for conveying speech. The tradeoff involved with using different sample rates is the additional storage requirement and more complex hardware needed as the sample rate increases.

Other issues that arise when storing sound in digital format are the number of channels and the sample encoding format. To support stereo sound, two channels are required. Some audio systems use four or more channels.

The samples themselves can be encoded in different formats. We've already mentioned sample size, with 8-bit and 16-bit samples being the most common. For a given sample size the samples might be encoded using signed or unsigned representation, and when the storage takes more than one byte, the ordering convention must be specified. These issues are important when transferring digital audio between programs or computers to ensure they agree on a common format. File formats, such as WAV, standardize how to represent sound information in a way that can be transferred between different computers and operating systems.

Often, sounds need to be combined or changed in volume. This is the process of mixing, and can be done in analog form (e.g., a volume control) or in digital form by the computer. Conceptually, you can mix two digital samples together simply by adding them, and you can change volume by multiplying the digital samples by a constant value.

Up to now we've discussed storing audio as digital samples. Other techniques are also commonly used. FM synthesis is an older technique that produces sound using hardware that manipulates different waveforms, such as sine and triangle waves. The hardware to do this is quite simple and was popular with the first generation of computer sound cards for generating music. Many sound cards still support FM synthesis for backward compatibility. Some newer cards use a technique called wavetable synthesis that improves on FM synthesis by generating the sounds using digital samples stored in the sound card itself.

MIDI stands for Musical Instrument Digital Interface. It is a standard protocol for allowing electronic musical instruments to communicate. Typical MIDI devices are music keyboards, synthesizers, and drum machines. MIDI works with events representing such things as a key on a music keyboard being pressed, rather than storing actual sound samples. MIDI events can be stored in a MIDI file, providing a way to represent a song in a very compact format. MIDI is most popular with professional musicians, although many consumer sound cards support the MIDI bus interface.

Earlier we mentioned CD audio, which uses a 16-bit sample size and a rate of 44,100 samples per second, with two channels (stereo). One hour of CD audio represents more than 600 MB of data. In order to make the storage of sound more manageable, various schemes for compressing audio have been devised. One approach is to simply compress the data using the same compression algorithms used for computer data. However, by taking into account the characteristics of human hearing, it is possible to compress audio more efficiently be removing components of the sound that are not audible. This is called lossy compression because information is lost during the compression process, but when properly implemented data size is reduced greatly, with little noticeable loss in audio quality. This is the approach that is used with MPEG-1 level 3 audio (MP3), which can achieve compression levels of 10:1 over the original digital audio. Another lossy compression algorithm that achieves similar results is Ogg Vorbis, which is popular with many Linux users because it avoids patent issues with MP3 encoding. Other compression algorithms are optimized for human speech, such as the GSM encoding used by some digital telephone systems. The algorithms used for encoding and decoding audio are sometimes referred to as codecs.

For applications in which sound is to be sent live via the Internet, sometimes broadcast to multiple users, sound files are not suitable. Streaming media is the term used to refer to systems that send audio, or other media, and play it back in real time.

Now that we've discussed digital audio concepts, let's look at the hardware used for audio. Sound cards follow a history similar to other peripheral cards for PCs. The first-generation cards used the ISA bus, and most aimed to be compatible with the SoundBlaster series from Creative Labs. With the introduction of the ISA Plug and Play (PnP) standard, many sound cards adopted this format, which simplified configuration by eliminating the need for hardware jumpers. Modern sound cards now typically use the PCI bus, either as separate peripheral cards or as on-board sound hardware that resides on the motherboard but is accessed through the PCI bus. Some USB sound devices are now available, the most popular being loudspeakers that can be controlled through the USB bus.

Some sound cards now support higher-end features such as surround sound using as many as six sound channels, and digital inputs and outputs that can connect to home theater systems. This is beyond the scope of this book, so we will not discuss such sound cards here. Much useful information on 3D sound can be found at http://www.3dsoundsurge.com. Information on the OpenAL 3D audio library can be found at http://www.openal.org/home.

Was this article helpful?

0 0
The Ultimate Computer Repair Guide

The Ultimate Computer Repair Guide

Read how to maintain and repair any desktop and laptop computer. This Ebook has articles with photos and videos that show detailed step by step pc repair and maintenance procedures. There are many links to online videos that explain how you can build, maintain, speed up, clean, and repair your computer yourself. Put the money that you were going to pay the PC Tech in your own pocket.

Get My Free Ebook


Post a comment