Get a site

VC++ 6.0 ebook chapter index
homepage aranna.altervista.org
Free counters!

Waveform Audio

Waveform audio is the most utilized multimedia feature of Windows. The waveform audio facilities can capture sounds coming through a microphone, turn them into numbers, and store them in memory or on disk in waveform files with the extension .WAV. The sounds can then be played back.

Sound and Waveforms

Before plunging into the waveform audio API, it's important to have an understanding of the physics and perception of sound and the process by which sounds can get in and out of our computers.

Sound is vibration. The human body perceives sound as it changes the air pressure on our eardrums. A microphone can pick up these vibrations and translate them into electrical currents. Similarly, electrical currents can be sent to amplifiers and speakers for rendering back into sound. In traditional analog forms of sound storage (such as audio tape and the phonograph record) these vibrations are stored as magnetic pulses or contoured grooves. When a sound is translated into an electrical current, it can be represented by a waveform that shows vibrations over time. The most natural form of vibration is represented by the sine wave, one cycle of which was shown earlier in this book in Figure 5-7.

The sine wave has two parameters—amplitude (that is, the maximum amplitude over the course of one cycle) and frequency. We perceive amplitude as loudness and frequency as pitch. Human ears are generally said to be sensitive to sine waves ranging from low-pitched sounds at 20 Hz (cycles per second) to high-pitched sounds at 20,000 Hz, although sensitivity to these higher sounds degrades with age.

Human perception of frequency is logarithmic rather than linear. That is, we perceive the frequency change from 20 Hz to 40 Hz to be the same as the frequency change from 40 Hz to 80 Hz. In music, this doubling of frequency defines the octave. Thus, the human ear is sensitive to about 10 octaves of sound. The range of a piano is a little over 7 octaves, from 27.5 Hz to 4186 Hz.

Although sine waves represent the most natural form of vibration, sine waves rarely occur in nature in pure forms. Moreover, pure sine waves are not very interesting sounds. Most sounds are much more complex.

Any periodic waveform (that is, a waveform that repeats itself) can be decomposed into multiple sine waves whose frequency relationships are in integer multiples. This is called a Fourier series, named after the French mathematician and physicist Jean Baptiste Joseph Fourier (1768–1830). The frequency of periodicity is known as the fundamental. The other sine waves in the series have frequencies that are 2, 3, 4 (and so forth) times the frequency of the fundamental. These are called overtones. The fundamental is also called the first harmonic. The first overtone is the second harmonic, and so forth.

The relative intensities of the sine wave harmonics give each periodic waveform a unique sound. This is known as "timbre," and it's what makes a trumpet sound like a trumpet and a piano sound like a piano.

At one time it was believed that electronically synthesizing musical instruments required merely that sounds be broken down into harmonics and reconstructed with multiple sine waves. However, it turned out that real-world sounds are not quite so simple. Waveforms representing real-world sounds are never strictly periodic. Relative intensities of harmonics are different over the range of a musical instrument and the harmonics change with time as each note is played. In particular, the beginning of a note played on a musical instrument—called the attack—can be quite complex and is vital to our perception of timbre.

Due to the increase in digital storage capabilities in recent years, it has become possible to store sounds directly in a digital form without any complex deconstruction.

Pulse Code Modulation

Computers work with numbers, so to get sounds into our computers, it is necessary to devise a mechanism to convert sound to numbers and back again from numbers to sound.

The most common method of doing this without compressing data is called "pulse code modulation" (PCM). PCM is used on compact discs, digital audio tapes, and in Windows. Pulse code modulation is a fancy term for a conceptually simple process.

With pulse code modulation, a waveform is sampled at a constant periodic rate, usually some tens of thousands of times per second. For each sample, the amplitude of the waveform is measured. The hardware that does the job of converting an amplitude into a number is an analog-to-digital converter (ADC). Similarly, numbers can be converted back into electrical waveforms using a digital-to-analog converter (DAC). What comes out is not exactly what goes in. The resultant waveform has sharp edges that are high-frequency components. For this reason, playback hardware generally includes a low-pass filter following the digital-to-analog converter. This filter removes the high frequencies and smooths out the resultant waveform. On the input side, a low-pass filter comes before the ADC.

Pulse code modulation has two parameters: the sample rate, or how many times per second you measure the waveform amplitude, and the sample size, or the number of bits you use to store the amplitude level. As you might expect, the faster the sampling rate and the larger the sample size, the better the reproduction of the original sound. However, there is a point where any improvements to the sampling rate and sample size are overkill because they go beyond the resolution of human perception. On the other hand, making the sampling rate and sample size too low can cause problems in accurately reproducing music and other sounds.

The Sampling Rate

The sampling rate determines the maximum frequency of sound that can be digitized and stored. In particular, the sampling rate must be twice the highest frequency of sampled sound. This is known as the "Nyquist Frequency," named after Harry Nyquist, an engineer who did research in the 1930s into sampling processes.

When a sine wave is sampled with too low a sampling rate, the resultant waveform has a lower frequency than the original. This is known as an alias. To avoid the problem of aliases, a low-pass filter is used on the input side to block all frequencies greater than half the sampling rate. On the output side, the rough edges of the waveform produced by the digital-to-analog converter are actually overtones composed of frequencies greater than half the sampling rate. Thus, a low-pass filter on the output side also blocks all frequencies greater than half the sampling rate.

The sampling rate used on audio CDs is 44,100 samples per second, or 44.1 kHz. The origin of this peculiar number is as follows:

The human ear can hear up to 20 kHz, so to capture the entire audio range that can be heard by humans, a sampling rate of 40 kHz is required. However, because low-pass filters have a roll-off effect, the sampling rate should be about 10 percent higher than that. Now we're up to 44 kHz. Just in case we want to record digital audio along with video, the sampling rate should be an integral multiple of the American and European television frame rates, which are 30 Hz and 25 Hz respectively. That pushes the sampling rate up to 44.1 kHz.

The compact disc sampling rate of 44.1 kHz produces a lot of data and might be overkill for some applications, such as recording voice rather than music. Halving the sampling rate to 22.05 kHz reduces the upper range of reproducible sound by one octave to 10 kHz. Halving it again to 11.025 kHz gives us a frequency range to 5 kHz. Sampling rates of 44.1 kHz, 22.05 kHz, and 11.025 kHz, as well as 8 kHz, are the standards commonly supported by waveform audio devices.

You might think that a sampling rate of 11.025 kHz is adequate for recording a piano because the highest frequency of a piano is 4186 Hz. However, 4186 Hz is the highest fundamental of a piano. Cutting off all sine waves above 5000 Hz reduces the overtones that can be reproduced and will not accurately capture and reproduce the piano sound.

The Sample Size

The second parameter in pulse code modulation is the sample size measured in bits. The sample size determines the difference between the softest sound and loudest sound that can be recorded and played back. This is known as the dynamic range.

Sound intensity is the square of the waveform amplitude (that is, the composite of the maximum amplitudes that each sine wave reaches over the course of one cycle). As is the case with frequency, human perception of sound intensity is logarithmic.

The difference in intensity between two sounds is measured in bels (named after Alexander Graham Bell, the inventor of the telephone) and decibels (dB). A bel is a tenfold increase in sound intensity. One dB is one tenth of a bel in equal multiplicative steps. Hence, one dB is an increase in sound intensity of 1.26 (that is, the 10th root of 10), or an increase in waveform amplitude of 1.12 (the 20th root of 10). A decibel is about the lowest increase in sound intensity that the ear can perceive. The difference in intensity between sounds at the threshold of hearing and sounds at the threshold of pain is about 100 dB.

You can calculate the dynamic range in decibels between two sounds with the following formula:

where A1 and A2 are the amplitudes of the two sounds. With a sample size of 1 bit, the dynamic range is zero, because only one amplitude is possible.

With a sample size of 8 bits, the ratio of the largest amplitude to the smallest amplitude is 256. Thus, the dynamic range is

or 48 decibels. A 48-dB dynamic range is about the difference between a quiet room and a power lawn mower. Doubling the sample size to 16 bits yields a dynamic range of

or 96 decibels. This is very nearly the difference between the threshold of hearing and the threshold of pain and is considered just about ideal for the reproduction of music.

Both 8-bit and 16-bit sample sizes are supported under Windows. When storing 8-bit samples, the samples are treated as unsigned bytes. Silence would be stored as a string of 0x80 values. The 16-bit samples are treated as signed integers, so silence would be stored as a string of zeros.

To calculate the storage space required for uncompressed audio, multiply the duration of the sound in seconds by the sampling rate. Double that if you're using 16-bit samples rather than 8-bit samples. Double that again if you're recording in stereo. For example, an hour of CD-quality sound (or 3600 seconds at 44,100 samples per second with 2 bytes per sample in stereo) requires 635 megabytes, not coincidentally very close to the storage capability of CD–ROM.

Generating Sine Waves in Software

For our first exercise in waveform audio, we're not going to save sounds to files or play back recorded sounds. We're going to use the low-level waveform audio APIs (that is, the functions beginning with the prefix waveOut) to create an audio sine wave generator called SINEWAVE. This program generates sine waves from 20 Hz (the bottom of human perception) to 5,000 Hz (two octaves short of the top of human perception) in 1 Hz increments.

As you know, the standard C run-time library includes a function called sin that returns the sine of an angle given in radians. (Two π (2 times pi) radians equals 360 degrees.) The sin function returns a value ranging from –1 to 1. (We used this function in another program called SINEWAVE way back in Chapter 5.) Thus, it should be easy to use the sin function to generate sine wave data to output to the waveform audio hardware. Basically, you fill a buffer up with data representing the waveform (in this case, a sine wave) and pass it to the API. (It's a little more complicated than that, but I'll get to the details shortly.) When the waveform audio hardware finishes playing the buffer, you pass it a second buffer, and so forth.

When first considering this problem (and not knowing anything about PCM), you might think it reasonable to divide one cycle of the sine wave into a fixed number of samples—for example, 360. For a 20-Hz sine wave, you output 7200 samples every second. For a 200-Hz sine wave, you output 72,000 samples per second. That might work, but it's not the way to do it. For a 5000-Hz sine wave, you'd need to output 1,800,000 samples per second, which would surely tax the DAC! Moreover, for the higher frequencies, this is much more precision than is needed.

With pulse code modulation, the sample rate is a constant. Let's assume the sample rate is 11,025 Hz because that's what I use in the SINEWAVE program. If you wish to generate a sine wave of 2,756.25 Hz (exactly one-quarter the sample rate), each cycle of the sine wave is just 4 samples. For a sine wave of 25 Hz, each cycle requires 441 samples. In general, the number of samples per cycle is the sample rate divided by the desired sine wave frequency. Once you know the number of samples per cycle, you can divide 2π (2 times pi) radians by that number and use the sin function to get the samples for one cycle. Then just repeat the samples for one cycle over and over again to create a continuous waveform.

The problem is the number of samples per cycle may well be fractional, so this approach won't work well either. You'd get a discontinuity at the end of each cycle.

The key to making this work correctly is to maintain a static "phase angle" variable. This angle is initialized at 0. The first sample is the sine of 0 degrees. The phase angle is then incremented by 2π (2 times pi) times the frequency, divided by the sample rate. Use this phase angle for the second sample, and continue in this way. Whenever the phase angle gets above 2π (2 times pi) radians, subtract 2π (2 times pi) radians from it. But don't ever reinitialize it to 0.

For example, suppose you want to generate a sine wave of 1000 Hz with a sample rate of 11,025 Hz. That's about 11 samples per cycle. The phase angles—and here I'll give them in degrees to make this a little more comprehensible—for approximately the first cycle and a half are 0, 32.65, 65.31, 97.96, 130.61, 163.27, 195.92, 228.57, 261.22, 293.88, 326.53, 359.18, 31.84, 64.49, 97.14, 129.80, 162.45, 195.10, and so forth. The waveform data you put in the buffer are the sines of these angles, scaled to the number of bits per sample. When creating the data for a subsequent buffer, you keep incrementing the last phase angle value without reinitializing it to zero.

A function called FillBuffer that does this—along with the rest of the SINEWAVE program—is shown in Figure 22-2.

Figure 22-2. The SINEWAVE program.

SINEWAVE.C

/*------------------------------------------------------
   SINEWAVE.C -- Multimedia Windows Sine Wave Generator
                 (c) Charles Petzold, 1998
  ------------------------------------------------------*/

#include <windows.h>
#include <math.h>
#include "resource.h"

#define SAMPLE_RATE     11025
#define FREQ_MIN           20
#define FREQ_MAX         5000
#define FREQ_INIT         440
#define OUT_BUFFER_SIZE  4096
#define PI                  3.14159

BOOL CALLBACK DlgProc (HWND, UINT, WPARAM, LPARAM) ;

TCHAR szAppName [] = TEXT ("SineWave") ;

int WINAPI WinMain (HINSTANCE hInstance, HINSTANCE hPrevInstance,
                    PSTR szCmdLine, int iCmdShow)
{
     if (-1 == DialogBox (hInstance, szAppName, NULL, DlgProc))
     {
          MessageBox (NULL, TEXT ("This program requires Windows NT!"),
                      szAppName, MB_ICONERROR) ;
     }
     return 0 ;
}

VOID FillBuffer (PBYTE pBuffer, int iFreq)
{
     static double fAngle ;
     int           i ;

     for (i = 0 ; i < OUT_BUFFER_SIZE ; i++)
     {
          pBuffer [i] = (BYTE) (127 + 127 * sin (fAngle)) ;

          fAngle += 2 * PI * iFreq / SAMPLE_RATE ;

          if (fAngle > 2 * PI)

               fAngle -= 2 * PI ;
     }
}

BOOL CALLBACK DlgProc (HWND hwnd, UINT message, WPARAM wParam, LPARAM lParam)
{
     static BOOL         bShutOff, bClosing ;
     static HWAVEOUT     hWaveOut ;
     static HWND         hwndScroll ;
     static int          iFreq = FREQ_INIT ;
     static PBYTE        pBuffer1, pBuffer2 ;
     static PWAVEHDR     pWaveHdr1, pWaveHdr2 ;
     static WAVEFORMATEX waveformat ;
     int                 iDummy ;
     
     switch (message)
     {
     case WM_INITDIALOG:
          hwndScroll = GetDlgItem (hwnd, IDC_SCROLL) ;
          SetScrollRange (hwndScroll, SB_CTL, FREQ_MIN, FREQ_MAX, FALSE) ;
          SetScrollPos   (hwndScroll, SB_CTL, FREQ_INIT, TRUE) ;
          SetDlgItemInt  (hwnd, IDC_TEXT, FREQ_INIT, FALSE) ;
          
          return TRUE ;
          
     case WM_HSCROLL:
          switch (LOWORD (wParam))
          {
          case SB_LINELEFT:  iFreq -=  1 ;  break ;
          case SB_LINERIGHT: iFreq +=  1 ;  break ;
          case SB_PAGELEFT:  iFreq /=  2 ;  break ;
          case SB_PAGERIGHT: iFreq *=  2 ;  break ;
               
          case SB_THUMBTRACK:
               iFreq = HIWORD (wParam) ;
               break ;
               
          case SB_TOP:
               GetScrollRange (hwndScroll, SB_CTL, &iFreq, &iDummy) ;
               break ;
               
          case SB_BOTTOM:
               GetScrollRange (hwndScroll, SB_CTL, &iDummy, &iFreq) ;
               break ;
          }
          
          iFreq = max (FREQ_MIN, min (FREQ_MAX, iFreq)) ;

          SetScrollPos (hwndScroll, SB_CTL, iFreq, TRUE) ;
          SetDlgItemInt (hwnd, IDC_TEXT, iFreq, FALSE) ;
          return TRUE ;
          
     case WM_COMMAND:
          switch (LOWORD (wParam))
          {
          case IDC_ONOFF:
                    // If turning on waveform, hWaveOut is NULL
               
               if (hWaveOut == NULL)
               {
                         // Allocate memory for 2 headers and 2 buffers

                    pWaveHdr1 = malloc (sizeof (WAVEHDR)) ;
                    pWaveHdr2 = malloc (sizeof (WAVEHDR)) ;
                    pBuffer1  = malloc (OUT_BUFFER_SIZE) ;
                    pBuffer2  = malloc (OUT_BUFFER_SIZE) ;

                    if (!pWaveHdr1 || !pWaveHdr2 || !pBuffer1 || !pBuffer2)
                    {
                         if (!pWaveHdr1) free (pWaveHdr1) ;
                         if (!pWaveHdr2) free (pWaveHdr2) ;
                         if (!pBuffer1)  free (pBuffer1) ;
                         if (!pBuffer2)  free (pBuffer2) ;

                         MessageBeep (MB_ICONEXCLAMATION) ;
                         MessageBox (hwnd, TEXT ("Error allocating memory!"),
                                     szAppName, MB_ICONEXCLAMATION | MB_OK) ;
                         return TRUE ;
                    }

                         // Variable to indicate Off button pressed

                    bShutOff = FALSE ;
                         
                         // Open waveform audio for output
                         
                    waveformat.wFormatTag      = WAVE_FORMAT_PCM ;
                    waveformat.nChannels       = 1 ;
                    waveformat.nSamplesPerSec  = SAMPLE_RATE ;
                    waveformat.nAvgBytesPerSec = SAMPLE_RATE ;
                    waveformat.nBlockAlign     = 1 ;
                    waveformat.wBitsPerSample  = 8 ;
                    waveformat.cbSize          = 0 ;
                         
                    if (waveOutOpen (&hWaveOut, WAVE_MAPPER, &waveformat,
                                     (DWORD) hwnd, 0, CALLBACK_WINDOW)
                              != MMSYSERR_NOERROR)
                    {
                         free (pWaveHdr1) ;
                         free (pWaveHdr2) ;
                         free (pBuffer1) ;
                         free (pBuffer2) ;

                         hWaveOut = NULL ;
                         MessageBeep (MB_ICONEXCLAMATION) ;
                         MessageBox (hwnd, 
                              TEXT ("Error opening waveform audio device!"),
                              szAppName, MB_ICONEXCLAMATION | MB_OK) ;
                         return TRUE ;
                    }

                         // Set up headers and prepare them

                    pWaveHdr1->lpData          = pBuffer1 ;
                    pWaveHdr1->dwBufferLength  = OUT_BUFFER_SIZE ;
                    pWaveHdr1->dwBytesRecorded = 0 ;
                    pWaveHdr1->dwUser          = 0 ;
                    pWaveHdr1->dwFlags         = 0 ;
                    pWaveHdr1->dwLoops         = 1 ;
                    pWaveHdr1->lpNext          = NULL ;
                    pWaveHdr1->reserved        = 0 ;
                    
                    waveOutPrepareHeader (hWaveOut, pWaveHdr1, 
                                          sizeof (WAVEHDR)) ;

                    pWaveHdr2->lpData          = pBuffer2 ;
                    pWaveHdr2->dwBufferLength  = OUT_BUFFER_SIZE ;
                    pWaveHdr2->dwBytesRecorded = 0 ;
                    pWaveHdr2->dwUser          = 0 ;
                    pWaveHdr2->dwFlags         = 0 ;
                    pWaveHdr2->dwLoops         = 1 ;
                    pWaveHdr2->lpNext          = NULL ;
                    pWaveHdr2->reserved        = 0 ;
                    
                    waveOutPrepareHeader (hWaveOut, pWaveHdr2,
                                          sizeof (WAVEHDR)) ;
               }
                    // If turning off waveform, reset waveform audio
               else
               {
                    bShutOff = TRUE ;

                    waveOutReset (hWaveOut) ;
               }
               return TRUE ;
          }
          break ;

               // Message generated from waveOutOpen call
               
     case MM_WOM_OPEN:
          SetDlgItemText (hwnd, IDC_ONOFF, TEXT ("Turn Off")) ;

               // Send two buffers to waveform output device
                    
          FillBuffer (pBuffer1, iFreq) ;
          waveOutWrite (hWaveOut, pWaveHdr1, sizeof (WAVEHDR)) ;
                    
          FillBuffer (pBuffer2, iFreq) ;
          waveOutWrite (hWaveOut, pWaveHdr2, sizeof (WAVEHDR)) ;
          return TRUE ;

               // Message generated when a buffer is finished
                    
     case MM_WOM_DONE:
          if (bShutOff)
          {
               waveOutClose (hWaveOut) ;
               return TRUE ;
          }

               // Fill and send out a new buffer

          FillBuffer (((PWAVEHDR) lParam)->lpData, iFreq) ;
          waveOutWrite (hWaveOut, (PWAVEHDR) lParam, sizeof (WAVEHDR)) ;
          return TRUE ;
          
     case MM_WOM_CLOSE:
          waveOutUnprepareHeader (hWaveOut, pWaveHdr1, sizeof (WAVEHDR)) ;
          waveOutUnprepareHeader (hWaveOut, pWaveHdr2, sizeof (WAVEHDR)) ;

          free (pWaveHdr1) ;
          free (pWaveHdr2) ;
          free (pBuffer1) ;
          free (pBuffer2) ;

          hWaveOut = NULL ;
          SetDlgItemText (hwnd, IDC_ONOFF, TEXT ("Turn On")) ;
          
          if (bClosing)
           
    EndDialog (hwnd, 0) ;
          
          return TRUE ;
          
     case WM_SYSCOMMAND:
          switch (wParam)
          {
          case SC_CLOSE:
               if (hWaveOut != NULL)
               {
                    bShutOff = TRUE ;
                    bClosing = TRUE ;
                    
                    waveOutReset (hWaveOut) ;
               }
               else
                    EndDialog (hwnd, 0) ;
               
               return TRUE ;
          }
          break ;
     }
     return FALSE ;
}

SINEWAVE.RC (excerpts)

//Microsoft Developer Studio generated resource script.

#include "resource.h"
#include "afxres.h"

/////////////////////////////////////////////////////////////////////////////
// Dialog

SINEWAVE DIALOG DISCARDABLE  100, 100, 200, 50
STYLE WS_MINIMIZEBOX | WS_VISIBLE | WS_CAPTION | WS_SYSMENU
CAPTION "Sine Wave Generator"
FONT 8, "MS Sans Serif"
BEGIN
    SCROLLBAR       IDC_SCROLL,8,8,150,12
    RTEXT           "440",IDC_TEXT,160,10,20,8
    LTEXT           "Hz",IDC_STATIC,182,10,12,8
    PUSHBUTTON      "Turn On",IDC_ONOFF,80,28,40,14
END

RESOURCE.H (excerpts)

// Microsoft Developer Studio generated include file.
// Used by SineWave.rc

#define IDC_STATIC                      -1
#define IDC_SCROLL                      1000
#define IDC_TEXT                        1001
#define IDC_ONOFF                       1002

Note that the OUT_BUFFER_SIZE, SAMPLE_RATE, and PI identifiers used in the FillBuffer routine are defined at the top of the program. The iFreq argument to FillBuffer is the desired frequency in Hz. Notice that the result of the sin function is scaled to range between 0 and 254. For each sample, the fAngle argument to the sin function is increased by 2π (2 times pi) radians times the desired frequency divided by the sample rate.

SINEWAVE's window contains three controls: a horizontal scroll bar used for selecting the frequency, a static text field that indicates the currently selected frequency, and a push button labeled "Turn On." When you press the button, you should hear a sine wave from the speakers connected to your sound board and the button text will change to "Turn Off." You can change the frequency by moving the scroll bar with the keyboard or mouse. To turn off the sound, push the button again.

The SINEWAVE code initializes the scroll bar so that the minimum frequency is 20 Hz and the maximum frequency is 5000 Hz during the WM_INITDIALOG message. Initially, the scroll bar is set to 440 Hz. In musical terms, this is the A above middle C, the note used for tuning an orchestra. DlgProc alters the static variable iFreq on receipt of WM_HSCROLL messages. Notice that Page Left and Page Right cause DlgProc to decrease or increase the frequency by one octave.

When DlgProc receives a WM_COMMAND message from the button, it first allocates 4 blocks of memory—2 for WAVEHDR structures, discussed shortly, and two for buffers, called pBuffer1 and pBuffer2, to hold the waveform data.

SINEWAVE opens the waveform audio device for output by calling the waveOutOpen function, which uses the following arguments:

waveOutOpen (&hWaveOut, wDeviceID, &waveformat, dwCallBack,
             dwCallBackData, dwFlags) ;

You set the first argument to point to a variable of type HWAVEOUT ("handle to waveform audio output"). On return from the function, this variable will be set to a handle used in subsequent waveform output calls.

The second argument to waveOutOpen is a device ID. This allows the function to be used on machines that have multiple sound boards installed. The argument can range from 0 to one less than the number of waveform output devices installed in the system. You can get the number of waveform output devices by calling waveOutGetNumDevs and find out about each of them by calling waveOutGetDevCaps. If you wish to avoid this device interrogation, you can use the constant WAVE_MAPPER (defined as equalling –1) to select the device the user as indicated as the Preferred Device in the Audio tab of the Multimedia applet of the Control Panel. Or the system could select another device if the preferred device can't handle what you need to do and another device can.

The third argument is a pointer to a WAVEFORMATEX structure. (More about this shortly.) The fourth argument is either a window handle or a pointer to a callback function in a dynamic-link library. This argument indicates the window or callback function that receives the waveform output messages. If you use a callback function, you can specify program-defined data in the fifth argument. The dwFlags argument can be set to either CALLBACK_WINDOW or CALLBACK_FUNCTION to indicate what the fourth argument is. You can also use the flag WAVE_FORMAT_QUERY to check whether the device can be opened without actually opening it. A few other flags are available.

The third argument to waveOutOpen is defined as a pointer to a structure of type WAVEFORMATEX, defined in MMSYSTEM.H as shown below:

typedef struct waveformat_tag
{
     WORD  wFormatTag ;        // waveform format = WAVE_FORMAT_PCM
     WORD  nChannels ;         // number of channels = 1 or 2
     DWORD nSamplesPerSec ;    // sample rate
     DWORD nAvgBytesPerSec ;   // bytes per second
     WORD  nBlockAlign ;       // block alignment
     WORD  wBitsPerSample ;    // bits per samples = 8 or 16
     WORD  cbSize ;            // 0 for PCM
}
WAVEFORMATEX, * PWAVEFORMATEX ;

This is the structure you use to specify the sample rate (nSamplesPerSec), the sample size (wBitsPerSample), and whether you want monophonic or stereophonic sound (nChannels). Some of the information in this structure may seem redundant, but the structure is designed for sampling methods other than PCM, in which case the last field is set to a nonzero value and other information follows.

For PCM, set nBlockAlign field to the product of nChannels and wBitsPerSample, divided by 8. This is the total number of bytes per sample. Set the nAvgBytesPerSec field to the product of nSamplesPerSec and nBlockAlign.

SINEWAVE initializes the fields of the WAVEFORMATEX structure and calls waveOutOpen like this:

waveOutOpen (&hWaveOut, WAVE_MAPPER, &waveformat, 
             (DWORD) hwnd, 0, CALLBACK_WINDOW)

The waveOutOpen function returns MMSYSERR_NOERROR(defined as 0) if the function is successful and a nonzero error code otherwise. If waveOutOpen returns nonzero, SINEWAVE cleans up and displays a message box indicating an error.

Now that the device is open, SINEWAVE continues by initializing the fields of the two WAVEHDR structures, which are used to pass buffers through the API. WAVEHDR is defined like so:

typedef struct wavehdr_tag
{
    LPSTR lpData;                    // pointer to data buffer
    DWORD dwBufferLength;            // length of data buffer
    DWORD dwBytesRecorded;           // used for recorded
    DWORD dwUser;                    // for program use
    DWORD dwFlags;                   // flags
    DWORD dwLoops;                   // number of repetitions
    struct wavehdr_tag FAR *lpNext;  // reserved
    DWORD reserved;                  // reserved
} 
WAVEHDR, *PWAVEHDR ;

SINEWAVE sets the lpData field to the address at the buffer that will contain the data, dwBufferLength to the size of this buffer, and dwLoops to 1. All other fields can be set to 0 or NULL. If you want to play a repeated loop of sound, you can specify that with the dwFlags and dwLoops fields.

Next SINEWAVE calls waveOutPrepareHeader for the two headers. Calling this function prevents the structure and buffer from being swapped to disk.

So far, all of this preparation has been in response to the button click to turn on the sound. But a message is waiting in the program's message queue. Because we specified in waveOutOpen that we wish to use a window procedure for receiving waveform output messages, the waveOutOpen function posted a MM_WOM_OPEN message to the program's message queue. The wParam message parameter is set to the waveform output handle. To process the MM_WOM_OPEN message, SINEWAVE twice calls FillBuffer to fill the pBuffer buffer with sinewave data. SINEWAVE then passes the two WAVEHDR structures to waveOutWrite. This is the function that actually starts the sound playing by passing the data to the waveform output hardware.

When the waveform hardware is finished playing the data passed to it in the waveOutWrite function, the window is posted an MM_WOM_DONE message. The wParam parameter is the waveform output handle, and lParam is a pointer to the WAVEHDR structure. SINEWAVE processes this message by calculating new values for the buffer and resubmitting the buffer by calling waveOutWrite.

SINEWAVE could have been written using just one WAVEHDR structure and one buffer. However, there would be a slight delay between the time the waveform hardware finished playing the data and the program processed the MM_WOM_DONE message to submit a new buffer. The "double-buffering" technique that SINEWAVE uses prevents gaps in the sound.

When the user clicks the "Turn Off" button to turn off the sound, DlgProc receives another WM_COMMAND message. For this message, DlgProc sets the bShutOff variable to TRUE and calls waveOutReset. The waveOutReset function stops sound processing and generates a MM_WOM_DONE message. When bShutOff is TRUE, SINEWAVE processes MM_WOM_DONE by calling waveOutClose. This in turn generates an MM_WOM_CLOSE message. Processing of MM_WOM_CLOSE mostly involves cleaning up. SINEWAVE calls waveOutUnprepareHeader for the two WAVEHDR structures, frees all the memory blocks, and sets the text of the button back to "Turn On."

If the waveform hardware is still playing a buffer, calling waveOutClose by itself will have no effect. You must call waveOutReset first to halt the playing and to generate an MM_WOM_DONE message. DlgProc also processes the WM_SYSCOMMAND message when wParam is SC_CLOSE. This results from the user selecting "Close" from the system menu. If waveform audio is still playing, DlgProc calls waveOutReset. Regardless, EndDialog is eventually called to close the dialog box and end the program.

A Digital Sound Recorder

Windows includes a program called Sound Recorder that lets you digitally record and playback sounds. The program shown in Figure 22-3 (RECORD1) is not quite as sophisticated as Sound Recorder because it doesn't do any file I/O or allow sound editing. However, it does show the basics of using the low-level waveform audio API for both recording and playing back sounds.

Figure 22-3. The RECORD1 program.

RECORD1.C

/*----------------------------------------
   RECORD1.C -- Waveform Audio Recorder
                (c) Charles Petzold, 1998
  ----------------------------------------*/

#include <windows.h>
#include "resource.h"

#define INP_BUFFER_SIZE 16384

BOOL CALLBACK DlgProc (HWND, UINT, WPARAM, LPARAM) ;

TCHAR szAppName [] = TEXT ("Record1") ;

int WINAPI WinMain (HINSTANCE hInstance, HINSTANCE hPrevInstance,
                    PSTR szCmdLine, int iCmdShow)
{
     if (-1 == DialogBox (hInstance, TEXT ("Record"), NULL, DlgProc))
     {
          MessageBox (NULL, TEXT ("This program requires Windows NT!"),
                      szAppName, MB_ICONERROR) ;
     }

     return 0 ;
}

void ReverseMemory (BYTE * pBuffer, int iLength)
{
     BYTE b ;
     int  i ;
     
     for (i = 0 ; i < iLength / 2 ; i++)
     {
          b = pBuffer [i] ;
          pBuffer [i] = pBuffer [iLength - i - 1] ;
          pBuffer [iLength - i - 1] = b ;
     }
}

BOOL CALLBACK DlgProc (HWND hwnd, UINT message, WPARAM wParam, LPARAM lParam)
{
     static BOOL         bRecording, bPlaying, bReverse, bPaused,
                         bEnding, bTerminating ;
     static DWORD        dwDataLength, dwRepetitions = 1 ;
     static HWAVEIN      hWaveIn ;
     static HWAVEOUT     hWaveOut ;
     static PBYTE        pBuffer1, pBuffer2, pSaveBuffer, pNewBuffer ;
     static PWAVEHDR     pWaveHdr1, pWaveHdr2 ;
     static TCHAR        szOpenError[] = TEXT ("Error opening waveform audio!");
     static TCHAR        szMemError [] = TEXT ("Error allocating memory!") ;
     static WAVEFORMATEX waveform ;
     
     switch (message)
     {
     case WM_INITDIALOG:
               // Allocate memory for wave header
          
          pWaveHdr1 = malloc (sizeof (WAVEHDR)) ;
          pWaveHdr2 = malloc (sizeof (WAVEHDR)) ;
          
               // Allocate memory for save buffer
          
          pSaveBuffer = malloc (1) ;
          return TRUE ;
          
     case WM_COMMAND:
          switch (LOWORD (wParam))
          {
          case IDC_RECORD_BEG:
                    // Allocate buffer memory

               pBuffer1 = malloc (INP_BUFFER_SIZE) ;
               pBuffer2 = malloc (INP_BUFFER_SIZE) ;
               
               if (!pBuffer1 || !pBuffer2)
               {
                    if (pBuffer1) free (pBuffer1) ;
                    if (pBuffer2) free (pBuffer2) ;

                    MessageBeep (MB_ICONEXCLAMATION) ;
                    MessageBox (hwnd, szMemError, szAppName,
                                      MB_ICONEXCLAMATION | MB_OK) ;
                    return TRUE ;
               }
               
                    // Open waveform audio for input
               
               waveform.wFormatTag      = WAVE_FORMAT_PCM ;
               waveform.nChannels       = 1 ;
               waveform.nSamplesPerSec  = 11025 ;
               waveform.nAvgBytesPerSec = 11025 ;
               waveform.nBlockAlign     = 1 ;
               waveform.wBitsPerSample  = 8 ;
               waveform.cbSize          = 0 ;
               
               if (waveInOpen (&hWaveIn, WAVE_MAPPER, &waveform, 
                               (DWORD) hwnd, 0, CALLBACK_WINDOW))
               {
                    free (pBuffer1) ;
                    free (pBuffer2) ;
                    MessageBeep (MB_ICONEXCLAMATION) ;
                    MessageBox (hwnd, szOpenError, szAppName,
                                      MB_ICONEXCLAMATION | MB_OK) ;
               }
                    // Set up headers and prepare them
               
               pWaveHdr1->lpData          = pBuffer1 ;
               pWaveHdr1->dwBufferLength  = INP_BUFFER_SIZE ;
               pWaveHdr1->dwBytesRecorded = 0 ;
               pWaveHdr1->dwUser          = 0 ;
               pWaveHdr1->dwFlags         = 0 ;
               pWaveHdr1->dwLoops         = 1 ;
               pWaveHdr1->lpNext          = NULL ;
               pWaveHdr1->reserved        = 0 ;
               waveInPrepareHeader (hWaveIn, pWaveHdr1, sizeof (WAVEHDR)) ;
          
               pWaveHdr2->lpData          = pBuffer2 ;
               pWaveHdr2->dwBufferLength  = INP_BUFFER_SIZE ;
               pWaveHdr2->dwBytesRecorded = 0 ;
               pWaveHdr2->dwUser          = 0 ;
               pWaveHdr2->dwFlags         = 0 ;
               pWaveHdr2->dwLoops         = 1 ;
               pWaveHdr2->lpNext          = NULL ;
               pWaveHdr2->reserved        = 0 ;
          
               waveInPrepareHeader (hWaveIn, pWaveHdr2, sizeof (WAVEHDR)) ;
               return TRUE ;
               
          case IDC_RECORD_END:
                    // Reset input to return last buffer
               
               bEnding = TRUE ;
               waveInReset (hWaveIn) ;
               return TRUE ;
               
          case IDC_PLAY_BEG:
                    // Open waveform audio for output
               
               waveform.wFormatTag      = WAVE_FORMAT_PCM ;
               waveform.nChannels       = 1 ;
               waveform.nSamplesPerSec  = 11025 ;
               waveform.nAvgBytesPerSec = 11025 ;
               waveform.nBlockAlign     = 1 ;
               waveform.wBitsPerSample  = 8 ;
               waveform.cbSize          = 0 ;
               
               if (waveOutOpen (&hWaveOut, WAVE_MAPPER, &waveform, 
                                (DWORD) hwnd, 0, CALLBACK_WINDOW))
               {
                    MessageBeep (MB_ICONEXCLAMATION) ;
                    MessageBox (hwnd, szOpenError, szAppName,
                         MB_ICONEXCLAMATION | MB_OK) ;
               }
               return TRUE ;
               
          case IDC_PLAY_PAUSE:
                    // Pause or restart output
               
               if (!bPaused)
               {
                    waveOutPause (hWaveOut) ;
                    SetDlgItemText (hwnd, IDC_PLAY_PAUSE, TEXT ("Resume")) ;
                    bPaused = TRUE ;
               }
               else
               {
                    waveOutRestart (hWaveOut) ;
                    SetDlgItemText (hwnd, IDC_PLAY_PAUSE, TEXT ("Pause")) ;
                    bPaused = FALSE ;
               }
               return TRUE ;
               
          case IDC_PLAY_END:
                    // Reset output for close preparation
               
               bEnding = TRUE ;
               waveOutReset (hWaveOut) ;
               return TRUE ;
               
          case IDC_PLAY_REV:
                    // Reverse save buffer and play
               
               bReverse = TRUE ;
               ReverseMemory (pSaveBuffer, dwDataLength) ;
               
               SendMessage (hwnd, WM_COMMAND, IDC_PLAY_BEG, 0) ;
               return TRUE ;
               
          case IDC_PLAY_REP:
                    // Set infinite repetitions and play
               
               dwRepetitions = -1 ;
               SendMessage (hwnd, WM_COMMAND, IDC_PLAY_BEG, 0) ;
               return TRUE ;
               
          case IDC_PLAY_SPEED:
                    // Open waveform audio for fast output
               
               waveform.wFormatTag      = WAVE_FORMAT_PCM ;
               waveform.nChannels       = 1 ;
               waveform.nSamplesPerSec  = 22050 ;
               waveform.nAvgBytesPerSec = 22050 ;
               waveform.nBlockAlign     = 1 ;
               waveform.wBitsPerSample  = 8 ;
               waveform.cbSize          = 0 ;
               if (waveOutOpen (&hWaveOut, 0, &waveform, (DWORD) hwnd, 0,
                                           CALLBACK_WINDOW))
               {
                    MessageBeep (MB_ICONEXCLAMATION) ;
                    MessageBox (hwnd, szOpenError, szAppName,
                                      MB_ICONEXCLAMATION | MB_OK) ;
               }
               return TRUE ;
          }
          break ;
               
     case MM_WIM_OPEN:
               // Shrink down the save buffer
          
          pSaveBuffer = realloc (pSaveBuffer, 1) ;
          
               // Enable and disable buttons
          
          EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), FALSE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), TRUE)  ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG),   FALSE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END),   FALSE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REV),   FALSE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REP),   FALSE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_SPEED), FALSE) ;
          SetFocus (GetDlgItem (hwnd, IDC_RECORD_END)) ;

               // Add the buffers
          
          waveInAddBuffer (hWaveIn, pWaveHdr1, sizeof (WAVEHDR)) ;
          waveInAddBuffer (hWaveIn, pWaveHdr2, sizeof (WAVEHDR)) ;
          
               // Begin sampling
          
          bRecording = TRUE ;
          bEnding = FALSE ;
          dwDataLength = 0 ;
          waveInStart (hWaveIn) ;
          return TRUE ;
          
     case MM_WIM_DATA:
         
               // Reallocate save buffer memory
          
          pNewBuffer = realloc (pSaveBuffer, dwDataLength +
                                   ((PWAVEHDR) lParam)->dwBytesRecorded) ;
          
          if (pNewBuffer == NULL)
          {
               waveInClose (hWaveIn) ;
               MessageBeep (MB_ICONEXCLAMATION) ;
               MessageBox (hwnd, szMemError, szAppName,
                                 MB_ICONEXCLAMATION | MB_OK) ;
               return TRUE ;
          }
          
          pSaveBuffer = pNewBuffer ;
          CopyMemory (pSaveBuffer + dwDataLength, ((PWAVEHDR) lParam)->lpData,
                         ((PWAVEHDR) lParam)->dwBytesRecorded) ;
          
          dwDataLength += ((PWAVEHDR) lParam)->dwBytesRecorded ;
          
          if (bEnding)
          {
               waveInClose (hWaveIn) ;
               return TRUE ;
          }
          
               // Send out a new buffer
          
          waveInAddBuffer (hWaveIn, (PWAVEHDR) lParam, sizeof (WAVEHDR)) ;
          return TRUE ;
          
     case MM_WIM_CLOSE:
               // Free the buffer memory

          waveInUnprepareHeader (hWaveIn, pWaveHdr1, sizeof (WAVEHDR)) ;
          waveInUnprepareHeader (hWaveIn, pWaveHdr2, sizeof (WAVEHDR)) ;

          free (pBuffer1) ;
          free (pBuffer2) ;
          
               // Enable and disable buttons
          
          EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), TRUE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE) ;
          SetFocus (GetDlgItem (hwnd, IDC_RECORD_BEG)) ;
          
          if (dwDataLength > 0)
          {
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG),   TRUE)  ;
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END),   FALSE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REP),   TRUE)  ;
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REV),   TRUE)  ;
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_SPEED), TRUE)  ;
               SetFocus (GetDlgItem (hwnd, IDC_PLAY_BEG)) ;
          }
          bRecording = FALSE ;
          
          if (bTerminating)
               SendMessage (hwnd, WM_SYSCOMMAND, SC_CLOSE, 0L) ;
          
          return TRUE ;
          
     case MM_WOM_OPEN:
               // Enable and disable buttons
          
          EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), FALSE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG),   FALSE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), TRUE)  ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END),   TRUE)  ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REP),   FALSE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REV),   FALSE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_SPEED), FALSE) ;
          SetFocus (GetDlgItem (hwnd, IDC_PLAY_END)) ;
          
               // Set up header
          
          pWaveHdr1->lpData          = pSaveBuffer ;
          pWaveHdr1->dwBufferLength  = dwDataLength ;
          pWaveHdr1->dwBytesRecorded = 0 ;
          pWaveHdr1->dwUser          = 0 ;
          pWaveHdr1->dwFlags         = WHDR_BEGINLOOP | WHDR_ENDLOOP ;
          pWaveHdr1->dwLoops         = dwRepetitions ;
          pWaveHdr1->lpNext          = NULL ;
          pWaveHdr1->reserved        = 0 ;
          
               // Prepare and write
          
          waveOutPrepareHeader (hWaveOut, pWaveHdr1, sizeof (WAVEHDR)) ;
          waveOutWrite (hWaveOut, pWaveHdr1, sizeof (WAVEHDR)) ;
          
          bEnding = FALSE ;
          bPlaying = TRUE ;
          return TRUE ;
          
     case MM_WOM_DONE:
          waveOutUnprepareHeader (hWaveOut, pWaveHdr1, sizeof (WAVEHDR)) ;
          waveOutClose (hWaveOut) ;
          return TRUE ;
          
     case MM_WOM_CLOSE:
               // Enable and disable buttons
          
          EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), TRUE)  ;
          EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), TRUE)  ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG),   TRUE)  ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END),   FALSE) ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REV),   TRUE)  ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REP),   TRUE)  ;
          EnableWindow (GetDlgItem (hwnd, IDC_PLAY_SPEED), TRUE)  ;
          SetFocus (GetDlgItem (hwnd, IDC_PLAY_BEG)) ;
          
          SetDlgItemText (hwnd, IDC_PLAY_PAUSE, TEXT ("Pause")) ;
          bPaused = FALSE ;
          dwRepetitions = 1 ;
          bPlaying = FALSE ;
          
          if (bReverse)
          {
               ReverseMemory (pSaveBuffer, dwDataLength) ;
               bReverse = FALSE ;
          }
          
          if (bTerminating)
               SendMessage (hwnd, WM_SYSCOMMAND, SC_CLOSE, 0L) ;
          
          return TRUE ;
          
     case WM_SYSCOMMAND:
          switch (LOWORD (wParam))
          {
          case SC_CLOSE:
               if (bRecording)
               {
                    bTerminating = TRUE ;
                    bEnding = TRUE ;
                    waveInReset (hWaveIn) ;
                    return TRUE ;
               }
               if (bPlaying)
               {
                    bTerminating = TRUE ;
                    bEnding = TRUE ;
                    waveOutReset (hWaveOut) ;
                    return TRUE ;
               }
               
               free (pWaveHdr1) ;
               free (pWaveHdr2) ;
               free (pSaveBuffer) ;
               EndDialog (hwnd, 0) ;
               return TRUE ;
          }
          break ;
     }
     return FALSE ;
}

RECORD.RC (excerpts)

//Microsoft Developer Studio generated resource script.

#include "resource.h"
#include "afxres.h"

/////////////////////////////////////////////////////////////////////////////
// Dialog

RECORD DIALOG DISCARDABLE  100, 100, 152, 74
STYLE WS_MINIMIZEBOX | WS_VISIBLE | WS_CAPTION | WS_SYSMENU
CAPTION "Waveform Audio Recorder"
FONT 8, "MS Sans Serif"
BEGIN
    PUSHBUTTON      "Record",IDC_RECORD_BEG,28,8,40,14
    PUSHBUTTON      "End",IDC_RECORD_END,76,8,40,14,WS_DISABLED
    PUSHBUTTON      "Play",IDC_PLAY_BEG,8,30,40,14,WS_DISABLED
    PUSHBUTTON      "Pause",IDC_PLAY_PAUSE,56,30,40,14,WS_DISABLED
    PUSHBUTTON      "End",IDC_PLAY_END,104,30,40,14,WS_DISABLED
    PUSHBUTTON      "Reverse",IDC_PLAY_REV,8,52,40,14,WS_DISABLED
    PUSHBUTTON      "Repeat",IDC_PLAY_REP,56,52,40,14,WS_DISABLED
    PUSHBUTTON      "Speedup",IDC_PLAY_SPEED,104,52,40,14,WS_DISABLED
END

RESOURCE.H (excerpts)

// Microsoft Developer Studio generated include file.
// Used by Record.rc

#define IDC_RECORD_BEG                  1000
#define IDC_RECORD_END                  1001
#define IDC_PLAY_BEG                    1002
#define IDC_PLAY_PAUSE                  1003
#define IDC_PLAY_END                    1004
#define IDC_PLAY_REV                    1005
#define IDC_PLAY_REP                    1006
#define IDC_PLAY_SPEED                  1007

The RECORD.RC and RESOURCE.H files will also be used in the RECORD2 and RECORD3 programs.

The RECORD1 window has eight push buttons. When you first run RECORD1, only the Record button is enabled. When you press Record, you can begin recording. The Record button becomes disabled, and the End button is enabled. Press End to stop recording. At this point, the Play, Reverse, Repeat, and Speedup buttons also become enabled. Pressing any of these buttons plays back the sound: Play plays it normally, Reverse plays it in reverse, Repeat causes the sound to be repeated indefinitely (like with a tape loop), and Speedup plays the sound back twice as fast. You can end playback by pressing the second End button, or you can pause the playback by pressing Pause. When pressed, the Pause button changes into a Resume button to resume playing back the sound. If you record another sound, it replaces the existing sound in memory.

At any time, the only buttons that are enabled are those that perform valid operations. This requires a lot of calls to EnableWindow in the RECORD1 source code, but the program doesn't have to check if a particular push-button operation is valid. Of course, it also makes the operation of the program more intuitive.

RECORD1 takes a number of shortcuts to simplify the code. First, if multiple waveform audio hardware devices are installed, RECORD1 uses the default one. Second, the program records and plays back at the standard 11.025 kHz sampling rate with an 8-bit sample size regardless of whether a higher sampling rate or sample size is available. The only exception is for the speed-up function, where RECORD1 plays back the sound at the 22.050 kHz sampling rate, thus playing it twice as fast and an octave higher in frequency.

Recording a sound involves opening the waveform audio hardware for input and passing buffers to the API to receive the sound data.

RECORD1 maintains several memory blocks. Three of these blocks are very small, at least initially, and are allocated during the WM_INITDIALOG message in DlgProc. The program allocates two WAVEHDR structures pointed to by pWaveHdr1 and pWaveHdr2. These structures are used to pass buffers to the waveform APIs. The pSaveBuffer pointer points to a buffer for storing the complete recorded sound; this is initially allocated as a 1-byte block. Later on, during recording, the buffer is increased in size to accommodate all the sound data. (If you record for a long period of time, RECORD1 recovers gracefully when it runs out of memory during recording, and lets you play back that portion of the sound successfully stored.) I'll refer to this buffer as the "save buffer" because it is used to save the accumulated sound data. Two more memory blocks, 16K in size and pointed to by pBuffer1 and pBuffer2, are allocated during recording to receive sound data. These buffers are freed when recording is complete.

Each of the eight buttons generates a WM_COMMAND message to DlgProc, the dialog procedure for REPORT1's window. Initially, only the Record button is enabled. Pressing this generates a WM_COMMAND message with wParam equal to IDC_RECORD_BEG. To process this message, RECORD1 allocates the two 16K buffers for receiving sound data, initializes the fields of a WAVEFORMATEX structure and passes it to the waveInOpen function, and sets up the two WAVEHDR structures.

The waveInOpen function generates an MM_WIM_OPEN message. During this message, RECORD1 shrinks the save buffer down to 1 byte in preparation for receiving data. (Of course, the first time you record something, the save buffer is already 1 byte in length, but during subsequent recordings, it could be much larger.) During the MM_WIM_OPEN message, RECORD1 also enables and disables the appropriate push buttons. Next, the program passes the two WAVEHDR structures and buffers to the API using waveInAddBuffer. Some flags are set, and recording begins with a call to waveInStart.

At a sampling rate of 11.025 kHz with an 8-bit sample size, the 16K buffer will be filled in approximately 1.5 seconds. At that time, RECORD1 receives an MM_WIM_DATA message. In response to this message, the program call reallocates the save buffer based on the dwDataLength variable and the dwBytesRecorded field of the WAVEHDR structure. If the reallocation fails, RECORD1 calls waveInClose to stop recording.

If the reallocation is successful, RECORD1 copies the data from the 16K buffer into the save buffer. It then calls waveInAddBuffer again. This process continues until RECORD1 runs out of memory for the save buffer or the user presses the End button.

The End button generates a WM_COMMAND message with wParam equal to IDC_RECORD_END. Processing this message is simple. RECORD1 sets the bEnding flag to TRUE and calls waveInReset. The waveInReset function causes recording to stop and generates an MM_WIM_DATA message containing a partially filled buffer. RECORD1 responds to this final MM_WIM_DATA message normally, except that it closes the waveform input device by calling waveInClose.

The waveInClose message generates an MM_WIM_CLOSE message. RECORD1 responds to this message by freeing the 16K input buffers and enabling and disabling the appropriate push buttons. In particular, if the save buffer contains data, which it almost always will unless the first reallocation fails, then the play buttons are enabled.

After recording a sound, the save buffer contains the total accumulated sound data. When the user selects the Play button, DlgProc receives a WM_COMMAND message with wParam equal to IDC_PLAY_BEG. The program responds by initializing the fields of a WAVEFORMATEX structure and calling waveOutOpen.

The waveOutOpen call again generates an MM_WOM_OPEN message. During this message, RECORD1 enables and disables the appropriate push buttons (allowing only Pause and End), initializes the fields of the WAVEHDR structure with the save buffer, prepares it by calling waveOutPrepareHeader, and begins playing it with a call to waveOutWrite.

Normally, the sound will continue until all the data in the buffer has been played. At that time, an MM_WOM_DONE message is generated. If there are additional buffers to be played, a program can pass them out to the API at that time. RECORD1 plays only one big buffer, so the program simply unprepares the header and calls waveOutClose. The waveOutClose function generates an MM_WOM_CLOSE message. During this message, RECORD1 enables and disables the appropriate buttons, allowing the sound to be played again or a new sound to be recorded.

I've also included a second End button so that the user can stop playing the sound at any time before the save buffer has completed. This End button generates a WM_COMMAND message with wParam equal to IDC_PLAY_END, and the program responds by calling waveOutReset. This function generates an MM_WOM_DONE message that is processed normally.

RECORD1's window also includes a Pause button. Processing this button is easy. The first time it's pushed, RECORD1 calls waveOutPause to halt the sound and sets the text in the Pause button to Resume. Pressing the Resume button starts the playback going again by a call to waveOutRestart.

To make the program just a little more interesting, I've also included buttons labeled "Reverse," "Repeat," and "Speedup." These buttons generate WM_COMMAND messages with wParam values equal to IDC_PLAY_REV, IDC_PLAY_REP, and IDC_PLAY_SPEED.

Playing the sound in reverse involves reversing the order of the bytes in the save buffer and playing the sound normally. RECORD1 includes a small function named ReverseMemory to reverse the bytes. It calls this function during the WM_COMMAND message before playing the block and again at the end of the MM_WOM_CLOSE message to restore it to normal.

The Repeat button plays the sound over and over again. This is not complicated because the API includes a provision for repeating a sound. It involves setting the dwLoops field in the WAVEHDR structure to the number of repetitions and setting the dwFlags field to WHDR_BEGINLOOP for the beginning buffer in the loop and to WHDR_ENDLOOP for the end buffer. Because RECORD1 uses only one buffer for playing the sound, these two flags are combined in the dwFlags field.

Playing the sound twice as fast is also quite easy. When initializing the fields of the WAVEFORMATEX structure in preparation for opening waveform audio for output, the nSamplesPerSec and nAvgBytesPerSec fields are set to 22050 rather than 11025.

The MCI Alternative

You may find, as I do, that RECORD1 seems inordinately complex. It is particularly tricky to deal with the interaction between the waveform audio function calls and the messages they generate, and then in the midst of all this, to deal with possible memory shortages as well. But maybe that's why it's called the "low-level" interface. As I noted earlier in this chapter, Windows also includes the high-level Media Control Interface.

For waveform audio, the primary differences between the low-level interface and MCI is that MCI records sound data to a waveform file and plays back the sound by reading the file. This makes it difficult to perform the "special effects" that RECORD1 implements because you'd have to read in the file, manipulate it, and write it back out before playing the sound. This is a typical versatility vs. ease-of-use trade-off. The low-level interface gives you flexibility, but MCI (for the most part) is easier.

MCI is implemented in two different but related forms. The first form uses messages and data structures to send commands to multimedia devices and receive information from them. The second form uses ASCII text strings. The text-based interface was originally created to allow multimedia devices to be controlled from simple scripting languages. But it also provides very easy interactive control, as was demonstrated in the TESTMCI program shown earlier in this chapter.

The RECORD2 program shown in Figure 22-4 uses the message and data structure form of MCI to implement another digital audio recorder and player. Although it uses the same dialog box template as RECORD1, it does not implement the three special effects buttons.

Figure 22-4. The RECORD2 program.

RECORD2.C

/*----------------------------------------
   RECORD2.C -- Waveform Audio Recorder
                (c) Charles Petzold, 1998
------------------------------------------*/

#include <windows.h>
#include "..\\record1\\resource.h"

BOOL CALLBACK DlgProc (HWND, UINT, WPARAM, LPARAM) ;

TCHAR szAppName [] = TEXT ("Record2") ;

int WINAPI WinMain (HINSTANCE hInstance, HINSTANCE hPrevInstance,
                    PSTR szCmdLine, int iCmdShow)
{
     if (-1 == DialogBox (hInstance, TEXT ("Record"), NULL, DlgProc))

      {
          MessageBox (NULL, TEXT ("This program requires Windows NT!"),
                      szAppName, MB_ICONERROR) ;
     }
     return 0 ;
}

void ShowError (HWND hwnd, DWORD dwError)
{
     TCHAR szErrorStr [1024] ;
     
     mciGetErrorString (dwError, szErrorStr, 
                        sizeof (szErrorStr) / sizeof (TCHAR)) ;
     MessageBeep (MB_ICONEXCLAMATION) ;
     MessageBox (hwnd, szErrorStr, szAppName, MB_OK | MB_ICONEXCLAMATION) ;
}

BOOL CALLBACK DlgProc (HWND hwnd, UINT message, WPARAM wParam, LPARAM lParam)
{
     static BOOL       bRecording, bPlaying, bPaused ;
     static TCHAR      szFileName[] = TEXT ("record2.wav") ;
     static WORD       wDeviceID ;
     DWORD             dwError ;
     MCI_GENERIC_PARMS mciGeneric ;
     MCI_OPEN_PARMS    mciOpen ;
     MCI_PLAY_PARMS    mciPlay ;
     MCI_RECORD_PARMS  mciRecord ;
     MCI_SAVE_PARMS    mciSave ;
     
     switch (message)
     {
     case WM_COMMAND:
          switch (wParam)
          {
          case IDC_RECORD_BEG:
                    // Delete existing waveform file
               
               DeleteFile (szFileName) ;
               
                    // Open waveform audio
               
               mciOpen.dwCallback       = 0 ;
               mciOpen.wDeviceID        = 0 ;
               mciOpen.lpstrDeviceType  = TEXT ("waveaudio") ;
               mciOpen.lpstrElementName = TEXT ("") ; 
               mciOpen.lpstrAlias       = NULL ;
               dwError = mciSendCommand (0, MCI_OPEN, 
                                   MCI_WAIT | MCI_OPEN_TYPE | MCI_OPEN_ELEMENT,
                                   (DWORD) (LPMCI_OPEN_PARMS) &mciOpen) ;
               if (dwError != 0)
               {
                    ShowError (hwnd, dwError) ;
                    return TRUE ;
               }
                    // Save the Device ID
               
               wDeviceID = mciOpen.wDeviceID ;
               
                    // Begin recording
               
               mciRecord.dwCallback = (DWORD) hwnd ;
               mciRecord.dwFrom     = 0 ;
               mciRecord.dwTo       = 0 ;
               
               mciSendCommand (wDeviceID, MCI_RECORD, MCI_NOTIFY,
                               (DWORD) (LPMCI_RECORD_PARMS) &mciRecord) ;
               
                    // Enable and disable buttons
               
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), TRUE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG),   FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END),   FALSE);
               SetFocus (GetDlgItem (hwnd, IDC_RECORD_END)) ;
               
               bRecording = TRUE ;
               return TRUE ;
               
          case IDC_RECORD_END:
                    // Stop recording
               
               mciGeneric.dwCallback = 0 ;
               
               mciSendCommand (wDeviceID, MCI_STOP, MCI_WAIT,
                               (DWORD) (LPMCI_GENERIC_PARMS) &mciGeneric) ;
               
                    // Save the file

               mciSave.dwCallback = 0 ;
               mciSave.lpfilename = szFileName ;
               

               mciSendCommand (wDeviceID, MCI_SAVE, MCI_WAIT | MCI_SAVE_FILE,
                               (DWORD) (LPMCI_SAVE_PARMS) &mciSave) ;
               
                    // Close the waveform device
               
               mciSendCommand (wDeviceID, MCI_CLOSE, MCI_WAIT,
                               (DWORD) (LPMCI_GENERIC_PARMS) &mciGeneric) ;
               
                    // Enable and disable buttons
               
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), TRUE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG),   TRUE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END),   FALSE);
               SetFocus (GetDlgItem (hwnd, IDC_PLAY_BEG)) ;
               
               bRecording = FALSE ;
               return TRUE ;
               
          case IDC_PLAY_BEG:
                    // Open waveform audio
               
               mciOpen.dwCallback       = 0 ;
               mciOpen.wDeviceID        = 0 ;
               mciOpen.lpstrDeviceType  = NULL ;
               mciOpen.lpstrElementName = szFileName ;
               mciOpen.lpstrAlias       = NULL ;
               
               dwError = mciSendCommand (0, MCI_OPEN,
                                         MCI_WAIT | MCI_OPEN_ELEMENT,
                                         (DWORD) (LPMCI_OPEN_PARMS) &mciOpen) ;
               
               if (dwError != 0)
               {
                    ShowError (hwnd, dwError) ;
                    return TRUE ;
               }
                    // Save the Device ID
               
               wDeviceID = mciOpen.wDeviceID ;
               
                    // Begin playing
               
               mciPlay.dwCallback = (DWORD) hwnd ;
               mciPlay.dwFrom     = 0 ;
               mciPlay.dwTo       = 0 ;
               
               mciSendCommand (wDeviceID, MCI_PLAY, MCI_NOTIFY,
                               (DWORD) (LPMCI_PLAY_PARMS) &mciPlay) ;
               
                    // Enable and disable buttons
               
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG),   FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), TRUE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END),   TRUE) ;
               SetFocus (GetDlgItem (hwnd, IDC_PLAY_END)) ;
               
               bPlaying = TRUE ;
               return TRUE ;
               
          case IDC_PLAY_PAUSE:
               if (!bPaused)
                         // Pause the play
               {
                    mciGeneric.dwCallback = 0 ;
                    
                    mciSendCommand (wDeviceID, MCI_PAUSE, MCI_WAIT,
                                    (DWORD) (LPMCI_GENERIC_PARMS) & mciGeneric);
                    
                    SetDlgItemText (hwnd, IDC_PLAY_PAUSE, TEXT ("Resume")) ;
                    bPaused = TRUE ;
               }
               else
                         // Begin playing again
               {
                    mciPlay.dwCallback = (DWORD) hwnd ;
                    mciPlay.dwFrom     = 0 ;
                    mciPlay.dwTo       = 0 ;
                    
                    mciSendCommand (wDeviceID, MCI_PLAY, MCI_NOTIFY,
                                    (DWORD) (LPMCI_PLAY_PARMS) &mciPlay) ;
                    
                    SetDlgItemText (hwnd, IDC_PLAY_PAUSE, TEXT ("Pause")) ;
                    bPaused = FALSE ;
               }
               
               return TRUE ;
               
          case IDC_PLAY_END:
                    // Stop and close
               
               mciGeneric.dwCallback = 0 ;
               
               mciSendCommand (wDeviceID, MCI_STOP, MCI_WAIT,
                               (DWORD) (LPMCI_GENERIC_PARMS) &mciGeneric) ;
               
               mciSendCommand (wDeviceID, MCI_CLOSE, MCI_WAIT,
                               (DWORD) (LPMCI_GENERIC_PARMS) &mciGeneric) ;
               
                    // Enable and disable buttons
               
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), TRUE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG),   TRUE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END),   FALSE);
               SetFocus (GetDlgItem (hwnd, IDC_PLAY_BEG)) ;
               
               bPlaying = FALSE ;
               bPaused  = FALSE ;
               return TRUE ;
          }
          break ;
               
     case MM_MCINOTIFY:
          switch (wParam)
          {
          case MCI_NOTIFY_SUCCESSFUL:
               if (bPlaying)
                    SendMessage (hwnd, WM_COMMAND, IDC_PLAY_END, 0) ;
               
               if (bRecording)
                    SendMessage (hwnd, WM_COMMAND, IDC_RECORD_END, 0);
               
               return TRUE ;
          }
          break ;
     
     case WM_SYSCOMMAND:
          switch (wParam)
          {
          case SC_CLOSE:
               if (bRecording)
                    SendMessage (hwnd, WM_COMMAND, IDC_RECORD_END, 0L) ;
               if (bPlaying)
                    SendMessage (hwnd, WM_COMMAND, IDC_PLAY_END, 0L) ;
               
               EndDialog (hwnd, 0) ;
               return TRUE ;
          }
          break ;
     }
     return FALSE ;
}

RECORD2 uses only two MCI function calls, the most important being this one:

error = mciSendCommand (wDeviceID, message, dwFlags, dwParam)

The first argument is a numeric identification number for the device. You use this ID number much like a handle. You obtain the ID when you open the device, and then you use it in subsequent mciSendCommand calls. The second argument is a constant beginning with the prefix MCI. These are called MCI command messages, and RECORD2 demonstrates seven of them: MCI_OPEN, MCI_RECORD, MCI_STOP, MCI_SAVE, MCI_PLAY, MCI_PAUSE, and MCI_CLOSE.

The dwFlags argument is generally composed of zero or more bit flag constants combined with the C bit-wise OR operator. These generally indicate various options. Some options are specific to particular command messages, and some are common to all messages. The dwParam argument is generally a long pointer to a data structure that indicates options and obtains information from the device. Many of the MCI messages are associated with data structures unique to the message.

The mciSendCommand function returns zero if the function is successful and an error code otherwise. To report this error to the user, you can obtain a text string that describes the error:

mciGetErrorString (error, szBuffer, dwLength)

This is the same function used in the TESTMCI program.

When the user presses the Record button, RECORD2's window procedure receives a WM_COMMAND message with wParam equal to IDC_RECORD_BEG. RECORD2 begins by opening the device. This involves setting the fields of an MCI_OPEN_PARMS structure and calling mciSendCommand with the MCI_OPEN command message. For recording, the lpstrDeviceType field is set to the string "waveaudio" to indicate the device type. The lpstrElementName field is set to a zero-length string. The MCI driver uses a default sampling rate and sample size, but you can change that using the MCI_SET command. During recording, the sound data is stored on the hard disk in a temporary file and is ultimately transferred to a standard waveform file. I'll discuss the format of waveform files later in this chapter. For playing back the sound, MCI uses the sampling rate and sample size defined in the waveform file.

If RECORD2 cannot open a device, it uses mciGetErrorString and MessageBox to tell the user what the problem is. Otherwise, on return from the mciSendCommand call, the wDeviceID field of the MCI_OPEN_PARMS structure contains the device ID used in subsequent calls.

To begin recording, RECORD2 calls mciSendCommand with the MCI_RECORD command message and the MCI_WAVE_RECORD_PARMS data structure. Optionally, you can set the dwFrom and dwTo fields of this structure (and use bit flags that indicate these fields are set) to insert a sound into an existing waveform file, the name of which would be specified in the lpstrElementName field of the MCI_OPEN_PARMS structure. By default, any new sound is inserted at the beginning of an existing file.

RECORD2 sets the dwCallback field of the MCI_WAVE_RECORD_PARMS to the program's window handle and includes the MCI_NOTIFY flag in the mciSendCommand call. This causes a notification message to be sent to the window procedure when recording has been completed. I'll discuss this notification message shortly.

When done recording, you press the first End button to stop. This generates a WM_COMMAND message with wParam equal to IDC_RECORD_END. The window procedure responds by calling mciSendCommand three times: The MCI_STOP command message stops recording, the MCI_SAVE command message transfers the sound data from the temporary file to the file specified in an MCI_SAVE_PARMS structure ("record2.wav"), and the MCI_CLOSE command message deletes any temporary files or memory blocks that might have been created and closes the device.

For playback, the lpstrElementName of the MCI_OPEN_PARMS structure field is set to the filename "record2.wav". The MCI_OPEN_ELEMENT flag included in the third argument to mciSendCommand indicates that the lpstrElementName field is a valid filename. MCI knows from the filename extension .WAV that you wish to open a waveform audio device. If multiple waveform hardware is present, it opens the first device. (It's also possible to use something other than the first waveform device by setting the lpstrDeviceType field of the MCI_OPEN_PARMS structure.)

Playing involves an mciSendCommand call with the MCI_PLAY command message and an MCI_PLAY_PARMS structure. Any part of the file can be played, but RECORD2 chooses to play it all.

RECORD2 also includes a Pause button for pausing the playback of a sound file. This button generates a WM_COMMAND message with wParam equal to IDC_PLAY_PAUSE. The program responds by calling mciSendCommand with the MCI_PAUSE command message and an MCI_GENERIC_PARMS structure. The MCI_GENERIC_PARMS structure is used for any message that requires no information except an optional window handle for notification. If the playback is already paused, the button resumes play by calling mciSendCommand again with the MCI_PLAY command message.

Playback can also be terminated by pressing the second End button. This generates a WM_COMMAND message with wParam equal to IDC_PLAY_END. The window procedure responds by calling mciSendCommand twice, first with the MCI_STOP command message and then with the MCI_CLOSE command message.

Now here's a problem: Although you can manually terminate playback by pressing the End button, you may want to play the entire sound file. How does the program know when the file has completed? That is the job of the MCI notification message.

When calling mciSendCommand with the MCI_RECORD and MCI_PLAY messages, RECORD2 includes the MCI_NOTIFY flag and sets the dwCallback field of the data structure to the program's window handle. This causes a notification message, called MM_MCINOTIFY, to be posted to the window procedure under certain circumstances. The wParam message parameter is a status code, and lParam is the device ID.

You'll receive an MM_MCINOTIFY message with wParam equal to MCI_NOTIFY_ABORTED when mciSendCommand is called with the MCI_STOP or MCI_PAUSE command messages. This happens when you press the Pause button or either of the two End buttons. RECORD2 can ignore this case because it already properly handles these buttons. During playback, you'll receive an MM_MCINOTIFY message with wParam equal to MCI_NOTIFY_SUCCESSFUL when the sound file has completed. To handle this case, the window procedure sends itself a WM_COMMAND message with wParam equal to IDC_PLAY_END to simulate the user pressing the End button. The window procedure then responds normally by stopping the play and closing the device.

During recording, you'll receive an MM_MCINOTIFY message with wParam equal to MCI_NOTIFY_SUCCESSFUL when you run out of hard disk space for storing the temporary sound file. (I wouldn't exactly call this a "successful" completion, but that's what happens.) The window procedure responds by sending itself a WM_COMMAND message with wParam equal to IDC_RECORD_END. The window procedure stops recording, saves the file, and closes the device, as is normal.

The MCI Command String Approach

At one time, the Windows multimedia interface included a function called mciExecute, with the following syntax:

bSuccess = mciExecute (szCommand) ;

The only argument was the MCI command string. The function returned a Boolean value—nonzero if the function is successful and zero if not. The mciExecute function was functionally equivalent to calling mciSendString (the string-based MCI function used in TESTMCI) with NULL or zero for the last three arguments and then mciGetErrorString and MessageBox if an error occurred.

Although mciExecute is no longer part of the API, I've included such a function in the RECORD3 version of the digital tape recorder and player. This is shown in Figure 22-5. Like RECORD2, the program uses the RECORD.RC resource script and RESOURCE.H from RECORD1.

Figure 22-5. The RECORD3 program.

RECORD3.C

/*----------------------------------------
   RECORD3.C -- Waveform Audio Recorder
                (c) Charles Petzold, 1998
  ----------------------------------------*/

#include <windows.h>
#include "..\\record1\\resource.h"

BOOL CALLBACK DlgProc (HWND, UINT, WPARAM, LPARAM) ;

TCHAR szAppName [] = TEXT ("Record3") ;

int WINAPI WinMain (HINSTANCE hInstance, HINSTANCE hPrevInstance,
                    PSTR szCmdLine, int iCmdShow)
{
     if (-1 == DialogBox (hInstance, TEXT ("Record"), NULL, DlgProc))
     {
          MessageBox (NULL, TEXT ("This program requires Windows NT!"),
                      szAppName, MB_ICONERROR) ;
     }
     return 0 ;
}

BOOL mciExecute (LPCTSTR szCommand)
{
     MCIERROR error ;
     TCHAR    szErrorStr [1024] ;

     if (error = mciSendString (szCommand, NULL, 0, NULL))
     {
          mciGetErrorString (error, szErrorStr, 
                             sizeof (szErrorStr) / sizeof (TCHAR)) ;
          MessageBeep (MB_ICONEXCLAMATION) ;
          MessageBox (NULL, szErrorStr, TEXT ("MCI Error"), 
                      MB_OK | MB_ICONEXCLAMATION) ;
     }
     return error == 0 ;
}

BOOL CALLBACK DlgProc (HWND hwnd, UINT message, WPARAM wParam, LPARAM lParam)
{
     static BOOL bRecording, bPlaying, bPaused ;
     
     switch (message)

     {
     case WM_COMMAND:
          switch (wParam)
          {
          case IDC_RECORD_BEG:
                    // Delete existing waveform file
               
               DeleteFile (TEXT ("record3.wav")) ;
               
                    // Open waveform audio and record
               
               if (!mciExecute (TEXT ("open new type waveaudio alias mysound")))
                    return TRUE ;
               
               mciExecute (TEXT ("record mysound")) ;
               
                    // Enable and disable buttons
               
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), TRUE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG),   FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END),   FALSE);
               SetFocus (GetDlgItem (hwnd, IDC_RECORD_END)) ;
               
               bRecording = TRUE ;
               return TRUE ;
               
          case IDC_RECORD_END:
                    // Stop, save, and close recording
               
               mciExecute (TEXT ("stop mysound")) ;
               mciExecute (TEXT ("save mysound record3.wav")) ;
               mciExecute (TEXT ("close mysound")) ;
               
                    // Enable and disable buttons
               
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), TRUE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG),   TRUE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END),   FALSE);
               SetFocus (GetDlgItem (hwnd, IDC_PLAY_BEG)) ;
               
               bRecording = FALSE ;
               return TRUE ;
               
          case IDC_PLAY_BEG:
                    // Open waveform audio and play
               
               if (!mciExecute (TEXT ("open record3.wav alias mysound")))
                    return TRUE ;
               
               mciExecute (TEXT ("play mysound")) ;
               
                    // Enable and disable buttons
               
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG),   FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), TRUE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END),   TRUE) ;
               SetFocus (GetDlgItem (hwnd, IDC_PLAY_END)) ;
               
               bPlaying = TRUE ;
               return TRUE ;
               
          case IDC_PLAY_PAUSE:
               if (!bPaused)
                         // Pause the play
               {
                    mciExecute (TEXT ("pause mysound")) ;
                    SetDlgItemText (hwnd, IDC_PLAY_PAUSE, TEXT ("Resume")) ;
                    bPaused = TRUE ;
               }
               else
                         // Begin playing again
               {
                    mciExecute (TEXT ("play mysound")) ;
                    SetDlgItemText (hwnd, IDC_PLAY_PAUSE, TEXT ("Pause")) ;
                    bPaused = FALSE ;
               }
               
               return TRUE ;
               
          case IDC_PLAY_END:
                    // Stop and close
               
               mciExecute (TEXT ("stop mysound")) ;
               mciExecute (TEXT ("close mysound")) ;
               
                    // Enable and disable buttons
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), TRUE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG),   TRUE) ;
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE);
               EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END),   FALSE);
               SetFocus (GetDlgItem (hwnd, IDC_PLAY_BEG)) ;
               
               bPlaying = FALSE ;
               bPaused  = FALSE ;
               return TRUE ;
          }
          break ;
     
     case WM_SYSCOMMAND:
          switch (wParam)
          {
          case SC_CLOSE:
               if (bRecording)
                    SendMessage (hwnd, WM_COMMAND, IDC_RECORD_END, 0L);
               
               if (bPlaying)
                    SendMessage (hwnd, WM_COMMAND, IDC_PLAY_END, 0L) ;
               
               EndDialog (hwnd, 0) ;
               return TRUE ;
          }
          break ;
     }
     return FALSE ;
}

When you begin exploring the message-based and the text-based interfaces to MCI, you'll find that they correspond closely. It's easy to guess that MCI translates the command strings into the corresponding command messages and data structures. RECORD3 could use the MM_MCINOTIFY messages like RECORD2, but it chooses not to—an implication of the mciExecute function. The drawback of this is that the program doesn't know when it's finished playing the waveform file. Therefore, the buttons do not automatically change state. You must manually press the End button so that the program will know that it's ready to record or play again.

Notice the use of the alias keyword in the MCI open command. This allows all the subsequent MCI commands to refer to the device using the alias name.

The Waveform Audio File Format

If you take a look at uncompressed (that is, PCM) .WAV files under a hexadecimal dump program, you'll find they have a format as shown in Figure 22-6.

Offset Bytes Data
0000 4 "RIFF"

0004

4 size of waveform chunk (file size minus 8)
0008 4 "WAVE"
000C 4 "fmt "
0010 4 size of format chunk (16 bytes)
0014 2 wf.wFormatTag = WAVE_FORMAT_PCM = 1
0016 2 wf.nChannels
0018 4 wf.nSamplesPerSec
001C 4 wf.nAvgBytesPerSec
0020 2 wf.nBlockAlign
0022 2 wf.wBitsPerSample
0024 4 "data"
0028 4 size of waveform data
002C waveform data

Figure 22-6. The .WAV file format.

This format is an example of a more extensive format known as RIFF (Resource Interchange File Format). RIFF was intended to be the all-encompassing format for multimedia data files. It is a tagged file format, where the file consists of "chunks" of data that are identified by a preceding 4-character ASCII name and a 4-byte (32-bit) chunk size. The value of the chunk size does not include the 8 bytes required for the chunk name and size.

A waveform audio file begins with the text string "RIFF", which identifies it as a RIFF file. This is followed by a 32-bit chunk size, which is the size of the remainder of the file, or the file size less 8 bytes.

The chunk data begins with the text string "WAVE", which identifies it as a waveform audio chunk. This is followed by the text string "fmt"—notice the blank to make this a 4-character string—which identifies a sub-chunk containing the format of the waveform audio data. The "fmt " string is followed by the size of the format information, in this case 16 bytes. The format information is the first 16 bytes of the WAVEFORMATEX structure, or, as it was defined originally, a PCMWAVEFORMAT structure that includes a WAVEFORMAT structure.

The nChannels field is either 1 or 2, for monaural or stereo sound. The nSamplesPerSec field is the number of samples per second; the standard values are 11025, 22050, and 44100 samples per second. The nAvgBytesPerSec field is the sample rate in samples per second times the number of channels times the size of each sample in bits, divided by 8 and rounded up. The standard sample sizes are 8 and 16 bits. The nBlockAlign field is the number of channels times the sample size in bits, divided by 8 and rounded up. Finally, the format concludes with a wBitsPerSample field, which is the number of channels times the sample size in bits.

The format information is followed by the text string "data", followed by a 32-bit data size, followed by the waveform data itself. The data are simply the consecutive samples in the same format as that used in the low-level waveform audio facilities. If the sample size is 8 bits or less, each sample consists of 1 byte for monaural or 2 bytes for stereo. If the sample size is between 9 and 16 bits, each sample is 2 bytes for monaural or 4 bytes for stereo. For stereo waveform data, each sample consists of the left value followed by the right value.

For sample sizes of 8 bits or less, the sample byte is interpreted as an unsigned value. For example, for an 8-bit sample size, silence is equivalent to a string of 0x80 bytes. For sample sizes of 9 bits or more, the sample is interpreted as a signed value, and silence is equivalent to a string of 0 values.

One of the important rules for reading tagged files is to ignore chunks you're not prepared to deal with. Although a waveform audio file requires "fmt " and "data" sub-chunks (in that order), it can also contain other sub-chunks. In particular, a waveform audio file might contain a sub-chunk labeled "INFO", and sub-sub-chunks within that sub-chunk that provide information about the waveform audio file.

Experimenting with Additive Synthesis

For many years—going back to Pythagoras at least—people have attempted to analyze musical tones. At first it seems very simple, but then it gets complex. Bear with me if I repeat a little of what I've already said about sound.

Musical tones, except for some percussive sounds, have a particular pitch or frequency. This frequency can range across the spectrum of human perception, from 20 Hz to 20,000 Hz. The notes of a piano, for example, have a frequency range between 27.5 Hz to 4186 Hz. Another characteristic of musical tones is volume or loudness. This corresponds to the overall amplitude of the waveform producing the tone. A change in loudness is measured in decibels. So far, so good.

And then there is an unwieldy thing called "timbre." Very simply, timbre is that quality of sound that lets us distinguish between a piano and a violin and a trumpet all playing the same pitch at the same volume.

The French mathematician Fourier discovered that any periodic waveform—no matter how complex—can be represented by a sum of sine waves whose frequencies are integral multiples of a fundamental frequency. The fundamental, also called the first harmonic, is the frequency of periodicity of the waveform. The first overtone, also called the second harmonic, has a frequency twice the fundamental; the second overtone, or third harmonic, has a frequency three times the fundamental, and so forth. The relative amplitudes of the harmonics governs the shape of the waveform.

For example, a square wave can be represented as a sum of sine waves where the amplitudes of the even harmonics (that is, 2, 4, 6, etc) are zero and the amplitudes of the odd harmonics (1, 3, 5, etc) are in the proportions 1, 1/3, 1/5, and so forth. In a sawtooth wave, all harmonics are present and the amplitudes are in the proportions 1, 1/2, 1/3, 1/4, and so forth.

To the German scientist Hermann Helmholtz (1821_1894), this was the key in understanding timbre. In his classic book On the Sensations of Tone (1885, republished by Dover Press in 1954), Helmholtz posited that the ear and brain break down complex tones into their component sine waves and that the relative intensities of these sine waves is what we perceive as timbre. Unfortunately, it proved to be not quite that simple.

Electronic music synthesizers came to widespread public attention in 1968 with the release of Wendy Carlos's album Switched on Bach. The synthesizers available at that time (such as the Moog) were analog synthesizers. Such synthesizers use analog circuitry to generate various audio waveforms such as square waves, triangle waves, and sawtooth waves. To make these waveforms sound more like real musical instruments, they are subjected to some changes over the course of a single note. The overall amplitude of the waveform is shaped by an "envelope." When a note begins, the amplitude begins at zero and rises, usually very quickly. This is known as the attack. The amplitude then remains constant as the note is held. This is known as the sustain. The amplitude then falls to zero when the note ends; this is known as the release.

The waveforms are also put through filters that attenuate some of the harmonics and turn the simple waveforms into something more complex and musically interesting. The cut-off frequencies of these filters can be controlled by an envelope so that the harmonic content of the sound changes over the course of the note.

Because these synthesizers begin with harmonically rich waveform, and some of the harmonics are attenuated using filters, this form of synthesis is known as "subtractive synthesis."

Even while working with subtractive synthesis, many people involved in electronic music saw additive synthesis as the next big thing.

In additive synthesis you begin with a number of sine wave generators tuned in integral multiples so that each sine wave corresponds to a harmonic. The amplitude of each harmonic can be controlled independently by an envelope. Additive synthesis is not practical using analog circuitry because you'd need somewhere between 8 and 24 sine wave generators for a single note and the relative frequencies of these sine wave generators would have to track each other precisely. Analog waveform generators are notoriously unstable and prone to frequency drift.

However, for digital synthesizers (which can generate waveforms digitally using lookup tables) and computer-generated waveforms, frequency drift is not a problem and additive synthesis becomes feasible. So here's the general idea: You record a real musical tone and break it down into harmonics using Fourier analysis. You can then determine the relative strength of each harmonic and regenerate the sound digitally using multiple sine waves.

When people began experimenting with applying Fourier analysis on real musical tones and generating these tones from multiple sine waves, they discovered that timbre is not quite as simple as Helmholtz believed.

The big problem is that the harmonics of real musical tones are not in strict integral relationships. Indeed, the term "harmonic" is not even appropriate for real musical tones. The various sine wave components are inharmonic and more correctly called "partials."

It was discovered that the inharmonicity among the partials of real musical tones is vital in making the tone sound "real." Strict harmonicity yields an "electronic" sound. Each partial changes in both amplitude and frequency over the course of a single note. The relative frequency and amplitude relationships among the partials is different for different pitches and intensities from the same instrument. The most complex part of a real musical tone occurs during the attack portion of the note, when there is much inharmonicity. It was discovered that this complex attack portion of the note was vital in the human perception of timbre.

In short, the sound of real musical instruments is more complex than anyone imagined. The idea of analyzing musical tones and coming up with relatively few simple envelopes for controlling the amplitudes and frequencies of the partials was clearly not practical.

Some analyses of real musical sounds were published in early issues (1977 and 1978) of the Computer Music Journal (at the time published by People's Computer Company and now published by the MIT Press). The three-part series "Lexicon of Analyzed Tones" was written by James A. Moorer, John Grey, and John Strawn, and it showed the amplitude and frequency graphs of partials of a single note (less than half a second long) played on a violin, oboe, clarinet, and trumpet. The note used was the E flat above middle C. Twenty partials are used for the violin, 21 for the oboe and clarinet, and 12 for the trumpet. In particular, Volume II, Number 2 (September 1978) of the Computer Music Journal contains numerical line-segment approximations for the various frequency and amplitude envelopes for the oboe, clarinet, and trumpet.

So, with the waveform support in Windows, it is fairly simple to type these numbers into a program, generate multiple sine wave samples for each partial, add them up, and send the samples out to the waveform audio sound board, thereby reproducing the sounds originally recorded over 20 years ago. The ADDSYNTH ("additive synthesis") program is shown in Figure 22-7.

Figure 22-7. The ADDSYNTH Program.

ADDSYNTH.C

/*---------------------------------------------------
   ADDSYNTH.C -- Additive Synthesis Sound Generation
                 (c) Charles Petzold, 1998
  ---------------------------------------------------*/

#include <windows.h>
#include <math.h>
#include "addsynth.h"
#include "resource.h"

#define ID_TIMER             1
#define SAMPLE_RATE      22050
#define MAX_PARTIALS        21
#define PI             3.14159

BOOL CALLBACK DlgProc (HWND, UINT, WPARAM, LPARAM) ;

TCHAR szAppName [] = TEXT ("AddSynth") ;

// Sine wave generator
// -------------------

double SineGenerator (double dFreq, double * pdAngle)
{
     double dAmp ;
     
     dAmp = sin (* pdAngle) ;
     * pdAngle += 2 * PI * dFreq / SAMPLE_RATE ;
     
     if (* pdAngle >= 2 * PI)
          * pdAngle -= 2 * PI ;
     
     return dAmp ;
}

// Fill a buffer with composite waveform
// -------------------------------------

VOID FillBuffer (INS ins, PBYTE pBuffer, int iNumSamples)
{
     static double dAngle [MAX_PARTIALS] ;
     double        dAmp, dFrq, dComp, dFrac ;
     int           i, iPrt, iMsecTime, iCompMaxAmp, iMaxAmp, iSmp ;

          // Calculate the composite maximum amplitude
     
     iCompMaxAmp = 0 ;
     
     for (iPrt = 0 ; iPrt < ins.iNumPartials ; iPrt++)
     {
          iMaxAmp = 0 ;
          
          for (i = 0 ; i < ins.pprt[iPrt].iNumAmp ; i++)
               iMaxAmp = max (iMaxAmp, ins.pprt[iPrt].pEnvAmp[i].iValue) ;
          
          iCompMaxAmp += iMaxAmp ;
     }
     
          // Loop through each sample
     
     for (iSmp = 0 ; iSmp < iNumSamples ; iSmp++)
     {
          dComp = 0 ;
          iMsecTime = (int) (1000 * iSmp / SAMPLE_RATE) ;
          
               // Loop through each partial
          
          for (iPrt = 0 ; iPrt < ins.iNumPartials ; iPrt++)
          {
               dAmp = 0 ;
               dFrq = 0 ;
               
               for (i = 0 ; i < ins.pprt[iPrt].iNumAmp - 1 ; i++)
               {
                    if (iMsecTime >= ins.pprt[iPrt].pEnvAmp[i  ].iTime &&
                         iMsecTime <= ins.pprt[iPrt].pEnvAmp[i+1].iTime)
                    {
                         dFrac = (double) (iMsecTime -
                              ins.pprt[iPrt].pEnvAmp[i  ].iTime) /
                              (ins.pprt[iPrt].pEnvAmp[i+1].iTime -
                              ins.pprt[iPrt].pEnvAmp[i  ].iTime) ;
                         
                         dAmp = dFrac  * ins.pprt[iPrt].pEnvAmp[i+1].iValue +
                              (1-dFrac) * ins.pprt[iPrt].pEnvAmp[i  ].iValue ;
                         
                         break ;
                    }
               }
               
               for (i = 0 ; i < ins.pprt[iPrt].iNumFrq - 1 ; i++)
               {
                    if (iMsecTime >= ins.pprt[iPrt].pEnvFrq[i  ].iTime &&
                         iMsecTime <= ins.pprt[iPrt].pEnvFrq[i+1].iTime)
                    {
                         dFrac = (double) (iMsecTime -
                              ins.pprt[iPrt].pEnvFrq[i  ].iTime) /
                              (ins.pprt[iPrt].pEnvFrq[i+1].iTime -
                              ins.pprt[iPrt].pEnvFrq[i  ].iTime) ;
                         
                         dFrq = dFrac  * ins.pprt[iPrt].pEnvFrq[i+1].iValue +
                              (1-dFrac) * ins.pprt[iPrt].pEnvFrq[i  ].iValue ;
                         
                         break ;
                    }
               }
               dComp += dAmp * SineGenerator (dFrq, dAngle + iPrt) ;
          }
          pBuffer[iSmp] = (BYTE) (127 + 127 * dComp / iCompMaxAmp) ;
     }
}

// Make a waveform file
// --------------------

BOOL MakeWaveFile (INS ins, TCHAR * szFileName)
{
     DWORD        dwWritten ;
     HANDLE       hFile ;
     int          iChunkSize, iPcmSize, iNumSamples ;
     PBYTE        pBuffer ;
     WAVEFORMATEX waveform ;

     hFile = CreateFile (szFileName, GENERIC_WRITE, 0, NULL,
                         CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL) ;
 
     if (hFile == NULL)
          return FALSE ;
     
     iNumSamples = ((long) ins.iMsecTime * SAMPLE_RATE / 1000 + 1) / 2 * 2 ;
     iPcmSize    = sizeof (PCMWAVEFORMAT) ;
     iChunkSize  = 12 + iPcmSize + 8 + iNumSamples ;
     
     if (NULL == (pBuffer = malloc (iNumSamples)))
     {
          CloseHandle (hFile) ;
          return FALSE ;
     }
     
     FillBuffer (ins, pBuffer, iNumSamples) ;
     
     waveform.wFormatTag      = WAVE_FORMAT_PCM ;
     waveform.nChannels       = 1 ;
     waveform.nSamplesPerSec  = SAMPLE_RATE ;
     waveform.nAvgBytesPerSec = SAMPLE_RATE ;
     waveform.nBlockAlign     = 1 ;
     waveform.wBitsPerSample  = 8 ;
     waveform.cbSize          = 0 ;
     
     WriteFile (hFile, "RIFF",       4, &dwWritten, NULL) ;
     WriteFile (hFile, &iChunkSize,  4, &dwWritten, NULL) ;
     WriteFile (hFile, "WAVEfmt ",   8, &dwWritten, NULL) ;
     WriteFile (hFile, &iPcmSize,    4, &dwWritten, NULL) ;
     WriteFile (hFile, &waveform, sizeof (WAVEFORMATEX) - 2, &dwWritten, NULL) ;
     WriteFile (hFile, "data",       4, &dwWritten, NULL) ;
     WriteFile (hFile, &iNumSamples, 4, &dwWritten, NULL) ;
     WriteFile (hFile, pBuffer,      iNumSamples,  &dwWritten, NULL) ;
     
     CloseHandle (hFile) ;
     free (pBuffer) ;
     
     if ((int) dwWritten != iNumSamples)
     {
          DeleteFile (szFileName) ;
          return FALSE ;
     }
     return TRUE ;
}

void TestAndCreateFile (HWND hwnd, INS ins, TCHAR * szFileName, int idButton)
{
     TCHAR szMessage [64] ;
     
     if (-1 != GetFileAttributes (szFileName))
          EnableWindow (GetDlgItem (hwnd, idButton), TRUE) ;
     else
     {
          if (MakeWaveFile (ins, szFileName))
               EnableWindow (GetDlgItem (hwnd, idButton), TRUE) ;
          else
          {
               wsprintf (szMessage, TEXT ("Could not create %x."), szFileName) ;
               MessageBeep (MB_ICONEXCLAMATION) ;
               MessageBox (hwnd, szMessage, szAppName,
                                 MB_OK | MB_ICONEXCLAMATION) ;
          }
     }
}

int WINAPI WinMain (HINSTANCE hInstance, HINSTANCE hPrevInstance,
                    PSTR szCmdLine, int iCmdShow)
{
     if (-1 == DialogBox (hInstance, szAppName, NULL, DlgProc))
     {
          MessageBox (NULL, TEXT ("This program requires Windows NT!"),
                      szAppName, MB_ICONERROR) ;
     }
     return 0 ;
}

BOOL CALLBACK DlgProc (HWND hwnd, UINT message, WPARAM wParam, LPARAM lParam)
{
     static TCHAR * szTrum = TEXT ("Trumpet.wav") ;
     static TCHAR * szOboe = TEXT ("Oboe.wav") ;
     static TCHAR * szClar = TEXT ("Clarinet.wav") ;
     
     switch (message)
     {
     case WM_INITDIALOG:
          SetTimer (hwnd, ID_TIMER, 1, NULL) ;
          return TRUE ;
          
     case WM_TIMER:
          KillTimer (hwnd, ID_TIMER) ;
          SetCursor (LoadCursor (NULL, IDC_WAIT)) ;
          ShowCursor (TRUE) ;
          
          TestAndCreateFile (hwnd, insTrum, szTrum, IDC_TRUMPET) ;
          TestAndCreateFile (hwnd, insOboe, szOboe, IDC_OBOE) ;
          TestAndCreateFile (hwnd, insClar, szClar, IDC_CLARINET) ;
          
          SetDlgItemText (hwnd, IDC_TEXT, TEXT (" ")) ;
          SetFocus (GetDlgItem (hwnd, IDC_TRUMPET)) ;
          
          ShowCursor (FALSE) ;
          SetCursor (LoadCursor (NULL, IDC_ARROW)) ;
          return TRUE ;
     case WM_COMMAND:
          switch (LOWORD (wParam))
          {
          case IDC_TRUMPET:
               PlaySound (szTrum, NULL, SND_FILENAME | SND_SYNC) ;
               return TRUE ;
               
          case IDC_OBOE:
               PlaySound (szOboe, NULL, SND_FILENAME | SND_SYNC) ;
               return TRUE ;
               
          case IDC_CLARINET:
               PlaySound (szClar, NULL, SND_FILENAME |SND_SYNC) ;
               return TRUE ;
          }
          break ;
          
     case WM_SYSCOMMAND:
          switch (LOWORD (wParam))
          {
          case SC_CLOSE:
               EndDialog (hwnd, 0) ;
               return TRUE ;
          }
          break ;
     }
     return FALSE ;
}

ADDSYNTH.RC (excerpts)

//Microsoft Developer Studio generated resource script.

#include "resource.h"
#include "afxres.h"

/////////////////////////////////////////////////////////////////////////////
// Dialog

ADDSYNTH DIALOG DISCARDABLE  100, 100, 176, 49
STYLE WS_MINIMIZEBOX | WS_CAPTION | WS_SYSMENU
CAPTION "Additive Synthesis"
FONT 8, "MS Sans Serif"
BEGIN
    PUSHBUTTON      "Trumpet",IDC_TRUMPET,8,8,48,16
    PUSHBUTTON      "Oboe",IDC_OBOE,64,8,48,16
    PUSHBUTTON      "Clarinet",IDC_CLARINET,120,8,48,16
    LTEXT           "Preparing Data...",IDC_TEXT,8,32,100,8
END

RESOURCE.H (excerpts)

// Microsoft Developer Studio generated include file.
// Used by AddSynth.rc

#define IDC_TRUMPET                     1000
#define IDC_OBOE                        1001
#define IDC_CLARINET                    1002
#define IDC_TEXT                        1003

An additional file called ADDSYNTH.H is not shown here because it contains several hundred lines of boring stuff. You'll find it on the companion disc for this book. At the beginning of ADDSYNTH.H, I define three structures used for storing the envelope data. Each amplitude and frequency envelope is stored as an array of structures of type ENV. These are number pairs that consist of a time in milliseconds followed by an amplitude value (in an arbitrary scale) or a frequency (in cycles per second). These arrays are of variable length, ranging from 6 to 14 values. Straight lines are assumed to connect the amplitude and frequency values.

Each instrument consists of a collection of partials (12 for the trumpet and 21 each for the oboe and clarinet) stored as an array of structures of type PRT. The PRT structure stores the number of points in the amplitude and frequency envelopes and a pointer to the ENV array. The INS structure contains the total time of the tone in milliseconds, the number of partials, and a pointer to the PRT array that stores the partials.

ADDSYNTH has three push buttons labeled "Trumpet," "Oboe," and "Clarinet." PCs are not yet quite fast enough to do all the additive synthesis calculations in real time, so the first time you run ADDSYNTH, these buttons will be disabled until the program calculates the samples and creates the TRUMPET.WAV, OBOE.WAV, and CLARINET.WAV sound files. The push buttons are then enabled and you can play the three sounds by using the PlaySound function. The next time you run the program, it will check for the existence of the waveform files and won't need to recreate them.

Most of the work is done in ADDSYNTH's FillBuffer function. FillBuffer begins by calculating the total composite maximum amplitude. It does this by looping through the partials for the instrument to find the maximum amplitude for each partial and then adding the maximum amplitudes all together. This value is later used to scale the samples to an 8-bit sample size.

FillBuffer then proceeds to calculate a value for each sample. Each sample corresponds to a millisecond time value that depends on the sample rate. (Actually, at a 22.05 kHz sample rate, every 22 samples correspond to the same millisecond time value.) FillBuffer then loops through the partials. For both the frequency and amplitude, it finds the envelope line segment corresponding to the millisecond time value and performs a linear interpolation.

The frequency value is passed to the SineGenerator function, together with a phase angle value. As I discussed earlier in this chapter, digitally generating sine waves requires a phase angle value to be maintained and incremented based on the frequency value. On return from the SineGenerator function, the sine value is multiplied by the amplitude for the partial and accumulated. After all the partials for a sample are added together, the sample is scaled to the size of a byte.

Waking Up to Waveform Audio

WAKEUP, which you'll find in Figure 22-8, is one of of those programs where the source code files don't look quite complete. The program's window looks like a dialog box, but there's no resource script (we already know how to do that), and the program uses what seems to be a waveform file, but there's no such file on the disk. However, the program packs quite a wallop: The sound it plays is loud and quite annoying. WAKEUP is my alarm clock, and it definitely works in waking me up.

Figure 22-8. The WAKEUP program.

WAKEUP.C

/*---------------------------------------
   WAKEUP.C -- Alarm Clock Program
               (c) Charles Petzold, 1998
  ---------------------------------------*/

#include <windows.h>
#include <commctrl.h>

     // ID values for 3 child windows

#define ID_TIMEPICK 0
#define ID_CHECKBOX 1
#define ID_PUSHBTN  2

     // Timer ID

#define ID_TIMER    1

     // Number of 100-nanosecond increments (ie FILETIME ticks) in an hour

#define FTTICKSPERHOUR (60 * 60 * (LONGLONG) 10000000)

     // Defines and structure for waveform "file"

#define SAMPRATE  11025
#define NUMSAMPS  (3 * SAMPRATE)
#define HALFSAMPS (NUMSAMPS / 2) 

typedef struct
{
     char  chRiff[4] ;
     DWORD dwRiffSize ;
     char  chWave[4] ;
     char  chFmt [4] ;
     DWORD dwFmtSize ;
     PCMWAVEFORMAT pwf ;
     char  chData[4] ;
     DWORD dwDataSize ;
     BYTE  byData[0] ;
}
WAVEFORM ;

     // The window proc and the subclass proc

LRESULT CALLBACK WndProc (HWND, UINT, WPARAM, LPARAM) ;
LRESULT CALLBACK SubProc (HWND, UINT, WPARAM, LPARAM) ;

     // Original window procedure addresses for the subclassed windows

WNDPROC SubbedProc [3] ;

     // The current child window with the input focus

HWND hwndFocus ;

int WINAPI WinMain (HINSTANCE hInstance, HINSTANCE hPrevInst,
                    PSTR szCmdLine, int iCmdShow)
{
     static TCHAR szAppName [] = TEXT ("WakeUp") ;
     HWND         hwnd ;
     MSG          msg ;
     WNDCLASS     wndclass ;

     wndclass.style         = 0 ;
     wndclass.lpfnWndProc   = WndProc ;
     wndclass.cbClsExtra    = 0 ;
     wndclass.cbWndExtra    = 0 ;
     wndclass.hInstance     = hInstance ;
     wndclass.hIcon         = LoadIcon (NULL, IDI_APPLICATION) ;
     wndclass.hCursor       = LoadCursor (NULL, IDC_ARROW) ;
     wndclass.hbrBackground = (HBRUSH) (1 + COLOR_BTNFACE) ;
     wndclass.lpszMenuName  = NULL ;
     wndclass.lpszClassName = szAppName ;

     if (!RegisterClass (&wndclass))
     {
          MessageBox (NULL, TEXT ("This program requires Windows NT!"),
                      szAppName, MB_ICONERROR) ;
          return 0 ;
     }

     hwnd = CreateWindow (szAppName, szAppName,
                          WS_OVERLAPPED | WS_CAPTION | 
                                          WS_SYSMENU | WS_MINIMIZEBOX,
                          CW_USEDEFAULT, CW_USEDEFAULT,
                          CW_USEDEFAULT, CW_USEDEFAULT,
                          NULL, NULL, hInstance, NULL) ;

     ShowWindow (hwnd, iCmdShow) ;
     UpdateWindow (hwnd) ;

     while (GetMessage (&msg, NULL, 0, 0))
     {
          TranslateMessage (&msg) ;
          DispatchMessage (&msg) ;
     }
     return msg.wParam ;
}

LRESULT CALLBACK WndProc (HWND hwnd, UINT message, WPARAM wParam, LPARAM lParam)
{
     static HWND          hwndDTP, hwndCheck, hwndPush ;
     static WAVEFORM      waveform = { "RIFF", NUMSAMPS + 0x24, "WAVE", "fmt ", 
                                       sizeof (PCMWAVEFORMAT), 1, 1, SAMPRATE, 
                                       SAMPRATE, 1, 8, "data", NUMSAMPS } ;
     static WAVEFORM    * pwaveform ;
     FILETIME             ft ;
     HINSTANCE            hInstance ;
     INITCOMMONCONTROLSEX icex ;
     int                  i, cxChar, cyChar ;
     LARGE_INTEGER        li ;
     SYSTEMTIME           st ;
     switch (message)
     {
     case WM_CREATE:
               // Some initialization stuff

          hInstance = (HINSTANCE) GetWindowLong (hwnd, GWL_HINSTANCE) ;

          icex.dwSize = sizeof (icex) ;
          icex.dwICC  = ICC_DATE_CLASSES ;
          InitCommonControlsEx (&icex) ;

               // Create the waveform file with alternating square waves

          pwaveform = malloc (sizeof (WAVEFORM) + NUMSAMPS) ;
          * pwaveform = waveform ;

          for (i = 0 ; i < HALFSAMPS ; i++)
               if (i % 600 < 300)
                    if (i % 16 < 8)
                         pwaveform->byData[i] = 25 ;
                    else
                         pwaveform->byData[i] = 230 ;
               else
                    if (i % 8 < 4)
                         pwaveform->byData[i] = 25 ;
                    else
                         pwaveform->byData[i] = 230 ;

               // Get character size and set a fixed window size.

          cxChar = LOWORD (GetDialogBaseUnits ()) ;
          cyChar = HIWORD (GetDialogBaseUnits ()) ;

          SetWindowPos (hwnd, NULL, 0, 0, 
                        42 * cxChar, 
                        10 * cyChar / 3 + 2 * GetSystemMetrics (SM_CYBORDER) +
                                              GetSystemMetrics (SM_CYCAPTION),
                        SWP_NOMOVE | SWP_NOZORDER | SWP_NOACTIVATE) ; 

               // Create the three child windows

          hwndDTP = CreateWindow (DATETIMEPICK_CLASS, TEXT (""), 
                         WS_BORDER | WS_CHILD | WS_VISIBLE | DTS_TIMEFORMAT,
                         2 * cxChar, cyChar, 12 * cxChar, 4 * cyChar / 3, 
                         hwnd, (HMENU) ID_TIMEPICK, hInstance, NULL) ;
          hwndCheck = CreateWindow (TEXT ("Button"), TEXT ("Set Alarm"),
                         WS_CHILD | WS_VISIBLE | BS_AUTOCHECKBOX,
                         16 * cxChar, cyChar, 12 * cxChar, 4 * cyChar / 3,
                         hwnd, (HMENU) ID_CHECKBOX, hInstance, NULL) ;

          hwndPush = CreateWindow (TEXT ("Button"), TEXT ("Turn Off"),
                         WS_CHILD | WS_VISIBLE | BS_PUSHBUTTON | WS_DISABLED,
                         28 * cxChar, cyChar, 12 * cxChar, 4 * cyChar / 3,
                         hwnd, (HMENU) ID_PUSHBTN, hInstance, NULL) ;

          hwndFocus = hwndDTP ;

               // Subclass the three child windows

          SubbedProc [ID_TIMEPICK] = (WNDPROC) 
                         SetWindowLong (hwndDTP, GWL_WNDPROC, (LONG) SubProc) ;
          SubbedProc [ID_CHECKBOX] = (WNDPROC) 
                         SetWindowLong (hwndCheck, GWL_WNDPROC, (LONG) SubProc);
          SubbedProc [ID_PUSHBTN] = (WNDPROC) 
                         SetWindowLong (hwndPush, GWL_WNDPROC, (LONG) SubProc) ;
          
               // Set the date and time picker control to the current time
               // plus 9 hours, rounded down to next lowest hour
          
          GetLocalTime (&st) ;
          SystemTimeToFileTime (&st, &ft) ;
          li = * (LARGE_INTEGER *) &ft ;
          li.QuadPart += 9 * FTTICKSPERHOUR ; 
          ft = * (FILETIME *) &li ;
          FileTimeToSystemTime (&ft, &st) ;
          st.wMinute = st.wSecond = st.wMilliseconds = 0 ;
          SendMessage (hwndDTP, DTM_SETSYSTEMTIME, 0, (LPARAM) &st) ;
          return 0 ;

     case WM_SETFOCUS:
          SetFocus (hwndFocus) ;
          return 0 ;

     case WM_COMMAND:
          switch (LOWORD (wParam))      // control ID
          {
          case ID_CHECKBOX:
               
                    // When the user checks the "Set Alarm" button, get the 
                    // time in the date and time control and subtract from 
                    // it the current PC time.
               if (SendMessage (hwndCheck, BM_GETCHECK, 0, 0))
               {
                    SendMessage (hwndDTP, DTM_GETSYSTEMTIME, 0, (LPARAM) &st) ;
                    SystemTimeToFileTime (&st, &ft) ;
                    li = * (LARGE_INTEGER *) &ft ;

                    GetLocalTime (&st) ;
                    SystemTimeToFileTime (&st, &ft) ;
                    li.QuadPart -= ((LARGE_INTEGER *) &ft)->QuadPart ;

                         // Make sure the time is between 0 and 24 hours!
                         // These little adjustments let us completely ignore
                         // the date part of the SYSTEMTIME structures.

                    while (li.QuadPart < 0)
                         li.QuadPart += 24 * FTTICKSPERHOUR ;

                    li.QuadPart %= 24 * FTTICKSPERHOUR ;

                         // Set a one-shot timer! (See you in the morning.)

                    SetTimer (hwnd, ID_TIMER, (int) (li.QuadPart / 10000), 0) ;
               }
                    // If button is being unchecked, kill the timer.

               else
                    KillTimer (hwnd, ID_TIMER) ;

               return 0 ;

               // The "Turn Off" button turns off the ringing alarm, and also
               // unchecks the "Set Alarm" button and disables itself.

          case ID_PUSHBTN:
               PlaySound (NULL, NULL, 0) ;
               SendMessage (hwndCheck, BM_SETCHECK, 0, 0) ;
               EnableWindow (hwndDTP, TRUE) ;
               EnableWindow (hwndCheck, TRUE) ;
               EnableWindow (hwndPush, FALSE) ;
               SetFocus (hwndDTP) ;
               return 0 ;
          }
          return 0 ;

               // The WM_NOTIFY message comes from the date and time picker.
               // If the user has checked "Set Alarm" and then gone back to 
               // change the alarm time, there might be a discrepancy between
               // the displayed time and the one-shot timer. So, the program
               // unchecks "Set Alarm" and kills any outstanding timer.

     case WM_NOTIFY:
          switch (wParam)          // control ID
          {
          case ID_TIMEPICK:
               switch (((NMHDR *) lParam)->code)       // notification code
               {
               case DTN_DATETIMECHANGE:
                    if (SendMessage (hwndCheck, BM_GETCHECK, 0, 0))
                    {
                         KillTimer (hwnd, ID_TIMER) ;
                         SendMessage (hwndCheck, BM_SETCHECK, 0, 0) ;
                    }
                    return 0 ;
               }
          }
          return 0 ;

          // The WM_COMMAND message comes from the two buttons. 

     case WM_TIMER:

               // When the timer message comes, kill the timer (because we only
               // want a one-shot) and start the annoying alarm noise going.

          KillTimer (hwnd, ID_TIMER) ;
          PlaySound ((PTSTR) pwaveform,  NULL, 
                     SND_MEMORY | SND_LOOP | SND_ASYNC);

               // Let the sleepy user turn off the timer by slapping the 
               // space bar. If the window is minimized, it's restored; then it's
               // brought to the forefront; then the pushbutton is enabled and
               // given the input focus.

          EnableWindow (hwndDTP, FALSE) ;
          EnableWindow (hwndCheck, FALSE) ;
          EnableWindow (hwndPush, TRUE) ;

          hwndFocus = hwndPush ;
          ShowWindow (hwnd, SW_RESTORE) ;
          SetForegroundWindow (hwnd) ;
          return 0 ;

          // Clean up if the alarm is ringing or the timer is still set.
     case WM_DESTROY:
          free (pwaveform) ;

          if (IsWindowEnabled (hwndPush))
               PlaySound (NULL, NULL, 0) ;

          if (SendMessage (hwndCheck, BM_GETCHECK, 0, 0))
               KillTimer (hwnd, ID_TIMER) ;

          PostQuitMessage (0) ;
          return 0 ;
     }
     return DefWindowProc (hwnd, message, wParam, lParam) ;
}

LRESULT CALLBACK SubProc (HWND hwnd, UINT message, WPARAM wParam, LPARAM lParam)
{
     int idNext, id = GetWindowLong (hwnd, GWL_ID) ;
         
     switch (message)
     {
     case WM_CHAR:
          if (wParam == `\t')
          {
               idNext = id ;

               do
                    idNext = (idNext + 
                         (GetKeyState (VK_SHIFT) < 0 ? 2 : 1)) % 3 ;
               while (!IsWindowEnabled (GetDlgItem (GetParent (hwnd), idNext)));

               SetFocus (GetDlgItem (GetParent (hwnd), idNext)) ;
               return 0 ;
          }
          break ;

     case WM_SETFOCUS:
          hwndFocus = hwnd ;
          break ;
     }
     return CallWindowProc (SubbedProc [id], hwnd, message, wParam, lParam) ;
}

The waveform that WAKEUP uses is just two square waves, but they are alternated very quickly. The actual waveform is calculated during WndProc's WM_CREATE message. The entire waveform file is stored in memory; a pointer to this memory block is passed to the PlaySound function, which uses the SND_MEMORY, SND_LOOP, and SND_ASYNC arguments.

WAKEUP uses a common control called the Date-Time Picker. This control takes care of logic to allow the user to select a particular date and time. (WAKEUP uses only the time feature.) A program can get and set this time using the SYSTEMTIME structure used in obtaining and setting the PC's own clock. To see how versatile the Date-Time Picker really is, try creating the window without any DTS style flags.

Notice the logic at the end of the WM_CREATE message: the program assumes that you run it soon before going to bed and that you want to wake up in 8 hours from the next stroke of the hour.

Now obviously you could obtain the current time in a SYSTEMTIME structure from the GetLocalTime function and increment the time "manually." But in the general case this calculation involves checking for a resultant hour greater than 24, which means you'll have to increment the day field, and then that might involve incrementing the month (so you have to have logic for the number of days in each month and a leap year check), and finally you might have to increment the year.

Instead, the recommended method (from /Platform SDK/Windows Base Services/General Library Time/Time Reference/Time Structures/SYSTEMTIME) is to convert the SYSTEMTIME to a FILETIME structure (using SystemTimeToFileTime), cast the FILETIME structure to a LARGE_INTEGER structure, perform the calculations on the large integer, cast back to a FILETIME structure, and then convert back to a SYSTEMTIME structure (using FileTimeToSystemTime).

The FILETIME structure, as its name implies, is used to get and set the time that a file was last modified. The structure looks like this:

type struct _FILETIME       // ft
{
     DWORD dwLowDateTime ;
     DWORD dwHighDateTime ;
}
FILETIME ;

These two fields together express a 64-bit value that indicates the number of 100-nanosecond intervals from January 1, 1601.

The Microsoft C/C++ compiler supports 64-bit integers as a nonstandard extension to ANSI C. The data type is __int64. You can do all the normal arithmetic operations on __int64 types, and some run-time library functions support them. The Windows WINNT.H header file defines the following:

typedef __int64 LONGLONG ;
typedef unsigned __int64 DWORDLONG ;

In Windows, this sometimes called a "quad word" or, more commonly, a "large integer." There's also a union defined:

typedef union _LARGE_INTEGER
{
     struct
     {
          DWORD LowPart ;
          LONG  HighPart ;
     } ;
     LONGLONG QuadPart ;
}
LARGE_INTEGER ; 

This is all documented in /Platform SDK/Windows Base Services/General Library/Large Integer Operations. The union lets you work with the large integer either as two 32-bit quantities or as a 64-bit quantity.