How to efficiently emulate polyphonic waveform generation (specifically Namco's 3-voice sound chip)

Emulating a polyphonic waveform generator is relatively straightforward, but doing it efficiently is another matter.  Namco's 3-voice waveform chip is a good one to examine since it was used in some of their most popular games (such as Galaga) and it can create a wide variety of sounds.  There are two aspects to emulating this chip: one is translating the memory mapped registers into a set of waveform, volume and frequency values; the other is merging the waveform and frequency values to create a buffer of waveform data suitable for passing to your sound card.  The chip controls 3 simultaneous voices, each with 16 possible volume levels and 8 possible 4-bit, 32-entry waveform tables.  These waveforms are contained in a 256x4-bit PROM addressed as 8 groups of 32 consecutive nibbles (Address lines a5-a7 select the waveform).

Part 1 - Translating the memory mapped registers into useful info

For this example, let's assume that the sound chip is mapped to address 0000.   The following addresses serve the following functions:

0005: Waveform #0, 0-7
000A: Waveform #1, 0-7
000F: Waveform #2, 0-7
0010-0014: Frequency #0 - each byte only uses the lower 4 bits and the addresses progress from LSN to MSN (least significant nibble to most significant nibble)
0015: Volume #0, 0-F
0016-0019: Frequency #1
001A: Volume #1, 0-F
001B-001E: Frequency #2
001F: Volume #2, 0-F

During the execution of the game, these registers are being updated one byte at a time.  One might be inclined to try and update the sound output each time a byte is written to the sound chip, but that will produce very innefficient code and possibly introduce distortion into the sound output.  I chose to update the sound output once every frame (60 times per second).  This produces clean, efficient sound and is able to keep up with most sound changes.  Since a sixtieth of a second is shorter than the typical musical note, this is fast enough to keep up with most musical sounds and sound effects.  The only exception I have found is in Xevious; the gun fire sound is made by rapidly changing the frequecy - my emulation of this sound is not very accurate.

Here's the code for extracting useful information from the sound registers:

pWave - Waveform offsets
cVolume - Volume level, 0-F
iFreq - Frequency

/*******************************************************************
*                                                                  *
* FUNCTION : NamcoSoundParams(unsigned char *)                     *
*                                                                  *
* PURPOSE : NAMCO 3-voice sound hardware emulation.                *
*                                                                  *
********************************************************************/

void NamcoSoundParams(unsigned char *pSoundRegs)
{
pWave[0] = 32 * (pSoundRegs[0x5] & 0x7);
pWave[1] = 32 * (pSoundRegs[0xa] & 0x7);
pWave[2] = 32 * (pSoundRegs[0xf] & 0x7);

cVolume[0] = pSoundRegs[0x15] & 0xf;
cVolume[1] = pSoundRegs[0x1a] & 0xf;
cVolume[2] = pSoundRegs[0x1f] & 0xf;

iFreq[0] = 65536 * pSoundRegs[0x14] + 4096 * pSoundRegs[0x13] + 256 * pSoundRegs[0x12] + 16 * pSoundRegs[0x11] + pSoundRegs[0x10];
iFreq[1] = 65536 * pSoundRegs[0x19] + 4096 * pSoundRegs[0x18] + 256 * pSoundRegs[0x17] + 16 * pSoundRegs[0x16];
iFreq[2] = 65536 * pSoundRegs[0x1e] + 4096 * pSoundRegs[0x1d] + 256 * pSoundRegs[0x1c] + 16 * pSoundRegs[0x1b];
} /* NamcoSoundParams() */

Part 2 - Translating the volume, frequency and waveform data into sound samples

At the end of each video frame (60 fps), I create a 60th of a second of sound data and send it to the sound card.  The following function uses the frequency, volume and waveform data to create a monophonic waveform which is sent to the soundcard.  In order to speed up the generation of the output wave data, I have precalculated every waveform at every volume level; this saves several integer operations (including a multiply - slow on x86) in the inner loop.  Shown below is the function which prepares the pre-calculated wave tables.

/***********************************************************************
*                                                                      *
* FUNCTION : PacPrepSounds(int)                                        *
*                                                                      *
* PURPOSE : Precalculate the sound waveforms to generate sound faster. *
*                                                                      *
************************************************************************/

void PacPrepSounds(int iShift)
{
signed char *p;
int v, j;

   p = pWaveforms; /* Destination of pre-calculated data */
   for (v=1; v<16; v++) /* Volume level */
      {
      for (j=0; j<256; j++)
         {
/* Do all necessary calculations beforehand */
         *p++ = ((((pSoundPROM[j] & 0x0f) - 8) * v) >> iShift);
         }
      }
} /* PacPrepSounds() */

Since the 3 voices must be summed to form the final output, the waveform data is also right shifted by 2 (iShift) so that the sum of the 3 voices does not overflow the 8-bit output.  The iAudioShift variable allows me to create waveform data for 11025, 22050 and 44100Hz output and has values of 12 to 14 (11025 to 44100).

/********************************************************************
*                                                                   *
* FUNCTION : PacSoundUpdate(void)                                   *
*                                                                   *
* PURPOSE : Create waveform data for the time slice last executed.  *
*                                                                   *
*********************************************************************/

void PacSoundUpdate(void)
{
int i, iLen, iPos, iInc;
int iVoice, f, v;
unsigned char *p;
signed char *d;

iLen = iSoundBlock; /* Number of 8-bit samples to create */

memset(&pSoundBuf[iSoundLen], 128, iSoundBlock); /* Start with silence */
for (iVoice=0; iVoice<3; iVoice++)
   {
   if (iFreq[iVoice] && cVolume[iVoice]) /* If this voice is active */
      {
      f = iFreq[iVoice];
      v = cVolume[iVoice];
      iPos = iPosition[iVoice]; /* Need to track waveform position */
/* Pointer to precalculated (waveform*volume) table */
      p = pWave[iVoice] + pWaveforms + (v-1)*256;
      d = &pSoundBuf[iSoundLen]; /* Pointer to buffer receiving output */
      for (i=0; i<iLen; i++)
         {
/* Sum the current voice data with existing data and update the position */
         *d++ += p[(iPos >> iAudioShift) & 0x1f];
         iPos += f;
         } /* for each sample */
/* Save the position for the next time through to produce smooth output */
      iPosition[iVoice] = iPos;
      } /* if voice active */
   } /* for each voice */
} /* PacSoundUpdate() */

Back