DSP, Micros & Memory


High-quality audio applications using the PIC32

25 July 2012 DSP, Micros & Memory

When designing audio docking stations and accessories for portable digital audio devices, and other digital audio sources, designers are constrained by cost, while trying to deliver the highest-quality audio playback.

In a typical docking station and device accessory, a digital audio source that plugs into the unit sends a serial stereo audio data stream over the dock’s data transfer interface, such as USB. The dock captures the data stream while performing other crucial tasks, and routes the stream to an audio codec or digital-to-analog converter (DAC) at a specific sample rate, as shown in Figure 1.

Figure 1. Typical audio application.
Figure 1. Typical audio application.

The captured stereo audio stream then flows through a serial interface to the codec or DAC. Since there are many possible sources of digitised audio, and not all of the sources use the same sampling rate, this serial interface typically adapts the sampling frequency to the source, or converts the sampled data stream into a common data rate.

Therefore, one of the challenges in the design of the docking system or device accessory is to perform the sample-rate conversion without degrading the audio quality, and at the lowest cost possible.

To deal with these challenges, designers have typically used a dedicated sample-rate conversion circuit and/or a high-end audio DAC that incorporates complex phase-locked loops (PLLs) to ensure flexible sample rates for stable communication of the sampled audio data.

The USB interface is a convenient interface for the transfer of audio data. However, to meet the requirements of professional audio, the subtle loss of quality due to USB clock and codec clock mismatch must be addressed.

In this article, the available audio-specific features of the PIC32 MCU are explored to address these needs. The SPI module supports different standard audio communication modes and offers high bit resolution for high-quality audio applications.

The flexible reference clock output feature of the PIC32 can be used to provide the master clock to the analog front-end to generate the different sample rates. The reference clock output also eliminates the need for an external crystal/oscillator by a codec.

It also eliminates the need for a PLL on the codec. The reference clock output can be tuned to prevent buffer under-run and over-run that arise because of clock mismatches. The MCU also offers the USB Host and Device module with flexible PLL clocking schemes at low power.

Digital audio data basics

When analog audio is converted into a discrete digital format, the analog signal is sampled at a frequency of at least twice the highest frequency component in the analog signal, or the Nyquist rate. Therefore, an audio signal that spans 0 to 20 kHz can be sampled at a data rate of 44,1 kHz, which in this case is the suitable Nyquist rate, so that the signal can be reconstructed without aliasing when converted back to the analog domain.

In addition to the sampling rate, data bit resolution can be 16-bit or 24-bit stereo audio data. For compact disc (CD) quality audio, the standard is 16-bit resolution with a 44,1 kHz sample rate. However, there are higher-performance CD music options. One such standard encodes the data with a 24-bit resolution and increases the sampling rate to 96 kHz.

For professional audio, the audio files are encoded with a resolution of 24 bits per sample, which provides headroom when the audio is mixed and manipulated. Also, the resolution choice allows for the trade-off of sound quality versus file size, even with compression.

The USB interface can readily handle the streaming of high-quality audio over isochronous transfers. Its ability to deliver high-quality audio is quite evident, as it is popular among many audio users. With its universal ease-of-use, USB audio can transfer high-resolution and high-sample-rate audio with negligible jitter, when packaged with a flexible audio interface.

Isochronous data transfer, amongst its various other uses, is utilised to stream audio data to and from a source at a constant rate in real-time. Stereo audio data packets, with size governed by the sample rate of the audio stream, are transferred as part of USB frames every 1 ms on the USB full-speed link. USB audio also provides controls for common features such as volume, tone, gain control and equalisers, among many control and processing units.

The differences in bit rates and sample rates require the hardware in the playback system, or dock, to be able to handle the differing rate data streams. To do that, the system must either use a more complex DAC that is expensive and can phase-lock to each sample rate and adjust itself to each playback option, or use an external sample-rate converter IC with the low-cost DAC, or convert all the streams into a standard sample rate and bit rate using an algorithm running on a microcontroller that a simple low-cost DAC can handle.

The PIC32 MCU offers a flexible reference clock output and audio mode to address these requirements to achieve high-quality audio while maintain a low design cost. The serial interface with the Audio mode and flexible reference clock output module are explored in the following sections.

PIC32 serial peripheral interface module with Audio mode

Most codecs offer serial communication over a 4-line serial interface. The transmission and reception of the stereo audio samples between the MCU and codec occur over this serial interface. Typical serial interfaces have the following signals:

* Serial Data Output (SDO) to transmit stereo audio data to the codec.

* Serial Data Input (SDI) to receive stereo audio data from the codec.

* Serial Bit Clock (SCK/BCLK) is the required bit clock provided by the Master.

* Left/Right Clock (LRCK) is the phase clock provided by the Master for stereo data.

PIC32 devices have a serial peripheral interface (SPI) module with Audio mode.

Audio mode offers various interface formats, bit resolutions and Master/Slave configurations. The communication modes supported include the following:

I²S format

I²S mode is where the most significant byte (MSB) is available on the second rising edge of the BCLK following an LRCK transition (Figure 2).

Figure 2. I²S format.
Figure 2. I²S format.

Left-justified format

Left-justified mode is where the MSB is available on the first rising edge of BCLK following an LRCK transition (Figure 3).

Figure 3. Left-justified format.
Figure 3. Left-justified format.

Right-justified format

Right-justified mode is where the least significant byte (LSB) is available on the rising edge of the BCLK preceding an LRCK transition and the MSB is still transmitted first (Figure 4).

Figure 4. Right-justified format.
Figure 4. Right-justified format.

DSP/PCM format

In DSP/PCM mode, the left-channel MSB is available on the first rising edge of the BCLK following a rising edge of the LRC (Figure 5). Right-channel data immediately follows left-channel data. Depending on word length, the BCLK frequency and sample rate, there may be unused BCLK cycles between the LSB of the right channel data and the next sample.

Figure 5. DSP/PCM format.
Figure 5. DSP/PCM format.

PIC32 Audio mode supports 16-, 24- and 32-bit stereo audio data. It also provides advanced error handling with receive overflow and transmit under-run flags and control bits to disable them as required. Also, the clocks generated by the BCLK and the LRC when the PIC32 MCU is a serial Master are free-running, which is essential for uninterrupted audio data transfer in streaming applications.

The audio codecs and DACs allow either or both of these configurations: Master or Slave. In the PIC32 SPI Audio mode Master configuration, the MCU provides the BCLK and the LRCK to the codec. This configuration is shown in Figure 6. In the PIC32 SPI Audio mode Slave configuration, the codec provides the bit clock (BCLK) and stereo phase clock (LRCK) to the MCU. This configuration is shown in Figure 7.

Figure 6. PIC32 SPI-codec configuration with PIC32 SPI as master and the codec as slave.
Figure 6. PIC32 SPI-codec configuration with PIC32 SPI as master and the codec as slave.

Figure 7. PIC32 SPI-codec configuration with PIC32 SPI as slave and the codec as master.
Figure 7. PIC32 SPI-codec configuration with PIC32 SPI as slave and the codec as master.

PIC32 devices also provide a control interface channel over the Inter-Integrated Circuit (I²C) peripheral module. The control interface is used to configure the codec for a specific mode of operation by configuring the control registers on the codecs.

Figure 8. SPI and codec configuration showing the reference clock output signal.
Figure 8. SPI and codec configuration showing the reference clock output signal.

Flexible reference clock output

The PIC32 devices have a flexible reference clock output. The reference clock output module (REFCLKO) can be used to generate the fractional clock that can be used by audio codec/DACs to accommodate various sample rates. Typical examples of these sample rates and associated master clocks are shown in Table 1.

Table 1. Typical master clocks and bit clocks required by the codec/DAC/ADC for different audio sample rates.
Table 1. Typical master clocks and bit clocks required by the codec/DAC/ADC for different audio sample rates.

The REFCLKO module can be used to generate these audio-specific Master clocks and is not limited to just these. The reference clock can be mapped to any port pin on the device if the Peripheral Pin Select (PPS) functionality permits on the specific device. REFCLKO can be configured to the source clock for the SPI module on the PIC32 MCU instead of the peripheral bus clock.

This ensures synchronisation of the USB clock with the SPI channel clock to minimise effects of clock jitter when the USB PLL clock is selected as the source clock for the REFCLKO. The stereo word select signal and the bit clock will be synchronised with the reference clock output since all of them are sourced off the reference clock output.

The clock source for the reference clock output module has various options. It can be selected from USB-PLL, Primary PLL, POSC, FRC, SOSC, LPRC, PBCLK, SYSCLK clocks. Based on the selected clock source, a fractional divisor needs to be configured for the REFCLKO to generate the needed MCLK based on the oversampling rate as noted in Table 2.

Table 2. Baud rate values when the PIC32 SPI is configured as a master.
Table 2. Baud rate values when the PIC32 SPI is configured as a master.

The fractional divisor needed is configured as an integer part and fractional part determined by RODIV and REFOTRIM bits in the REFOCON register, respectively. Figure 9 shows the PIC32 oscillator module with the reference clock output indicating the different clock sources for the reference clock output and the divider with the trimming scheme.

Figure 9. Block diagram of the PIC32 oscillator showing the reference clock output.
Figure 9. Block diagram of the PIC32 oscillator showing the reference clock output.

When the PIC32 SPI is the Master, the BCLK and the LRCK are generated by the device. With REFCLKO as the clock source for the SPI module, the BCLK can be configured by loading the appropriate values in the SPI baud rate generator register, SPIxBRG. If needed, the REFCLKO can provide the master clock input to the codec.

As shown in Table 2, the SPIxBRG register values and the required REFCLKO frequencies for the audio sample rates are 32/44,1/48 kHz for stereo audio data with both 16-bit and 24-bit resolution. The bit clock, BCLK, can be 32 fs or 64 fs. When the PIC32 SPI is the Slave, the BCLK and the LRCK are generated by the codec. The reference clock output can provide the master clock for the codec/DAC.

Example implementation

Let us explore an example of generating a REFCLKO signal. If the required MCLK is 12,288 MHz and if the USB-PLL clock of 96 MHz is used as source for the reference clock output, the required divisor is 7,8125. The divisor needs to be programmed for a half period of 3,90625, which is (7,8125/2). The RODIV bits will be configured with a value of 3 and ROTRIM configured for 464/512 = 0,90625.

Table 4 provides the required RODIV and ROTRIM values needed for the standard audio MCLKs previously listed in Table 1. The clock source selected here is the USB-PLL clock. However, any other clock source can be selected and the RODIV and ROTRIM values can be calculated in a similar manner.

Table 4. Divisor and trimming values for different master clocks.
Table 4. Divisor and trimming values for different master clocks.

The reference clock output can be changed with a fine precision. The unit resolution is between approximately 0,02% and 0,05% based on the generated clock frequency. This ensures that the sample rate deviation is well below the limits required by the codec when the reference clock output is only being used as the SPI clock source and not as the master clock to the codec. If the reference clock output is used as the master clock for the codec, the deviation is only limited by the specification of the codec.

Table 3 shows an example of an audio master clock generated by an expensive codec with flexible internal PLL using a standard external crystal of 12 MHz and the master clock using reference clock output. The data indicate that the master clock generated by the PIC32 reference clock output is very close to the required master clock.

Also, the reference clock output performs as well as or better than the master clock generated using the internal PLL on an expensive codec. This would eliminate the need for a PLL on the codec or DAC used as an analog front end and eliminate the need for an external crystal used as time base for the codec. This will result in cost savings in the audio design while providing the same or better quality than an expensive codec.

Table 5, Table 6 and Table 7 show examples of the signal quality metrics, SNR and THD, for a set of (pure) audio tones generated using external crystal with internal PLL on the codec and using the reference clock output as master clock on the same codec.

Table 5. Observed THD and SNR using codec PLL and PIC32 reference clock output – sample rate of 32 kHz, expected master clock of 8,192 MHz.
Table 5. Observed THD and SNR using codec PLL and PIC32 reference clock output – sample rate of 32 kHz, expected master clock of 8,192 MHz.

Table 6. Observed THD and SNR using codec PLL and PIC32 reference clock output – sample rate of 44,1 kHz, expected master clock of 11,2896 MHz.
Table 6. Observed THD and SNR using codec PLL and PIC32 reference clock output – sample rate of 44,1 kHz, expected master clock of 11,2896 MHz.

Table 7. Observed THD and SNR using codec PLL and PIC32 reference clock output – sample rate of 48 kHz, expected master clock of 12,288 MHz.
Table 7. Observed THD and SNR using codec PLL and PIC32 reference clock output – sample rate of 48 kHz, expected master clock of 12,288 MHz.

The SNR metric or signal-to-noise ratio measures the power of the noise induced compared to the signal. The higher the SNR metric the better is the quality of the signal produced.

The THD metric or the total harmonic distortion measures the strength of harmonic distortion caused by the harmonics of the fundamental frequency. The lower the THD metric the better is the quality of the signal produced.

The SNR and THD data indicates that the quality of audio signal generated by an AFE with reference clock output as the time base performs exceptionally well compared to that of a very expensive codec with internal PLL.

Tuning reference clock output

The reference clock output has the ability to be tuned on-the-fly, and can be tuned in steps between a specified range. The range should be such that it ensures a swing of the sample rate, typically about ±0,2%, which is well below a range that might introduce audible artifacts. For example, a swing of ±0,2% of the data stream with a sample rate of 48 kHz requires a tuning of the reference clock output between the ranges of 12 263 424 Hz and 12 312 576 Hz.

Table 8 lists the required RODIV and ROTRIM values for tuning the sample rate within ±200 Hz of the sample rate. The required master clock column indicates that the sample rate needs to be tuned between the range of 47,9 kHz and 48,1 kHz, ROTRIM needs to be tuned between 459 and 468 and the required RODIV is 3. Similar tuning ranges for RODIV and ROTRIM to tune the reference clock output can be determined for other standard audio sample rates.

Table 8. REFCLKO tuning example with RODIV and ROTRIM values for 48 kHz audio stream.
Table 8. REFCLKO tuning example with RODIV and ROTRIM values for 48 kHz audio stream.

The tuning capability of REFCLKO prevents buffer under-run and over-run and alleviates the audible clicks, as discussed in the next section.

Example 1. RODIV and ROTRIM values for REFCLKO of 12,288 MHz.
Example 1. RODIV and ROTRIM values for REFCLKO of 12,288 MHz.

USB clock mismatch

USB specifications require a tolerance budget and a limit on the USB clock frequency as a way to achieve immunity to radio interference. The USB clock with the allowed tolerance budget results in reduced audio quality if there is a USB clock mismatch.

The real-time streaming audio samples must arrive at precise, regular time intervals so that the DAC can convert the digital samples to an analog signal with the expected constant rate at which it is configured.

The DAC clock that expects and receives the audio samples at a particular sample rate cannot miss even a single sample. A missing sample manifests as a subtle click for the listener, since the DAC fails to generate an accurate representation of the streamed audio signal.

On a microcontroller or microprocessor with an embedded USB module, the USB clock is sourced from an independent clock such as an on-chip PLL with an external crystal oscillator of specific value. Since the clock is not sourced from the USB interface, the mismatch in clocks introduces buffer over-run or under-run, causing audible clicks as shown in Figure 10.

Figure 10. Clock jitter DAC underrun or overrun manifesting as subtle audible clicks.
Figure 10. Clock jitter DAC underrun or overrun manifesting as subtle audible clicks.

An easy solution for the audio data under-run or over-run issue as related to audio-quality degradation, is to use a good asynchronous sample rate converter (ASRC), where the input sample rate is estimated with jitter attenuation, and the internal filters are dynamically tuned for a new sample rate. However, a good ASRC is very expensive and the system still requires a DAC for analog conversion.

As an effective low-cost solution, the USB audio packets are buffered and the clocks of the codec or DAC can be tuned to prevent under-run or over-run using a feedback mechanism. The feedback mechanism monitors the buffer level and ensures it stays within an acceptable range, while achieving at least the same quality achieved by an expensive ASRC.

The reference clock output with USB-PLL clock as its source can be used to generate and tune the required master clock, as discussed in the ‘Tuning reference clock output’ section above. This capability prevents buffer under-run and over-run while maintaining an acceptable DAC sample rate with a swing range of 0,2%. This is the lowest-cost solution compared to the other two, while still achieving high-quality audio.

Audio accessory board and application demonstration

As an example, all of the features discussed in this article are implemented in the PIC32 USB Headset application project available for the PIC32 USB Digital Audio Accessory Board.



Credit(s)



Share this article:
Share via emailShare via LinkedInPrint this page

Further reading:

QuecPython live demonstration
Quectel Wireless Solutions DSP, Micros & Memory
QuecPython allows designers to adapt Quectel’s modules quickly, with a low-code approach to suit their precise requirements in less time and at reduced cost, while maintaining high security standards.

Read more...
Robust and customisable SBC
Altron Arrow DSP, Micros & Memory
Pairing the powerful i.MX8M Plus System on Module (SoM) from SolidRun, which features the i.MX 8M Plus SoC from NXP, this high-performance SBC is set to transform industrial environments.

Read more...
New family supports future cryptography
Altron Arrow DSP, Micros & Memory
NXP has introduced its new i.MX 94 family, which contains an i.MX MPU with an integrated time-sensitive networking (TSN) switch, enabling configurable, secure communications with rich protocol support in industrial and automotive environments.

Read more...
Fast and reliable 4G connectivity worldwide
TRX Electronics DSP, Micros & Memory
Powered by a powerful Quectel LTE Cat 4 modem, the Arduino Pro 4G module’s fast data throughput and high bandwidths ensure reliable and quick data download and upload, even in remote locations.

Read more...
NXP’s all-purpose microcontroller series
Altron Arrow DSP, Micros & Memory
NXP has released its MCX A14x and A15x series of all-purpose microcontrollers which are part of the larger MCX portfolio that shares a common Arm Cortex-M33 core platform.

Read more...
ESP32-P4 SoC
iCorp Technologies DSP, Micros & Memory
Espressif Systems announced its latest SoC, the ESP32-P4 which is powered by a RISC-V CPU, with an AI instructions extension, an advanced memory subsystem, and integrated high-speed peripherals.

Read more...
Microchip SoC FPGA
ASIC Design Services DSP, Micros & Memory
Microchip Technology introduced the RT PolarFire SoC FPGA, the first real-time Linux capable, RISC-V-based microprocessor subsystem on a proven RT PolarFire FPGA platform.

Read more...
QLC Flash memory using BiCS tech
EBV Electrolink DSP, Micros & Memory
KIOXIA announced it had started shipping its 2 Tb Quad-Level-Cell memory devices with its 8th-generation BiCS FLASH 3D flash memory technology.

Read more...
Low noise 3-axis MEMS accelerometers
Altron Arrow DSP, Micros & Memory
The ADXL357 and ADXL357B from Analog Devices are digital outputs, low noise density, low 0 g offset drift, low power, three-axis accelerometers with selectable measurement ranges.

Read more...
ST’s biosensing tech enables next-gen wearables
Future Electronics DSP, Micros & Memory
The highly integrated biosensor device combines an input channel for cardio and neurological sensing, with motion tracking and embedded AI core, for healthcare and fitness applications.

Read more...