DSP, Micros & Memory


The three Ps of value in selecting DSPs

19 April 2006 DSP, Micros & Memory

Digital signal processors (DSPs) have become the foundation of many new markets for technology in recent years, and the innovations have only begun. New forms of motor and motion control, automotive systems, entertainment systems, and a vast range of communications are all areas that have been built on the high computational performance of DSPs. Developers in these and other areas increasingly turn to DSPs for realtime signal processing, along with support rivalling that for traditional RISC microcontrollers. As a result, in a short time DSPs have come from being considered esoteric toys played with only by lab scientists to standard tools in developers' workshops.

Availability and familiarity are not everything, though. As developers increasingly turn to DSPs as a solution to their realtime system needs, they are understandably concerned about the value of what they select. The calculus of tradeoffs involved in selecting a processor is complex, and the information provided about the device does not always tell how it will perform in a specific application. From the developer's perspective, selecting a DSP usually comes down to evaluating the three Ps - raw performance, operational power consumption, and the price of the system overall. Without full information, how does the developer decide when to use a DSP and which one to use?

The continuum of metrics

Plenty of information is available, of course. DSP vendors all supply lots of metrics: megahertz, MIPS, megaMACs, milliwatts, MIPS per mW, MIPS per dollar, etc. The problem is that these numbers only provide the sketchiest indication about performance, power, and price in the end application. Even benchmarking standards such as the one from BDTI are only guidelines that exercise the kernel operations of the device in common ways. They cannot predict exactly how the device will measure up in the target system, where the DSP performs a set of operations unique to the application. For developers who are trying to evaluate a DSP versus a RISC or FPGA, the picture is complicated by the different mix of benchmarks among these types of devices. RISC metrics largely ignore the multiply-accumulate (MAC) operations essential to realtime signal processing, while with FPGAs it is gate counts that are truly significant in overall cost.

Figure 1 shows the spectrum of metrics that tell the story of DSP performance. Starting on the left are common specifications that tell how fast the device is operating (megahertz) and how many instructions it handles (MIPS). These measurements, which apply to any processor, are followed by generic DSP specifications of millions of multiply/accumulate operations (MMACs) and billions of floating-point operations per second (gigaFLOPS) that the device handles. Following the continuum to the right are the general operational benchmarks such as BDTI's BDTImark2000, then measurements of how the device handles specific algorithms that will be used for the equipment. (Since the latter are often necessary for evaluating competing algorithms as much as DSPs, the method of measurement can vary along with the device under test.) Finally, the developer creates the benchmarks and uses them to measure the application and end equipment. A similar though less involved spectrum could be shown for power, ranging from the generic mW/MHz to an application-specific measure such as channels per watt.

Figure 1: Spectrum of DSP metrics
Figure 1: Spectrum of DSP metrics

The more technology-general specifications to the left end of the spectrum are easier for vendors to define and measure than the more application-specific benchmarks to the right. The generic metrics are also easier to communicate and, because they are applied across a broad range of products, they tend to be used as the initial basis of product comparison. For the developer, though, generic metrics are the least meaningful; and since they are so readily available and so heavily touted, generic metrics frequently become a source of frustration because they get in the way of the search for application-specific information.

The gap between the developer's need for application-specific information and the vendor's ability to supply it is not just a matter of misunderstanding. In many cases, especially with new types of applications, only the developers themselves can create the benchmarks they need to determine the ultimate value of a given DSP in the system. The more that is known about the application base, the easier it becomes for DSP vendors to embrace these metrics and publicise them. Recognising the disparity between the information they can readily provide and what developers really need to know, DSP vendors are continually seeking ways to find meaningful metrics.

Application requirements

What about the applications themselves? How do their requirements differ in terms of the three Ps? As might be expected, value takes on a variety of meanings, depending on the needs of the system. For example, the designer of a handheld communications device is extremely concerned about its power efficiency and will look for a DSP that is designed to be power efficient, provided that it offers sufficient performance for the end application. On the other hand, the designer of the communications infrastructure equipment that complements the handheld device is less concerned about the power efficiency of a DSP than its raw execution performance - that is, given that the power dissipation is acceptable. Price is always a factor as well, though normally it is more critical with the smaller, mass market systems.

Table 1 lists a number of DSP applications, along with the relative importance (rated 1 to 3) for performance, price, and power dissipation. When performance is the main criterion of DSP selection, the applications tend to be larger, use grid power, and are likely to support multiple channels, or at least multiple tasks, simultaneously. When price is the top priority, the equipment may or may not use grid power, but they are invariably high-volume consumer items. Finally, when power dissipation is the most important factor, the end equipment is generally personal and portable.

Table 1. Requirements for the three Ps in different DSP market segments
Table 1. Requirements for the three Ps in different DSP market segments

Difficulties of benchmarking applications

Two well-known standards used in audio (Dolby Digital) and video (H.264/MPEG-4 AVC) show how difficult it is to create a simple benchmark on application performance. Since the algorithms are well understood, it would seem that they should both provide a straightforward basis of comparison for the performance of various signal processing devices. Unfortunately, that is not necessarily the case.

Audio: First consider the options available in Dolby Digital. At present, the standard supports speaker configurations of either 2.0 (two front speaker channels for traditional stereo) or 5.1 (three front, two back, and one subwoofer for the original home theatre surround sound), and there is only one sample rate choice of 48 kHz. So comparison among DSPs should be simple enough.

But how do the options of Dolby Digital today translate into future needs? Audio is quickly becoming more complex. More channels are being added for side and back speakers, so that there are 2.1, 3.0, 3.1, 4.0, 6.0, 7.1, 9.1, and 10.2 configurations, in addition to the 2.0 and 5.1 options. In addition, the sample rate choices have multiplied to include 32, 44.1, 64, 88.2, 96, 128, 176.4, and 192 kHz, as well as 48 kHz. With all these choices, which configurations and which sample rates will be representative of the market in the future? Which ones should DSP vendors benchmark?

Even if all the possible combinations of these variables could be taken into account, the question remains whether Dolby Digital is representative of other audio coders like DTS and AAC, which have their own array of options. The simple conclusion is that even a well-established industry standard like Dolby Digital does not always give a good indication of how a DSP will perform in an end application.

Video: Audio is relatively simple when compared to video and imaging. Video has many standards. They include MPEG-1, MPEG-2, MPEG-4, and the newest, H.264. This newest standard was jointly endorsed in 2003 by the ITU and ISO. With its breadth of flexibility and capability, H.264 will make it both easier for the industry and, at the same time, even more confusing.

But rather than focus on this latest standard, let us focus on the previous standard, MPEG-4 to reduce the complexity of our discussion. Let us start by a simple overview of four different applications for video.

The applications are: DVD player; DVD recorder; video phone; security camera.

The DVD player is simply a playback machine for video and, in some cases, music. In this application the decoder must be able to handle multiple scene changes in the media, with excellent picture quality, and with multiple audio formats, all at a low data rate.

The DVD recorder must be able to handle the same requirements as the DVD player with the addition of being able to encode a video and audio stream in addition to decoding them. It might also need, at some point in the future, to transcode. The transcode capability would require the product to do simultaneous encode and decode.

The video-phone differs significantly from the DVD recorder/player in several aspects. One significant way is in its need to minimise latency. Another is its lesser demand on video and audio quality. Rather than a D1 or high-definition (HD) video quality, it can use a format such as CIF or QCIF. Finally, a video-phone does not have to handle the multiple scene changes that a movie requires.

Finally, the security system has two aspects. The first is the camera, which is an encode-only device. The second is the infrastructure that may have multiple cameras connected to it. In both cases the video quality is less stringent than for movies in several ways: The image size, few or no scene changes, and lower frame rates. The system might even use JPEG rather than MPEG-4/H.264. Table 2 summarises the differences in these four applications.

Table 2. A summary of four applications of video compression technology
Table 2. A summary of four applications of video compression technology

In the four application examples mentioned in Table 1, it is important to note is that there is a significant difference between movies and video-phones since movies have rapid scene changes that will not be found in the video-phone application, so the performance requirements for the video-phone will be much lower than for movies. High compression for low bit-rate transmission may produce an acceptable image for video-conferencing, but a lower compression ratio with higher transmission bandwidth is usually necessary for entertainment video. Clearly, a standard that covers video-conferencing and entertainment video needs to offer this flexibility.

The H.264/MPEG-4 AVC standard provides this flexibility with support for three profiles: baseline, main and extended. The baseline profile requires the least computation and system memory and is optimised for low latency. It does not include B (backwardly predicted) frames due to the inherent latency, or CABAC due to computational complexity. The baseline profile is a good match for video telephony applications as well as other applications that require cost-effective realtime encoding. The main profile provides the highest compression but requires significantly more processing than the baseline profile, making it difficult for low-cost realtime encoding and also low-latency applications. Broadcast and content storage applications are primarily interested in the main profile to leverage the highest possible video quality at the lowest bit rate. The extended profile, with support for additional features such as graphic elements, has so far generated less interest in the industry.

Once again, as shown in the examples, there are other factors to consider about the application, too. Will the image be full-screen D1, quarter-screen CIF, one-sixteenth screen QCIF, or something else? And will these resolutions correspond to NTSC, PAL, or another standard? Is the image high-definition? Are there 25 or 30 frames per second? Will the scanning be interleaved or progressive? Is a pixel defined by eight bits per colour or more? Will the system encode only, decode only, or encode and decode? Will it include audio processing? If so, which audio standards and options apply?

Table 3 puts these issues in a larger context by showing the full-frame throughput for different compression standards on different networks. These maximum theoretical frame rates for transmitting generic VHS-quality digital video data (352 x 240 pixels) are based on MPEG-4 and H.264. Successive JPEG frames, often called Motion JPEG and frequently used for security networks, are also shown for comparison. Ultimately, the degree of compression required for an application will depend not only on the transmission bandwidth and computational performance available, but also on the quality of image desired. Since any given video platform may be used in more than one application, DSP performance metrics for several of these potential end uses of a system may be valuable to the developer.

Table 3. Frame rates for network bit rates and compression ratio combinations
Table 3. Frame rates for network bit rates and compression ratio combinations

Obviously, system developers would like to have as much information as possible about how DSPs and other processing elements can perform H.264/MPEG-4 AVC compression and decompression, but the success of the new standard in improving performance is also the source of difficulty in providing benchmarks for it.

Table 4 provides some of these metrics for a widely used DSP, the Texas Instruments TMS320DM64x, operating at 600 MHz. The percentage of the processor cycles used for performing H.264 and MPEG-4 is shown, along with JPEG, MPEG-2, and Windows Media Video 9 for comparison. Note that these benchmarks are based on typical test data for existing implementations or detailed performance estimates. Encoder implementations can also vary significantly depending on the feature set invoked. In other words, this type of data, while well researched, is only a guideline for the developer and not definitive for the end application.

Table 4. Percentage of DM64x DSP cycles required at 600 MHz
Table 4. Percentage of DM64x DSP cycles required at 600 MHz

Targeted DSPs

In the last few years, DSP design has come a long way toward meeting the needs of different application areas. Since the mid-'90s, DSP vendors have developed specialised architectures designed for different mixes of the three Ps. Many of the metrics for these architectures, though still not application-specific, have become inherently closer to what developers want to see projected about the DSPs in end use.

Some architectures, designed for handheld applications such as cellphones and PDAs, are focused on keeping power consumption extremely low while performance and price stay reasonable. Others are based on a very-long-instruction-word (VLIW) data path to achieve massive parallelism, enabling extremely high performance while power consumption per channel and price per MMAC are reasonable. These VLIW DSPs benefit multichannel systems such as wireless base stations, video servers, routers, DSL and other telecom concentration units, and so forth. Still others focus on system price while offering reasonable performance and power by integrating the specific memory configurations and sets of peripherals needed for motors, uninterruptible power supplies and other embedded control applications. Commonly known DSPs that serve as examples of these three respective architectures are the TMS320C55x, C64x, and C28x DSP families from Texas Instruments.

Having some background about the intent of the architecture can be useful, since performance metrics are seldom as specifically targeted as developers would like. The choice of signal-processing engine must also include factors such as whether the DSP vendor has expertise in the application area, what kind of support is offered, availability, and so forth. DSP vendors work hard to make the technical information available and as relevant to developers' needs as possible. But in an industry that changes so quickly, as soon as performance metrics can be nailed down, they are often out of date. In the end, no matter how much technical data is available, choosing a solution depends to some degree on the developer's subjective judgment concerning how the device balances the three Ps of value.





Share this article:
Share via emailShare via LinkedInPrint this page

Further reading:

QuecPython live demonstration
Quectel Wireless Solutions DSP, Micros & Memory
QuecPython allows designers to adapt Quectel’s modules quickly, with a low-code approach to suit their precise requirements in less time and at reduced cost, while maintaining high security standards.

Read more...
Robust and customisable SBC
Altron Arrow DSP, Micros & Memory
Pairing the powerful i.MX8M Plus System on Module (SoM) from SolidRun, which features the i.MX 8M Plus SoC from NXP, this high-performance SBC is set to transform industrial environments.

Read more...
New family supports future cryptography
Altron Arrow DSP, Micros & Memory
NXP has introduced its new i.MX 94 family, which contains an i.MX MPU with an integrated time-sensitive networking (TSN) switch, enabling configurable, secure communications with rich protocol support in industrial and automotive environments.

Read more...
Fast and reliable 4G connectivity worldwide
TRX Electronics DSP, Micros & Memory
Powered by a powerful Quectel LTE Cat 4 modem, the Arduino Pro 4G module’s fast data throughput and high bandwidths ensure reliable and quick data download and upload, even in remote locations.

Read more...
NXP’s all-purpose microcontroller series
Altron Arrow DSP, Micros & Memory
NXP has released its MCX A14x and A15x series of all-purpose microcontrollers which are part of the larger MCX portfolio that shares a common Arm Cortex-M33 core platform.

Read more...
ESP32-P4 SoC
iCorp Technologies DSP, Micros & Memory
Espressif Systems announced its latest SoC, the ESP32-P4 which is powered by a RISC-V CPU, with an AI instructions extension, an advanced memory subsystem, and integrated high-speed peripherals.

Read more...
Microchip SoC FPGA
ASIC Design Services DSP, Micros & Memory
Microchip Technology introduced the RT PolarFire SoC FPGA, the first real-time Linux capable, RISC-V-based microprocessor subsystem on a proven RT PolarFire FPGA platform.

Read more...
QLC Flash memory using BiCS tech
EBV Electrolink DSP, Micros & Memory
KIOXIA announced it had started shipping its 2 Tb Quad-Level-Cell memory devices with its 8th-generation BiCS FLASH 3D flash memory technology.

Read more...
Low noise 3-axis MEMS accelerometers
Altron Arrow DSP, Micros & Memory
The ADXL357 and ADXL357B from Analog Devices are digital outputs, low noise density, low 0 g offset drift, low power, three-axis accelerometers with selectable measurement ranges.

Read more...
ST’s biosensing tech enables next-gen wearables
Future Electronics DSP, Micros & Memory
The highly integrated biosensor device combines an input channel for cardio and neurological sensing, with motion tracking and embedded AI core, for healthcare and fitness applications.

Read more...