The embedded design world has changed markedly over the last decade, and the progress shows no sign of slowing. Multicore processing – in the form of both symmetric multiprocessing (SMP) and asymmetric multiprocessing (AMP) – is becoming commonplace, with embedded multicore CPU revenue expected to grow sixfold from 2007 to 2011, according to VDC.
In addition, field programmable gate arrays (FPGAs) have grown in capability and gone down in cost, providing high-speed functionality that could once only be achieved with application specific integrated circuits (ASICs). Finally, virtualisation is blurring the connection between hardware and software by enabling multiple operating systems to run on a single processor. With the rapid evolution of these technologies, how can embedded developers possibly keep up? This article explains what these technologies mean for embedded designs, and how developers can take advantage of these changes now while keeping development time to a minimum.
It goes without saying that multicore processing represents an enormous shift in embedded design. With the presence of just one processor core on a chip, embedded designers traditionally have been able to use sequential programming languages such as C even for the most complex of applications. However, the presence of multiple processing cores on one physical chip complicates the design process considerably.
Because most commercial compilers have not advanced to automatically analysing which sections of code can run in parallel, embedded designers looking to take advantage of multicore processors must make use of parallel programming APIs that add overhead to code and are difficult to debug. In addition, traditional sequential programs make it very difficult to visualise parallel routines, creating a big problem for designers inheriting legacy code (or struggling with their own complex applications). If today’s parallel programming is difficult for designers, how will they fare when challenged with the next generation of (16 or more core) processors?
The most obvious solution to this challenge is using better programming tools and methods to abstract away the complexity of multicore hardware. While APIs such as OpenMP and POSIX have become commonplace in parallel applications, newer APIs such as the Multicore Communications API (MCAPI) promise to be more scalable and support a wide variety of parallel hardware architectures (both SMP and AMP). In addition, new tool suites such as Intel Parallel Studio aim to provide better debugging tools than previously available.
Finally, graphical dataflow languages such as NI LabVIEW provide an inherently parallel programming model for SMP that can greatly reduce time to market. The question is, why program serially when the application is supposed to run in parallel? By automatically analysing parallel sections of code and mapping those sections onto multiple threads, dataflow languages allow designers to focus on their main task: developing code quickly and concisely.
Envisioning a typical embedded software design process, a large embedded application likely starts with a flow chart, and then individual pieces of the flow chart are translated into code and implemented. With dataflow programming, a step can be skipped; code can be implemented in parallel as laid out on the flow chart without translation into a sequential language. In this way, investing in parallel programming tools (including new APIs and IDEs that support dataflow languages) will help users make the most of advances in multicore technology for their embedded designs.
Next, FPGAs have changed the way that high-speed and massively parallel embedded designs are implemented, and will no doubt continue to evolve in the future. In the past, implementing custom signal processing routines such as digital filtering in hardware meant designing an ASIC with significant initial design expense. While this may have been cost-effective for high-volume applications, low-volume embedded designs were forced to use a combination of existing ASICs, or run signal processing code on a considerably slower processor in software. FPGAs have been a game changer. Now, embedded designers can simply download custom signal processing applications to an FPGA and run in hardware, at a cost of only tens or hundreds of Rands. In addition, because FPGAs implement embedded applications in hardware, they are by nature massively parallel.
One major challenge embedded developers face is the difference in design tools used to program FPGAs and microprocessors. While many developers are comfortable writing high-level C code (at least for sequential microprocessor applications), FPGA programming is typically done in a hardware description language (HDL) such as VHDL. This fundamental gap in communication between developers can add a major hurdle in the development cycle, especially when FPGAs and processors are both used in a single design.
To solve this problem, a number of tools have been developed to translate C applications into HDL code (such as Impulse CoDeveloper), which enable you to specify applications at a high level and then target those applications to FPGAs. In addition, graphical dataflow languages such as LabVIEW allow users to develop for FPGAs without specific HDL knowledge. Because dataflow provides an inherently parallel approach to programming, it also allows users to take advantage of the massively parallel nature of FPGAs automatically. The message here is simple: using high-level FPGA design strategies (such as dataflow languages and C to HDL translators) can maximise the efficiency of a design team and reduce time to market.
Finally, one of the most recent technologies to enter the embedded scene is virtualisation. The main idea behind this technology is to make better use of processing hardware by abstracting away the details of the specific hardware platform from operating systems and applications. Specifically, one way to use virtualisation in embedded designs is to install a piece of software called a hypervisor, which will allow multiple operating systems to run in parallel simultaneously. This ends up having positive implications on both the overall capability of an embedded system and its use of multicore hardware. In a system with multiple homogeneous processor cores, a hypervisor makes it easy to construct an AMP software architecture where individual operating systems are assigned one or more cores. At a high level, virtualisation technology can be thought of as making multicore hardware multitalented.
Though designers often program entire embedded systems from the ground up, pressure to reduce development time (and therefore cost) has led to higher usage of operating systems in the embedded domain. This, however, presents a problem: how do engineers balance the need for the services and UI provided by a commercial OS with the realtime performance needed for an embedded application? Imagine, for example, designing a medical imaging machine. How can one take advantage of the built-in UI capabilities of an OS such as Linux while processing imaging data in realtime? Using a hypervisor can meet these challenges. By running both a feature-rich commercial OS and a realtime OS in parallel, development time for the embedded applications can be reduced while maintaining determinism.
In conclusion, though trends in embedded technology including multicore processing, FPGAs and virtualisation present a big departure from traditional development techniques, there are some clear steps that can be taken to harness them and stay competitive. First, by adopting programming tools that abstract away hardware features such as multiple processing cores or FPGA gates. By concentrating on implementing a design while spending minimal time making adjustments for the underlying hardware architecture, designers can bring embedded products to market faster.
For more information contact National Instruments, 0800 203 199, [email protected], www.ni/com/southafrica
© Technews Publishing (Pty) Ltd | All Rights Reserved