Home >> Features >> Programming Network Processors Made Simple
Attention: open in a new window. PrintE-mail

Programming Network Processors Made Simple

advertisement:

The telecommunications industry’s continuous strive for higher performance has spurred innovations in processor architectures. The general trend has been to go parallel; adding more cores to a single processor device and then dividing tasks between them. This has resulted in a more complex environment for software engineers to master. But does this mean that the programming of next-generation network processors (NPUs) has to be difficult? Not necessarily.

Figure 1. Programmers of complex NPUs tend to spend most of their time in the process of test and code re-write to achieve sufficient performance.
Figure 1. Programmers of complex NPUs tend to spend most of their time in the process of test and code re-write to achieve sufficient performance.
Some of the complexities that emerge due to the parallel concepts in modern processors were recently addressed by Sebastien Maury and Dr. Peter Robertson in Embedded Technology on March 1, 20101. Their conclusions, among others, include that processors should be kept simple and that shared memory basically should be avoided due to the complexity of memory consistency and cache coherence. We agree with this view when it comes to special purpose processors. In this article, we will provide an example of an architecture that has taken a different approach to parallelism, one which keeps the programming model of the uni-processor intact and utilizes resources very effectively.

When the Task is Hard

The processing demands on modern NPUs are very high. In an application designed for 100 Gbps processing, the NPU must be able to handle 150 million packets per second. In such an application, thousands of packets are typically being processed concurrently by the device. The amount of parallelism is extreme compared to any other application in the IT industry. Another unique attribute of NPUs is the demand for extremely high table memory lookup rates.

In packet processing, every network service (User and control/OAM traffic) requires a unique set of operations per packet (classification, filtering, counting, metering, policing/shaping and forwarding). A network service may require hundreds or even thousands of operations before they are eventually forwarded to outgoing interfaces or to the conftrol CPU.

With the networking industry’s unique set of performance requirements, next generation NPUs are designed to solve very specialized problems. They don’t compete with general-purpose CPUs, but offer a programmable alternative to in-house developed fixed-function ASICs.

The Shift Towards NPUs

There is currently a shift toward merchant silicon in high-end networking, mainly driven by the ability to shorten time-to-market and focus research and development (R&D) expenses on differentiation through software rather than through more risky ASIC designs.

As ASIC designs are shifted out in favor of NPUs, some R&D managers raise concerns regarding the complexity of these devices. Are they difficult to program? Can performance and an intuitive uni-processor programming model be combined?

In 2004, Larry Huston of Intel (at the time Intel was a main player in the NPU space) ended a paper with a statement which carries as much meaning today as it did six years ago2:

“The ideal scenario would have a programmer write an application as a single piece of software and the tools would automatically partition and map the application to the set of parallel resources. This may be a difficult goal, but any steps in that direction will improve the life of a developer.”



>> Newsletter

Subscribe today to receive the INSIDER, a FREE e-mail newsletter from NASA Tech Briefs featuring exclusive previews of upcoming articles, late breaking NASA and industry news, hot products and design ideas, links to online resources, and much more.

Your name:

Your email:

Please Subscribe me to the Insider