| Multicores Affect Algorithm Choices |
|
|
| Dec 01 2006 | |
|
Page 2 of 3
advertisement:
Porting and Accuracy If your computing environment is a cluster of single- or dual-core systems interconnected with a high-bandwidth network, the principle option is to look for relevant component libraries that utilize message passing to achieve performance that scales with the number of systems in the cluster. Alternatively, you can use one of the message passing tools such as MPICH and redesign the most computationally intensive parts of the code to scale with the number of processors. This course potentially may provide the largest performance improvement but it is also the most complex and will not be effective in some applications. Eventually, we may see tools that permit hybrid use of Shared Memory (SMP) and Distributed Memory (DMP) Parallelism, but such complexity is practical only for a few today. (Shared-memory parallel computing often is referred to as the new defining acronym (SMP), previously referred to as symmetric multiprocessing.) Whatever your approach to harnessing the horsepower of new SMP and/or DMP systems, there is a high probability that the applications you use today will be ported one or more times to various combinations new chip architectures, operating systems, compilers, and vendor performance libraries. As an organization, the Numerical Algorithms Group (NAG) has been developing and porting component libraries to new platforms for over 35 years and can say with confidence that porting code is rarely simple. Even without any changes to application logic, the aforementioned changes can introduce errors into the ported application. Whether your code is commercial, open-source, or internally developed, you should realistically assess the challenge of designing it for long life. It’s not enough to have fast software. Speed without sufficient accuracy is a “non-starter,” or as NAG developers often put it, “How fast do you want the wrong answer?” Designing for SMP While there are a number of things you can do to extend the life of an application, the following are a few that are particularly appropriate to computationally intensive HPC applications. For the computational “core” of the application, pay extra attention to the underlying algorithms used in code you write or acquire from others. Often there is more than one algorithm that can solve a particular problem and some will be faster than others, although a method that is computationally faster might not be very robust in handling extreme cases of data or ill-posed problems. A method that breaks easily is likely to be frustrating to users and potentially more troublesome when the application is ported to a new platform. The best advice is to err on the side of robustness since new machines will continue to make code faster. |






