Typical engineering design offices are equipped with several PC computers with different processor power, memory available and disc capacity. The computers are usually connected with fast Ethernet. The computing performance is relatively high with respect to the demands of design process, involving CAD, structural analysis etc. This makes the offices well equipped for high performance computing provided relevant parallel software is available.
Parallel computing is nowadays considered a very efficient tool to overcome bottlenecks of traditional serial computing. These bottlenecks relate to both lack of resources (memory, disk space, etc.) and long computational times. Typical parallel application decreases the demands on memory and other resources by spreading the task over several mutually interconnected computers and speeds up the response of the application by distribution of the computation to individual processors. Note however that parallel computing is worth also for applications that require almost no resources but consume an excessive amount of time and for applications that cannot be performed on a single (even well equipped) computer regardless of the computational time. It is important to realize that from engineering point of view the scalability of the algorithm is not the only criterion to judge efficiency of parallel application. In many cases, the ability to analyze extremely large problems not solvable on individual machine is of primary interest.
There are two basic types of parallel applications - single instruction multiple data (SIMD) and multiple instruction multiple data (MIMD). While in the former one the same code is working on different (and distributed) data (typically parallelization using High Performance Fortran, computation on shared memory multiprocessor computer, but also message passing environment) the latter one is based on several codes working on different data (typically message passing applications). Although the MIMD model is more general, the SIMD model is sufficient for wide class of applications.
The solution of complex sophisticated problems to model various phenomena with sufficiently high accuracy and in reasonable time makes the parallel processing attractive for a large family of applications, including structural analysis. However it is important to realize that most of traditional algorithms are inherently not suitable for parallelization because of their development for sequential processing. The most natural way for parallelization is the decomposition of the problem being solved in time or space. The individual domains are then mapped on individual processors and are solved separately ensuring the proper response of the whole system by appropriate communication between the domains. An efficient parallel algorithm requires a balance of the work (performed on individual domains) between the processors while maintaining the interprocessor communication (typical bottleneck of parallel computation) at a minimum.
Since the last decade the parallel computation has become quite feasible due to the following three aspects. Firstly, a lot of new algorithms, suitable for parallel processing, have been developed (including efficient algorithms for domain decomposition). Secondly, the parallel computation ceased to be limited to parallel supercomputers (equipped with high technology for even higher price) but can be performed on ordinary computers interconnected by network into computer cluster. Such a parallel cluster can even outperform the supercomputers (as IBM SP2, SGI Origin etc) while keeping the investment and maintenance costs substantially lower ! And thirdly, several message passing libraries (typically MPI, PVM), portable to various hardware and operating system platforms, have been developed, which allows to port the parallel applications almost to any platform (including multiplatform parallel computing cluster).
The development of parallel computers can be divided into three phases
80s - special and very expensive parallel computers (e.g. CRAY)
90s - parallel computers with standard processor technology (e.g. SP2) and with special expensive interprocessor communication
New millennium - parallel clusters based on standard processors with fast standard communication technology (e.g. INTEL IA64 architecture)