In the current implementation, the master and slaves parallel computing scheme (Fig. 1) has been used. In the beginning phase, the input data are read, parsed and checked by the master processor. The constituted model is then broadcast to all slave processors. For each relevant model entity, its approximate load level is calculated on the first available slave processor. The total load level gathered from individual relevant model entities is then broadcast to slaves. After that, relevant model entities are split appropriately into domains using the first available slave processor. Completed domain decomposition, collected by the master together with work load of individual domains, is then again broadcast to slave processors. In the following phase, the dynamic load balancing mechanism is applied. Domain (not yet assigned) with the largest estimated work load is assigned to the first available slave processor and the parametric tree of that domain is built on this processor. After all domains have been processed the complete domain to processor assignment is broadcast to all slaves. In the next step, the boundary tree exchange process is looped on all slaves. For each domain, assigned to that slave processor, the boundary tree structures are extracted and sent to appropriate domains (slaves) which will update their basic tree structure accordingly, and which, if necessary, will invoke further boundary tree exchange. The role of master processor in this process consists in ``listening'' to exchange messages to detect the completion of the process. Once the parametric tree structures have been updated with respect to all boundary tree structures, the mesh generation starts on the slave processors followed immediately by mesh smoothing. The master processor is only notified about the numbers of generated elements and nodes in the domain and on its boundary. These numbers are used to setup the final numbering ranges which are broadcast to all slaves. Each slave processor then perform the renumbering of domains assigned to it. In the final phase, the output data are written on local device of each processor. The domain decomposition is output on the master processor while the mesh data are stored on the slave processors.