Performance simulation of multi-processor systems based on load reallocation
| Attachment | Size |
|---|---|
| TE_WP2_VTT_01_S2_Performance_simulation_solution_presentation.ppt | 874 KB |
| TE_WP2_VTT_01_S2_Performance_simulation_MThesis.pdf | 1.84 MB |
The solution uses SW load allocation data collected from existing systems as a basis for giving estimations on system architecture design. From existing measurement data, a workload model is generated, which is used as a basis for simulating the system performance with potential system architectures. This gives insight on what is the optimal HW/SW architecture, for example, when moving from single-core to multi-core systems, depending on the system under analysis.
Current systems are mainly based on single HW cores, where all tasks and processes are executed in sequence. The trend in HW is, however, moving towards more multi-core solutions also in the embedded systems domain. In this case, it is not easily possible to predict what gains will be achieved by using adding more CPU cores to the system, due to how the parallel tasks are divided between the cores. Dependencies between tasks and threads of execution, and the nature of the SW applications may prevent the efficient allocation of the processing load to be simultaneously processed on the different cores. It is then important to be able to have an idea of what can be gained by using more multi-core HW architecture solutions or what kind of bottlenecks exist in the SW architecture that prevent the simultaneous and parallel execution of tasks. This information is needed to provide a basis for making informed decisions on how the HW/SW architecture will be designed and what gains can be expected, without the need to build many expensive prototypes early on. This solution addresses this problem through collecting load allocation measurements from existing HW/SW platforms and using simulation, analysis and visualization to provide insight into the various possibilities.
Objective of the method:
The objective is to provide a new, rather highly abstracted method for system designers in a multi-processor domain. The method’s purpose is to aid architecture decisions before prototyping. We will provide a performance simulation for estimating the feasibility of different design alternatives (figure 1). The method will attempt to help the selection of the most favourable ones to be validated with prototypes.

Figure 1.
Step-by-step description:
The current implementation is based on the embedded Linux platform, while the concepts can be applied more widely on different platforms. The solution comprises of five distinct steps (figure 2):
- Workload modelling, which includes the instrumentation of the desired system, measuring performance data from it and finally creating a workload model from the gathered data
- System modelling, which includes exploring the performance related parts of the system, like the degree of co-processing and scheduling, and presenting these with a programming language
- Validation, if possible at all, can be done by executing the simulation model of some existing system and then comparing the simulation results with the realworld system
- Simulation, which means the actual execution of the simulation models with the workload model, and gathering the desired results
- Analysis, which can include three-dimensional visualization of the simulation results and possible comparison with the given performance requirements

Figure 2.
In the instrumentation phase, an execution log is gathered from an existing system by utilizing the Probe Framework, or a SystemTap probe. The information from the operating system’s kernel includes task activations, deactivations and context switches. This data functions as input for the next step, which is the workload modeling. It will automatically create the workload model, which describes the dependencies, execution order and similar properties of the tasks. System modeling means the inclusion of parameters, such as HW resources, scheduling algorithms and similar properties. Validation of the model can be performed against a real system if needed (or possible), to verify that the built model is accurate. Simulation is then the process of consuming the workload model, to give a performance estimate of the chosen HW/SW architecture. The analysis phase provides analysis tools to provide an understanding of the results for the human analyst. The process can be repeated with different parameters to observe the results against the performance and other requirements.
Examples of analysis tools are shown in the table and figure 3 below, which show a table of performance gains when additional cores are introduced into the system and the distribution of load over these cores as a 3D graph (illustrating the quad-core version), respectively. This shows how this system can benefit almost linearly from improvements up to the level of three cores, after which there is no longer sufficient parallelism available and gains are reduced.


Figure 3.
- Login to post comments





















