Complex Software System Hybrid Simulation Model

Bell Communications Research - BELLCORE; Piscataway, NJ

Predicting the performance of a large-scale software system under development is an important task. Performance estimates at the design and development stages are necessary to initiate appropriate hardware purchases, to insure user satisfaction with the system's performance and also to validate both the system design and architecture.

The system studied is an inventory and assignment system being designed for execution within the IMS/MVS environment. It is presently projected that the system will use an IBM 3090/200S, or equivalent, to support a customer of approximately 7 million customer accounts. The DASD required by this system will be approximately 100 gigabytes, or in the order of 170 single density IBM 3380 disks. Daily transaction volume is estimated to be approximately 190,000 complex transactions.

The model uses a technique called hybrid simulation where subsections of a detailed simulation model are replaced with analytical solutions. In this two-layer model, one layer is an analytical model, which represents the behavior of the immediate, and the deferred processor modules. The second layer is a detailed simulation model, which mimics the overall system behavior. The use of hybrid modeling has been a major factor in substantially reducing the simulation run time.

Because of the large size and associated cost, the system is designed to operate at high levels of hardware utilization. The simulation model had to take several considerations into account:

1. Large Number of Processing Events - for the system, a typical transaction requires several hundred I/Os. This results in a large number of CPU service and I/O service events for each transaction. Because of the large number of service events, direct simulation of the 190,000 daily transactions results in an extremely slow and expensive model.

2. Complex Service Time Computations - simulating the system requires computation of service times at the processor modules. This service time is a function of the contention between the processor modules for shared CPU and I/O resources. As a result, the remaining service time for a transaction actively being processed within the model must be continuously modified to reflect changes in the number of busy processor modules.

3. Simulation of Priority Queues and Variable Polling Algorithms - because of the priority polling of the deferred transactions, both the priority class and the order of arrival must be considered when dequeueing transactions to the processor modules. Also, the simulation program has to be modular in construct. This allows different polling algorithms to be substituted without requiring any modification to other parts of the program.

4. Extensive Memory Storage Requirement - for the system studied, there are extended periods where a significant transaction backlog is present. This backlog occurs when the arrival rate of transactions exceeds the service rate of the processor modules. Since modeling of these backlog conditions requires considerable memory, it is possible that a premature termination of the simulation run will occur due to the lack of sufficient memory.

5. Changing System Processing Conditions - because of the varying arrival rates for each transaction type and the effect of priority polling, the transaction processing mixture at the processor modules varies throughout the processing day. Thus, the system processing conditions vary over a wide range during the simulation. This implies that the simulation must be extremely flexible in its ability to model various processing conditions.

6. Rapid Simulation Execution Requirement - it is desirable to be able to simulate a 16-hour processing day within a reasonable simulation time period so that a large number of sensitivity runs can be conducted. A goal of one simulation run per 5 minutes of CPU time was considered reasonable given the simulation facilities available for this study (an IBM 3090/400S computer system supporting a non dedicated general purpose computing environment).

7. Large amounts of Statistical Information - the objective of the simulation model is to provide estimates for a wide number of system performance metrics. These metrics include statistics on: average and maximum response times for different classes of transactions arriving at different hours of the day, average and maximum queue lengths, transaction backlogs at the end of each hour, CPU utilization, and percentage of region occupancy. Processing and maintaining this wide range of statistical data, unless special techniques are used, can result in both a large computational load and the loss of valuable simulation memory space.

8. Large Number of Queues - the system has approximately 450 geographical areas. Since each area also has its own priority queue structure, it is necessary to maintain performance statistics for more than 1000 priority queues.

Special simulation techniques were employed to solve several simulation requirements:

1. Bit Packing - the implementation of bit packing solves the problem of requiring an extensive amount of memory storage and was crucial in meeting a host system partitioned memory constraint of consuming no more than 5.5 Megabytes of memory space during the simulation. The use of bit packing alleviated the memory constraint problem.

2. Mirroring - in a complex simulation model, there may be symmetrical parts. If these parts have identical inputs (i.e. same mean arrival rate and distribution), then the simulation results obtained from modeling one part is representative of what would be obtained from modeling all the parts. That is, the one part can be mirrored to the other parts. The use of mirroring succeeded in reducing modeling complexity while still producing accurate results. The use of mirroring and hybrid modeling together contributed to providing a rapid simulation execution of 2.5 CPU minutes for the average simulation run.

3. Post Simulation of Statistical Analysis - The use of post simulation statistical analysis lessens the burden of collecting and processing the required large amounts of statistical information as a part of the simulation run. Since this technique reduces the computational load that the simulation has to perform during a run, it contributes to the overall effort of providing a rapid simulation execution.

Benefits of SIMSCRIPT II.5: "The complex service time computations called for by the analytical models necessitate the use of a simulation language, like SIMSCRIPT II.5, that is efficient in performing scientific calculations. Another useful feature, available in SIMSCRIPT II.5, is the ability to interrupt a process and easily modify its remaining service time (i.e. by adjusting the TIME.A attribute of the service process)."

Customer Quote: "Using SIMSCRIPT II.5 as the modeling language, the construct supports the use of structured programming and modularity in developing simulation models. With this construct, simulation modules were created one at a time in an orderly and manageable fashion."