当前位置:文档之家› Rigorous Analysis of (Distributed) Simulation Results, submitted to Distributed Systems Wor

Rigorous Analysis of (Distributed) Simulation Results, submitted to Distributed Systems Wor

Rigorous Analysis of (Distributed) Simulation Results, submitted to Distributed Systems Wor
Rigorous Analysis of (Distributed) Simulation Results, submitted to Distributed Systems Wor

Rigorous Analysis of (Distributed) Simulation Results

O. Kremien1 , J. Kramer2 , M. Kapelevich1

Abstract

Formal static analysis of the correctness and complexity of scalable and adaptive algorithms for distributed systems is difficult and often not appropriate. Rather, tool support is required to facilitate the 'trial and error' approach which is often adopted. Simulation supports this experimental approach well. In this paper we discuss the need for a rigorous approach to simulation results analysis and model validation. These aspects are often neglected in simulation studies, particularly in distributed simulation. Our aim is to provide the practitioner with a set of guidelines which can be used as a ‘recipe’ in different simulation environments, making sound techniques (simulation and statistics) accessible to users. We demonstrate use of the suggested analysis method with two different distributed simulators (CNCSIM [8]) and (NEST[3]) thus illustrating its generality. The same guidelines may be used with other simulation tools to ensure meaningful results while obviating the need to acquire more detailed knowledge in the area.

Keywords:Confidence intervals, distributed algorithms, independence, initialization bias, performance, scalability, simulation, statistical analysis, steady state.

1. Introduction

Distributed systems consist of multiple computer nodes interconnected by a communication network and supporting multiple distributed applications. A distinguishing feature of distributed systems is the inability to maintain consistent global state information at distributed points of control. A distributed system can thus be viewed as a collection of distributed decision makers taking decisions

1Department of Mathematics and Computer Science, Bar-Ilan University, 52900 Ramat-Gan, ISRAEL.

email: {orly,michael}@macs.biu.ac.il

2 Department of Computing, Imperial College of Science, Technology & Medicine, 180 Queen’s Gate,

London SW7 2BZ, U.K. email: jk@https://www.doczj.com/doc/3b4796470.html,

to achieve common goals under uncertain local and partial views of the system state. Formal static analysis of the correctness and complexity of such systems and associated algorithms is usually difficult, and a "trial and error" approach is often adopted. This approach may be supported by either a simulation based tool or a prototype implementation. Tool support is required which will enable rapid prototyping and assessment of different ideas, and allow one to determine whether functional and performance goals are likely to be met. Simulation supports this approach. A simulation based tool may be based on a centralized simulator or on a distributed one making use of multiple processors which may be available.

A simulation is a computer based statistical sampling experiment. Thus, if the results of a simulation study are to have any meaning, appropriate statistical techniques must be used to design and analyze the simulation experiments. This paper studies methods for simulation results analysis and model validation which can be automated and put into the tool selected for algorithm prototyping and assessment. Our interest in automated methods is stimulated by the existence of simulation packages (e.g. NEST [3] and many others) that are used in different application areas by a wide population of practitioners who have little knowledge or interest in the intricacies of simulation results analysis. The practitioner is essentially interested in his application and in obtaining meaningful results. Our aim is to provide a set of guidelines which may be used in different environments (centralized or distributed) in order to aid the process of simulation planning and output analysis. We also illustrate by example steps which should be taken in order to validate the tool selected. The reader need not study all the technical details involved and can avoid gaining deep knowledge of different statistical techniques. However, by using the analysis method advocated, we expect to provide guidance as to how to produce and interpret the simulation output. This is intended to help bridge the gap that exists between the background in statistics which practitioners may actually have and that required. By using the suggested guidelines as a ‘recipe’, the lengthy process of selecting the right statistical techniques out of a big collection of available ones is skipped. The building blocks forming our approach are taken from different sources, generalized and then put under the same ‘umbrella’ so that the reader has at hand all statistical tests required for rigorous analysis. We believe the main contribution of this paper lies in providing sound techniques accessible to users, using experience to illustrate and demonstrate the utility of the approach. In addition, we also supply the reader with a sample implementation of the already tested methods on our Web site, the address of which is attached below1. Guidance as of possible use as well as error situations more often encountered in a distributed environment are also included.

1http: //www.cs.biu.ac.il/~dsg/statistical_tests.html

We describe the use of tools for implementing the advocated analysis method on a load sharing algorithm [10] designed to be scalable and adaptive. The same tool and analysis method can serve to analyze the results of other distributed algorithms: this particular example is chosen for illustration only. The statistical analysis method advocated was already used in several cases where users could directly contact the authors to get the required information. Positive results of this experience together with the generality of the approach motivated publishing this paper.

In Section 2 the reasons for choosing to develop a simulator are explained. Section 3 discusses some concerns regarding statistical inferences of simulation studies in general, giving guidelines for centralized or distributed systems. Two examples of very different simulation tools implementing the suggested techniques are given in Section 4. The model is first validated and then results from the literature available from the first tool are reproduced by the second. Conclusions and directions for future study are presented in Section 5.

2.Selecting a Tool for Rapid Assessment

2.1Required Properties

Simulation provides good support for the “trial and error” approach. It permits flexible modeling and control over model constraints [2]. Thus, simulation can facilitate the study of a particular aspect of the system (e.g. network delay) and its effect. Tool support is required to determine if functional and performance goals are likely to be met by the implementation of a suggested solution. The following tool properties are generally desirable:

?Minimize effort required for algorithm assessment.

?Minimize effort required for transition to the target environment.

?Independence of the simulation environment.

Whereas the following results from specific algorithm requirements:

?Scale-up.

Unlike distributed systems, failure of any component of the simulation can generally be allowed to cause termination of the overall computation, since aspects such as performance can no longer be assessed. Thus, fault tolerance of the simulation is generally not required.

2.2Simulation vs. Implementation

One possible approach is to prototype an implementation. This is the most realistic. Unfortunately, implementation can be very costly and time consuming. Also, with an implementation it may not be

feasible to thoroughly study various parameters of the system.

On the other hand, simulation can aid in the modeling and evaluation of distributed algorithms. One has control over the parameters (e.g. system size) and events (e.g. application arrival) of the system under scrutiny. Since simulation is performed in simulation-time, aspects such as network delay can be controlled. Simulations are also repeatable and a simulator can be used to compare different approaches to the problem under scrutiny. One expects to identify sensitivities of an algorithm to aspects such as the size of the system, amount and frequency of state information shared and other parameters.

In short, although simulation is less realistic than implementation, it better supports experimentation. This approach should be adopted for assessment and analysis. Nevertheless, a prototype implementation can be useful as a complement to simulation to discover certain error situations which may not have been taken into account in a simulation. A smooth transition from simulation to prototype implementation can shorten this process.

However in performing a simulation, care must be taken in obtaining and interpreting the results. Virtually all simulation output variables are nonstationary (the distributions of the successive observations change over time) and autocorrelated (the observations in the component are correlated with each other). Thus classical statistical techniques based on independent identically distributed observations are not directly applicable. Statistical techniques applicable for simulation are discussed next.

3.Statistical Techniques

Statistical techniques are used to analyze the simulation experiments: to decide when steady-state has been reached, how long a simulation run should be and also to analyze the results themselves including determination of the accuracy of the simulation run response (e.g. the average response-time). These are discussed below.

3.1Run Length Control

One objective of simulation output analysis is to estimate some unknown characteristic or parameter of the system being studied. The experimenter often wants not only an estimate of this parameter value, but also some measure of the estimate’s precision. Confidence intervals are widely used for this purpose [11,15].

The initial state of the simulated system must be specified each time the program is run. It is often inconvenient or impossible to ensure that these initial conditions are typical. As a result, estimation of the steady-state simulation response is complicated by the possible presence of initialization bias. For example, the queues in the system are usually of a certain non-zero length whereas initially the

queues are empty (of length zero).

When simulating a single variant (measure of performance that is of interest) in a steady-state simulation, the desired measure of performance is defined as a limit as the length of the simulation tends to infinity. The following questions arise:

?How long should we run the simulation program?

?In which state should we start the simulated system, and how can we determine that simulation has reached steady-state? In other words, how long should we run the simulation for it to “warm up”?

?Once we have stopped the simulation run, how can we determine the accuracy of the simulation run response (e.g. accuracy of the mean response-time estimate).

?How do we account for there being a multiplicity of resources (e.g. processors) rather than a single one?

Solutions to these problems are based on mathematical statistics. As already mentioned, when analyzing single or multiple server systems, responses are autocorrelated and we cannot assume they are normally and identically distributed. Several approaches can be applied for steady-state performance analysis of such systems like: batching, replication and regenerative (renewal). The aim of these approaches is to create (almost) independent, normally and identically distributed responses for which classical techniques apply.

With the batching approach, we make a single extremely long run. We discard the initial (possibly biased) part of that single run, and divide the remaining (much larger) part of the run into a number of subruns or batches of fixed length. Our estimator of mean response time is equal to the average of the batch averages. When using the replication approach, we start with a very long run, and obtain a single observation on the steady-state response (the average response-time for this run). To obtain the next observation, we start all over again using a different random stream, so that the responses are independent. With the regenerative (renewal) approach, we define a system state (e.g. all queues are empty) and run the simulation until this state is reached. This process is repeated n times, and produces n independent cycles.

The advantages of the regenerative approach is that it creates truly independent responses, and that we need not discard initialization bias. The main disadvantage is that it is not clear at all how long it will take before the renewal state is reached, especially in a distributed environment. The replication approach may also require very long runs. For the batching approach we have to detect initialization bias only once. The independence test must be applied [6], but this is not expected to put high demands on the system. We therefore tend to recommend the batching approach which we expect to

be less demanding than the other approaches.

The run length control procedure proposed by Heidelberger and Welch [5] is extended to become applicable also in a distributed (decentralized) simulation environment. This is described next.

3.2. Batching Approach - Guidelines

In the following analysis it is assumed that the pseudo-random number generator works satisfactorily (i.e. pseudo-random numbers are truly independent). As an illustrative example, we use load sharing and assume that the measure of interest is the mean response time.

The batching approach to steady-state performance analysis of simulation output is used. The initial (biased) part of a very long run is thrown away, applying Schruben et al. test [16], and the remaining is divided into a number of batches. If batches are independent (i.e. pass Von Neumann test for independence - appendix B), the confidence interval for the chosen estimator can be calculated. It is recommended that at least 100 batches from the second half of a run are used [6,15,16].

In order to control the run length, this methodology is combined with a run length control procedure proposed by Heidelberger and Welch [5]. There are six basic parameters of interest:

1. An initial batch size. (we initially set batch-size=8 observations ),

2. An initial end of run checkpoint j1. This provides an initial run length for our search for a

sequence of batches.(we use j1 =200 batches),

3. A maximum run length of j max batches. This gives a run upper bound for our search for a

sequence of batches. (we use j max =500 batches),

4. A multiplicative checkpoint increment parameter I. This is used to extend the current end of

run checkpoint j k if necessary, such that j k+1=minimum{I×j k, j max}, where k is the checkpoint index. (we use I=1.5)

5. Confidence level required α.(we use α=0.10).

6. A relative half-width requirement ε.(we use α/2).

A batch response is the results from a batch. In order to minimize our run, we are looking for the smallest values of n such that the sequence of batch responses {X(n) | n=n0+1,..., j k} is a sample from a stationary sequence (i.e. behaving as if batch responses or means were truly independent), where X(i) is the i th batch in the sequence, n0 is the start and j k is the checkpoint marking the end of the run. If the number of batches in the sequence is large enough (at least 100), then valid

confidence intervals (with confidence-level α) result. In small samples, however, the intervals may miss the true mean with a probability higher than α. The approach is thus to find a potentially stationary sequence and check for independence. If positive, it is expected that the confidence interval generated will meet the required accuracy.

Step 1:Initialization-Bias Detection (Transient Test)

1. First set k=1 (the checkpoint index) and n0=0. We thus test the sequence {X(n) | n=1,..., j1} by applying the Schruben et al. test. The hypothesis is that the mean of a sequence does not change throughout a run. This is tested by checking the behavior of the mean cumulative sum process Sm=Yn?Ym. S is highly sensitive to the presence of initialization-bias. A two-sided test is applied in our case since imposing load on ready queues of nodes may cause them to grow whereas applying a load sharing algorithm may have the adverse effect. Implementation details are included in Appendix A and full details may be found in [16]. If the sequence passes the test (there is no initialization bias), then we let n0=0, i.e. we take the entire sequence as the stationary portion. We estimate the standard deviation for the Schruben et al. test from the second half of the data in order to reduce possible bias [6,16].

2. If it fails the test, then some observations from the initial (biased) portion of the sequence must be deleted. In our implementation, we remove the initial 10% (i.e.. n0=j1/10), and apply the test again to the truncated sequence {X(n) | n=[j1/10]+1,..., j1}. If this truncated sequence fails the test, an additional 10% is removed. We repeat this process until we have a satisfactory value for n0, or until it is concluded that there is not a sufficiently long stationary portion with which to generate the required confidence interval.

3. If we could not select a satisfactory value for n0, we set k=k+1 (using the checkpoint increment parameter) and simulation proceeds to the next checkpoint. For instance, in our case j2=minimum{I×j1, j max} = minimum{1.5×200, 500} = 300, so we run the tests again on the new sequence of length 300. The process described above is repeated at the new checkpoint in exactly the same fashion and it is independent of the results from previous checkpoint, i.e. a new decision is made from the entire new sequence without regard to the results of the initial-bias test at the previous checkpoint. The simulation continues from checkpoint to checkpoint until either a stationary sequence is found (and n0 selected) or j max is reached.

4. If j max was reached and we could not select n0, we double the batch size and start the whole process all over again. For example, if the batch size was first set to 8 we double it to 16 and restart from the beginning.

Step 2: Independence Test

If a stationary sequence was found (i.e. we have a big enough sample from steady state), we now check for its independence applying the Von Neumann test. Implementation details are included in appendix B. If the hypothesis of independence is rejected:

1. If j max has not been reached, we set k=k+1 (next checkpoint) and go back to Step 1.

2. If j max has been reached, we double batch size, set k=1, and start the process all over again from Step 1.

Step 3: Confidence-Interval Accuracy

A confidence interval can now be generated from the stationary, independent sequence. We compare the estimated relative half width (ERHW) of the confidence interval with the mean response time estimate (the average over the whole sequence):

ERHW=confidence_interval_width

2m where m is the mean estimator.

There are three cases:

1. If ERHW≤ε, the simulation stops.

2. If ERHW>ε, and j max has not yet been reached, we set k=k+1 and go back to Step 1.

3. If ERHW>ε, and j max has been reached, we double batch size, set k=1, and start the process all over again from Step 1.

3.3Special Concerns for the Distributed Case

For distributed (or decentralized) simulation this is further complicated by not all nodes detecting initialization bias or completing gathering the required number of batches at the same time.

We refine the guidelines as follows:

a. we set the number of batches and the batch size to the same value for all nodes.

b. when a node has collected the required number of batch responses, it will forward this information to the monitor or any centralized entity gathering statistics from all nodes, but will stop running only after all nodes have collected the required number of observations. A system wide batch is the average of the corresponding batches of all nodes.

The run length control procedure can enhance a simulator. It is implemented in both CNCSIM and NEST which are discussed next.

4.Illustrations of the Suggested Statistical Techniques

We illustrate use of the suggested techniques with two simulation tools differing in their characteristics but with a common aim: prototyping and evaluation of distributed load sharing algorithms.

4.1Simulation Initialization

Before a simulation run can be started (with any tool), the user has to set some system parameters characterizing that run. Following the setting of these parameters, the user has to define the load intensity on each node of the system. Finally, all parameters relevant to the simulated load sharing algorithm have to be set and also those relevant to some assumptions made (e.g. network delay). A simulation run can then be started. Several distribution functions implemented in the simulator facilitate the setting of some of these parameters.

4.2Some Performance and Efficiency Measures

Each of the tools implements the statistical technique described earlier and also the measures detailed in [9] for performance and efficiency evaluation. In our case, the most important measure is:

? average response-tim e. This is specified in Fig. 1:

Fig. 1: system performance (response time)

This measure together with those detailed in [9] permit objective evaluation and comparison of load sharing algorithms. CNCSIM was designed especially for this purpose whereas the second is a general NEtwork Simulation Testbed (NEST). The tools also implement the statistical inference

models described earlier in this paper. The tools used are briefly described next.

4.3.C N C S I M

CNCSIM [8] is a decentralized discrete event-driven simulator implemented in CONIC. CNCSIM comprises several modules, some of which encapsulate the load sharing algorithm. There are few predefined entrypoints for routines and data-structures which have to be programmed by the algorithm designer. These entries provide for algorithm binding, and describe the algorithm dependent procedures (initialization, control decision taking, and state update messages reception/transmission), data-structures (algorithm parameters, messages exchanged by interacting entities), as well as assumptions made by the algorithm designer (e.g. network-delay model). Thus, the distributed algorithm can be changed simply by module replacement and reconfiguration. It provides predictions of response-time, and other performance measures as a function of offered load, and allows tracing of the progress through the system of each message/event to facilitate model validation and analysis.

4.4.N E S T

The Network Simulation Testbed (NEST) is a graphical environment for simulation and rapid prototyping of distributed algorithms developed at Columbia university and available to the public. NEST is embedded within a standard UNIX? environment. Multiple threads of execution are supported within a single UNIX process. The overhead associated with context-switching is thus significantly reduced. Therefore NEST can support large simulations (scores to hundreds to thousands of nodes), making it especially attractive in our case where scalability has to be shown.

In NEST, there are no built-in functions for statistics gathering, but since the user can easily program the functions that run on nodes (using NEST’s node-functions) and pass data over links (using NEST’s channel-functions), relevant statistics can easily be gathered.

4.5.Model Validation

The simulation model and associated tool were validated against some analytic models (queuing theory models). In addition, the results of published load sharing algorithms were reproduced. These are described next.

Queuing Theory Models

As a first step in the direction of validating our model, first by CNCSIM and later on by NEST, we tried reproducing results of very simple queuing models making some simplifying assumptions.

No cooperation is tried first: this is the simplest case to prove that the simulator is valid. Few cases

were run (under CNCSIM and NEST) and compared successfully to M/M/1 results.

The other extreme of full cooperation (ideal) is represented by the M/M/n model. This model implicitly (and unrealistically) assumes accurate system state information at no cost. Few cases were run and compared successfully to M/M/n results.

To further illustrate our method, we select an example from the literature [10]. Results have been reproduced by both CNCSIM and NEST. This is discussed next.

4.6.Production of Initial Results

Reproduction of earlier results was important for gaining more confidence in the package selected to run the algorithm under study. The results published for FLS in [10] were produced using CNCSIM. Later on, CONIC [12] was replaced by REX [7], which was recently replaced by Regis [13]. It had to be decided whether to convert CNCSIM to each of these distributed programming environments or turn into a more independent and general environment. The second option was chosen, which is discussed in the next section.

In order to be able to reproduce results we need to have a detailed description of all inputs and assumptions made. This enables the reproduction of output. For the reader to have a complete example, we include a detailed description of the required information, so that the process can be easily repeated in other environments.

Assumptions

The following assumptions are used throughout our analysis. Applications are processed in first-come-first-served order and are run to completion. A system is first analyzed subject to an even load on its nodes (Table 1). A simulated system of 5 evenly loaded nodes consists of 4 node types. Each node type is characterized by cpu demands, arrival rates and application size distribution functions. For a larger system all we need to do is multiply the #nodes as appropriate. We also study the case of uneven or extreme load imposed on the system (Table 2). Both cases have the same overall load

Each node in the system has a communication co-processor, and is also equipped with a hard disk. Delay in the network is modeled as a function of information size - the size of the information to be sent divided by packet size (1K bits) multiplied by the average delay per packet. Application size is taken from the distribution function described in Table 3.

Packet delay, i.e. preparation (packaging), transmission and reception (unpackaging) is assumed to be 10ms, and the basic cost per invocation for these very simple algorithms is assumed to be 10ms. These algorithms involve internal load state observation. The cost associated with such observation or preprocessing of state information is assumed to be 10ms. An internal state observation is taken

over the last second every 200ms. Note that when conducting such a study, the ratio between the requested service time and various delays is more important than the absolute numbers. Also, load assumptions are drawn from exponential distributions which are not always realistic. Nevertheless, from this study we are able to draw some useful conclusions.

_____________________________________________________________ _____________________________________________ #nodes of resulting intensity

this type CPU requested (sec) arrival rate (per sec) per node

_____________________________________________________________ _____________________________________________

node-type 11exponential(0.5)Poisson(1.53)76.5%

node-type 22exponential(0.7)Poisson(1.25)87.5%

node-type 31exponential(0.6)Poisson(1.43)85.8%

node-type 41exponential(0.5)Poisson(1.43)71.5%

_____________________________________________________________ _____________________________________________

overall load81.76%

_____________________________________________________________ _____________________________________________

Table 1: Even load (71.5-87.5%)

_____________________________________________________________ _____________________________________________ #nodes of resulting intensity

this type CPU requested (sec) arrival rate (per sec) per node

_____________________________________________________________ _____________________________________________

node-type 14exponential(0.695) Poisson(1.43)99.3%

node-type 21exponential(0.093) Poisson(1.25)11.6%

_____________________________________________________________ _____________________________________________

overall load81.76%

_____________________________________________________________ _____________________________________________

Table 2: Uneven load (11.6-99.3%)

_____________________________________________________________ ______________________________________________ percentile size percentile size percentile size

_____________________________________________________________ ______________________________________________ 10100070180009534000 201200080200009838000 401400085220009944000

6016000903000099.550000

_____________________________________________________________ _____________________________________________

Table 3: Application size distribution

4.7.Reproduction of Published Results with NEST

In NEST the monitor process is responsible for gathering global system statistics (for initial bias detection, batches independence test and confidence interval calculation) and for simulation termination. A predefined number of observations is set to constitute a single batch. As in CNCSIM, the statistical test may be performed only when all the nodes of the system accumulate the required number of batches. The statistical tests could be easily embedded in the NEST simulation environment, resulting in a simple yet powerful method of automatic control over

simulation of a decentralized system. It gives us confidence in the correctness of the obtained results, which is especially hard to obtain in decentralized simulation. The same methods are being used for simulating different extensions to the original FLS algorithm, one of which is proximity handling.

Very similar results to those produced with CNCSIM were recently reproduced with NEST [3] as shown in Fig. 2: response-time of 1.95ms for the even load case and 2.39ms for the uneven load case were found by CNCSIM and 1.96ms and 2.39ms respectively produced by NEST. This is

very encouraging.

F L S R e s p o n s e T i m e 0

0.5

11.522.5

even

load uneven load

Fig. 4: Performance Results Reproduction

5. Conclusions and Future Work

A simulation is a computer based statistical sampling experiment. Thus, if the results of a simulation study are to have any meaning, appropriate statistical techniques must be used to design and analyze the simulation experiments. This side of simulation is often neglected by many simulation studies.We believe it is important for obtaining statistically valid results. We wish to avoid making erroneous inferences which simulation, if not carefully conducted, can easily lead to. In this paper emphasis was put on rigorous analysis of simulation results.

Any simulation tool selected for algorithm assessment should be viewed as a framework in which one can easily plug distinct distributed algorithms to be studied. In our case we used load sharing algorithms. The tool should also include the advocated analysis method, enable one to test the algorithm under study in different configurations, and facilitate subsequent implementation. To have confidence in the results produced, they should be statistically correct. Also, the simulation model itself needs to be validated against some analytic models (queuing theory models) when possible.To add to our confidence in the selected tool, results of a published load sharing (or any other)

algorithm produced by one tool should be reproduced by another. The practitioner is usually interested in his application and much less in simulation specific or statistic details. However he is interested in obtaining statistically valid results. Our aim is to supply the practitioner with a set of guidelines which may be easily followed in different environments. We also supply all details required for running the specific example selected (FLS). This is done to enable results reproduction in different environments.

An analysis method was presented and its generality illustrated by using it in two different simulators, CNCSIM and NEST. The aim of the specific example was to illustrate the flexibility of the statistical method which is easy to add to any simulation package or prototyping tool. Our experience from using these analysis methods in two different environments is very positive. Results first achieved for a certain algorithm (FLS) with the first tool were almost identical to those attained by the second which is encouraging.

With NEST, the simulation based tool is not implemented in the target distributed system environment, thereby slightly limiting similarity of the code used for simulation to that of the implementation. On the other hand, the generality of a tool like NEST makes it easily portable to many environments. We already have FLS implemented in NEST and plan to make use of this code in three other distributed programming environments (Regis [12], Mosix2 and PVM3) all of which are UNIX based. This will enable reuse of code originally developed for NEST. Also, the limitation of a tool like CNCSIM is its dependence on the target distributed programming environment which may evolve and change, especially in an academic environment!

After experimenting with NEST, which is a tool running under UNIX and requiring minimal changes for simulation, our conclusion is that the advantages of running a simulation in a general-purpose environment and requiring only minor changes for simulation, outweigh its limitations. Also, getting the same results from an unbiased tool from a third party, as is the case with NEST, gives the results more credence. Consequently, we strongly advocate use of such an environment for similar purposes. Our plan is to continue and use NEST to prototype extensions to the adaptive partitioning algorithm (like proximity handling, dynamically setting the parameters values and others). Any of these extensions will first be tried in NEST and only when it is proven to add to performance, it will be portable to the target environment which can be described as a Network Of Workstations. [1].

2http://www.cs.huji.ac.il/mosix/

3https://www.doczj.com/doc/3b4796470.html,/pvm/pvm_home.html

Acknowledgements

We gratefully acknowledge the Ministry of Science grant no. 8500-1-95 for its financial support. We also want to thank Peter Heidelberger for his helpful suggestions in the area of statistical inference. Thanks to Kostas Karagianidis for converting the statistical analysis methods from CNCSIM to NEST. This was later modified to arrive at its current form.

References

[1]T.E.Anderson, D.E. Culler, D.A. Patterson., "A Case for NOW (Networks of

Workstations)", IEEE Micro, February 1995, pp. 54-64.

[2] A. Chai, S. Ghosh., "Modelling and Distributed Simulation of a Broadband-ISDN Network",

Computer, 26(9), September 1993, pp. 37-52.

[3]J.Dupuy,Y.Schwartz,Y.Yemini,D. Bacon, "NEST: A Network Simulation and Prototyping

Testbed", Communication of the ACM, 33(10), Oct. 1990, pp. 63-74.

[4]R. M. Fujimoto, "Parallel Discrete Event Simulation", CACM, Vol. 33, No. 10, Oct. 1990,

pp. 30-53.

[5]P.Heidelberger,D.Welch , “Simulation Run Length Control in the Presence of an Initial

Transient”, Operations Research, 31(6), 1983, 1109-1144

[6]J.P.C.Kleijnen, “Statistical Tools for the Simulation Practitioners”, 1987, Marcel Dekker

[7]J.Kramer, "Configuration Programming - A Framework for the Development of Distributable

Systems", Proceedings of the 7th Int. Conf on Computer Systems and Software Engineering (CompEuro 90) , Israel, May 1990, pp. 374-383.

[8]O.Kremien,J.Kramer,J.Magee, “Rapid Assessment of Decentralized Algorithms”,

Proceedings of the 7th Int. Conf on Computer Systems and Software Engineering (CompEuro 90) , Israel, May 1990 pp.329-335.

[9]O.Kremien,J.Kramer, "Methodical Analysis of Adaptive Load Sharing Algorithms", IEEE

Trans. on Parallel and Distributed Systems, November 1992, pp. 747-760.

[10]O.Kremien,J.Kramer,J.Magee, "Scalable and Adaptive load sharing for Distributed

Systems”, IEEE Parallel and Distributed Technology, 1(3), Aug. 1993, pp. 62-70.

[11] https://www.doczj.com/doc/3b4796470.html,w, “Statistical Analysis of Simulation Output Data”, Operations Research, 31(6),

1983

[12]J.Magee,J.Kramer,M.Sloman, “Constructing Distributed Systems in CONIC.” IEEE

Transactions on Software Engineering, 15(6), June 1989, pp. 663-675.

[13]J. Magee, N. Dulay, J. Kramer, “Regis: A Constructive Development Environment for

Distributed Programs", Distributed Systems Engineering Journal, 1 (5), Special Issue on Configurable Distributed Systems, (1994), 304-312.

[14] B.Schmeiser, “Batch Size Effects in the Analysis of Simulation Output’, Operations

Research, 30(3), 1982, 556-568

[15]L.Schruben, “Confidence Interval Estimation Using Standardized Time Series”, Operations

Research, 30(3), 1982, 1090-1108

[16]L.Schruben,H.Singh,L.Tierney, “Optimal Tests for Initialization Bias in Simulation Output”,

Operations Research, 31(6), 1983, 1167-1178

Appendix A: Transient Test (Initialization-Bias Detection and Removal)

We calculate the sequence of absolute values of differences between the average of the entire sequence and the average of the first m observations, i.e.moving averages up to the number of batches, over the second half of the sequence.We then estimate the variance from the last half of batches. We can then calculate the statistic to be used in our test. There are four possibilities:Carmer-von Mises, Anderson-Darling, Colmogorov-Smirnov or Schruben’s [5] statistic. To illustrate a possible implementation we select the latter:

?T =

3/2)σ(1?m =1n ∑m n )m (y n ?y m ) is a statistic with χ2 distribution and 3 degrees of

freedom. We perform a two sided test rejecting the hypothesis of no initialization bias if it is significant, i.e. )T >α2 or

)T <α2.Suppose, for example, that the current checkpoint of the simulation run is j 1=200 batches. We calculate the sequence S as : S m =yn ?ym , where m=101, ..., j 1 ; n=j 1=200.The standard

deviation of the sequence Y 101...Y 200 is calculated as

)σ=(y 200?y m =101200∑m )2100.For more details the reader is referred to [16].

Appendix B: Independence Test (Von Neumann Test)

We accumulate a number of batches together and calculate a mean estimate. We also compute an estimate of the variance. Von-Neumann Statistic is computed as:

?q

=n ?1)(n +1), where n is the number of accumulated batches. This is compared to a number from a t-table which is a function of n and 1-α.

If ()q <(2?(Z )σ)) batches are not independent. else they are independent.

相关主题
文本预览
相关文档 最新文档