Professional Documents
Culture Documents
Introduction ()
You can observe a lot just by watching ( ) Data gathering results a conceptual model of how the system operated ( ) Data gathering should avoid ending up with lots of data but with very little useful information ( )
Data Collection and Analysis 2
What is the best procedure to follow? (?) What types of data should be gathered? (?) What sources should be used ? (?) What types of analyses should be performed on the data? (?) How do you select the right probability distribution representing the data? (?) How should data be documented? (?)
Data Collection and Analysis 3
Focus on key impact factors () Avoid little impact information (e.g. off-hour performance, extremely rare downtime, negligible move time..)
( .) Separate input variables from response variables ( )
Focus on essence rather than substance Capture cause-effect relationships and ignore meaningless details () Focus on the activity of using resources or the delay of entity flow (system abstraction)
()
Determine data requirements ( ) Identify data resources () Collect the data () Make assumptions () Analyze the data () Document and approve the data ( )
Data Collection and Analysis 7
Structural data ()
All the objects in the system to be modeled () Describe the layout of the system () Identify the items to be processed (e.g. entities, resources, locations.) ( )
Operational Data ()
Explain how the system operates () When, where and how events & activities take place () Consist of the logic information about the system, e.g. routing, schedules, downtime behavior and resource allocation. ( )
Numerical Data () Provide quantitative information of the system () Some are easy to get but some are not () e.g. capacities, arrival rates, activity time ( )
10
Use of a Questionnaire (sample see p.103) (103) Questionnaire help gathering right information () If sample data are not available, it is useful to get at least estimate of the minimum, most likely, and maximum value until more precise data obtained. ( )
Data Collection and Analysis 11
Good sources of data () Historical Records () System Documentation (..) Personal Observation (,..) Personal Interviews (,,..) Comparison with similar systems () Vendor claim (..) Design estimation (, ..) Research literature (..)
Data Collection and Analysis 12
14
Description of Operation ()
Explain how entities are processed & provides the details of the EFD
(EDF) Requirements ()
Time & resource requirements of the activity or operation Where, when & in what quantities entities get routed next
()
()
()
15
16
17
Incidental data (downtimes, setups & work priority) are not essential but necessary in order to have a complete & accurate model ( ) Once a basic model constructed, any numerical values (e.g. activity time, arrival rates ..) should be firmed up ( )
Data Collection and Analysis 18
Making Assumptions ()
Simulation cant run with incomplete data, so assumptions are required for any unknown future conditions ( ) Assumption must make sense in the overall operation of the model. Seeing absurd behavior may tell us that certain assumptions dont make sense (
)
Data Collection and Analysis 19
Making Assumptions ()
Simulation cant run with incomplete data, so assumptions are required for any unknown future conditions
()
Assumption must make sense in the overall operation of the model. Seeing absurd behavior may tell us that certain assumptions dont make sense
()
20
Making Assumptions ()
21
Data should be analyzed to ascertain their suitability for use. () Data characteristics: ()
Independence (randomness) () Homogeneity (data from the same distribution) () Stationary (distribution of data no change over time) ( )
Parameters ()
Mean () the average of the data Median () the value of middle observation Mode () the value with greatest frequency
Data Collection and Analysis 23
Descriptive Statistics
()
Parameters ()
Standard Deviation () measure of average deviation Variance () the square of standard deviation Coefficient of variation () standard deviation divided by mean Skewness () measure of symmetry Kurtosis () measure of flatness or peakedness
Data Collection and Analysis 24
Scatter Plot ()
A plot of adjacent points in the sequence of observed values plotted against each other A pair of consecutive observations (Xi, Xi+1), i=1,..,n-1 () Xis Positively correlated () positively sloped trend line () Xis Negatively correlated () Negatively sloped trend line ()
Data Collection and Analysis 26
Autocorrelation Plot ()
If observations in a sample are independent, they are uncorrelated. () Assume that data are taken from stationary process The measure of autocorrelation is called rho () (see, p. 104) () Autocorrelation is between [-1,1]. (-1<= <=1) If is near either extreme 1 or -1, the data is autocorrelated. (1 or -1) If is near 0, the data is little or unrelated (0 )
Data Collection and Analysis 27
Runs Test ()
A run in a series of observations is the occurrence of an uninterrupted sequence of numbers showing the same trend e.g run up or down ; ( )
28
Types of runs tests: if there are too many or too few, the randomness of the series is rejected. ()
Median Test (): measure the number of runs (sequences of numbers) above and below the median Turning Point Test(): measure the number of times the series changes directions
Activity times that take longer or shorter depending on the type of entity being processed () Inter-arrival times vary in length depending on the time of the day or week ()
30
Visually inspect the distribution to see if it has more than one mode () (p.118 Fig. 5.9)
() Analysis of variance (ANOVA) for normally distributed data ( ) Two-Sample test, Chi-square multi-sample test, Kruskal-Wallis non-parametric test. ()
Data Collection and Analysis 31
One type of nonhomogenous data occurs when the distribution changes over time Example of time-changing distribution ( )
Learning Curve () Non-stationary or time variant ( Arrival rate of customers to a service facility ( )
Non-stationary data can be detected by plotting subgroups of data that occur within successive time intervals (Fig 5.10)
()
Run Stat::Fit and see what distribution best fits each data set. If the same distribution fits both, the same population is assumed (Stat::Fit
)
33
Distribution Fitting
()
Three ways of Data Representation () Original data record () The data set is usually not large enough Empirical distribution (characterize data) () Continuous frequency distribution (): the percentage of values that fall within given intervals ()
Distribution Fitting
()
Empirical distribution (characterize data) Discrete frequency distribution: the percentage of times a particular value occurs. () Drawbacks ()
Insufficient sample size may create artificial bias () Fail to capture rare extreme values that may exist in the population from which they were sampled
()
Data Collection and Analysis 35
Distribution Fitting
()
Theoretical distribution () Fitting theoretical distribution to the data () Random variates (generated from the probability distribution provide the simulated random values. ()
36
Distribution Fitting
()
Theoretical distribution ()
Most simulation software provide utilities for fitting distributions to numerical data ( )
Data Collection and Analysis 37
Theoretical Distribution
()
Uniform Distribution
() (see p. 124) X~U(a,b) with EX=(a+b)/2, VarX=(b-a)^2/12 Used as a first model that is felt to be randomly varying between a & b which little else is known ()
38
Theoretical Distribution
()
Triangular Distribution
() (see p. 124) X~Triang(a,m,b) with EX=(a+m+b)/3, VarX=(a^2+m^2+b^2-am-ab-bm)/18
Used
()
39
Theoretical Distribution
()
Normal Distribution
() (see p. 125) X~N(,2) with EX=, VarX= 2 Symmetry (Bell-shaped curve) () Physical measurements height, length () Certain activity time ()
Data Collection and Analysis 40
Theoretical Distribution
()
Theoretical Distribution
()
Theoretical Distribution
()
Gamma Distribution ()
X~Gamma(,) with EX = , with VarX=2 Used as time to complete some tasks, e.g. customer service or machine repair. ( ,) Distribution of a random proportion, e.g. the proportion of defective items in a shipment; time to complete a task in a PERT ( PERT)
Theoretical Distribution
()
Beta Distribution ()
X~Beta(1,2) Used as a rough model in the absence of data () Distribution of a random proportion, e.g. the proportion of defective items in a shipment; time to complete a task in a PERT ( PERT)
44
Theoretical Distribution
()
Weibull Distribution ()
X~Weibull(,) Exp()=Weibull(1,) Used as time to complete some task or time to failure of a piece of equipements ( ) Distribution of a random proportion, e.g. the proportion of defective items in a shipment; time to complete a task in a PERT ( PERT)
Stat::Fit does a reasonable job of data fitting which ranks distribution. (Stat::Fit (p.127)) Trial and Error Process () Goodness of fit test evaluates each fitted distribution to ascertain the relative goodness of fit.
(
46
( 2 KolmogorovSmirnov )
If little data are available, goodness of fit test is unlikely to reject any candidate distribution Good idea to look at graphical display in a histogram () before making decisions
()
()
Data Collection and Analysis 47
Data Absence ()
About 10 customers arrivals per hour Approximately 20 mins to assemble parts Around five machine failure per day 1.5 to 3 mins to inspect items 5 to 10 customer arrivals per hour 4 to 6 minutes to set up a machine
Minimum, Most likely, Maximum Values can be easily set up as a triangular distribution ( )
Data Collection and Analysis 48
Summary ()
Data should be collected systematically () Three types of data: structural, operational and numerical () Questionnaire is a good way to request information ()
Data Collection and Analysis 49
Summary ()
Numerical data for random variables should be analyzed to test for independency and homogeneity
()