This action might not be possible to undo. Are you sure you want to continue?
) . Q1. Define “Statistics”. What are the functions of Statistics? Distinguish between Primary data and Secondary data. Ans. Statistics as a discipline is considered indispensable in almost all spheres of humanknowledge. There is hardly any branch of study which does not use statistics. Scientific,social and economic studies use statistics in one form or another. These disciplines make-use of observations, facts and figures, enquiries and experiments etc. using statistics andstatistical methods. Statistics studies almost all aspects in an enquiry. It mainly aims atsimplifying the complexity of information collected in an enquiry. It presents data in asimplified form as to make them intelligible. It analyses data and facilitates drawal of conclusions. Now let us briefly discuss some of the important functions of statistics. Presents facts in. simple form:Statistics presents facts and figures in a definite form. That makes the statement logical andconvincing than mere description. It condenses the whole mass of figures into a singlefigure. This makes the problem intelligible. Reduces the Complexity of data:Statistics simplifies the complexity of data. The raw data are unintelligible. We make themsimple and intelligible by using different statistical measures. Some such commonly usedmeasures are graphs, averages, dispersions, skewness, kurtosis, correlation and regressionetc. These measures help in interpretation and drawing inferences. Therefore, statisticsenables to enlarge the horizon of one's knowledge. Facilitates comparison:Comparison between different sets of observation is an important function of statistics. Comparison is necessary to draw conclusions as Professor Boddington rightly points out.” the object of statistics is to enable comparison between past and present results to ascertainthe reasons for changes, which have taken place and the effect of such changes in future. Soto determine the efficiency of any measure comparison is
scientists and administrators to prepare different plans and programmes. . along with a measure of their reliabilityabout the population parameters from the sample dataAny statistical data can be classified under two categories depending upon the sources utilized.Statistics is very helpful in the field of business.necessary. This helps indeveloping new theories.These categories are. Statistical helps in collecting an appropriate quantitative data.economists. Hence. Derives valid inferences :Statistical methods mainly aim at deriving inferences from an enquiry. diagrammatic andgraphic form for any easy and comprehension of the data. So statistics examines the truth and helps in innovating new ideas. Statistical techniques are often used by scholars’ planners and scientists to evaluate different projects. Statisticaltechniques are used for predicting the future values of a variable. some of theuses of Statistics are: Statistics helps in providing a better understanding and exact description of aphenomenon of nature. are used for the purpose of comparison. Education etc. Statistics helps in proper and efficient planning of a statistical inquiry in any field of study. Statistics helps in presenting complex data in a suitable tabular. Statistics helps in drawing valid inference. coefficients etc. Formulation of Policies :Statistics helps in formulating plans and policies in different fields. Testing hypothesis:Formulating and testing of hypothesis is an important function of statistics. Statistics helps in forecasting the trend and tendencies. considering thepresent population trends. Statistics helps in understanding the nature and pattern of variability of aphenomenon through quantitative observations. These techniques are also used to draw inferences regarding population parameters on the basis of sample information. research. Similarly. Forecasting :The future is uncertain.. Statistical devices likeaverages. ratios. the planners can forecast the future population etc. statistics is essential for planners. For example a producerforecasts his future production on the basis of the present demand conditions and his pastexperiences. Statistical analysis of data forms the beginning of policy formulations.
‘ A primary source is a publication in which the data are published by the same authority which gathered and analysed them. One such sources are old and secondly they contain limitedinformation as well as they can be misleading and biased. According to W. 4. reporting the data which have been gathered by other authorities and for which others are responsible’. Care should be taken twice that theenumerator record correct information provided by the informants.Importance of Primary data cannot be neglected. 3. which is collected by the investigator himself for the purpose of aspecific inquiry or study. A secondary source is apublication. . Secondary data Primary Data: Primary data is the one.A. Collection of data by sending theenumerator is bound to be expensive. magazine and other printed sources. thestatistical records of female population in a country cannot be based on newspaper. The collection of data by the method of personal survey is possible only if thearea covered by the investigator is small. A research can be conducted withoutsecondary data but a research based on only secondary data is least reliable and may have biasesbecause secondary data has already been manipulated by human beings. then the schedules may not be filledwith accurate and correct information and hence this method is unsuitable.Neiswanger. 2.1. Collection of primary data by framing a schedules or distributing and collectingquestionnaires by post is less expensive and can be completed in shorter time. Such data is original in character and is generated by surveyconducted by individuals or research institution or any organisation. and later the same data are used by a different agency. In statistical surveys it isnecessary to get information from primary sources and work on primary data: for example. Secondary Data: Secondary data are those data which have been already collected and analysed bysome earlier agency for its own use. Primary data 2. The information collected for primary data is mere reliable than those collectedfrom the secondary data. Suppose the questions are embarrassing or of complicated nature or the questionsprobe into personnel affairs of individuals. 1.
Sometimes primary data does not exist in such situation one has to confine the researchon secondary data.The government help in making decisions and planningfuture policy. Sometimes primary data is present but the respondents are not willing toreveal it in such case too secondary data can suffice: for example. 3. Draw a histogram for the following distribution: Age 0-10 10-20 20-30 30-40 40-50 No. of People 2 5 10 8 4 . Secondary data can be less valid but its importance is still there.1. economical and industrialinformation. Much of the secondary data available has been collected for many years and therefore itcan be used to plot trends. Research organizations by providing social. Secondary data is cheap to obtain. 4. Business and industry in areas such as marketing. Many government publications are relatively cheapand libraries stock quantities of secondary data produced by the government. Large quantities of secondary data can be got through internet. Q2. Secondary data is of value to: . and sales in order toappreciate the general economic and social conditions and to provide information oncompetitors. Sometimes it is difficult toobtain primary data. 2. in these cases getting information from secondary sources is easier and possible. bycompanies and other organizations. if the research is on thepsychology of transsexuals first it is difficult to find out transsexuals and second they may not bewilling to give information you want for your research. so you can collect data from books orother published sources.
24. 18. . 40. 32. Q3. 18. 10. (i) Arithmetic mean = 40+32+24+36+42+18+10 7 = 28. Therefore 32 is the median value. Ans. 36. 24.85 (ii) Arranging in Ascending Order 10. Find the (i) arithmetic mean and (ii) the median value of the following set of values: 40.Ans. 42. 36. 32. 42.
Q4. 80-82 15 82-84 26 84-86 23 86-88 9 88-90 4 . Calculate the standard deviation of the following data: Marks 78-80 No. of 3 students Ans.
(ii) Variable. forexample. or tall Australian females. making a census or a complete enumeration of all the values in the population impracticalor impossible. a sample is a subset of a population. i) Sample In statistics. an unbiased sample of Australian men taller than 2m might consist of a randomly sampled subset of 1% of Australianmales taller than 2m. stratifiedrandom samples. Typically. A random sample is defined as a sample where each individualmember of the population has a known. the population is verylarge. a complete sample of Australianmen taller than 2m would consist of a list of every Australian male taller than 2m. non-zero chance of being selected as part of the sample. In an astronomical context. But it wouldn'tinclude German males. and nationality for each member of that parent population. The sample represents a subset of manageable size. but such complete samples are oftenavailable in other disciplines. In the case of humanpopulations. amount of income. or people shorter than 2m. (iii) Population. (ii) Variable A variable is a characteristic that may assume more than one set of values to which anumerical measure can be assigned.An unbiased sample is a set of objects chosen from a complete sample using a selection processthat does not depend on the properties of the objects. age. males aged under 18 will not be on the electoral register. For example. alsoknown as a probability sample.Categorical variables: A categorical variable . Explain the following terms with respect to Statistics: (i) Sample. Samples are collected andstatistics are calculated from the samples so that one can make inferences or extrapolations fromthe sample to the population. and cluster random samples. province or country of birth. systematic samples. such as complete magnitude-limited samples of astronomicalobjects.gender. But one chosen from the electoral register might not be unbiased since.Height. Ans. So to compile sucha complete sample requires a complete list of the parent population.provided the data availability is not biased by individual source properties. This process of collecting information from a sample is referred toas sampling. such a complete list is unlikely to exist. including data on height.some of which are outlined in this section. For example. grades obtained at school and typeof housing are all examples of variables.Q5.Several types of random samples are simple random samples. anunbiased sample might consist of that fraction of a complete sample for which data are available. Variables may be classified into various categories.A complete sample is a set of objects from a parent population that includes ALL such objectsthat satisfy a set of well-defined selection criteria.The best way to avoid a biased or unrepresentative sample is to select a random sample.
we are interested in the set of all adult crows now alive in the county of . geography willalso constitute a limitation in that our resources for studying crows are also limited. Notice that if we choose a population like all crows.6321748755. the height of a student is a continuous variablebecause a student may be 1. (iii) Population A statistical population is a set of entities concerning which statistical inferences are tobe drawn.Continuous variables: A variable is said to be continuous if it can assume an infinite number of real values. but the variable(satisfaction) is really an ordinal variable. a score of 8. These categories must be mutually exclusiveand exhaustive. Categorical variables can be either nominal or ordinal. For example. when you are asked toassign a value from 1 to 5 to express your level of satisfaction. if wewere interested in generalizations about crows. often based on a random sample taken from the population. Ordinal variables: An ordinal variable is a categorical variable for which the possible categoriescan be placed in a specific order or in some 'natural' way. you use numbers.Discrete variables: As opposed to a continuous variable. a discrete variable can only take afinite number of real values. or by the accuracyof the measuring instruments. age and temperature.The measurement of a continuous variable is restricted by the methods used. Numeric variables may be either continuous ordiscrete. Numeric variables: A numeric variable. Suppose. An example of a discrete variable would be the score given by a judge to a gymnast in competition: the range is 0 to 10 and the score is always given to onedecimal (e. However... Examples of a continuous variable are distance. Probably.g. is one that canassume a number of real values such as age or number of people in a household. also known as a quantitative variable. forexample.(also called qualitative variable) is one for whicheach response can be put into a specific category. then we would describe the set of crowsthat is of interest. we will be limitedto observing crows that exist now or will exist in the future. including notonly cases actually observed but those that are potentially observable.Population is also used to refer to a set of potential measurements or values.5). there is no 'natural ordering' of the set of possible names or categories. notall variables described by numbers are considered numeric. Mutually exclusive means that each possible survey response should belong toonly one category. For example. metres tall. exhaustive requires that the categories should cover the entire set of possibilities. whereas. For example. Nominal variables: A nominal variable is one that describes a name or category.. Contrary toordinal variables.
for instance. Given that: . Let ‘A’ be the event of getting head. the properties and response of the overall population can often bebetter understood if it is first separated into distinct subpopulations. and the set of these weights is called thepopulation of weights. a particular medicine may have different effects on differentsubpopulations. one can often estimate parameters more accurately if one separates outsubpopulations: distribution of heights among people is better modeled by consideringmen and women as separate subpopulations. For instance. What is the probability that the tosses will result in: (i) at least four heads. and (ii) exactly two heads. and we want to know the mean weight of these birds.Cambridge shire. Ans. whichcombine the distributions within subpopulations into an overall population distribution. A subset of a population is called a subpopulation. Q6. and these effects may be obscured or dismissed if such specialsubpopulations are not identified and examined in isolation. Populations consisting of subpopulations can be modeled by mixture models. If different subpopulations havedifferent properties. Similarly. An unbiased coin is tossed six times. For each bird inthe population of crows there is a weight.
the probability that the tosses will result in exactly two heads is 15/64.(ii) The probability that the tosses will result in exactly two heads is given by: Therefore. .
Master of Business Administration. X 18 16 12 8 4 aX = 58 Y 22 14 12 10 8 aY = 66 2 2 18 22 16 14 12 12 8 10 4 8 X 324 256 144 64 16 2 Y 484 196 144 100 64 2 XY 216 224 144 80 32 aXY = 696 aX = 804 aY = 988 Q2. Ans. 26.MBA Semester 1 MB0040 – Statistics for Management . Find Karl Pearson’s correlation co-efficient for the data given in the below table: X Y Ans. 21.lowest number/2= 58/2=2 = highest . 17.2 (60 Marks) Q1.4 Credits (Book ID: B1129) Assignment Set . 20. 22. Find the (i) arithmetic mean (ii) range and (iii) median of the following data: 15. 19. Arithmetic mean= (15+77+22+21+19+26+20)/7=140/7=20Range number.
Thirdly. because high data availability requires a resilient storage and networking environment. What is the importance of classification of data? What are the types of classification of data? Ans. financial data. Whitton said. The balance may vary greatly from one user to the next between office documents. He added that the start point for most companies is to classify data in line with their confidentiality requirements. but experts warn against it. e-mail correspondence. companies need to choose certain types of data to classify. . which helps to cut storage and backup costs. personal data. and so on. video files. As well as the type and confidentiality of the data. Certified database technologies can tag every data item however. But organisations can also gain from de-duplicating their information. For example. adding more security for increasingly confidential data.Q3. However. this could be the most externally damaging and internally sensitive. as each generates different types and volumes of data. data strategies differ greatly from one organisation to the next. images. "If it goes wrong. organisations should also consider its integrity. whilst speeding up data searches. customer and product information. "Full data classification can be a very expensive activity that very few organisations do well. in our experience only governments do this because of the cost implications. Users should also consider its availability. It may seem a good idea to classify and tag everything in the databases. Andy Whitton. Data classification and identification is all about tagging your data so it can be found quickly and efficiently. classification can help an organisation to meet legal and regulatory requirements for retrieving specific information within a set timeframe." Instead. partner in Deloitte's data practice says. and this is often the motivation behind implementing data classification technology. or commercially valuable data. as low-quality data cannot be trusted." says Whitton. such as account data. everyone is very protective over salary data.
standardising the path to it. Test at 5% level of significance whether the weeks and shifts are independent. Shift I II III Total 1st Week 15 20 25 60 2nd Week 5 10 15 30 3rd Week 20 20 20 60 Total 40 50 60 150 Ans. to its Q4. to first source the desired data.Tagging the data in the right way. said Keller. "In other words. is essential. "A plan must be put in place. to its volatility. the egg must truly precede the chicken. by using an effective metadata strategy ." Once this platform of initial "metadata" has been established and replicated successfully to other information stores. documenting the data's structure and general content along with any known business rules and then ultimately communicating this initial set of information to relevant constituencies. The data given in the below table shows the production in three shifts and the number of defective goods that turned out in three weeks." The enterprise is overwhelmed with data. "This set of tags can range from its quality encryption/security level. in terms of their business relevance. the organisation can implement a "classification taxonomy" to tag the assets of varying types. said Greg Keller. classification. including relational (structured) and non-relational (semi-structured or non-structured). stale and of radically varying quality"." says Keller. he explains. chief evangelist at software firm Embarcadero. by an enterprise or data architecture team. much of which is redundant. ObservedValue (O) Expected Value (E) (O – E) .
04175 40 x 30/150 = 8 9 1.64595. ‘H o ’ is accepted. .F (3 – 1) (3 – 1) = 43. Level of Significance is 5% and D. Hence. Null hypothesis ‘H o ’: The week and shifts are independent Alternate hypothesis ‘H A ’: The week and shifts are dependent 2.O.0625 20 50 x 60/150 = 20 0 0.6459) < c 2tab (9.000020 60 x 60/150 = 24 16 0.2 15 40 x 60 /150 = 16 1 0.000025 60 x 60/150 = 24 1 0. Conclusion: Since c 2cal (3.000020 50 x 60 /150 = 20 0 0.125010 50 x 30/150 = 10 0 0. Test c 2cal = 3.49). the attributes ‘week’and ‘shifts’ are independent. Test Statistics4.6459 The steps followed to calculate c 2 are described below.750020 40 x 60/150 = 16 16 1. 1.6667 c 2 3.000015 60 x 30/150 = 12 9 0.
Probability sampling 2. nonrandom (or judgment) sampling and random (or probability) sampling. There are two broad methods of sampling used by researchers. In judgement sampling the researcher selects items to be drawn from the population based on his or her judgement about how well these items represent the whole population. Each item in the sample stands equal chance of being included in the sample. to draw statistically valid inferences about the characteristics about the entire population. A random sampling system enables more reliable results of statistical analysis with measurable margins of errors and degree of confidence. A judgement sampling system is simple and less expensive to use. The sampling techniques may be broadly classified into 1. sample units are drawn in such a way each and every unit in the population has an equal and independent chance of being included in the . The chance of an item being included in the sample are influenced by the characteristic of the item as judged by an expert selecting the item. What is sampling? Explain briefly the types of sampling Ans. Simple random sampling Under this technique.Q5. In random sampling. Also when there is very little known about the population under study a pilot study based on judgement sample is carried out to permit design of a more rigorous sampling system for a detailed study. individual judgement plays no part in selection of sample. The technique of drawing samples is according to the law in which each unit has a probability of being included in the sample. Sampling refers to the statistical process of selecting and studying the characteristics of a relatively small number of items from a relatively large population of such items.. the researcher is required to use specific statistical processes to ensure this equal probability of every item in the population. Non-probability sampling Probability Sampling: Probability sampling provides a scientific technique of drawing samples from the population. In case of random sampling.The sample is thus based on someones knowledge about the population and the characteristics of individual items within it.
Stratified random sampling This sampling design is most appropriate if the population is heterogeneous with respect to characteristic under study or the population distribution is highly skewed. where N is the population size. Very easy to operate and easy to 1. 2. then it is known as simple Random Sampling with Replacement. list. It saves time and labour. . Provides more efficient estimate 3. Sample is more representative 2. In the case probability of drawing a unit is 1/Nn. probability of drawing a unit is 1/N. More efficient than simple random sampling if we have up-to-date frame. Appropriate sample sizes are not drawn from each of the stratum Systematic sampling This design is recommended if we have a complete list of sampling units arranged in some systematic order such as geographical. chronological or alphabetical order. It gives biased results if periodic feature exist in the data. Administratively more convenient 4. Table: Merits and demerits of stratified random sampling Merits 1. 2. Many times the stratification is not effective 2. Table: Merits and demerits of systematic sampling Merits Demerits 1. then it is case. If a sample unit is replaced before drawing the next unit. Can be applied in situation where different degrees of accuracy is desired for different segments of population Demerits 1. 3.sample. If the sample unit is not replaced before drawing the next unit. Many case we do not get up-to-date check.
It is the most suitable method if the population size is less. Cluster sampling The total population is divided into recognizable sub-divisions. The units are selected from each cluster by suitable sampling techniques. known as clusters such that within each cluster they are homogenous. Judgment sampling The choice of sampling items depends exclusively on the judgment of the investigator. Multi-stage sampling The total population is divided into several stages. The sampling process is carried out through several stages. . Figure: Multistage sampling Non-probability sampling: Depending upon the object of inquiry and other considerations a predetermined number of sampling units is selected purposely so that they represent the true characteristics of the population. The investigator’s experience and knowledge about the population will help to select the sample units.
What is the probability that: (i) none of the houses catch fire and (ii) At least one house catch fire? Ans. Suppose two houses in a thousand catch fire in a year and there are 2000 houses in a village.002 – 4 .002 and n = 2000 Therefore.Q6. m – np = 2000 .0. Given the probability of a house catching fire is: P= 2/1000 = 0.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.