Types of Data: Definition of Data: Its the facts presented to the
researcher from the studys environment. It is characterized by their abstractness, verifiability, elusiveness, and closeness to the phenomenon. 1) Qualitative data: is a categorical measurement expressed not in terms of numbers, but rather by means of a natural language description. In statistics, it is often used interchangeably with "categorical" data. For e.g., favorite color Blue, Height Tall. Although we may have categories, the categories may have a structure to them. When there is not a natural ordering of the categories, we call these nominal categories. Examples might be gender, race, religion, or sport. When the categories may be ordered, these are called ordinal variables. Categorical variables that judge size (small, medium, large, etc.) are ordinal variables. Attitudes (strongly disagree, disagree, neutral, agree, strongly agree) are also ordinal variables, however we may not know which value is the best or worst of these issues. Note that the distance between these categories is not something we can measure. 2) Quantitative data: is a numerical measurement expressed not by means of a natural language description, but rather in terms of numbers. However, not all numbers are continuous and measurable. For example, the Aadhaar is a number, but not something that one can add or subtract. Quantitative data always associate with a scale measure including ratio-scale. Primary vs Secondary data: Primary data are sought for their proximity to the truth and control over error. These cautions remind us to use care in designing data collection procedures and generalizing from results. Secondary data have had at least one level of interpretation inserted between the event and its recording. primary sources, (2) secondary sources, and (3) tertiary sources. Primary sources are original works of research or raw data without interpretation or pronouncements that represent an official opinion or position. Included among the primary sources are memos; letters; complete interviews or speeches (in audio, video, or written transcript formats); laws; regulations; court decisions or standards; and most government data, including census, economic, and labor data. Primary sources are always the most authoritative because the information has not been altered or interpreted by a second party. Other internal sources of primary data are inventory records, personnel records, purchasing requisition forms, statistical process control charts, and similar data. Secondary sources are interpretations of primary data. Encyclopedias, textbooks, handbooks, magazine and newspaper articles, and most newscasts are considered secondary information sources. Indeed, nearly all reference materials fall into this category. Internally, sales analysis summaries and investor annual reports would be examples of secondary sources, because they are compiled from a variety of primary sources. To an outsider, however, the annual report is viewed as a primary source, because it represents the official position of the corporation. Methods of primary data collection: Methods of data collection: 1) Monitoring (Conditions, behaviors, events, processes): includes studies in which the researcher inspects the activities of a subject or the nature of some material without attempting to elicit responses from anyone e.g, Traffic counts at an intersection. 2) Communication (Attitudes, motivations, intentions, expectations): the researcher questions the subjects and collects their responses by personal or impersonal means. The collected data may result from (i) interview or telephone conversations, (ii) self-administered or self-reported instruments sent through the mail, left in convenient locations, or transmitted electronically or by other means, or (iii) instruments presented before and/or after a treatment or stimulus condition in an experiment. Data Collection Design: Steps: 1) Select relevant variables; 2) Specify levels of treatment; 3) Control the experimental environment; 4) Choose the experimental design Screen design, Response surface design, Choice design, Life test design, Nonlinear design, Space filling design, Full factorial design, Taguchi design, Mixture design, Evaluate design & Augment design. Instrument Design: Steps: 1) Identify screening inquiry; 2) Prepare participation appeal; 3) Identify source of error; 4) Prepare error reduction plan; 5) Prepare instrument. Survey vs Observation: Survey: Very versatile in types of data collection. This method provides opportunity to the respondents for seeking clarifications. The the response to the questions can be sought though Personal interviews, Ordinary Mail or Electronic communication. Time and cooperation is required from the respondent. Observation: Data collection is constrained only what can be observed or heard. Any kind of attitude/feelings survey is not possible. The observation can be done mechanically (video tapes) or through human interface. This method is best for conducting surveys on infants /children who cannot speak. In this technique no extra effort is needed from the respondent. Not
affected by the presence of the interviewer. Types of
Observations: 1) Natural vs Contrived observation; 2) Disguised vs Non-disguised; 3) Human vs Mechanical; 4) Web-based observation. Experiments: Read Unit II cheatsheet Construction of questionnaire and instrument: Question construction involves three critical decision areas. They are (a) question content, (b) question wording, and (c) response strategy. Question content should pass the following tests: Should the question be asked? Is it of proper scope? Can and will the participant answer adequately? Question wording difficulties exceed most other sources of distortion in surveys. Each response strategy generates a specific level of data, with available statistical procedures for each scale type influencing the desired response strategy. Participant factors include level of information about the topic, degree to which the topic has been thought through, ease of communication, and motivation to share information. Instruments obtain three general classes of information. Target questions address the investigative questions and are the most important. Classification questions concern participant characteristics and allow participants answers to be grouped for analysis. Administrative questions identify the participant, interviewer, and interview location and conditions. Validation of questionnaire: Retention of a question should be confirmed by answering these questions: Is the question stated in terms of a shared vocabulary? Does the vocabulary have a single meaning? Does the question contain misleading assumptions? Is the wording biased? Is it correctly personalized? Are adequate alternatives presented? Definitions: Idea of Sampling: is that by selecting some of the elements in a population, we may draw conclusions about the entire population. A population element is the individual participant or object on which the measurement is taken. It is the unit of study. A population is the total collection of elements about which we wish to make some inferences. A census is a count of all the elements in a population. We call the listing of all population elements from which the sample will be drawn as the sample frame. Sample Types: 1) Nonprobability sampling is arbitrary and subjective; when we choose subjectively, we usually do so with a pattern or scheme in mind (e.g., only talking with young people or only talking with women). Each member of the population does not have a known chance of being included. 2) Probability sampling is based on the concept of random selectiona controlled procedure that assures that each population element is given a known nonzero chance of selection. This procedure is never haphazard. Only probability samples provide estimates of precision. Sample plan: Sampling Design Steps: 1. What is the target population? 2. What are the parameters of interest? 3. What is the sampling frame? 4. What is the appropriate sampling method? 5. What size sample is needed? Sample size: The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample. Determinants of optimal sample size: 1) Type of analysis to be employed; 2) The level of precision needed; 3) Population homogeneity/heterogeneity; 4) Available resources; 5) Sampling technique used Sampling techniques: Types: Unrestricted: 1) Simple Random (Probability), 2) Convenience (Non-probability). Restricted: 1) Complex Random (Probability) Systematic, Cluster, Stratified, Double. 2) Purposive (Nonprobability) Judgement, Quota, Snowball. Exhibit 14-8. Probability vs non-probability sampling methods: Probability Sampling: You have a complete sampling frame. You have contact information for the entire population. You can select a random sample from your population. Since all persons (or units) have an equal chance of being selected for your survey, you can randomly select participants without missing entire portions of your audience. You can generalize your results from a random sample. With this data collection method and a decent response rate, you can extrapolate your results to the entire population. Can be more expensive and timeconsuming than convenience or purposive sampling. Nonprobability Sampling: Used when there isnt an exhaustive population list available. Some units are unable to be selected, therefore you have no way of knowing the size and effect of sampling error (missed persons, unequal representation, etc.). Not random. Can be effective when trying to generate ideas and getting feedback, but you cannot generalize your results to an entire population with a high level of confidence.
Quota samples (males and females, etc.) are an example. More
convenient and less costly, but doesnt hold up to expectations of probability theory. Stratified Sampling: 1. We divide the population into a few subgroups: Each subgroup has many elements in it; Subgroups are selected according to some criterion that is related to the variables under study. 2. We try to secure homogeneity within subgroups. 3. We try to secure heterogeneity between subgroups. 4. We randomly choose elements from within each subgroup. Cluster Sampling: 1. We divide the population into many subgroups: Each subgroup has few elements in it; Subgroups
are selected according to some criterion of ease or availability in
data collection. 2. We try to secure heterogeneity within subgroups. 3. We try to secure homogeneity between subgroups. 4. We randomly choose several subgroups that we then typically study in depth.