Statistics values, and behavior in the production and
consumption of health and health care. - Science dealing with the collection, organization, Genetics and Genomics: Heredity; genes & analysis, and interpretation of numerical data. function - Art of summarizing data so that non-statistician - play a role in health sciences can understand it. Genetics – study of genes and the way that certain - Tool in decision making - formulation of good traits and conditions that are passed down from judgment one gen to other gen. - Method or Data - any information that is Genomics – describes the study of all a person’s concerning to a population or sample. genes (genomes) Ex of Data: Interview, Questionnaire (survey), observe, Branch of Statistics documents or record, focus group, oral history, case study Descriptive Statistics – Methods of summarizing and presenting data Uses of Statistics – Computation of measures of central tendency Data reduction technique – transformation of and variability numerical or alphabetical digital information that – Tabulation and graphical presentation is derived empirically or experimentally into a – Facilitate understanding, analysis, and corrected, ordered, and simplified form. interpretation of data. Tool for analyzing research projects and clinical – Devoted in summarization and description of trials data, it consists of methods for organizing and objective appraisal and evaluation of programs summarizing information in decision making process and policy making – Calculation of various descriptive measures – means, measures of variation, & percentiles Biostatistics – It describes the most important characteristic of - Bio – life; Statistics – data, a given set of data - it is a special branch of statistics which deals – Contains exact numbers with quantitative and qualitative aspects of vital Inferential Statistics phenomena – Methods of arriving at conclusions and - Term that is used when the focus is on the generalizations about a target population based biological and health sciences instead on stats on information from a sample only. – Estimation of parameters and hypotheses testing Health Statistics - data required in the planning, – Consists of methods for drawing and measuring administration, and evolution of health programs the variability & reliability of conclusion about Uses of Biostatistics population that is based from information obtained from a sample population Epidemiology - distribution & determinants of – Can make prediction health-related states and events. – Branch of stats that is concerned with using - A method that is used to find the causes of sample data to make an inference about a health outcomes and diseases in population population of data Demography - study of human population Terms in Biostatistics Ex: age, gender, educ, nationality, ethnicity, & religion Population – all members of a specified group Health Economics - functioning of health care Sample – subset of a population system and health affecting behaviors Parameter – measure of a characteristic of a - Branch of economics that is concerned with population issues that is related in efficiency, effectiveness, BIOSTAT WEEK 1 Constant – value of a characteristic that remains Nominal - simply used as names or identifiers of the same from person to person, from time to a category time or from place to place - always qualitative Variable – characteristics takes on different - does not represent any amount or quantity value Ex: Names, labels, categories, Ordinal - represents an ordered series of Types of Data relationships, may be qualitative or quantitative According to Source Ex: sequence, level, order, ranks, scaling Interval – does not have a true-zero value Primary Data – coming from the researcher starting point - via interview, experiment, questionnaires - Always quantitative Secondary Data – comes from another source Ex: Temp (F), standardized score, - book, journal, newspaper, thesis, & Ratio – Modified interval level which includes dissertation. zero as a starting point According to Functional Relationship - Always quantitative
Independent – refers to any controlling data, Nominal Named variables
- anything that can be manipulated, (treatment) Ordinal + Ordered variables Dependent – any data that is affected by Interval + Proportionate interval between variables controlling data Ratio + Can accommodate absolute zero Categories of Data
Types of Variables True Equal Or Cate
Scale Example Zero Interval der gory Qualitative - descriptions or labels to distinguish Marital one group from another status, sex, - Uses categories or attributes that can Nominal x x x / gender, distinguish non numeric characteristics ethnicity - Ex: gender, marital status, eye color, address, Student ethnicity, religion, etc. Ordinal x x / / grade, NFL Quantitative - can be measured and ordered rankings according to quantity or amount and expressed Temp (F), numerically Interval x / / / SAT scored, � - Consist numbers that represents counts or IQ, Year measurement Age, Ex: height, weight, temperature, no. of years, etc Ratio / / / / height, weight Types: Discrete can assume a finite or countable number of values Data Processing Ex: no. children, no. of students Systematic procedure to ensure that the Continuous can assume an infinity or information/data gathered are complete, consistent, other possible value that corresponding and suitable for analysis. to a point on a line interval. Data Analysis Steps: Ex: Weight, decimal, temp that can in 1. Identify the problem principle be measured arbitrarily, 2. Collect data accurately. 3. Presentation of data Scale of Measurement of Variables 4. Analysis of data 5. Interpretation of data BIOSTAT WEEK 1 Data Processing Flowchart A document which contains a record of all codes assigned to the responses to all questions in the data collection forms Minimum information that must be included in a coding manual Variable name – variables must consist of one string only and consisting of letters (when useful), numbers and underscore Data Coding - Spaces are not allowed - enter it at a top of each column - Conversion of verbal/written information into - Long enough to be meaningful, short numbers which can be more easily encoded, enough to be easy to read counted and tabulated. Variable description/label of a variable - Assigning numerals or other symbols to answers, such as textual description, or reference so responses can be put into a limited category. to the question number of the item Codes – it is the rules for interpreting, classifying, and arises from the questionnaire. recording data in the coding process. - Include descriptive variable label for each variable in the file Types of Codes - Important for statisticians to Field Code - actual value or information given by understand the contents for each data the respondent item, as well as for the researchers as Bracket Code - recorded as range of values the table will facilitate in understanding rather than actual values. output of the statistical analysis. Factual Code - codes are assigned to a list of Coding instructions categories of a given variable. Code book – contains coding instructions Pattern Code - applicable for questions with and necessary information about multiple response. variables in a data set. It generally contains column no., record Rules in Code Construction no., variable no., variable mean, Number of rules must be kept to minimum (<8) question no., instruction for coding. Codes should be exhaustive and mutually Note: In coding manual, you can add additional exclusive information in subsequent column Adopt coding convention for questions with similar answer Data Encoding - Entering the data/responses in a Category codes - should be only few or around spreadsheet 10% or less of the responses should fall into the MS Excel other category. MS Access - Should be assigned for critical issues even if no Epi Info one has mentioned them, data should be coded to retain as much detail as possible. Data Editing Coding Problems Inspection and correction of any errors or No response inconsistencies in the information collected Not applicable questions - During data collection, encoding, before data Coding Manual analysis Process of examining the collected raw data to detect errors/omissions, and to correct them asap. BIOSTAT WEEK 1 Types of Editing I. Field Editing Reviewing the accomplished data collection forms Decoding of abbreviations or special symbols Making callbacks/messages for verification /clarification of incomplete answers Raw files II. Central Editing Checking of inconsistencies and incorrect entries after receiving the questionnaire from the field Checking of encoded data Computerized, consolidated, summarized
Importance of Data Editing
Make corrections as early as possible
Reduce non-response or incomplete answers Eliminate inconsistencies, incorrect info. Make the entries clear, legible & comprehensive Prepare data for analysis
What to check when editing data?
Check for duplicate entries
Check the totals of each variable if the same as with the sample size For qualitative data, check if categories are consistent with what is specified in the coding manual For quantitative data, check the minimum and maximum if they are logical given the possible values of variable