You are on page 1of 18

Data collection and data

processing
Prof. Dr. med. Frank P. Schelp
Professor Emeritus
Charité-University Medicine Berlin Germany
Consultant Faculty of Public Health
Khon Kaen University, Thailand
Preliminary remarks:

• Collecting variables from field studies


• Date file management
• For studies in the field of sociology refer to
lectures about assessing behaviour,
knowledge etc.
Often used variables and attributes assessed by
Age
questionnaires I
Vulnerable groups
Newborn
Weaning
Toddler
Pre-school children
School children
Adolescent females
Young adults
Elderly
Gender
Marital status
Common wives versus married wives
Polygamy
Relationships in societies with high migration ( e.g. Philippines)
Family structure
Nuclear
Extended
Often used variables and attributes assessed by
questionnaires II
Occupation
Socio-economic status
Place of residence
Rural
Urban
Semi urban
Land holding
Quality of land
Type of housing
Materials used for construction of house
Cash income
Possession score
Education
Often used variables derived from measurements

Anthropometric measurements
Weight, height, length
Skin fold measurements
Biological material
Haemoglobin, PCV
Vitamins
Other variables of clinical laboratory
investigations such as blood glucose levels,
creatinine etc.

Faeces
Intestinal parasites
Conducting field work

Supervision of staff
Labelling
Taking measurements
Taking biological material
Fill in questionnaires
Ensuring continued co-operation of population
Minimise blood taking
Explain objective of study
Inform about outcome
Treat population
Provide service during field work
Common statistical software
(for semi-professionals):
• EPI-INFO: Good for data entry, data transfer to other
software, statistics for proportion
• SPSS: Especially good for multivariate statistics, data
entry and data transfer, uncritical if faulty data
spread sheet included, good for spread sheet
management
• MINITAB: Good for descriptive and analytical
statistics, especially good for non-parametric
statistics
• STATA: Very much liked by statisticians
Do and don‘t do in data file management:

• Try to enter all variables of a particular project into


only one spread sheet - merging of spread sheets
are troublesome.
• Make sure that statistics can be done vertical and
horizontal – add all variables for one individual in
one horizontal line
• Store information as numeric variable and
avoid alpha numeric variables
• Be extremely careful in composing the
worksheet (information of your spread
sheet also during data processing)
• Make clear-cut distinction between „no-
answer“ and „Missing“ and a „don‘t know
answer“
• Don‘t mix up a „missing value“ with a true
value
• Check your spread sheet about plausibility
(often the fact that a questionnaire was
grossly faulty is being only recognized by
checking the plausibility of the spread
sheet)
• „cave“ translation from a local language
into English

You might also like