You are on page 1of 13

jrn

O R I G I NA L PA P E R

Data quality management:


an example in caregiver Journal of Research

research in Nursing
©2007
SAGE PUBLICATIONS
London,Thousand Oaks,
New Delhi
VOL 12(4) 333–345
DOI:10.1177/
Marie-Luise Friedemann 1744987107072978

Professor
College of Nursing and Health Sciences,
Florida International University, Florida

Carlos Mayorga
Clinical Director
Frederick L. Newman
Professor
Robert Stempel College of Public Health,
Florida International University, Florida

Abstract The purpose of this paper is to assist nurse researchers around the world
in addressing data quality issues by guiding them through an example of a complex
multicultural study of caregivers who take care of elders in their home. Using data
flow diagrams for visualization, methods of data quality management are outlined in
the stages of subject recruitment, data gathering, and data monitoring. Various strate-
gies are used to improve the quality of such data as the sample, information on
consent forms, raw interview data, and the electronic data file. These strategies are
aimed at avoiding various sources of human error, error in sampling, non-response
and coverage error, as well as error of measurement and processing. Ultimately, this
exemplar demonstrates an approach to realistic decision making and may offer nurse
researchers incentives for discussion of their own data quality plans. Since immigrant
population groups are expanding in many countries across the world, nurse
researchers who seek to include such ethnic groups in their studies may find the
multiethnic considerations included in the discussion of additional value.

Data quality management: an example in caregiver research


Industrial societies have witnessed what has been called a quality revolution since the
mid twentieth century (Biemer et al, 2003). Quality control in health care, standard-
ized documentation and information management procedures have been discussed
widely, as they apply to various providers (i.e. Munday, 2003; Tucker, 2003; Monsen
et al, 2004).The attention to research data quality has a history of multiple origins that
range from legislative mandates to the responsibility to observe professional standards
(Divorski et al, 2001). In the United States, computerized interactive interviewing

333
Journal of Research in Nursing 12(4)
systems, such as the Computer Aided Technology, Inc. (CATI) system, developed in
1992, ensure accuracy in recording survey responses by allowing the interviewer to
progress to the next item only if the response falls within the allowable choices. Such
systems, however, address only one source of errors, and are expensive and beyond the
reach of most researchers in nursing. For the average nurse researcher the development
of a comprehensive data quality plan is a formidable challenge, especially in complex
studies that include diverse participants.
This paper tells the story of such a plan. Rather than outlining detailed instructions
about data quality procedures, the paper exposes the readers to thought processes that
went into plan development for a quantitative survey project and the challenges expe-
rienced in executing the plan. The project employs structured in-home interviews of
family members caring for elderly loved ones at home, and explores ways in which
family members of various ethnic groups who care for elders in their home in
Southern Florida organize the care and utilize programs offered by the community.
The readers learn about methods of data quality control put into effect in this study,
beginning with the generation of data and ending at the point when the data can be
analyzed. The outlined strategies are by no means perfect solutions and the example
shows that many had to be modified in response to emerging problems. The ultimate
aim of this exemplar is to demonstrate an approach to realistic decision making and
may offer nurse researchers suggestions for their own discussions of data quality plans.
Since immigrant population groups are expanding in many countries across the world,
nurse researchers who seek to include such ethnic groups in their studies may find the
multiethnic considerations of additional value.

Data quality and its management


Much has been debated about quality and the investigation process (Seale, 1999), but
the fact that the quality of data/information ensures the legitimacy of any research
project remains unquestioned. As Schutz (1997) explains, the conclusions and sugges-
tions presented as outcome of an investigation are legitimate only if the process of
analysis was carried out with true information, meaning that the information authen-
tically represents the characteristics of the phenomena of interest. The acquisition of
true information, however, is particularly difficult.
Social scientists argue that there is intrinsically more scope for disagreement in
social science than there is in natural science, because compelling evidence for any par-
ticular proposition or theory is much harder to build (Pratt, 2003).When dealing with
social phenomena, variables and data in general go beyond their statistical value. Data
represent not only people and resources through numeric aggregates, but also reflect
the reality that these people face, their needs and their expectations.Thus, optimal data
quality, meaning data that most closely represents the reality of the people involved is
not only academically desirable but morally and scientifically necessary.
Researchers long placed their emphasis on type and quality of analysis and debated
the logical sequence in the process of ensuring quality in research studies (Ligon,
1996). Recently, the focus has shifted to data quality management. Assuming leadership
in this area, the American Institutes for Research and The National Reporting System for
Adult Education (Condelli et al., 2002) defined the following as the main components
of data quality:

• Reliability Data are collected in the same way, by different people at different times.
• Validity Data represent what they are intended to represent.
• Objectivity The information is accurate, unbiased, clear and well documented.
334
Friedemann et al. Data quality management
• Integrity The information is neither altered nor false.
• Transparency Methods and data sources, assumptions, outcomes, and related infor-
mation are clearly described so that users understand the data.
• Reproducibility Data can be reproduced by others by using the documented methods,
assumptions, and data sources to achieve comparable findings.
• Utility The information is useful and available to its beneficiaries.

Other components include timeliness, comparability, coherence, relevance and com-


pleteness (Arondel et al., 1998).
Data quality must be managed throughout the research process. The stages of the
data quality continuum form a complex and integrated course of action, where the
risks of error are as diverse as the roles of people involved in the process. From
gathering, storing, integrating, retrieving, mining, and analyzing to publishing
data/information, each activity entails different tasks aimed at preserving data
excellence in the face of challenges (Dasu et al., nd).
The following discussion addresses the data quality continuum from the planning
stage to data analysis. The topics are: (1) subject recruitment, (2) data gathering, and
(3) data monitoring.

Background of the study


The study entitled “Culture, Family Patterns, and Caregiver Resource Use” is a govern-
ment-funded four-year project. It has been reviewed and approved by the university
ethics committee as well as the boards of the participating agencies. The investigators
are in the process of comparing patterns of caring for disabled elders in families of var-
ious ethnic groups living in the Miami-Dade area. Specifically, the researchers exam-
ine how the families use informal (family help) and formal resources (programs for
caregivers and community services), and to what extent cultural and family factors
predict the use of resources. To enroll subjects who can answer these questions, the
research team relies on referrals from cooperating home care agencies, community
organizations, and minority graduate nursing students who recruit caregivers in their
communities. Then, a multiethnic team interviews family caregivers in their homes.
The validity of data about these patterns of caring is dependent on data quality. To
enhance data quality, the researchers strive to minimize five sources of error described
by the Subcommittee on Measuring and Reporting the Quality of Survey Data of the
Federal Committee on Statistical Methodology ( 2001):

• Sampling error-Error that occurs by chance due to sample characteristics if the


sample is not representative of the population to be observed.
• Non-response error-“Error of non-observation reflecting an unsuccessful attempt
to obtain the desired information from an eligible unit.” (p. 1–6)
• Coverage error-Error due to omitting certain population units from the sample or
having other units overrepresented in the sample.
• Measurement error-“Difference between the observed value of a variable and the
true, but unobserved, value of that variable.” (p. 1–6) This error is largely due to
instruments that lack validity or reliability.
• Processing error-Error in data processing that occurs after data collection during
any of the stages of the research process.

To illustrate, the data flow is graphically represented by a data flow diagram (DFD), a
tool that visualizes the data quality continuum and shows what the data are, where
335
Journal of Research in Nursing 12(4)
they come from and how they pass from one point to the next (Webster’s Dictionary,
nd).The DFD technique was introduced in the 1970s (Infoarch Group, nd) and is now
being used widely for data quality control. The DFDs below summarize the stages to
be discussed: subject recruitment, data gathering, and data monitoring.

Stages of data quality management


Subject recruitment
The data flow diagram in Figure 1 [adapted from Senior Secondary Assessment Board
of South Australia (SSABSA), 1999] outlines the preliminary steps before research data
can be generated. In this stage, the word data means names of potential participants.
Selecting the cases. Sampling the appropriate group of caregivers to allow generaliza-
tion and comparisons is perhaps the most difficult task in this multiethnic study. Even
the definition of a population from which to sample is challenging. A major aim of
this study is ethnic comparisons of the use of services. The Miami-Dade area in South
Florida has became a refuge, first for Cubans, now for people from other countries in
Latin America and the Caribbean, as well as countries in Africa and Eastern Europe dev-
astated through social unrest or natural disasters. A random sample of the Miami-Dade
area of the 850 people needed for analysis of the study variables would yield such
diversity that the formation of sizable comparison groups would be difficult.
Consequently, stratified sampling (Polit et al., 2003) is the chosen method. We
attempt similar numbers of Cubans, other Hispanics, Caribbean/American-born
Blacks, and Non-Hispanic Whites and a relatively normal distribution of community
services used by the caregivers, the key outcome variable in this study. Two cooperat-
ing home-care agencies are crucial for recruitment. Once a month, the agencies gen-
erate a list of active cases (see Figure 1) that meet the inclusion criteria (patients must
have a family caregiver, be 65 years of age or older, and speak English or Spanish).This
list serves as the sampling frame from which participants can be selected at random.
This procedure diminishes bias by forcing staff to ask predetermined caregivers for
participation rather than selecting the ones they deem most cooperative.
Although a direct approach of researchers contacting patients and caregivers would
be more efficient, we are limited by stringent government regulations that prohibit the

Figure 1 Subject Recruitment Process

336
Friedemann et al. Data quality management
disclosure of patient names from institutions to researchers before the subjects have
agreed to be contacted. Consequently, a key person or liaison in each agency who is
committed to keeping track of assigned cases, and guides and follows up with staff is
essential.
Based on the project’s timeline, the required sample size of 850 caregivers requires
25 to 30 interviews per month. In these agencies, about 50% of the potential partici-
pants actually deliver completed interviews. Frequently, a home health nurse visits a
selected patient only once or twice and forgets to introduce the study. Other reasons for
missed referrals are a heavy workload and non-committed temporary and part-time
employees. The reason for the remaining missed referrals concerns caregiver refusal.
This situation is problematic, but we are bound by the law to maintain confidentiality
mentioned above. Therefore, we supplement agency referrals with individuals targeted
in the community.
Recruiting caregivers. Quality data depend on a relatively low refusal rate, since elimi-
nating entire categories of participants with fears about research or disinterest intro-
duces bias (unfair distribution of participant characteristics) and increases non-
response error (Subcommittee, 2001). The importance of the interaction that takes
place in the recruitment process in controlling non-response error cannot be overes-
timated (Couper, 1997). Consequently, the researchers with the assistance of the
agency liaison have conducted intensive workshops for agency staff with the goals of
raising enthusiasm for the study and understanding the relationship between recruit-
ment methods and data quality. Recruitment skills have been taught through role-play
with a script composed for the study. Staff also learned to inform caregivers of the
study’s potential usefulness in helping future caregivers, i.e., the compiled information
will be used to inform policy makers of service needs of caregivers.
Finally, they give an informative letter from the researchers to the prospective
participants and inform the caregivers about a small compensation for completing a
1.5-hour interview. Such rewards are recommended by research funding agencies in
the United States and accepted by the ethics committee. After having communicated
these study details, the visiting staff ask the prospective participants for the permission
to be contacted by a researcher (See Figure 1– Agreement to Participate).
Additional training sessions are held periodically for new staff individually or in
small groups. The research team has learned that the training of small groups of highly
motivated health care workers is most effective in assuring cooperation and that the
motivation needs to be maintained. We encourage, reward, and support the liaison per-
son and the cooperating staff on an ongoing basis through telephone calls, joint lunch
meetings, holiday cards, and special recognition of their efforts in research reports, and
a project newsletter that outlines study progress, reports meaningful caregiver stories,
and includes notes of appreciation.
In spite of these efforts, the actual refusal rate (caregivers refusing to participate after
being introduced to the study) has been approximately 23%, suggesting a potential
for non-response error. The major reasons for caregiver refusal are feeling over-
whelmed, or being mistrusting. Unfortunately, we do not have access to demographic
and health data of the refusing individuals; thus, the estimation of a systematic sam-
pling error is difficult.
The issue of lowering coverage error (Subcommittee, 2001) is even more com-
plex. In this study, as described above, we attempt to recruit comparable numbers of
participants in the three major ethnicities and an acceptable variance in service use.
We conducted a preliminary analysis with the first 250 cases as a basis for discussion
of sample coverage and need for change in participant selection methods. Two major

337
Journal of Research in Nursing 12(4)
deficiencies became apparent: insufficient Caribbean and American-born Black care-
givers and a small number of caregivers who use formal services extensively.
Perhaps the most challenging segment of black participants is immigrants from
Haiti. A home care agency initially chosen because it is located in the heart of Little
Haiti, the home of the majority of Haitian immigrants in the Miami-Dade area, did
not cooperate. Even though the agency employed Haitian nurses to work with these
patients, these nurses were not able to recruit a single caregiver and reported that
Haitians were suspicious and unwilling to cooperate with researchers. It has become
clear that the recruitment can become successful only through the use of an essential
key person. We involve a charismatic male community leader who has been success-
ful in referring cases. He accompanies one of the other interviewers to the interviews
to alleviate fears and helps in case of language problems. Additionally, we now recruit
caregivers in targeted areas in the community with the assistance of minority gradu-
ate nursing students who, as part of their research experience, ask people of their
own ethnicity for participation. This strategy has been successful in boosting the
number in the deficient ethnic groups by an average of five participants per month.
Evaluating success to date, we are satisfied with the geographic spread of our care-
givers throughout the area, improved ethnic variance and a wide range of income and
education levels.
To access high service users, we engage social workers and other key personnel of
community organizations that offer case management and link caregivers with pro-
grams in the community. In spite of these improvements, the team must continue to
tabulate the representation of all major ethnic groups and the use of services as time
progresses.
If the staff’s skills are used effectively, caregivers agree to participate and indicate
their name and phone number, preferred language for the interview, and a best time
to be contacted by a researcher on a form provided.The health care worker then sends
the completed participant forms to the research team by fax (see Figure 1).
Scheduling appointments. Having received the referrals, the research team takes over the
responsibility for the rest of the research process. In addition to the principal investi-
gator and two co-investigators, the team consists of a project director, five data collec-
tors with excellent interpersonal skills, a bilingual Spanish language specialist and
quality control supervisor, and a research assistant.
The importance of having a multiethnic data collection team for this type of a study
cannot be underestimated. A designated Latino data collector receives all referrals of
Spanish-speaking caregivers, contacts these caregivers, and schedules appointments at
their convenience. The project director schedules the appointments for English-speak-
ing participants. She prints out electronic road maps and assigns two data collectors to
each case. In addition to safety, going out in pairs promotes data quality through
mutual supervision interviewing procedures and data form completion.

Data gathering
In this stage, the meaning of data shifts from the names of potential participants to
actual participants and to information gathered from these participants. Figure 2
depicts the data-gathering process starting with attaining consent from the caregiver
and ending with a completed interview schedule.
Attaining consent. Informed consent forms instruct research participants about their
involvement in the study, their benefits and rights. Even in studies involving minimal
risk to participants, like this project, the text on the consent form must follow strict
government regulations and must be written at an educational level participants will

338
Friedemann et al. Data quality management

Figure 2 Data Gathering Process

understand. Before the interview, the data collectors read the text to the participants.
The participants sign the forms if they agree with the conditions and receive a copy.
For the protection of the participants, the researchers keep the forms safely locked in
a compartment and separate from the interview data. Team members involved with
data analysis or report writing thus will be unaware of personal identities.
Setting the stage for the interview. The interview (see Figure 2 – Structured Interview) is a
standard and common technique in social science disciplines for generating social
research data (Hester et al., 1994). Personal interviews have the advantage of increas-
ing the accuracy and completeness of data since the interviewer can clarify questions
and make sure every item is addressed.They don’t allow anonymity, however, and care-
givers interacting with an unfamiliar person may not be willing to share sensitive
information (Polit et al., 2003).
The success of the personal interview, honestly delivered quality data, is contingent
upon the caregiver’s trust in the interviewer’s intentions and the quality of their inter-
actions (Billiet et al., 1988). To prevent further non-response error at this stage, the
data collectors must use optimal communication skills. Hispanic and other minority
caregivers need to develop trust to overcome fear and suspicion that are often based
on past negative experiences in their countries of origin. This underscores the neces-
sity for setting the stage before starting the procedure.
Interviewers in this study are motivated and trained in a three-day workshop fol-
lowing techniques similar to those described by Stycos (1955). On the first day, our
training focused heavily on establishing a trusting relationship with the respondents.
The interviewers learned to first let the caregivers know who they are by describing
their role on the research team. They then let the caregivers decide on a location for
the interview and try to follow cultural rules of politeness. After sitting down togeth-
er, interviewers socialize with the caregivers until they feel comfortable. Comments
about the caregivers’ home, furnishings, family photographs, artifacts, or decorations
are useful to find common ground and to establish trust.
Interviewing. An unresolved debate exists around the standardization of the interview-
ing process versus the conversational approach (Suchman et al., 1990; Sanchez &
Morchio, 1992; Pawson, 1996; Conrad et al, 1999). In order to be both personal and
strictly consistent, the hybrid approach of Biemer and Lyberg (2003) led us to training

339
Journal of Research in Nursing 12(4)
the interviewers to follow the uniform questionnaire in the proper sequence but to be
flexible in the general conversation.
Data validity, reliability, integrity, and completeness depend on the interview form
and interview process. With regard to process, the ability of the interviewer to make
contact with the respondent and to secure cooperation is undoubtedly important in
meeting the goal of quality data (Bradburn et al., 2004). After the classroom training,
interviewers learned through supervised practice interviews to conduct the procedure
professionally, with a caring attitude, by facing the caregiver when talking, speaking
clearly and slowly, waiting for responses, explaining details carefully, and asking
repeatedly if there are any questions.They were warned against showing surprise reac-
tions to responses or making inappropriate comments about their private life and
against using a business style resembling an interrogation. Instead, they keep the
momentum going with a smile, some humorous words, and concern for the caregiv-
er’s well being.
In the field, the interviewers first remove certain obstacles to communication, for
example, by turning off the television or politely asking visitors not to disturb the
interaction. To minimize patient disruptions, one of the interviewers spends time and
socializes with the patient, seeking the help of the caregiver only if an urgent need
arises. After having gained the caregiver’s full participation, the primary interviewer
follows a simple set of rules:

• Read instructions and each item exactly as written in the interview schedule.
• Enter each answer immediately in the space provided.
• Read the question a second time if the caregiver has difficulty understanding it.
(The meaning of a single word can be clarified).
• Follow the instructions to the interviewer on the interview schedule.
• Make notes at the end of the interview about anything unusual in the interaction
with the caregiver (e.g., emotional reaction, cognitive problems).
• Check every page for omitted items at the end of the interview.

These rules help to assure consistency in asking the questions and therefore relia-
bility of the data (Polit et al., 2003).To enhance consistency and reliability of the data,
the data collectors use laminated cards on which the possible response options to the
structured questions are listed in large print. These are particularly helpful for elderly
participants.
The team strives to leave the participants satisfied and feeling that they have made
a worthwhile contribution. At the conclusion of their structured interview, data
collectors give the respondents the opportunity to express themselves freely and com-
municate anything they perceive as important caregiver issues. They respond to four
open questions about how they feel about being a caregiver, what would make their
life easier, what problems they have encountered with existing programs and what
other services would be helpful. Taking notes of what is said conveys the importance
of what is reported and reading the notes back to the participant verifies accuracy
and correct interpretation of the issues. In addition to citing parts of these stories in
the project newsletter and collecting citations as evidence for future work with
policy makers, qualitative information can add to data quality if it lends depth to
quantitative data interpretation or clarifies certain findings derived from the analysis.
Therefore, each interviewer transcribes the caregivers’ qualitative responses on an
electronic file. The files of the four interviewers will then be merged and the text will
be ready for a content analysis.

340
Friedemann et al. Data quality management
Interview supervision. Interviews are periodically supervised to ensure ongoing data
collection quality and to evaluate whether the process is evolving as envisioned. This
is crucial since interviewers tend to settle down into routine patterns after some time,
may skip certain precautions, or lose their concentration. To prevent errors, each data
collector’s tenth interview is supervised by the quality control supervisor or PI who
will take on the role of the secondary interviewer, listen in, and score concurrently
every response of the participant. The two interviewers subsequently compare the two
interview schedules for discrepancies and discuss the quality of communication and
the visit in general. Problems are brought to the team to be discussed and solved dur-
ing team meetings held every other week.
Also to control interview quality, the quality control supervisor, who is bilingual,
calls every tenth participant to inquire whether the interview was satisfactory, the
interviewer was polite, and if the participant’s concerns were addressed.The investiga-
tor presents the quality reports to the interviewer involved either as a motivational
incentive or corrective measure, depending on the nature of the report.

Data monitoring
Data monitoring consists of checking the interview schedule for completeness and
logic after the items are marked, preparing the data for entry, conducting data entry by
two independent entry persons, checking the data entries, and cleaning the data file,
as pictured in Figure 3.
Data monitoring addresses the integrity of the data and consequently their reliabil-
ity (Condelli et al., 2002) and it helps control for processing errors (Arondel et al.,
1998).To minimize serious measurement error (Arondel et al., 1998), the researchers
in this caregiver study included standardized instruments in the interview schedule to
measure family dynamics and cultural patterns as well as patient and caregiver charac-
teristics. They selected tools that were appropriate for multiethnic samples.
Nevertheless, certain tools had to be constructed for the study.To reduce the likelihood
of measurement error, all instruments were piloted with two small samples of
white, black, and Hispanic caregivers and were found reliable and easy to administer
to caregivers.

Figure 3 Data Monitoring Process

341
Journal of Research in Nursing 12(4)
Preparing for data entry.The quality of interview data is contingent upon the prevention
of errors derived from marking the wrong response choices or omitting responses.
After the interview, the second data collector checks all pages of the interview sched-
ule for completeness. After submitting the interview schedules for data entry, another
check is performed. Error-free data require that the research director makes sure that
each entry pertains clearly to a particular response and is logical. For example, if a
caregiver is marked as being single, a later question about who lives in the household
cannot include a mark for a spouse. The caregiver may have to be called for clarifica-
tion if entries are inconsistent or omitted by error. Clean interview schedules are then
ready for data entry (See Figure 3 – Clean Raw Data).
Every response accompanied by a consecutive number on the interview schedule
represents a variable, and code numbers are designated and marked clearly next to
every option on the interview schedule. Therefore, the interview schedule itself serves
as a codebook that documents the codes and the procedures for applying them
(Weston, 2001). Data can be entered into an electronic data file directly from the
interview schedule. An easily readable format of the questionnaire prevents data entry
mistakes and thus, processing errors.
Entering data and cleaning the data file. A crucial feature within the data management
process is the regular review of data entered into the data system. To reduce inconsis-
tencies, two research team members enter the codes simultaneously into separate
temporary electronic data files, making sure with each entry that the item number in
the interview schedule matches the variable number in the electronic data file.
With every twenty cases the Research Assistant performs a consistency check. First,
he examines the two temporary files for stray codes (numbers inconsistent with the
code book). Next, he calculates frequency distributions on all 430 variables for both
files. If the two frequency distributions of a particular variable are not equal, he locates
the discrepancies by comparing all entries in the two data sets, finds the correct entries
on the interview schedules, and rectifies the errors. We find an average of fifteen dis-
crepancies in every twenty cases, in spite of taking great care when entering the codes.
The comparison of frequency distributions controls for most wrongly entered codes
as well as omitted entries. Nevertheless, the procedure may be flawed, since it allows
errors in each separate file to cancel each other out in the frequency count, but still
have individual case differences between the two files. To reduce such errors, we ini-
tially compared five randomly selected cases per twenty newly-entered cases, item by
item, on both temporary files. Since we discovered no discrepancies for three months,
we are now comparing two cases only to ascertain correctness.
Attributing values to missing items presents a challenge the statistician can meet
best by deciding on the most suitable method of replacing the missing figure: mean,
median, or other point estimates; or the distribution of related non-missing values
(Dasu et al., nd). After meeting this challenge, the checked temporary file is merged
with the permanent data file.This procedure is repeated until a complete and clean data
set is ready for data analysis

Conclusions
The example illustrates a process leading to an optimally reliable data set ready to be
analyzed. In spite of countless sources of error, many such errors can be avoided by
a rigorous data quality maintenance plan. Nevertheless, the example showed that
certain errors, e.g. errors due to sampling and recruitment challenges, while being
diminished through strategic changes in the plan, could not be eliminated. The

342
Friedemann et al. Data quality management
outcome of a well-thought-out plan has the potential to bring forth results that rep-
resent the phenomenon studied (Schutz, 1997).
The example shows the use of data flow diagrams to make the process visible to the
whole research team. Even though team members may be concerned only with a small
number of the procedures pictured in the diagram, each member should understand
fully how his/her particular role contributes to the whole of the study. A strong team
is perhaps the most effective means to prevent errors. Errors can best be avoided if the
team members are committed to and assume responsibility for the project as a whole
and its outcomes. The researchers of the caregiver study encouraged such cohesion as
soon as the team was formed, starting with the initial training sessions. Using the data
flow diagrams as a visual aid, they included intensive discussions of data quality main-
tenance. Through these discussions, the team offered many of the ideas described
above, assisted in the final formatting of the interview schedule and helped to define
the recruitment and data collection procedures. Ongoing discussions of data quality
are used for joint problem solving.
No study is conducted free of problems and errors. Nevertheless, beyond limita-
tions imposed by confidentiality law, cultural differences, and funding and time lim-
its, this example presented to nurse researchers a carefully planned effort to reduce the
likelihood of common mistakes potentially affecting the validity of the data collected,
enriched by discussions of challenges encountered by cross-cultural researchers. Data
quality is the product of careful strategic planning and painstaking execution of the
plan and needs to be the number one objective of every serious researcher.

Acknowledgement
This study is supported by Florida International University MBRS grant, SCORE
Project, National Institutes of Health, National Institute of General Medical Sciences.

References
Arondel, P., Depoutot, R. (1998). Overview of quality Department of Labor, Bureau of Labor Statistics. Retrieved
issues when dealing with socio-economic statistical August 2, 2006, from http://stats.bls.gov/ore/pdf/
products in an international environment. Eurostat st990250.pdf
Publications. Retrieved August 2, 2006, from http:// Couper, M. (1997) Survey introductions and data quality.
europa.eu.int/en/comm/eurostat/research/conferences/ The Public Opinion Quarterly, 6, 317–338.
ntts-98/papers/cp/004c.pdf Dasu, T., Johnson, T. (n.d.) DQ overview. Research in Data
Biemer, P., Lyberg, L. (2003) Introduction to survey quality. Quality (DQ). Retrieved August 2, 2006, from http://
Hoboken, NJ: John Wiley & Sons, Inc.,. www.dataquality-research.com
Billiet, J., Loosveldt, G. (1988) Improvement of the quality Divorski, S., Scheirer, M. (2001) Improving data quality for
of responses to factual survey questions by interviewer performance measures: Results from a GAO study of
training. The Public Opinion Quarterly, 52:2, 190–211. verification and validation. Evaluation and Program Planning,
Bradburn, N., Sudman, S., Wansink, B. (2004) Asking ques- 24:1, 83–94.
tions: The definitive guide to questionnaire design. For market research, Hester, S., Francis, D. (1994) Doing data: The local organi-
political polls, and social and health questionnaires. San Francisco, zation of a sociological interview. The British Journal of
CA: John Wiley & Sons, Inc. Sociology, 45, 675–695.
Condelli, L., Castillo, L., Seburn, M., Deveaux, J. (2002) Infoarch Group (nd) The IT analyst’s desktop: Data flow
Guide for improving NRS data quality: Procedures for diagram. Retrieved August 2, 2006, from http://www.
data collection and training. Washington DC: American infoarchgroup.com/qrdfd.htm
Institutes for Research and National Reporting System for Ligon, G. (1996, April) Data quality: Earning the confidence of deci-
Adult Education. Retrieved August 2, 2006, from http:// sion makers. Paper presented at the Annual Meeting of the
www.nrsweb.org/traininginfo/dataquality.pdf American Educational Research Association, New York .
Conrad, F., Schober, M. (1999) Conversational interviewing Monson, K. A., Kerr, M. J. (2004) Mining quality documen-
and data quality. Proceedings of the Federal Committee on tation: Data for golden outcomes. Home Health Care
Statistical Methodology Research Conference. Washington, D.C.: Management & Practice, 16, 192–199.

343
Journal of Research in Nursing 12(4)
Munday, E. (2003) Data quality issues plague primary care. surveys. (Statistical Policy Working Paper 31), Washington
The British Journal of Healthcare Computing & Information D.C.: Statistical Policy Office, Office of Information and
Management, 20:3, 22–24. Regulatory Affairs, & Office of Management and Budget.
Pawson, R. (1996) Theorizing the interview. The British Retrieved August 2, 2006, from http://www.
Journal of Sociology, 47, 295–314. fcsm.gov/01papers/SPWP31_final.pdf
Polit, D. F., Beck, C. T. Nursing Research: Principles and Methods Suchman, L., Jordan, B. (1990) Interactional troubles in
(7th ed.) (2003). Philadelphia, PA: Lippincott Williams & face-to-face survey interviews. Journal of the American
Wilkins. Statistical Association, 85:409, 232–241.
Pratt, V (2003) The philosophy of the social sciences. New York: The Senior Secondary Assessment Board of South Australia
Taylor & Francis. (SSABSA) (1999) Data Flow Diagrams. Curriculum Area
Sanchez, M., Morchio, G. (1992) Probing “don’t know” Manual for Technology. Retrieved August 3, 2006, from
answers: Effects on survey estimates and variable relation- http://www.schools.ash.org.au/olshc/infotech/dataflo
ships. The Public Opinion Quarterly, 56, 454–474. w.htm.
Schutz, A. (1997) The phenomenology of the social world. Evanston, Tucker, D. (2003) Turning point for data accreditation. The
IL: Northwestern University Press. British Journal of Healthcare Computing & Information Management,
Seale, C. (1999) Quality in Qualitative Research. Qualitative 20:6, 15–17.
Inquiry, 5, 465–478. Webster’s Dictionary (n.d.) Definition of data flow diagram.
Stycos, J. M. (1955) Further observations on the recruit- Dictionary of Computing. Retrieved August 2, 2006, from
ment and training of interviewers in other cultures. The http://www.websterdictionary.org/definition/data%20f
Public Opinion Quarterly, 19, 68–78. low%20diagram
Subcommittee on Measuring and Reporting the Quality of Weston, C. (2001) Analyzing Interview Data: The develop-
Survey Data, Federal Committee on Statistical ment and evolution of a coding system. Qualitative Sociology,
Methodology (2001) Measuring and reporting sources of error in 24, 381–400.

Marie-Luise Friedemann is Professor at Florida International University, School of


Nursing. She has a doctorate and a Masters in Psychiatric Nursing from the University
of Michigan. She has held faculty and administrative positions at Wayne State
University and Florida International University. At the present, she is a researcher at
FIU, teaches graduate courses in theory, cultural awareness, and research and is
involved in foreign student exchange programs. Family nursing is her specialty and
research projects involve family caregiving. Dr. Friedemann has developed a theoreti-
cal framework, the Framework of Systemic Organization, taught and applied in
practice in many countries. She also developed a theory-based family assessment
instrument, the Assessement of Strategies in Families, that has been translated into five
other languages and is used world-wide. She is the author of many journal articles and
several books and serves as consultant for research and curriculum development in
Europe and Latin America. Email: friedemm@fiu.edu

Carlos Mayorga was born in Colombia and graduated from medical school at the
Universidad Industrial de Santander (Colombia). Besides his medical degree, he has a
Masters in Health Management with emphasis in health policy. His professional expe-
rience ranges from the provision of primary healthcare to healthcare management
within the Colombian Social Security Health System. His primary interests are the
development of social and healthcare systems which demands attention to bench-
marks, detailed analysis, and projections for resource allocation. His special interest is
connecting Spanish and English-speaking healthcare organizations that synergistically
utilize multiple funding opportunities for mutual gain. Carlos moved to the United
States in 2003 to develop a health and wellness education company. He also serves as
a research consultant and a medical translator for Miami International Cardiology
Consultants, Miami Heart Institute and researchers in South Florida.

Frederick Newman, Professor in Health Policy & Management at Florida International


University since 1990 has a doctorate in Psychology from the University of
Massachusetts. Dr. Newman has been involved in mental health and substance abuse
services and treatment research that led improved methods as well as changes in serv-

344
Friedemann et al. Data quality management
ice delivery and funding policies. He was a Fellow at the Center for Advanced Studies
in the Behavioral Sciences (1971–2) and elected a Fellow in the American Association
for the Advancement of Science, three divisions of the American Psychological
Association (Statistics, Measurement & Evaluation; Clinical; and Psychotherapy), and
the American Psychological Society. Dr. Newman served as Associate Editor for the
Journal of Consulting & Clinical Psychology from 1991 through 1996 and in 2003,
where his focus was on research methods. He has coauthored three books, over 300
refereed journal articles and 17 book chapters. Professor Newman has received
several teaching awards.

345

You might also like