You are on page 1of 8

Methods

Machine Learning Methods for Identifying


Critical Data Elements in Nursing
Documentation
Downloaded from https://journals.lww.com/nursingresearchonline by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AWnYQp/IlQrHD3OlGIyw74G1/4msPui8fjtZK8V9ijaKoamZKztUcXzW9+kasoWYq6nQ== on 04/25/2020

Eliezer Bose ▼ Sasank Maganti ▼ Kathryn H. Bowles ▼ Bonnie L. Brueshoff ▼ Karen A. Monsen

Background: Public health nurses (PHNs) engage in home visiting services and documentation of care services for at-risk clients.
To increase efficiency and decrease documentation burden, it would be useful for PHNs to identify critical data elements most
associated with patient care priorities and outcomes. Machine learning techniques can aid in retrospective identification of
critical data elements.
Objective: We used two different machine learning feature selection techniques of minimum redundancy–maximum relevance
(mRMR) and LASSO (least absolute shrinkage and selection operator) and elastic net regularized generalized linear model
(glmnet in R).
Methods: We demonstrated application of these techniques on the Omaha System database of 205 data elements (features)
with a cohort of 756 family home visiting clients who received at least one visit from PHNs in a local Midwest public health
agency. A dichotomous maternal risk index served as the outcome for feature selection.
Application: Using mRMR as a feature selection technique, out of 206 features, 50 features were selected with scores greater
than zero, and generalized linear model applied on the 50 features achieved highest accuracy of 86.2% on a held-out test set.
Using glmnet as a feature selection technique and obtaining feature importance, 63 features had importance scores greater than
zero, and generalized linear model applied on them achieved the highest accuracy of 95.5% on a held-out test set.
Discussion: Feature selection techniques show promise toward reducing public health nursing documentation burden by
identifying the most critical data elements needed to predict risk status. Further studies to refine the process of feature selection
can aid in informing PHNs’ focus on client-specific and targeted interventions in the delivery of care.
Key Words: machine learning  nursing informatics  Omaha System  public health nursing
Nursing Research, January/February 2019, Vol 68, No 1, 65–72

I
n public health nursing and maternal–child home visiting, This costly intervention strategy has been shown to save pub-
a lengthy therapeutic relationship between a public health lic dollars over time (Eckenrode et al., 2010). However, with
nurse (PHN) and a high-risk parent has demonstrated itself decreasing resources and time spent per client, it would be
as an effective approach for improving life course trajectories useful for PHNs to identify critical data elements during their
for this population (Barnard, 1998; Eckenrode et al., 2010; home visits and decrease documentation burden (Keenan,
Monsen et al., 2006). Length of public health nursing home Yakel, Tschannen, & Mandeville, 2008). Machine learning tech-
care varies and may be as short as a few visits to provide infor- niques can aid in retrospective investigation of critical data ele-
mation and connections to resources in relatively simple situa- ments to determine what is important to the case and what is not.
tions or as long as several years to provide relationship-based Data reduction is the process of obtaining a reduced rep-
therapeutic and educational interventions addressing highly resentation of the data set that is much smaller in volume but
complex needs (Monsen, Farri, McNaughton, & Savik, 2011). produces almost the same analytical results (Han, Kamber, &
Pei, 2011). The main objective of this article is to introduce
Eliezer Bose, PhD, BEng, APRN, ACNP-BC, is Assistant Professor, University of readers to two different machine learning techniques of fea-
Texas at Austin School of Nursing. ture reduction and illustrate their application using a large
Sagank Maganti, MS, B-Tech, is Research Assistant, University of Minnesota
Carlson School of Computer Science and Engineering, Minneapolis. healthcare data set with the Omaha System.
Kathryn H. Bowles, PhD, RN, FAAN, FACMI, is Professor, University of Pennsylvania
School of Nursing, Philadelphia.
Bonnie L. Brueshoff, DNP, RN, PHN, is Public Health Director at Dakota County, METHODS
Minneapolis, Minnesota.
Karen A. Monsen, PhD, RN, FAAN, is Associate Professor, University of Minnesota
School of Nursing, Minneapolis. Sample
Copyright © 2019 Wolters Kluwer Health, Inc. All rights reserved. This study used deidentified existing data and was approved by
DOI: 10.1097/NNR.0000000000000315 the university institutional review board as nonhuman subjects

Nursing Research www.nursingresearchonline.com 65

Copyright © 2018 Wolters Kluwer Health, Inc. All rights reserved.


66 Machine Learning for Critical Data Elements www.nursingresearchonline.com

research. We randomly selected 756 clients from a large pool of For the dependent variable, a maternal risk index (MRI)
maternal–child health clients who received public health nurs- variable developed previously as a metric for risk classification
ing services between 2000 and 2009. Of the 756 clients, the ma- using the Omaha System was computed for each client
jority were women (88%), White (62%), and not married (82%), (Monsen et al., 2011; Monsen, Peterson, et al., 2017). The
with an average age of 19.2 years (SD = 10.4 years). MRI summarizes risk using weighted problem totals adjusted
by baseline knowledge scores. The algorithm for transforming
Data Source existing variables into the MRI score is as follows [number of
problems for which client received interventions +1 if Income
The Omaha System is an interface terminology available in the
problem + 1 if Substance use problem + 1 if Mental health
electronic health record for documenting patient care that is
problem) / baseline knowledge score] (Monsen et al., 2011).
recognized by the American Nurses Association and is a mem-
We calculated the MRI for the very first encounter with
ber of the Alliance for Nursing Informatics (Martin, Monsen, &
the client to establish a baseline MRI. Risk index scores were
Bowles, 2011). The Omaha System has three components: the
partitioned at the median of the distribution to form two
Problem Classification Scheme, the Intervention Scheme, and
groups (low- and high-risk clients; n = 378 in both groups).
the Problem Rating Scale for Outcomes. The Problem Classifi-
cation Scheme is a taxonomy of 42 health concepts called
Data Analysis
problems that are organized under four domains: Environmen-
We used R Version 3.3.2 software for our analysis purposes.
tal (4 problems), Psychological (12 problems), physiological
(18 problems), and Health-related behaviors (8 problems). Feature selection Techniques In machine learning, feature
Each problem has a set of unique signs/symptoms used in selection, also known as variable subset selection, is the pro-
clinical assessments to further specify the problem. Signs/ cess of selecting a subset of relevant features (variables). If
symptoms variables are binary (present/not present). The the goal is to reduce a data set’s high dimensionality but still
Intervention Scheme consists of four levels: problem, cate- preserve variables, then a common solution to this conundrum
gory, target, and care description. The first level (problem) is feature selection. The central premise when using a feature
consists of all of the problems in the Problem Classification selection technique is that the data set contains features that
Scheme. The second level (category) consists of four action are either redundant or irrelevant and as such can be removed
terms: teaching, guidance, and counseling; treatments and without incurring much loss of information (Guyon & Elisseeff,
procedures; case management; and surveillance. The third 2003; Igbe, Darwish, & Saadawi, 2016). Some examples of ma-
level (target) consists of 75 defined terms that provide addi- chine learning feature selection techniques are LASSO (least
tional information about the focus of the intervention. The absolute shrinkage and selection operator) and elastic net reg-
fourth level (care description) is not taxonomic and was not ularized generalized linear models (glmnet in R) and mini-
used in this analysis. The Problem Rating Scale for Outcomes mum redundancy–maximum relevance (mRMR).
consists of three Likert-type ordinal scales for rating problem-
specific client knowledge, behavior, and status (KBS) (1 = most Ridge and LASSO Regression A general regression model
negative to 5 = most positive). Omaha System data sets are with p predictors x1, x2, …, xp with response variable y is
particularly suited to intervention effectiveness research be- predicted by
cause use of the Omaha System in routine documentation
b
y¼β b 1 þ ⋯: þ xp β
b 0 þ x1 β bp 1
generates relational problem-specific assessment, intervention,
and outcomes data.
A model fitting procedure, such as regression,
 produces the
b¼ β
vector of coefficients β b p . For instance, the ordi-
b 0 ; ⋯⋅; β
Independent/Predictor and Dependent Variables nary least squares (OLS) estimates are obtained by minimizing
We considered the Omaha System client assessment and ser- the distance between the actual and estimated values of the
vice delivery variables independent variables. Client assess- dependent variable or target variable.
ment variables were problem-specific signs/symptoms and Each distribution has a measure of spread of the data ele-
problem-specific ratings for knowledge (no knowledge, ments such as variance. Whenever a model is constructed,
superior knowledge), behavior (not appropriate, consistently the total variance can be split into variance explained by the
appropriate), and status (extreme signs/symptoms, no signs/ estimators or independent variables (commonly referred to as
symptoms). These ratings were averaged across problems R2) and the unexplained variance or the variance of the resid-
to yield one KBS score for each client before and after receiv- uals. Root mean square error is the square root of the variance
ing public health nursing services. Service delivery variables of the residuals. The higher the R2, the greater the model is
included problem-specific intervention categories and tar- able to explain the variance of the distribution. Because OLS
gets. All of these variables were considered independent has been known to perform poorly in both prediction and
variables or features per patient. interpretation, penalization techniques have been proposed

Copyright © 2018 Wolters Kluwer Health, Inc. All rights reserved.


Nursing Research • January/February 2019 • Volume 68 • No. 1 Machine Learning for Critical Data Elements 67

(Zou & Hastie, 2005). Penalized estimation is a procedure that coefficients will be zero valued). Lastly, using a cross-validation
reduces the variance of estimators (independent variables) by procedure enables glmnet to pick an optimal value for λ. Dis-
introducing substantial bias, which becomes a major compo- crete estimates of the coefficients (βs) are made along the
nent of the mean squared error, with variance contributing way. Readers interested in getting an in-depth mathematical
only a small part. A brief explanation about penalization fol- knowledge about glmnet should refer to the Friedman, Hastie,
lows. However, the reader would benefit from knowing about and Tibshirani (2010) article. Readers are further directed to this
two bias introducing procedures called L1-norm and L2-norm. website by the original authors https://web.stanford.edu/
As an error function, L1-norm, often referred to as least ~hastie/glmnet/glmnet_alpha.html for glmnet vignette. The final
absolute deviations procedure, minimizes the sum of absolute active set of features are those with nonzero coefficients (βs).
differences between the target value (Yi) and the estimated Once a model is developed, automatic variable selection proce-
values. As an error function, L2-norm, also called least squares, dures allow users to obtain a list of the features selected by the
minimizes the sum of the square of the differences between model. The concept of elastic net in automatic variable selection
the target value (Yi) and the estimated values. Most regression is similar to that of retaining “all the big fish” by stretching the
techniques use either the L1-norm or the L2-norm for penali- fishing net (Knights, Costello, & Knight, 2011). We used the
zation. For feature selection, L1-norm adds a penalty equal caret package in R to execute glmnet.
to error plus the sum of the absolute value of the coefficients.
L2-norm adds a penalty equal to the error plus the sum of the Minimum Redundancy–Maximum Relevance (mRMR)
squared value of the coefficients. Ridge regression creates a In feature selection, it is important to choose features that are
regression model penalized with the L2-norm, which has the relevant for prediction but more so to have a set of features that
effect of shrinking the coefficient values, allowing coefficients are not redundant in order to increase robustness (Auffarth,
with minor contribution to the target variable to get close to López, & Cerquides, 2010; Hira & Gillies, 2015). Feature selec-
zero. LASSO, on the other hand, creates a regression model tion approaches have been categorized into filter-based methods
penalized with the L1-norm, which has the effect of shrinking (Yu & Liu, 2003), wrapper-based methods (Kohavi & John,
coefficient values, allowing some with a minor effect to the 1997), and embedded methods (Guyon, Gunn, Nikravesh, &
target variable to become zero (Kuhn & Johnson, 2013). Ridge Zadeh, 2006). Filter-based feature selection methods apply a
regression (Hoerl & Kennard, 1988) and LASSO (Tibshirani, statistical measure to assign a scoring to each feature. Features
1996) have been proposed to improve OLS. However, with in- are ranked by their score and selected either to be kept or
herent problems found in both ridge regression, such as not removed from the data set. Wrapper-based methods treat
producing a parsimonious model and limitations of LASSO the selection of features as a search problem, where several
(Zou & Hastie, 2005), regularization techniques such as elastic combinations of features are evaluated and compared. A
net have been suggested. predictive model is used to evaluate a combination of fea-
tures, and a final score is assigned based on model accuracy
Elastic Net Elastic net creates a regression model that is pe- (AlNuaimi, Masud, & Mohammed, 2015). Embedded methods
nalized with both the L1-norm and the L2-norm. This has the learn which features best contribute to the accuracy of the
effect of effectively shrinking coefficients (as in ridge regres- model while the model is being created (AlNuaimi et al.,
sion) and setting some coefficients to zero (as in LASSO; Zou 2015). The most common type of embedded feature selection
& Hastie, 2005). The final active set of features are those with methods are penalized methods such as LASSO, ridge regres-
nonzero coefficients (βs). This has been used within glmnet in R. sion, and elastic net. A special group of filter-based feature
selection approaches tends to simultaneously select highly
LASSO and Elastic Net Regularized Generalized Linear predictive but uncorrelated features. These approaches
Models (glmnet in R) glmnet fits a generalized linear model tend to select a subset of features having the most correlation
via penalization using the elastic net. Two important parame- with a class (relevance) and the least correlation between
ters needed in the glmnet model are the elastic net penalty themselves (redundancy). In these algorithms, features are
(represented as alpha [α]) and the tuning parameter (repre- ranked according to the mRMR criteria (Radovic, Ghalwash,
sented as lambda [λ]). The elastic net penalty (α) bridges the Filipovic, & Obradovic, 2017). It is likely that features selected
gap between LASSO (α = 1, the default) and ridge (α = 0; according to maximum relevance could have rich redundancy,
Hastie & Qian, 2014). The tuning parameter λ controls the that is, the dependency among these features could be large
overall strength of the penalty, with successive repeats with (Ding & Peng, 2005). When two features highly depend on
different values for λ providing a regularization path. glmnet each other, the representative class discriminative power
will thus fit a whole string of λ values. When λ is very small, would not change much if one of them were removed. There-
the LASSO solution should be very close to the OLS solution, fore, the minimal redundancy condition can be added to select
and all of the coefficients are in the model. As λ grows, fewer mutually exclusive features. The criterion combining the above
variables are kept in the model (because more and more two constraints is called mRMR (Ding & Peng, 2005). For

Copyright © 2018 Wolters Kluwer Health, Inc. All rights reserved.


68 Machine Learning for Critical Data Elements www.nursingresearchonline.com

continuous features, the F-statistic can be used to calculate classic mRMR model. We obtained features retained by mRMR
correlation with the class (relevance), and the Pearson corre- based on decreasing order of importance of their scores. Fifty
lation coefficient can be used to calculate correlation between features had scores greater than zero and thus were considered
features (redundancy; Radovic et al., 2017). Thereafter, features important features, with the top 20 features listed as shown in
are selected one by one by applying a greedy search to maxi- Figure 1. A training and testing subset containing all the patients
mize the objective function, which is a function that integrates with only the above 50 features was retained for further accu-
relevance and redundancy information of each variable into racy testing. We constructed a generalized linear model and
a single scoring mechanism. (Radovic et al., 2017). Readers tested its accuracy on the held-out test data set, with the perfor-
interested in learning more about mRMR are directed to mance of the model indicated by the accuracy (expressed as a
Auffrath et al. (2010), De Jay et al. (2013), Radovic et al. (2017), percentage)—or instances of correct classification of MRI using
and Ding and Peng (2005). mRMR is a very popular tool in the training model. We achieved highest accuracy value of
biostatistics and genetic research to sort through data sets 86.2% with 50 features with α = .55 and λ = .04. Subsequent
containing thousands to millions of features (Ding & Peng, testing with 40 features and 30 features achieved lower accu-
2005). Similar to glmnet in R, features are ranked according racy values of 85.9% and 84.6%.
to their importance score. The top 20 features (Figure 1) using mRMR had 10 knowl-
edge (Residence, Mental health, Postpartum, Abuse, Oral health,
APPLICATION Substance abuse, Family planning, Caretaking/parenting, Health
Findings of the two methods as applied to the Omaha System care supervision, and Neglect), seven signs/symptoms (purpose-
documentation data are presented and compared below. The less activities, inaccurate inconsistent use of family planning
application of each method is provided for clarity. methods, homelessness, inadequate crowded living space,
dissatisfied with family planning methods, steep unsafe stairs,
Application of mRMR for Feature Selection and apprehension undefined fear), and three targets (growth
We split the entire data set into a training and a testing data set and development, anatomy physiology, and sickness injury
using a 50–50 split. The training and testing data sets each had care). Overall, the most important feature was residence;
378 patients and 206 features, with each patient present either when classified individually within the groupings, residence
in the training or in the testing data set. Using mRMR and mak- was the most important knowledge rating, purposeless activ-
ing the class of MRI (high or low) as the dependent variable ities was the most important sign/symptom, and growth and
and all the other variables as predictors, we constructed a development was the most important target.

FIGURE 1. Importance of each variable using minimum redundancy–maximum relevance (mRMR). The name of each individual feature is preceded by
either knowledge (K), signs and symptoms (S), or target (T).

Copyright © 2018 Wolters Kluwer Health, Inc. All rights reserved.


Nursing Research • January/February 2019 • Volume 68 • No. 1 Machine Learning for Critical Data Elements 69

Application of glmnet for Feature Selection in obtaining family planning methods, inadequate delayed medi-
We imported the entire data set of 756 patients and 206 fea- cal care, other5, and other4), and four targets (food, behavior
tures per patient, with the outcome variable being the class modification, mental emotional signs, and symptoms and medi-
of MRI. We used the same previously split training and testing cation administration). Overall, the most important feature was
data sets. Using glmnet and making MRI as the dependent var- food. When classified individually within the groupings, growth
iable and all the other variables as predictors, we constructed a and development was the most important knowledge rating,
model that used three repeats of cross validation done on the fearful, hypervigilent behavior was the most important sign/
training data set (378 patients and 205 predictors, one out- symptom and food was the most important target (Figure 3).
come variable [MRI]). The advantage of using glmnet in R is
that, for all the list of coefficients (βs) developed, more impor- DISCUSSION
tant coefficients will be larger than the less important ones and We applied two different machine learning methods for reduc-
could easily be ranked by their magnitude. Variable impor- ing high dimensionality of the Omaha System data set. Our goal
tance evaluation functions, such as VarImp in caret package was to identify critical data elements necessary for a particular
in R, are those that use the glmnet model information with outcome variable, which, in our study, was the MRI.
all measures of importance scaled to have a maximum value Given that knowledge ratings adjust the total problem
of 100 (Kuhn, 2012). This technique allows users to rank vari- count of the MRI metric, it may be expected that knowledge
ables in order of importance. Figure 2 shows the top 20 vari- ratings would be among the top 20 features of both the mRMR
ables listed in order of their importance. Sixty-three features and glmnet model. Interestingly, the three weighted problems
had their importance value greater than zero, and further test- in the MRI that were included in the original metric based on
ing was performed using only those features, removing the the literature (Income, Mental health, and Substance use) were
other features. We iteratively reduced features, based on less important than other problems in determining risk, in-
varImp from 205 to 63, 50, 40, and 30. Using 63 features and cluding Residence, Growth and development, Oral health,
constructing a generalized linear model achieved the highest Caretaking/parenting, Abuse, and Postpartum (Monsen et al.,
accuracy of 95.5% with α = .55 and λ = .004. Subsequent low- 2011). This may suggest that lower knowledge of basic health
ering of features revealed lower accuracy values of 94.4%, information for maternal–child health problems is associated
92.6%, and 92.3%, respectively. with increased risk. Further research is needed to evaluate
The top 20 features (Figure 2) using glmnet had 10 knowl- differential knowledge by problem as a risk predictor.
edge ratings (Growth and development, Oral health, Caretaking The finding that specific signs/symptoms were important
parenting, Postpartum, Mental health, Substance use, Residence, predictors points to the importance of documenting signs/
Abuse, Role change, Income), six signs/symptoms (fearful symptoms during clinical assessments to further specify the
hypervigilent behavior, unsafe appliances equipment, difficulty problem assessment (KBS ratings). Signs/symptoms for the

FIGURE 2. Importance of each variable using elastic net regularized generalized linear model (glmnet in R). The name of each individual feature is
preceded by either target (T), signs and symptoms (S) or knowledge (K).

Copyright © 2018 Wolters Kluwer Health, Inc. All rights reserved.


70 Machine Learning for Critical Data Elements www.nursingresearchonline.com

FIGURE 3. Top 20 features selected by glmnet classified by knowledge ratings, signs/symptoms, and targets. The lower the rank, the more important the
feature $$$.

Residence, Mental health, Family planning, and Health care su- population to better understand the importance of the particular
pervision problems were important in these models, showing signs/symptoms and interventions in the model.
that a comprehensive holistic assessment yields critical infor- For the glmnet model, the finding that two signs/symptoms
mation regarding baseline risk. This finding aligns with the of “other” were important in the model has critical implications
literature regarding the importance of social and behavioral for documentation because it is impossible to interpret the
determinants of health in influencing health risk (Monsen, meaning of these two variables. Use of terms such as “other”
Brandt, et al., 2017). This finding is further reinforced by the or similar nondefined or customized terminologies limits our
intervention target food, most likely to be used to describe in- ability to discover meaningful new knowledge using these
terventions focused on obtaining food resources—among the and other data mining methods (Bowles et al., 2013).
most basic of human needs. However, the fact that these spe- Feature selection techniques such as mRMR and glmnet
cific signs/symptoms and intervention targets were found to are techniques designed to enable users to reduce high dimen-
be important suggests that it may be possible to reduce the as- sionality within a data set. Feature selection techniques enable
sessment to fewer signs/symptoms, and interventions docu- selection of important features based on scoring metrics. We
mentation may also be reduced in interest of streamlining demonstrated application of these techniques on the Omaha
documentation. However, any data-driven documentation re- System in order to identify important features. Feature selec-
duction recommendation must be informed by clinicians who tion techniques can improve prediction performance and can
rely on documentation as part of quality care, as well as com- inform more efficient documentation by identifying critical
pliance with administrative requirements. These findings should data elements. These techniques could be applied to many
be evaluated with additional data sets for the maternal–child big data applications, and nurse researchers handling large

Copyright © 2018 Wolters Kluwer Health, Inc. All rights reserved.


Nursing Research • January/February 2019 • Volume 68 • No. 1 Machine Learning for Critical Data Elements 71

data sets can benefit from implementation of such techniques follow-up of a randomized trial. Archives of Pediatrics & Adolescent
Medicine, 164, 9–15. doi:10.1001/archpediatrics.2009.240
with their data sets.
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths
Limitations for generalized linear models via coordinate descent. Journal of
Statistical Software, 33, 1–22.
The usual limitations of observational data sets apply to this study. Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature
Furthermore, we include clinical implications in order to illus- selection. Journal of Machine Learning Research, 3, 1157–1182.
trate the relevance of feature selection for clinical practice. Guyon, I., Gunn, S., Nikravesh, M., & Zadeh, I. (Eds.). (2006). Embed-
Further research is needed to determine whether these find- ded methods. In Feature extraction: Foundations and applica-
ings apply across maternal–child health populations and is tions (pp. 137–162). Berlin, Heidelberg: Springer.
the focus of our future research endeavors. Han, J., Kamber, M., & Pei, J. (2011). Data reduction. In Data mining:
Concepts and techniques (3rd ed. pp. 99–110). Waltham, MA:
Morgan Kaufmann.
Conclusion
Hastie, T., & Qian, J. (2014). Glmnet vignette. Retrieved from https://
We described and tested two different machine learning tech- web.stanford.edu/~hastie/glmnet/glmnet_alpha.html
niques of feature selection and applied them on a data set gen- Hira, Z. M., & Gillies, D. F. (2015). A review of feature selection and
erated by PHNs using the Omaha System during routine feature extraction methods applied on microarray data. Advances
documentation. Further studies to refine the process of feature in Bioinformatics, 2015, 198363. doi:10.1155/2015/198363
selection may aid in informing PHNs’ and administrators’ docu- Hoerl, A., & Kennard, R. (1988). Ridge regression. In Encyclopedia of
Statistical Sciences (Vol. 8, pp. 129–136). New York, NY: Wiley.
mentation decisions in ensuring care efficiency and effective
Igbe, O., Darwish, I., & Saadawi, T. (2016). Distributed network intru-
documentation. Feature selection techniques show promise
sion detection systems: An artificial immune system approach. In
toward reducing public health nursing documentation burden Connected Health: Applications, Systems and Engineering Tech-
by reducing the number of critical data elements needed during nologies (CHASE), 2016 I.E. First International Conference on
home visits. These machine learning methods have far-reaching (pp. 101–106). City University of New York, NY: IEEE. doi:10.1109/
CHASE.2016.36
applications both with the Omaha System and, in general, any
Keenan, G. M., Yakel, E., Tschannen, D., & Mandeville, M. (2008).
application that requires the reduction of features in big data. Documentation and the nurse care planning process. In & R. G.
Hughes (Ed.), Patient safety and quality: An evidence-based
Accepted for publication June 3, 2018. handbook for nurses (Chap. 49). Agency for Healthcare Research
This research work has adhered to strict ethical conduct of research and and Quality: Rockville, MD.
was approved by the University of Minnesota Institutional Review Board. Knights, D., Costello, E. K., & Knight, R. (2011). Supervised classifica-
The authors have no conflicts of interest to report. tion of human microbiota. FEMS Microbiology Reviews, 35,
Corresponding author: Eliezer Bose, PhD, BEng, APRN, ACNP-BC, 343–359. doi:10.1111/j.1574-6976.2010.00251.x
University of Texas at Austin School of Nursing, 1710 Red River St., Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection.
Austin, TX 78712 (e-mail: ebose@nursing.utexas.edu). Artificial Intelligence, 97, 273–324. doi:10.1016/S0004-3702(97)00043-X
Kuhn, M. (2012). Variable importance using the caret package. Re-
REFERENCES trieved from http://idg.pl/mirrors/CRAN/web/packages/caret/
AlNuaimi, N., Masud, M. M., & Mohammed, F. (2015). ICU patient de- vignettes/caretVarImp.pdf
terioration prediction: A data-mining approach. arXiv preprint Kuhn, M., & Johnson, K. (2013). An introduction to feature selection. In
arXiv:1511.06910. doi:10.5121/csit.2015.51517 Applied predictive modeling (pp. 487–519). New York, NY: Springer.
Auffarth, B., López, M., & Cerquides, J. (2010). Comparison of redun- Martin, K. S., Monsen, K. A., & Bowles, K. H. (2011). The Omaha
dancy and relevance measures for feature selection in tissue clas- system and meaningful use: Applications for practice, education,
sification of CT images. In Industrial Conference on Data Mining and research. CIN: Computers, Informatics, Nursing, 29, 52–58.
(pp. 248–262). Berlin, Heidelberg: Springer. doi:10.1097/NCN.0b013e3181f9ddc6
Barnard, K. (1998). Developing, implementing, and documenting in- Monsen, K. A., Brandt, J. K., Brueshoff, B. L., Chi, C. L., Mathiason, M. A.,
terventions with parents and young children. Zero to Three, 18, 23–29. Swenson, S. M., & Thorson, D. R. (2017). Social determinants and
Bowles, K. H., Potashnik, S., Ratcliffe, S. J., Rosenberg, M., Shih, N. W., health disparities associated with outcomes of women of child-
Topaz, M., Naylor, M. D. (2013). Conducting research using the elec- bearing age who receive public health nurse home visiting ser-
tronic health record across multi-hospital systems: Semantic harmoniza- vices. Journal of Obstetric, Gynecologic & Neonatal Nursing,
tion implications for administrators. Journal of Nursing Administration, 46, 292–303. doi:10.1016/j.jogn.2016.10.004
43, 355–360. doi:10.1097/NNA.0b013e3182942c3c Monsen, K. A., Farri, O., McNaughton, D., & Savik, K. (2011). Prob-
De Jay, N., Papillon-Cavanagh, S., Olsen, C., El-Hachem, N., Bontempi, lem stabilization: A metric for problem improvement in home vis-
G., & Haibe-Kains, B. (2013). mRMRe: An R package for parallelized iting clients. Applied Clinical Informatics, 2, 437–446. doi:10.4338/
mRMR ensemble feature selection. Bioinformatics, 29, 2365–2368. ACI-2011-06-RA-0038
doi:10.1093/bioinformatics/btt383 Monsen, K. A., Fitzsimmons, L. L., Lescenski, B. A., Lytton, A. B.,
Ding, C., & Peng, H. (2005). Minimum redundancy feature selection Schwichtenberg, L. D., & Martin, K. S. (2006). A public health
from microarray gene expression data. Journal of Bioinformatics and nursing informatics data-and-practice quality project. CIN: Com-
Computational Biology, 3, 185–205. doi:10.1142/S0219720005001004 puters, Informatics, Nursing, 24, 152–158.
Eckenrode, J., Campa, M., Luckey, D. W., Henderson, C. R. Jr., Cole, R., Monsen, K. A., Peterson, J. J., Mathiason, M. A., Kim, E., Votava, B., &
Kitzman, H., … Olds, D. (2010). Long-term effects of prenatal and Pieczkiewicz, D. S. (2017). Discovering public health nurse-specific
infancy nurse home visitation on the life course of youths: 19-year family home visiting intervention patterns using visualization

Copyright © 2018 Wolters Kluwer Health, Inc. All rights reserved.


72 Machine Learning for Critical Data Elements www.nursingresearchonline.com

techniques. Western Journal of Nursing Research, 39, 127–146. Yu, L., & Liu, H. (2003). Feature selection for high-dimensional
doi:10.1177/0193945916679663 data: A fast correlation-based filter solution. Proceedings of the
Radovic, M., Ghalwash, M., Filipovic, N., & Obradovic, Z. (2017). 20th International Conference on Machine Learning (ICML-03)
Minimum redundancy maximum relevance feature selection (pp. 856–863). Retrieved from http://www.aaai.org/Papers/ICML/
approach for temporal gene expression data. BMC Bioinformatics, 2003/ICML03-111.pdf
18, 9. doi:10.1186/s12859-016-1423-9 Zou, H., & Hastie, T. (2005). Regularization and variable selection
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. via the elastic net. Journal of the Royal Statistical Society:
Journal of the Royal Statistical Society: Series B (Statistical Meth- Series B (Statistical Methodology), 67, 301–320. doi:10.1111/
odology), 58, 267–288. j.1467-9868.2005.00503.x

REGISTER YOUR SYSTEMATIC REVIEW OR META-ANALYSIS


Prospective registration of systematic reviews and meta-analysis reports is recommended. The recommendation does
not apply to integrative Biology Reviews. Registration sites for systematic review and meta-analysis are:
PROSPERO
www.crd.york.ac.uk/prospero
Cochrane Collaboration
http://www.cochrane.org/cochrane-reviews/registering-titles
Include the registry and trial number in the letter to the editor uploaded with your original submission. For questions,
please contact the editor, Dr. Rita Pickler, at pickler.1@osu.edu.

Copyright © 2018 Wolters Kluwer Health, Inc. All rights reserved.

You might also like