Professional Documents
Culture Documents
A R T I C L E I N F O A B S T R A C T
Keywords: This study builds on the existing poverty literature and leverages data from disparate sources including both big
Poverty data sources such as satellite images and traditional data sources such as the federal, state, and local agencies to
Vector autoregression develop a context-specific poverty prediction model using design science. We examine whether and to what
Social network analysis
extent infrastructure development as measured from the satellite images as well as spatial spillovers helps predict
Ensemble modeling and design science
the poverty rate of a given census tract. We also develop and implement a Vector Autoregression (VAR) based
ensemble model that combines predictions from daytime and nighttime imaging with adjacent tracts’ poverty
rates and other economic and demographic factors identified in prior literature. Our results show that daytime
imaging and spatial network features have significant predictive value and that a combination of these features
gives the best predictive power. In addition, we find that the skewness of poverty rates among adjacent census
tracts, not the average, is a significant predictor showing the importance of distribution of poverty around a
region. Our work has major implications for researchers using deep learning and network analysis for policy
development and decision making.
1. Introduction landscape and demographics. The advent of big data and data analytics
methods has made it possible for researchers to overcome these issues by
Poverty is a worldwide issue intertwined with other societal issues. leveraging large-scale datasets such as daytime and nighttime satellite
In the United States, the widening wealth and income gap between the images in poverty research [2–5]. However, prior studies tend to rely on
rich and poor in the past four decades has created acute economic and a single data source. There has been a lack of research integrating
social challenges for federal, state, and local governments [1]. To diverse data from disparate sources to predict poverty rates. In sum
effectively combat poverty, it is important not only to understand the mary, there is a need for innovative methods that can combine tradi
causes of poverty but also to predict poverty based on contextual and tional data and big data to leverage the strengths of each.
regional factors (i.e., local demographic, geographic and economic In this research, we combine data from a variety of traditional and
factors) at a granular level. Such knowledge can become the basis for big data sources to comprehensively identify context-specific features
resource allocation decisions for state and local governments and policy that may be associated with poverty. We also develop a method to
makers. extract more granular census-tract level features such as landscape
Although extant research has identified behavioral, structural, and features and infrastructure development using daytime satellite images,
political causes of poverty, studies in this area tend to be disjointed and which have the potential to help us more accurately predict poverty.
have used data from surveys, experiments, and other traditional sources. Furthermore, based on prior poverty research, our study attempts to
Furthermore, given that approaches towards poverty reduction are often quantify the spatial spillover effects among adjacent census tracts. Our
unique to their locales and demographics, there is a need for location- research objective is to investigate whether and to what extent land
specific models that can pinpoint factors influencing poverty in each scape features and infrastructure development estimated from satellite
area. Finally, traditional data sources have issues like delay in data image data combined with spatial spillover effects helps predict the
acquisition and inability to capture dynamic data like changing poverty level in an area.
* Corresponding author.
E-mail addresses: bhogstra@memphis.edu (B. Hoogstra), svlchety@memphis.edu (S. Velichety), czhang1@iastate.edu (C. Zhang).
https://doi.org/10.1016/j.dss.2023.114080
Received 1 February 2023; Received in revised form 21 August 2023; Accepted 4 September 2023
Available online 9 September 2023
0167-9236/© 2023 Elsevier B.V. All rights reserved.
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
We use Shelby County in the state of Tennessee, in which Memphis is nature of poverty and the need to consider the nature and severity of
the major city, as our research context. Given that Memphis is consis poverty within local social, economic, spatial, and other contexts. Some
tently ranked in the five cities with highest poverty rates in the United studies have focused on counties with persistently high levels of poverty
States, we choose this county as the appropriate context for our research. and assess the role of place-based economic development policy in these
Our unit of analysis is census tract, which is a more granular level of areas [17].
analysis than most prior poverty studies. We quantify landscape fea One of the contextual factors that has been examined in poverty
tures, infrastructure, and urban development of a census tract using a research is neighboring areas’ poverty level. Weinberg [19] is one of the
CNN-LSTM network on daytime census tract images, which not only first studies to find that geographical contiguity to a county with a high
segregates different landscape features such as residential and industrial poverty rate impacts the focal county’s poverty rate even after de
neighborhoods, barren land, and forest areas, but also identifies tem mographic, labor-market, institutional, and financial factors are
poral changes to these features. In addition, we use the intensity of controlled for. In other words, location does matter in poverty; there
nighttime images from a census tract as a proxy for economic activity. A exist “pockets of poverty” (Weinberg [19], p. 399). Recognizing the
social network of adjacent census tracts is used to quantify the spatial presence of spatial dynamics in poverty reduction, economists have
spillover effect on the poverty rate of a focal tract. We also propose and developed spatial econometric models to predict changes in counties’
implement a Vector Autoregression (VAR) based ensemble model that poverty rates. For instance, Goetz and Rupasingha [20] find that
combines poverty predictions from daytime images with features changes in a focal county’s poverty are impacted by the poverty of its
derived from disparate data sources, which include Google Earth Engine neighboring counties. Due to the potential spatial aggregation bias
(GEE), property values and transaction prices from City Assessors Office inherent in county-level data, Crandall and Weber [21] focus on poverty
and Tennessee Department of Transportation (TDOT), education, crime, spatial dynamics among census tracts, which are small statistical sub
and demographic data from United States Census Bureau, school divisions with population ranging from 1200 to 8000 of a county.1 Their
catchment area information from Shelby County School districts. findings confirm the presence of geographic spillovers in poverty
Overall, our study leverages both satellite image data and traditional reduction.
data used in prior poverty research to develop a context-specific pre From a policy standpoint, fighting poverty requires the ability to
dictive model of poverty. The prediction models thus obtained outper predict where poverty will occur and how it may change over time. Such
form the existing models in predicting poverty rates. knowledge can become the basis for resource allocation decisions for
The rest of the manuscript is organized as follows. In Section 2, we state and local governments and policy makers. In the past decade, with
provide an overview of the existing poverty literature and summarize the advent of big data and data analytics methods, an increasing body of
the recent studies that use big data and data science approaches to research has focused on developing more accurate context-specific
predict poverty. In Section 3, we provide an overview of the design poverty prediction models. There has been a growing emphasis on
science methodology used in this work and the steps in each phase of adopting a data-driven approach when making policy decisions to
research. In Section 4, we provide a comprehensive overview of various effectively fight poverty.
data sources used in this research and the summary statistics. In Section
5, we provide details of how we formulated and extracted features from 2.2. Satellite image data and poverty
various data sources, what aspects of poverty may be associated with
each feature, the details of the techniques used, and the computational With the increasing availability of satellite imagery data, researchers
details of the infrastructure used to extract these features. In Section 6, in multiple disciplines such as computer science, economics, urban
we describe the prediction models used and the results. In Section 7, we development, and geography have leveraged such data in poverty
identify the theoretical, methodological, and practical contributions of mapping and prediction.
this research as well as potential areas of future work that can build on The earliest known image data-driven approach in poverty research
this study. leverages both nighttime and daytime images to identify image features
that can explain the variation in economic outcomes of local areas [2].
2. Literature review Recognizing the drawbacks of relying on nighttime light intensity alone
such as the inability to identify landscape features such as vegetation,
2.1. Determinants and spatial spillover of poverty residential and commercial areas, roads, waterbodies etc., Jean et al. [3]
combined nighttime maps with high-resolution daytime satellite images
By analyzing individual, regional, and country-level data collected of five African countries and showed that a convolutional neural
from surveys, case studies, and secondary sources such as the Census network can explain up to 75% of variation in local-level economic
Bureau, researchers have identified several behavioral, structural, and outcomes. Xie et al. [5] used a transfer learning approach to predict
political causes of poverty. Behavioral theories posit that poverty is high nighttime light intensity with daytime imagery while simultaneously
in areas where people engage in counter-productive behaviors such as learning landscape features. Blumenstock [2] used a similar approach of
conceiving children out of wedlock [6,7], not saving enough [8], low combining daytime features with nighttime light intensity to predict
education [9], borrowing high-cost debt from payday lenders [10], and poverty rates. Another line of work in this area focuses on generating
not having insurance [8]. Structural factors such as labor market op poverty maps of a region using features identified from daytime imaging
portunities, economic change, and residential segregation are associated [20,21].
with macro and meso-level demographic and economic contexts Most of these studies adopt Convolutional Neural Networks (CNNs),
[10–12]. Structural theories focus on the interaction between contexts a deep learning technique that has the capability to algorithmically
like demographics, economy, etc. and individual behavior when identify and classify objects in input images i.e., to find those objects
explaining the causes of poverty [13–15]. Political theories argue that that are more associated with poverty levels [3,5]. However, to the best
power and institutions impact government policy, which causes poverty of our knowledge, little research has used daytime imaging to identify
[11] and moderates the behavior-poverty link [16]. Nonetheless, each of changes in landscape features across different time periods with respect
these three types of theories have their own challenges; they need to be to infrastructure and urban development to predict poverty rates at the
integrated coherently, with consideration to the investigation context, census tract level. More importantly, we posit in our study that
to formulate more explicit theories about poverty [8]. combining CNNs with Recurrent Neural Networks (RNNs), a deep
An increasing body of research has identified the importance of
context-specific and place-based poverty reduction policies (e.g., [17]).
Gweshengwe and Hassan [18] especially highlight the context-specific 1
https://www2.census.gov/geo/pdfs/education/CensusTracts.pdf
2
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
learning technique designed to identify and predict patterns in gathered data about these census tracts from a variety of sources
sequential data, has the potential to help us identify temporal changes to including Google Earth Engine, United States Census Bureau, Tennessee
infrastructure-related landscape features, which can serve as proxies of Department of Transportation, and Shelby County Assessor’s Office for
local infrastructure development progress over time. the period 2010–2018. The target variable is the percentage of families
Furthermore, despite the rich body of poverty research, most studies below poverty in a census tract, which is reported yearly by the United
tend to rely on a single data source [22]. In this study, we propose to States Census Bureau. The data sources for extracting the required fea
combine the traditional sources of data on poverty, such as the Census tures are shown in Fig. 3. We define big data as sources where the files
Bureau statistics on demographic and household characteristics, with have more than two million rows and contain non-numeric data such as
the more recent big-data related sources, such as the Google Satellite image and spatial data.
Images, to comprehensively measure and select the features associated We extract the required features in three phases, which are described
with poverty. Combining such data has the potential to allow us to build in detail in the following subsections.
on existing poverty literature and develop a more comprehensive
context-specific and data-driven machine learning model to better pre 4.1. Phase 1 – extracting features from daytime and nighttime imaging
dict changes in poverty in each area.
The first step in our data collection was to identify census tracts and
2.3. Research context their shapes. The Census Bureau publishes the shape files for these tracts
every year. We identified 231 tracts in the Shelby County and loaded
We use Shelby County in the state of Tennessee, in which Memphis is their information into the Google Earth Engine (GEE).2
the major city, as our research context. According to the 2020 American GEE presents a unique opportunity to download Satellite imagery
Community Survey (ACS), the overall poverty rate for the City of from multiple raster aggregate images. The challenge with these images
Memphis is 21.7% (ranked second among metropolitan areas with more is that they are not inherently cloud free and must be pre-processed so
than one million population) compared with 16.8% in Shelby County, that clouds and cloud shadows can be removed before analysis. To do so,
13.9% in Tennessee, and 12.3% in the United States [23]. Fig. 1 shows GEE provides cloud cover bands, which allowed us to identify clouds
the incidence of poverty rates across various census tracts in Shelby and cloud shadows with medium to high confidence. Pixels identified
County in 2019. Nonetheless, in the context of Shelby County and other were excluded from the aggregated data set. To ensure a full data set,
metropolitan areas with high incidence of poverty, there is little known images were aggregated for each month and from three different raster
context-specific research on poverty prediction using disparate data images, which were collected at a rate of two images per month.
sources including satellite images and traditional secondary data Nighttime images were similarly gathered from GEE.
sources. Fig. 4 shows the process of extracting features from daytime images
downloaded using Google Earth Engine. We used a ResNet50 Model
3. Methodology [25] with the output layer removed to produce output arrays after a
series of convolutions.
In this research, we propose to develop a prediction model for Applying Convolutional Neural Networks (CNN) to the daytime
poverty rates using context specific data from a variety of sources. We, satellite images of each census tract showed promising results. It helped
therefore, use the design science approach by Peffers et al. [24] for this us identify distinctive landscape features (Appendix A shows the land
research. The various stages are shown in Fig. 2. scape features identified from a sample census tract image). We then
The problem is to devise a comprehensive context-specific model for clustered all census tracts based on these landscape features such as
poverty prediction. The objective is to use a combination of satellite residential and industrial neighborhoods, forest area, and barren land.
images and traditional archival data to formulate measures for poverty We identified three clusters as the optimal number of clusters based on a
determinants using big-data analytics. In the design phase, we identify metric that combines separation and compactness of the clusters (Ap
relationship between landscape features, which are related to infra pendix A). Tracts with similar landscape features belong to the same
structure and urban development, and poverty rates using a combina cluster consistently throughout our observation window. Furthermore,
tion of deep learning methods such as Convolutional Neural Networks given that the average poverty rate of the tracts belonging to each
(CNN) and Recurrent Neural Network (RNN). We also identify spatial cluster varies significantly among the three clusters (Figs. 5 and 6), we
dynamics using adjacency data on census tracts. Finally, we combine coded these clusters as 0, 1, and 2 with 0 being the lowest poverty rate
these features with features identified in prior literature that are related cluster and 2 being the cluster with the highest poverty level. Census
to education attainment, crime, housing, family income, real estate tracts in the low poverty cluster tend to have well developed residential
transactions, and minority population. In the demonstration phase, we and commercial areas and little barren land although some of them tend
present a VAR based Ensemble model consisting of all the features to have substantial forest area. Census tracts in the medium poverty
identified in the design phase to predict poverty. This model combines cluster tend to have a mix of well-developed residential neighborhoods
predictions from daytime imaging with features from adjacent census with poorly constructed residential and commercial areas. Finally,
tracts, education attainment, crime count, family income and economic census tracts in the high poverty cluster tend to have a lot of barren land
factors to provide a partially interpretable ensemble model. In the or forest cover with poorly constructed residential and commercial
evaluation phase, we compare our model with the previous models that areas. We included the poverty cluster as a feature in our poverty pre
use a subset of the features identified. We also identify the importance of diction model.
each of these factors and the robustness of the model to changes in the In addition, we adopted the Long Short-Term Memory model (LSTM)
data and techniques employed. Finally, in the communication phase, we [26–28], a type of Recurrent Neural Network (RNN), to detect and
present the theoretical, methodological, and practical contributions of measure the temporal changes at the pixel level over a three-year span to
this research. We also identify three potential areas of future work that the landscape features of a tract that were identified from the CNN.
can build on this study. Widely used in applications such as predictive search and text genera
tion, RNNs are often used to predict the next value in a sequence based
4. Data description and feature engineering on dynamic evolving changes to the input data. In our processing of the
daytime images, the CNN segments different landscape features of the
The unit of analysis in our study is census tract. Shelby County had
231 census tracts during the period 2010–2018 (Census data are gath
ered once a decade with 2010 being the latest in our dataset). We 2
https://earthengine.google.com/
3
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
census tract image while the LSTM detects temporal changes to these trained the LSTM on the sequential data for these chunks, and aggre
features over a consecutive three-year period, which can be considered gated the poverty predictions from these chunks by calculating their
as indicators of infrastructure and urban development. We then used average and skewness. Fig. 7 shows the results of training. The neural
these temporal changes to predict the poverty rate of each tract. More network reaches a minimal validation error rate in less than three
specifically, we divided the image of each census tract into chunks, hundred epochs. The poverty predictions from the CNN-LSTM neural
4
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
5
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
3
https://www.assessormelvinburgess.com/propertySearch
4
https://www.tn.gov/tdot.html
5
Fig. 6. Poverty Clusters Across Years. https://osf.io/zyaqn/
6
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
Every one unit increase in the number of homes sold is associated with a Prediction) are significant predictors of poverty rate. The poverty cluster
0.96% reduction in poverty rate. Crime Count has a significant positive is coded in reverse order of poverty level (i.e., clusters with the lowest
coefficient with every crime associated with a 0.225% increase in poverty rates have a smaller value). The positive signs and the signifi
poverty. In addition, the number of high school graduates (i.e., educa cance levels for the coefficients not only indicate the importance of
tion attainment) does not seem to have a significant predictive power identifying landscape features in poverty prediction but also suggest that
(not just in this model but across all four models). We attribute it to the census tracts with higher poverty rates tend to get even poorer over
fact that graduates may move to different regions instead of staying in time. In addition, both the average and skewness of poverty prediction
the same census tract. This model has the lowest level of performance from daytime imaging are significant.
among all four models, underscoring the importance of identifying Overall, the model that uses a combination of all four groups of
additional features to improve the predictive performance. features (i.e., model 3) gives the best predictive power as measured by
After adding features extracted from nighttime image data to the the four-error metrics. The mean squared error (MSE) drops by more
baseline model (i.e., model 1), we find that both the average and the than 10% over the baseline model. In addition, we find that the skewness
skewness of nighttime intensity are significant. Every unit increase in of poverty rates across adjacent tracts, not the average, is significant,
night time intensity is associated with a 0.03% decrease in poverty rate showing that distribution of poverty rates of adjacent tracts plays a vital
while a similar increase in skewness is associated with a 0.66% increase. role in determining the poverty level of a focal tract. Considering the
In addition, we see a 1–2% reduction in all the error metrics compared direction of coefficients, if poverty rates of adjacent tracts are more
with the baseline model, suggesting the importance of leveraging unevenly distributed, the poverty rate of the focal tract is also likely to
nighttime data as a proxy for economic activity in poverty prediction. be higher. These findings add new insights to our understanding of
Results associated with model 2, which incorporates features poverty’s spatial spillover effects. Across all the models, previous period
extracted from the CNN-LTSM neural network model using daytime poverty prediction and average development poverty prediction from
image data, show that all three features (i.e., Poverty Cluster, Average daytime imaging are significant, showing the value of using predictions
Development Poverty Prediction, Skewness Development Poverty from multiple models (i.e., ensembles) in the final model.
Table 1
Descriptive Statistics.*
Average Median Minimum Maximum VIF
Dependent Variable
Families Below Poverty (Percentage) 16.6 12 0 66.3 2.51
Independent Variables
Poverty Cluster (Coded 0,1,2 with 0 being the lowest poverty) 1.84 2.0 0.44 2.0 2.03
Daytime Imaging Poverty Prediction 11.77 6.9 0 59.02 4.04
Nighttime Intensity 22.0 26.08 0.52 75.81 2.52
Adjacent Tract Poverty (Percentage) 12.06 7.14 0 61.4 2.04
High School Graduates 1884 285 0 38,130 3.09
Real Estate Transaction Count (Houses Sold) 34.21 19 1 380 1.6
Price Per Square Foot (USD) 64.9 55.22 2.19 176.2 2.11
Transaction Price (USD) 138,450.60 90,633.33 2400 594,334.72 1.98
Crime Count 1176 1275.29 0 6345 2.6
Minority Percentage 62.7 68.27 4.53 100 4.58
Median Family Income 62,694.75 55,402 0 189,444 2.58
Owner Occupied Units 1139.14 1049 29 3688 2.14
*
The data shown are average per census tract.
7
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
Table 2
Results.
Baseline Model Model 1 Model 2 Model 3
Education Attainment + Real Baseline + Nighttime Baseline+ Nighttime Baseline+ Nighttime+ Daytime+
Estate Intensity +Daytime Adjacency
6. Robustness checks traditional data used in prior poverty research to develop a context-
specific predictive model of poverty in a county in the United States.
We performed three robustness checks on our predictive model. Data from these disparate sources allowed us to estimate features related
First, using Forecast Error Variance Decomposition (FEVD) [38], we to local landscape features and infrastructure development as well as
checked whether any single variable (other than the lagged dependent spatial spillover factors, which may help us predict the changes in
variable) has an outsized influence on the predictive power of the model. poverty in this area. In addition, we proposed and demonstrated the
Results showed that no variable accounted for more than 5% towards application of a VAR-based ensemble model that combines poverty
the error variance (Table B.2 has the details of contribution of each prediction from daytime imaging with other types of features. The
variable towards the forecast error variance). prediction models thus obtained outperform the existing models in
Second, we checked the robustness of significant predictors in each predicting poverty rates. In doing so, we have major theoretical, meth
model. We used permutation feature importance [39], which measures odological, and practical contributions.
the contribution of a feature by randomly interchanging all the values of
a variable across rows, to assess the robustness of each significant var
iable in each of the models. We found that all the significant variables 7.1. Theoretical contributions
across models increase the error by at least 3%.
Third, we used the heteroskedasticity-and-autocorrelation (HAC) Theoretical work in this area has so far focused on behavioral, eco
consistent estimators of the variance-covariance matrix to overcome the nomic, structural, and political determinants of poverty rates. The past
issue of serially correlated error terms across different periods. As a decade has seen more emphasis on using big data sources to fight
result, the direction and significance of the coefficients was consistent poverty. However, most studies tend to rely on a single source of data.
across all the three lagged periods. Our work combines the more traditional sources with publicly available
big data sources to identify key features in poverty prediction. We
7. Contributions and future research propose a context specific model consisting of structural and economic
factors for poverty prediction using design science. We also quantify the
In this research, we leveraged both satellite image data and impact of spatial spillover effects on poverty in a focal area. Overall, our
study highlights the potential of developing overarching theoretical
8
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
frameworks on the dynamics of poverty, which can be tested by and Census Bureau. Imaging data can be collected from Google Earth
leveraging big data and innovative analytical methods. Engine. The only change in data sources for another city is the high
school graduation data which is usually publicly available for most
7.2. Methodological contributions metropolitan areas. The data extraction and model building processes
can be replicated using Fig. 2. The computational requirements for
We have four major methodological contributions in this study. extracting the features include a high-performance computing cluster
First, we propose a method for combining Convolutional Neural with eight cores and four CPUs per task that have a memory of 8 GB per
Network (CNN) for identification and classification of landscape fea CPU.
tures with Recurrent Neural Network (RNN) for sequential development Recent years have seen explosion of data available to researchers and
to quantify infrastructure and urban development at the census tract policy makers [44]. Research in the information systems community has
level in relation to poverty rates. Using CNN, we were able to find focused on providing big data analytics and data science frameworks in
different landscape features such as residential and commercial zones, numerous areas including online communities [30,31,45], cyber secu
forests, and barren land. Using RNN, we were able to track the devel rity [46,47], text mining [48], sales prediction [49], recommender
opment in a census tract, in relation to its poverty rate, and use it to systems [50,51], mobile design [52], targeted advertising [53] and
predict the poverty rates. This method can be replicated and used in crowdfunding [54]. Our study is a step forward in this direction that
other contexts to quantify infrastructure and urban development and focuses on the public policy and social works domains.
devise proper measures.
Second, we propose and implement a VAR-based Ensemble model
7.4. Future work
that combines predictions from daytime imaging with features related to
adjacent tract poverty rates and demographic features. The literature on
We identify the following potential areas of future work.
time series based ensemble models is sparse [40–42]. However, the
First, our research uses a dataset of eight years for a single county.
advent of big data has necessitated the need for such models. Recently,
Given that the metrics we used, and the neural networks trained can be
Velichety and Shrivastava [43] proposed and implemented a two stage
applied to other regions beyond our research context, a study consisting
Vector Autoregression model to quantify the impacts of fake news on
of similar data from multiple geographical regions will not only allow us
social media platforms. More recently, there has been a surge in the
to develop a deeper theoretical understanding of the contextual factors
usage of neural network-based models for time series prediction. How
influencing poverty but also help customize our processes and metrics to
ever, these models can be complex to interpret because they involve
improve their generalizability.
multiple hidden layers and millions (or even billions) of parameters. On
Second, when using the spatial network measures to predict the
the other hand, econometric based time series methods can only handle
poverty rate of a focal census tract, we treat all neighboring census tracts
numeric data (A comprehensive comparison of the pros and cons of
equally. However, geographic proximity may not be the only factor
these methods is provided in Appendix C). In this research we identify a
impacting the connections between tracts, which may also be influenced
middle ground between the two by compromising only slightly on the
by neighborhood characteristics. Prior research has found that com
interpretability. Our work in this research is a step forward in building
munities’ network is linked by their socioeconomic needs, despite the
partially interpretable models for time series data. In doing so, we also
geographical distance between them [55]. To take such complexity into
contribute to the literature on proposing novel methods using big data
consideration, future research may build dynamic graph neural net
and design science [44].
works to capture temporal relationships between poverty levels of
Third, we use spatial measures to quantify the spatial spillover
census tracts and identify poverty co-movement among tracts, which
impact in poverty. We found that skewness of neighborhood tracts’
may or may not be adjacent to each other. The embeddings learnt from
poverty levels is significant in predicting the poverty rate of the focal
such networks can provide insights about the connections between
tract. The method of constructing spatial network based on neighbor
census tracts beyond their geographical distance, which can further help
hood data and deriving metrics can be replicated by future researchers
improve predictive accuracy. Such a study can also contribute to the
working in this area.
wealth of theoretical and empirical literature on using network analyses
Fourth, we propose a customized algorithm using shapely polygons
to solve societal problems [56–58] and on applications of graph neural
to determine the common area between a census tract and a school
networks [45,59–63].
catchment area and assign a school district to a census tract. This al
gorithm can be replicated efficiently to find common intersecting areas
CRediT author contribution statement
of any two geographies.
9
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
Appendix A. Appendix
Fig. A1 shows the landscape of a census tract extracted from GEE while Figs. A2 and A3 show the results of clustering on this image after applying
the ResNet50 model [25]. We can see that the neural network is clearly able to identify and segregate different sections of the landscape including
residential neighborhoods, industrial warehouses, barren land, and dense forests.
To determine the optimal value for number of clusters, for each value of k (ranging from 2 to 10), we first evaluated the quality of the clustering
results using evaluation functions proposed in [64] that measure the cluster compactness (Cmp), cluster separation (Sep), and combined measure of
overall cluster quality (Ocq). The definitions of these functions are given below.
1∑c v(Ci )
Cmp = (A.1)
c i v(X)
where C is the number of clusters generated by dataset X, v(Ci) is the standard deviation of the cluster Ci and v(X) is the standard deviation of the
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
∑ ̅
2
dataset X. v(X) = N1 N i=1 d (xi , μ) where d() is the Euclidean distance between two vectors, N is the number of members in X, and μ is the mean of X.
( ( ))
1 ∑c ∑ c
d2 xc i , xcj
Sep = exp − (A.2)
c(c − 1) i=1 j=1 j∕
=i
2σ2
( )
where C is the number of clusters, σ is the standard deviation of the dataset X, xci is the centroid of cluster ci, and d xci , xcj is the distance between the
10
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
Appendix B. Appendix
Table B.1
Model Fit Statistics.
Table B.2
Forecast Error Variance Decomposition.
We propose a Vector Autoregression (VAR) based Ensemble model in this research. It combines poverty prediction from daytime imaging with
features related to poverty of adjacent census tracts and education attainment.
⎡ ⎤
⎢ ⎥
α1 + β 1 t ⎢ qk1,1 qk1,2 qk1,3 qk1,4 qk1,5 qk1,6 ⎥ e1t
⎛ ⎞ ⎡ ⎤ ⎢ ⎥ ⎛ ⎞ ⎡ ⎤
families below povertyit ⎢ ⎥ families below povertyik
α2 + β 2 t ⎢ qk2,1 qk2,2 qk2,3 qk2,4 qk2,5 qk2,6 ⎥ e2t
⎜ Average Development Predictionit ⎟ ⎢ ⎥ ⎢ ⎥ ⎜ Average Development Predictionik ⎟ ⎢ ⎥
⎜ ⎟ ⎢ ⎥ ⎢ ⎥ ⎜ ⎟ ⎢ ⎥
⎜ Adjacent Tract Poverty Ratesit ⎟ ⎢ α3 + β 3 t ⎥ ∑ ⎢
t− 3 qk3,1 qk3,2 qk3,3 qk3,4 qk3,5 qk3,6 ⎥ ⎜ Adjacent Tract Poverty Ratesik ⎟ ⎢ e3t ⎥
⎜ ⎟=⎢ ⎥+ ⎢ ⎥*⎜ ⎟+⎢ ⎥ (B.1)
⎜ Education Attainmentit ⎟ ⎢ α4 + β 4 t ⎥ ⎢ k k k k k k ⎥ ⎜
q4,1 q4,2 q4,3 q4,4 q4,5 q4,6 ⎥ ⎜ Education Attainmentik ⎟ ⎢ e4t ⎥
⎜ ⎟ ⎦ k=t− 1⎢ ⎟
⎝ Real Estate and Economic Featuresit ⎠ ⎣ ⎢ ⎥ ⎝ Real Estate and Economic Featuresik ⎠ ⎣ ⎦
α5 + β 5 t ⎢ qk5,1 qk5,2 qk5,3 qk5,4 qk5,5 qk5,6 ⎥ e5t
Crime Countit ⎢ ⎥ Crime Countik
⎢ ⎥
⎢ qk6,1 qk6,2 qk6,3 qk6,4 qk6,5 qk6,6 ⎥ e6t
α6 + β 6 t ⎣ ⎦
Where αi (i = 1, 2, 3, 4, 5, 6) = constant,βi ,qki,j (i, j = 1,2, 3, 4, 5, 6)= Coefficients, K = lag length, ei (i = 1, 2, 3, 4, 5, 6) is the white-noise residual. The
ones in bold are predictions i.e., ensemble features while the ones in regular text are raw features.
To determine the lag order (K), we use Final Prediction Error (FPE). We found that the optimal lag was three. We test for various assumptions of
VAR including multivariate normality (MVN), co-integration of time series, autocorrelation, auto-regression conditional heteroscedasticity (ARCH)
11
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
T T+1
Table C.1 shows a theoretical comparison of Neural Networks and VAR based models for time series data on four unique dimensions identified
using the literature in both these areas [37,65–68].
Table C.1
Comparison of Neural Network and VAR Based Models.
To compensate for the moderate predictive performance of VAR models, we propose using ensembles. Doing so can moderately decrease the
interpretability while improving the predictive performance to a large extent. Table C.2 compares the computational cost and predictive performance
of a Neural Network model with a combined Neural Network and VAR model. In our case, it is not possible to construct a pure VAR model because
some daytime features were derived using Neural Networks. We use an extreme learning machine (ELM) neural network for our time series prediction.
Computational Cost is calculated based on the time it takes to train the model. Table C.2 shows that combining neural networks and VAR results in a
substantial performance improvement across all the models while decreasing the computational cost.
In addition, Model 5 (using both poverty cluster based on spatial features from CNN and poverty prediction based on temporal features from LSTM)
outperforms Model 3 (using only poverty cluster based on spatial features from CNN) and Model 4 (using only poverty prediction based on temporal
features from LSTM) in model accuracy, demonstrating the strength of combining Convolutional Neural Network (CNN) for identification and
classification of landscape features with Recurrent Neural Network (RNN) for quantify temporal changes to infrastructure and urban development
over time.
Table C.2
Computational Cost and Predictive Performance of Neural Network and VAR Models.
12
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
Neural Network
0.130 22.13 4.71 2.640 0.168
+VAR
References [33] C. Njuguna, P. McSharry, Constructing spatiotemporal poverty indices from big
data, J. Bus. Res. 70 (2017) 318–327.
[34] N. Eagle, M. Macy, R. Claxton, Network diversity and economic development,
[1] K. Schaeffer, 6 facts about economic inequality in the U.S, Pew Res. Cent, 2020. htt
Science. 328 (2010) 1029–1031.
ps://www.pewresearch.org/fact-tank/2020/02/07/6-facts-about-economic-ine
[35] A. Abu, R. Hamdan, N.S. Sani, Ensemble learning for multidimensional poverty
quality-in-the-u-s/ (accessed August 18, 2022).
classification, Sains Malays. 49 (2020) 447–459.
[2] J.E. Blumenstock, Fighting poverty with data, Science. 353 (2016) 753–754.
[36] S. Gillies, The Shapely User Manual, URL Httpspypi OrgprojectShapely, 2013.
[3] N. Jean, M. Burke, M. Xie, W.M. Davis, D.B. Lobell, S. Ermon, Combining satellite
[37] G. Adomavicius, J. Bockstedt, A. Gupta, Modeling supply-side dynamics of IT
imagery and machine learning to predict poverty, Science. 353 (2016) 790–794.
components, products, and infrastructure: an empirical analysis using vector
[4] P.O. Okwi, G. Ndeng’e, P. Kristjanson, M. Arunga, A. Notenbaert, A. Omolo,
autoregression, Inf. Syst. Res. 23 (2012) 397–417.
N. Henninger, T. Benson, P. Kariuki, J. Owuor, Spatial determinants of poverty in
[38] H. Lütkepohl, New Introduction to Multiple Time Series Analysis, Springer Science
rural Kenya, Proc. Natl. Acad. Sci. 104 (2007) 16769–16774.
& Business Media, 2005.
[5] M. Xie, N. Jean, M. Burke, D. Lobell, S. Ermon, Transfer learning from deep
[39] A. Fisher, C. Rudin, F. Dominici, Model Class Reliance: Variable Importance
features for remote sensing and poverty mapping, in: Thirtieth AAAI Conf. Artif.
Measures for any Machine Learning Model Class, from the, Rashomon Perspect,
Intell, 2016.
ArXiv E-Prints, 2018.
[6] M. Bertrand, S. Mullainathan, E. Shafir, A behavioral-economics view of poverty,
[40] J.D. Wichard, M. Ogorzalek, Time series prediction with ensemble models, in: 2004
Am. Econ. Rev. 94 (2004) 419–423.
IEEE Int. Jt. Conf. Neural Netw. IEEE Cat No 04CH37541, IEEE, 2004,
[7] S.N. Durlauf, Groups, social influences, and inequality, Poverty Traps (2006)
pp. 1625–1630.
141–175.
[41] A. Galicia, R. Talavera-Llames, A. Troncoso, I. Koprinska, F. Martínez-Álvarez,
[8] A.V. Banerjee, A. Banerjee, E. Duflo, Poor economics: a radical rethinking of the
Multi-step forecasting for big data time series based on ensemble learning, Knowl.-
way to fight global poverty, Public Affairs (2011).
Based Syst. 163 (2019) 830–841.
[9] L. Rainwater, T.M. Smeeding, Poor Kids in a Rich Country: America’s Children in
[42] M. van Heeswijk, Y. Miche, T. Lindh-Knuutila, P.A. Hilbers, T. Honkela, E. Oja,
Comparative Perspective, Russell Sage Foundation, 2003.
A. Lendasse, Adaptive ensemble models of extreme learning machines for time
[10] M. Bertrand, A. Morse, Information disclosure, cognitive biases, and payday
series prediction, in: Int. Conf. Artif. Neural Netw, Springer, 2009, pp. 305–314.
borrowing, J. Financ. 66 (2011) 1865–1893.
[43] S. Velichety, U. Shrivastava, Quantifying the impacts of online fake news on the
[11] D. Brady, Theories of the causes of poverty, Annu. Rev. Sociol. 45 (2019) 155–175.
equity value of social media platforms–evidence from Twitter, Int. J. Inf. Manag.
[12] M.R. Rank, H.-S. Yoon, T.A. Hirschl, American poverty as a structural failing:
64 (2022), 102474.
evidence and arguments, J. Sociol. Soc. Welf. 30 (2003) 3.
[44] A. Abbasi, S. Sarker, R.H. Chiang, Big data research in information systems: toward
[13] D. Brady, L.M. Burton, The Oxford Handbook of the Social Science of Poverty,
an inclusive research agenda, J. Assoc. Inf. Syst. 17 (2016).
Oxford University Press, 2016.
[45] S. Velichety, S. Ram, Finding a needle in the haystack: recommending online
[14] D. Brady, R.M. Finnigan, S. Hübgen, Rethinking the risks of poverty: a framework
communities on social media platforms using network and design science, J. Assoc.
for analyzing prevalences and penalties, Am. J. Sociol. 123 (2017) 740–786.
Inf. Syst. 22 (2021) 1285–1310.
[15] L. Tach, A.D. Emory, Public housing redevelopment, neighborhood change, and the
[46] A. Abbasi, Z. Zhang, D. Zimbra, H. Chen, J.F. Nunamaker Jr., Detecting fake
restructuring of urban inequality, Am. J. Sociol. 123 (2017) 686–739.
websites: the contribution of statistical learning theory, MIS Q. (2010) 435–461.
[16] D. Brady, A. Blome, H. Kleider, How Politics and Institutions Shape Poverty and
[47] B. Biswas, A. Mukhopadhyay, S. Bhattacharjee, A. Kumar, D. Delen, A text-mining
Inequality, Oxford University Press, Oxford, UK, 2016.
based cyber-risk assessment and mitigation framework for critical analysis of
[17] M.D. Partridge, D.S. Rickman, Persistent pockets of extreme American poverty and
online hacker forums, Decis. Support. Syst. 152 (2022), 113651.
job growth: is there a place-based policy role? J. Agric. Resour. Econ. 201–224
[48] H. Dutta, A. Gupta, PNRank: unsupervised ranking of person name entities from
(2007).
noisy OCR text, Decis. Support. Syst. 152 (2022), 113662.
[18] B. Gweshengwe, N.H. Hassan, Defining the characteristics of poverty and their
[49] T. Geva, G. Oestreicher-Singer, N. Efron, Y. Shimshoni, Using Forum and Search
implications for poverty analysis, Cogent Soc. Sci. 6 (2020) 1768669.
Data for Sales Prediction of High-Involvement Products. https://papers.ssrn.com/s
[19] D.H. Weinberg, Poverty spending and the poverty gap, J. Policy Anal. Manag. 6
ol3/papers.cfm?abstract_id=2294609, 2015 (accessed September 20, 2017).
(1987) 230–241.
[50] S. Oruç, P.E. Eren, A. Koçyiğit, A constraint programming model for making
[20] S.J. Goetz, A. Rupasingha, The returns on higher education: estimates for the 48
recommendations in personal process management: a design science research
contiguous states, Econ. Dev. Q. 17 (2003) 337–351.
approach, Decis. Support. Syst. 152 (2022), 113665.
[21] M.S. Crandall, B.A. Weber, Local social and economic conditions, spatial
[51] R. Duan, C. Jiang, H.K. Jain, Combining review-based collaborative filtering and
concentrations of poverty, and poverty dynamics, Am. J. Agric. Econ. 86 (2004)
matrix factorization: a solution to rating’s sparsity problem, Decis. Support. Syst.
1276–1281.
156 (2022), 113748.
[22] N. Pokhriyal, D.C. Jacques, Combining disparate data sources for improved poverty
[52] F. Provost, D. Martens, A. Murray, Finding similar mobile consumers with a
prediction and mapping, Proc. Natl. Acad. Sci. 114 (2017) E9783–E9792.
privacy-friendly geosocial design, Inf. Syst. Res. 26 (2015) 243–265.
[23] M.E. Delavega, Memphis poverty fact sheet, 2021, p. 20.
[53] B. Kitchens, D. Dobolyi, J. Li, A. Abbasi, Advanced customer analytics: strategic
[24] K. Peffers, T. Tuunanen, M.A. Rothenberger, S. Chatterjee, A design science
value through integration of relationship-oriented big data, J. Manag. Inf. Syst. 35
research methodology for information systems research, J. Manag. Inf. Syst. 24
(2018) 540–574.
(2007) 45–77.
[54] Q. Du, J. Li, Y. Du, G.A. Wang, W. Fan, Predicting crowdfunding project success
[25] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in:
based on backers’ language preferences, J. Assoc. Inf. Sci. Technol. 72 (2021)
Proc. IEEE Conf. Comput. Vis. Pattern Recognit, 2016, pp. 770–778.
1558–1574.
[26] T.-Y. Kim, S.-B. Cho, Predicting residential energy consumption using CNN-LSTM
[55] C. Bevilacqua, P. Sohrabi, N. Hamdy, F. Mangiulli, Mapping connections between
neural networks, Energy. 182 (2019) 72–81.
neighborhoods in response to community-based social needs, Sustainability. 15
[27] W. Boulila, H. Ghandorh, M.A. Khan, F. Ahmed, J. Ahmad, A novel CNN-LSTM-
(2023) 4898.
based approach to predict urban expansion, Ecol. Inform. 64 (2021), 101325.
[56] K. Srinivasan, F. Currim, S. Ram, Predicting high-cost patients at point of admission
[28] A. Vidal, W. Kristjanpoller, Gold volatility prediction using a CNN-LSTM approach,
using network science, IEEE J. Biomed. Health Inform. 22 (2017) 1970–1977.
Expert Syst. Appl. 157 (2020), 113481.
[57] W. Zhang, S. Ram, A comprehensive analysis of triggers and risk factors for asthma
[29] J. Bockstedt, C. Druehl, A. Mishra, Heterogeneous submission behavior and its
based on machine learning and large heterogeneous data sources, MIS Q. 44
implications for success in innovation contests with public submissions, Prod.
(2020).
Oper. Manag. 25 (2016) 1157–1176.
[58] S. Ram, W. Zhang, M. Williams, Y. Pengetnze, Predicting asthma-related
[30] S. Velichety, Quality assessment of peer-produced content in knowledge
emergency department visits using big data, IEEE J. Biomed. Health Inform. 19
repositories using big data and social networks: The case of implicit collaboration
(2015) 1216–1223.
in Wikipedia, in: ACM SIGMIS Database DATABASE Adv. Inf. Syst 50, 2019,
[59] W. Liao, B. Bak-Jensen, J.R. Pillai, Y. Wang, Y. Wang, A review of graph neural
pp. 28–51.
networks and their applications in power systems, J. Mod. Power Syst. Clean
[31] S. Velichety, S. Ram, J. Bockstedt, Quality assessment of peer-produced content in
Energy 10 (2021) 345–360.
knowledge repositories using development and coordination activities, J. Manag.
[60] A. Derrow-Pinion, J. She, D. Wong, O. Lange, T. Hester, L. Perez, M. Nunkesser,
Inf. Syst. 36 (2019) 478–512.
S. Lee, X. Guo, B. Wiltshire, Eta prediction with graph neural networks in google
[32] J. Blumenstock, G. Cadamuro, R. On, Predicting poverty and wealth from mobile
maps, in: Proc. 30th ACM Int. Conf. Inf. Knowl. Manag, 2021, pp. 3767–3776.
phone metadata, Science. 350 (2015) 1073–1076.
[61] X.-M. Zhang, L. Liang, L. Liu, M.-J. Tang, Graph neural networks and their current
applications in bioinformatics, Front. Genet. 12 (2021), 690049.
13
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
[62] J. Shlomi, P. Battaglia, J.-R. Vlimant, Graph neural networks in particle physics, bachelor’s degree from Davenport University and a master’s degree from University of
Mach. Learn. Sci. Technol. 2 (2020), 021001. Memphis.
[63] H. Tian, X. Zheng, K. Zhao, M.W. Liu, D.D. Zeng, Inductive representation learning
on dynamic stock co-movement graphs for stock predictions, INFORMS J. Comput.
Srikar Velichety is an Assistant Professor of Business Information and Technology at the
34 (2022) 1940–1957.
Fogelman College of Business and Economics at the University of Memphis. Velichety got
[64] J. He, M. Lan, C.-L. Tan, S.-Y. Sung, H.-B. Low, Initialization of cluster refinement
his PhD in Management Information Systems from the Eller College of Management,
algorithms: A review and comparative study, in: in: 2004 IEEE Int. Jt. Conf. On
University of Arizona in 2016. His research interests are in social media and Social Net
Neural Netw, IEEE, 2004.
works, User Generated Content, Recommender Systems and Predictive Analytics. His
[65] M. Bańbura, D. Giannone, L. Reichlin, Large Bayesian vector auto regressions,
research has been published in journals including JMIS, JAIS, IJIM and ACM SIGMIS
J. Appl. Econ. 25 (2010) 71–92.
DATABASE.
[66] N. Frosst, G. Hinton, Distilling a neural network into a soft decision tree, ArXiv
Prepr. ArXiv171109784, 2017.
[67] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, Chen Zhang is an Associate Professor of Business Administration at Ivy College of Business
V. Vanhoucke, A. Rabinovich, Going deeper with convolutions, in: Proc. IEEE Conf. at the Iowas State University. Prior to this, she was the associate dean of faculty and
Comput. Vis. Pattern Recognit, 2015, pp. 1–9. administration at the Fogelman College of Business and Economics at the University of
[68] A. Graves, A. Mohamed, G. Hinton, Speech recognition with deep recurrent neural Memphis. Zhang got her PhD in Management Information Systems from the Krannert
networks, in: 2013 IEEE Int. Conf. Acoust. Speech Signal Process, IEEE, 2013, School of Management at Purdue University in 2007. Her research interests are in software
pp. 6645–6649. platforms, User Generated Content and Sharing Economy. Here research has been pub
lished in top journals including MIS Quarterly, Information Systems Research, IEEE
Software and CAIS.
Brian Hoogstra is a Data Scientist Principal at FedEx Services, Memphis. He has wide
experience in database administration and enterprise content recovery. Brian has a
14