1 s2.0 S0167923623001550 Main

Decision Support Systems 177 (2024) 114080
Contents lists available at ScienceDirect
Decision Support Systems

journal homepage: www.elsevier.com/locate/dss
Developing a contextual model of poverty prediction using data science and

analytics – The case of Shelby County
Brian Hoogstra a, Srikar Velichety b, *, Chen Zhang c
a
FedEX Services, USA
b
Fogelman College of Business and Economics, University of Memphis, Rm 309, 3675 Central Ave, Memphis, TN 38152, USA
c
Ivy College of Business, Iowa State University, 3438 Gerdin, Ames, IA 52761, USA
A R T I C L E I N F O A B S T R A C T
Keywords: This study builds on the existing poverty literature and leverages data from disparate sources including both big
Poverty data sources such as satellite images and traditional data sources such as the federal, state, and local agencies to
Vector autoregression develop a context-specific poverty prediction model using design science. We examine whether and to what
Social network analysis
extent infrastructure development as measured from the satellite images as well as spatial spillovers helps predict
Ensemble modeling and design science
the poverty rate of a given census tract. We also develop and implement a Vector Autoregression (VAR) based
ensemble model that combines predictions from daytime and nighttime imaging with adjacent tracts’ poverty
rates and other economic and demographic factors identified in prior literature. Our results show that daytime
imaging and spatial network features have significant predictive value and that a combination of these features
gives the best predictive power. In addition, we find that the skewness of poverty rates among adjacent census
tracts, not the average, is a significant predictor showing the importance of distribution of poverty around a
region. Our work has major implications for researchers using deep learning and network analysis for policy
development and decision making.
1. Introduction landscape and demographics. The advent of big data and data analytics
methods has made it possible for researchers to overcome these issues by
Poverty is a worldwide issue intertwined with other societal issues. leveraging large-scale datasets such as daytime and nighttime satellite
In the United States, the widening wealth and income gap between the images in poverty research [2–5]. However, prior studies tend to rely on
rich and poor in the past four decades has created acute economic and a single data source. There has been a lack of research integrating
social challenges for federal, state, and local governments [1]. To diverse data from disparate sources to predict poverty rates. In sum
effectively combat poverty, it is important not only to understand the mary, there is a need for innovative methods that can combine tradi
causes of poverty but also to predict poverty based on contextual and tional data and big data to leverage the strengths of each.
regional factors (i.e., local demographic, geographic and economic In this research, we combine data from a variety of traditional and
factors) at a granular level. Such knowledge can become the basis for big data sources to comprehensively identify context-specific features
resource allocation decisions for state and local governments and policy that may be associated with poverty. We also develop a method to
makers. extract more granular census-tract level features such as landscape
Although extant research has identified behavioral, structural, and features and infrastructure development using daytime satellite images,
political causes of poverty, studies in this area tend to be disjointed and which have the potential to help us more accurately predict poverty.
have used data from surveys, experiments, and other traditional sources. Furthermore, based on prior poverty research, our study attempts to
Furthermore, given that approaches towards poverty reduction are often quantify the spatial spillover effects among adjacent census tracts. Our
unique to their locales and demographics, there is a need for location- research objective is to investigate whether and to what extent land
specific models that can pinpoint factors influencing poverty in each scape features and infrastructure development estimated from satellite
area. Finally, traditional data sources have issues like delay in data image data combined with spatial spillover effects helps predict the
acquisition and inability to capture dynamic data like changing poverty level in an area.
* Corresponding author.
E-mail addresses: bhogstra@memphis.edu (B. Hoogstra), svlchety@memphis.edu (S. Velichety), czhang1@iastate.edu (C. Zhang).
https://doi.org/10.1016/j.dss.2023.114080
Received 1 February 2023; Received in revised form 21 August 2023; Accepted 4 September 2023
Available online 9 September 2023
0167-9236/© 2023 Elsevier B.V. All rights reserved.
B. Hoogstra et al. Decision Support Systems 177 (2024) 114080
We use Shelby County in the state of Tennessee, in which Memphis is nature of poverty and the need to consider the nature and severity of
the major city, as our research context. Given that Memphis is consis poverty within local social, economic, spatial, and other contexts. Some
tently ranked in the five cities with highest poverty rates in the United studies have focused on counties with persistently high levels of poverty
States, we choose this county as the appropriate context for our research. and assess the role of place-based economic development policy in these
Our unit of analysis is census tract, which is a more granular level of areas [17].
analysis than most prior poverty studies. We quantify landscape fea One of the contextual factors that has been examined in poverty
tures, infrastructure, and urban development of a census tract using a research is neighboring areas’ poverty level. Weinberg [19] is one of the
CNN-LSTM network on daytime census tract images, which not only first studies to find that geographical contiguity to a county with a high
segregates different landscape features such as residential and industrial poverty rate impacts the focal county’s poverty rate even after de
neighborhoods, barren land, and forest areas, but also identifies tem mographic, labor-market, institutional, and financial factors are
poral changes to these features. In addition, we use the intensity of controlled for. In other words, location does matter in poverty; there
nighttime images from a census tract as a proxy for economic activity. A exist “pockets of poverty” (Weinberg [19], p. 399). Recognizing the
social network of adjacent census tracts is used to quantify the spatial presence of spatial dynamics in poverty reduction, economists have
spillover effect on the poverty rate of a focal tract. We also propose and developed spatial econometric models to predict changes in counties’
implement a Vector Autoregression (VAR) based ensemble model that poverty rates. For instance, Goetz and Rupasingha [20] find that
combines poverty predictions from daytime images with features changes in a focal county’s poverty are impacted by the poverty of its
derived from disparate data sources, which include Google Earth Engine neighboring counties. Due to the potential spatial aggregation bias
(GEE), property values and transaction prices from City Assessors Office inherent in county-level data, Crandall and Weber [21] focus on poverty
and Tennessee Department of Transportation (TDOT), education, crime, spatial dynamics among census tracts, which are small statistical sub
and demographic data from United States Census Bureau, school divisions with population ranging from 1200 to 8000 of a county.1 Their
catchment area information from Shelby County School districts. findings confirm the presence of geographic spillovers in poverty
Overall, our study leverages both satellite image data and traditional reduction.
data used in prior poverty research to develop a context-specific pre From a policy standpoint, fighting poverty requires the ability to
dictive model of poverty. The prediction models thus obtained outper predict where poverty will occur and how it may change over time. Such
form the existing models in predicting poverty rates. knowledge can become the basis for resource allocation decisions for
The rest of the manuscript is organized as follows. In Section 2, we state and local governments and policy makers. In the past decade, with
provide an overview of the existing poverty literature and summarize the advent of big data and data analytics methods, an increasing body of
the recent studies that use big data and data science approaches to research has focused on developing more accurate context-specific
predict poverty. In Section 3, we provide an overview of the design poverty prediction models. There has been a growing emphasis on
science methodology used in this work and the steps in each phase of adopting a data-driven approach when making policy decisions to
research. In Section 4, we provide a comprehensive overview of various effectively fight poverty.
data sources used in this research and the summary statistics. In Section
5, we provide details of how we formulated and extracted features from 2.2. Satellite image data and poverty
various data sources, what aspects of poverty may be associated with
each feature, the details of the techniques used, and the computational With the increasing availability of satellite imagery data, researchers
details of the infrastructure used to extract these features. In Section 6, in multiple disciplines such as computer science, economics, urban
we describe the prediction models used and the results. In Section 7, we development, and geography have leveraged such data in poverty
identify the theoretical, methodological, and practical contributions of mapping and prediction.
this research as well as potential areas of future work that can build on The earliest known image data-driven approach in poverty research
this study. leverages both nighttime and daytime images to identify image features
that can explain the variation in economic outcomes of local areas [2].
2. Literature review Recognizing the drawbacks of relying on nighttime light intensity alone
such as the inability to identify landscape features such as vegetation,
2.1. Determinants and spatial spillover of poverty residential and commercial areas, roads, waterbodies etc., Jean et al. [3]
combined nighttime maps with high-resolution daytime satellite images
By analyzing individual, regional, and country-level data collected of five African countries and showed that a convolutional neural
from surveys, case studies, and secondary sources such as the Census network can explain up to 75% of variation in local-level economic
Bureau, researchers have identified several behavioral, structural, and outcomes. Xie et al. [5] used a transfer learning approach to predict
political causes of poverty. Behavioral theories posit that poverty is high nighttime light intensity with daytime imagery while simultaneously
in areas where people engage in counter-productive behaviors such as learning landscape features. Blumenstock [2] used a similar approach of
conceiving children out of wedlock [6,7], not saving enough [8], low combining daytime features with nighttime light intensity to predict
education [9], borrowing high-cost debt from payday lenders [10], and poverty rates. Another line of work in this area focuses on generating
not having insurance [8]. Structural factors such as labor market op poverty maps of a region using features identified from daytime imaging
portunities, economic change, and residential segregation are associated [20,21].
with macro and meso-level demographic and economic contexts Most of these studies adopt Convolutional Neural Networks (CNNs),
[10–12]. Structural theories focus on the interaction between contexts a deep learning technique that has the capability to algorithmically
like demographics, economy, etc. and individual behavior when identify and classify objects in input images i.e., to find those objects
explaining the causes of poverty [13–15]. Political theories argue that that are more associated with poverty levels [3,5]. However, to the best
power and institutions impact government policy, which causes poverty of our knowledge, little research has used daytime imaging to identify
[11] and moderates the behavior-poverty link [16]. Nonetheless, each of changes in landscape features across different time periods with respect
these three types of theories have their own challenges; they need to be to infrastructure and urban development to predict poverty rates at the
integrated coherently, with consideration to the investigation context, census tract level. More importantly, we posit in our study that
to formulate more explicit theories about poverty [8]. combining CNNs with Recurrent Neural Networks (RNNs), a deep
An increasing body of research has identified the importance of
context-specific and place-based poverty reduction policies (e.g., [17]).
Gweshengwe and Hassan [18] especially highlight the context-specific 1
https://www2.census.gov/geo/pdfs/education/CensusTracts.pdf
2
learning technique designed to identify and predict patterns in gathered data about these census tracts from a variety of sources
sequential data, has the potential to help us identify temporal changes to including Google Earth Engine, United States Census Bureau, Tennessee
infrastructure-related landscape features, which can serve as proxies of Department of Transportation, and Shelby County Assessor’s Office for
local infrastructure development progress over time. the period 2010–2018. The target variable is the percentage of families
Furthermore, despite the rich body of poverty research, most studies below poverty in a census tract, which is reported yearly by the United
tend to rely on a single data source [22]. In this study, we propose to States Census Bureau. The data sources for extracting the required fea
combine the traditional sources of data on poverty, such as the Census tures are shown in Fig. 3. We define big data as sources where the files
Bureau statistics on demographic and household characteristics, with have more than two million rows and contain non-numeric data such as
the more recent big-data related sources, such as the Google Satellite image and spatial data.
Images, to comprehensively measure and select the features associated We extract the required features in three phases, which are described
with poverty. Combining such data has the potential to allow us to build in detail in the following subsections.
on existing poverty literature and develop a more comprehensive
context-specific and data-driven machine learning model to better pre 4.1. Phase 1 – extracting features from daytime and nighttime imaging
dict changes in poverty in each area.
The first step in our data collection was to identify census tracts and
2.3. Research context their shapes. The Census Bureau publishes the shape files for these tracts
every year. We identified 231 tracts in the Shelby County and loaded
We use Shelby County in the state of Tennessee, in which Memphis is their information into the Google Earth Engine (GEE).2
the major city, as our research context. According to the 2020 American GEE presents a unique opportunity to download Satellite imagery
Community Survey (ACS), the overall poverty rate for the City of from multiple raster aggregate images. The challenge with these images
Memphis is 21.7% (ranked second among metropolitan areas with more is that they are not inherently cloud free and must be pre-processed so
than one million population) compared with 16.8% in Shelby County, that clouds and cloud shadows can be removed before analysis. To do so,
13.9% in Tennessee, and 12.3% in the United States [23]. Fig. 1 shows GEE provides cloud cover bands, which allowed us to identify clouds
the incidence of poverty rates across various census tracts in Shelby and cloud shadows with medium to high confidence. Pixels identified
County in 2019. Nonetheless, in the context of Shelby County and other were excluded from the aggregated data set. To ensure a full data set,
metropolitan areas with high incidence of poverty, there is little known images were aggregated for each month and from three different raster
context-specific research on poverty prediction using disparate data images, which were collected at a rate of two images per month.
sources including satellite images and traditional secondary data Nighttime images were similarly gathered from GEE.
sources. Fig. 4 shows the process of extracting features from daytime images
downloaded using Google Earth Engine. We used a ResNet50 Model
3. Methodology [25] with the output layer removed to produce output arrays after a
series of convolutions.
In this research, we propose to develop a prediction model for Applying Convolutional Neural Networks (CNN) to the daytime
poverty rates using context specific data from a variety of sources. We, satellite images of each census tract showed promising results. It helped
therefore, use the design science approach by Peffers et al. [24] for this us identify distinctive landscape features (Appendix A shows the land
research. The various stages are shown in Fig. 2. scape features identified from a sample census tract image). We then
The problem is to devise a comprehensive context-specific model for clustered all census tracts based on these landscape features such as
poverty prediction. The objective is to use a combination of satellite residential and industrial neighborhoods, forest area, and barren land.
images and traditional archival data to formulate measures for poverty We identified three clusters as the optimal number of clusters based on a
determinants using big-data analytics. In the design phase, we identify metric that combines separation and compactness of the clusters (Ap
relationship between landscape features, which are related to infra pendix A). Tracts with similar landscape features belong to the same
structure and urban development, and poverty rates using a combina cluster consistently throughout our observation window. Furthermore,
tion of deep learning methods such as Convolutional Neural Networks given that the average poverty rate of the tracts belonging to each
(CNN) and Recurrent Neural Network (RNN). We also identify spatial cluster varies significantly among the three clusters (Figs. 5 and 6), we
dynamics using adjacency data on census tracts. Finally, we combine coded these clusters as 0, 1, and 2 with 0 being the lowest poverty rate
these features with features identified in prior literature that are related cluster and 2 being the cluster with the highest poverty level. Census
to education attainment, crime, housing, family income, real estate tracts in the low poverty cluster tend to have well developed residential
transactions, and minority population. In the demonstration phase, we and commercial areas and little barren land although some of them tend
present a VAR based Ensemble model consisting of all the features to have substantial forest area. Census tracts in the medium poverty
identified in the design phase to predict poverty. This model combines cluster tend to have a mix of well-developed residential neighborhoods
predictions from daytime imaging with features from adjacent census with poorly constructed residential and commercial areas. Finally,
tracts, education attainment, crime count, family income and economic census tracts in the high poverty cluster tend to have a lot of barren land
factors to provide a partially interpretable ensemble model. In the or forest cover with poorly constructed residential and commercial
evaluation phase, we compare our model with the previous models that areas. We included the poverty cluster as a feature in our poverty pre
use a subset of the features identified. We also identify the importance of diction model.
each of these factors and the robustness of the model to changes in the In addition, we adopted the Long Short-Term Memory model (LSTM)
data and techniques employed. Finally, in the communication phase, we [26–28], a type of Recurrent Neural Network (RNN), to detect and
present the theoretical, methodological, and practical contributions of measure the temporal changes at the pixel level over a three-year span to
this research. We also identify three potential areas of future work that the landscape features of a tract that were identified from the CNN.
can build on this study. Widely used in applications such as predictive search and text genera
tion, RNNs are often used to predict the next value in a sequence based
4. Data description and feature engineering on dynamic evolving changes to the input data. In our processing of the
daytime images, the CNN segments different landscape features of the
The unit of analysis in our study is census tract. Shelby County had
231 census tracts during the period 2010–2018 (Census data are gath
ered once a decade with 2010 being the latest in our dataset). We 2
https://earthengine.google.com/
3
Fig. 1. Poverty Rates in Shelby County.
census tract image while the LSTM detects temporal changes to these trained the LSTM on the sequential data for these chunks, and aggre
features over a consecutive three-year period, which can be considered gated the poverty predictions from these chunks by calculating their
as indicators of infrastructure and urban development. We then used average and skewness. Fig. 7 shows the results of training. The neural
these temporal changes to predict the poverty rate of each tract. More network reaches a minimal validation error rate in less than three
specifically, we divided the image of each census tract into chunks, hundred epochs. The poverty predictions from the CNN-LSTM neural
Fig. 2. Research Framework.
4
Fig. 3. Data Sources for Feature Extraction.
Fig. 4. Process for Day Time Image Feature Extraction.
5
number of high school graduates as a proxy for the educational attain

ment in each tract. Most of the school districts intersect with only one
census tract. If a school district does intersect with more than one tract,
we use the census tract that intersects with it the most, which is iden
tified by the intersecting method of shapely polygons [36]. Appendix A
provides more details about the algorithm we adopted to match the
school catchment area to a census tract.
In addition, we gathered housing data (both housing attributes and
transaction history) from the County Assessor’s office.3 The Department
of Transportation (TDOT)4 address database was used to find every
address in the County, which was then mapped to the corresponding
census tract based on its latitude and longitude. We collected the
average price per square foot and the number of residential properties
sold in each tract in the previous year. We also gathered crime count
data for all types of crimes for each census tract from the FBI Crime
Database,5 which provides details of each reported crime in a census
block for each year. We aggregated the count of these crimes by year.
Fig. 5. Box Plot of Poverty Clusters.
Minority Percentage, Median Family Income and Owner Occupied Units
data were collected from the US Census Bureau. Table 1 shows a sum
network model (i.e., from daytime image processing) were then mary of the data collected.
included as ensemble features in our final prediction model.
In summary, the CNN allowed us to extract landscape features of
5. Results
each census tract and identify the poverty cluster the tract belongs to.
The LSTM, a type of RNN, enabled us to track the temporal progression
To predict the poverty rate of a census tract, we propose the use of
of the landscape features in a tract over a three-year span. The inte
Vector Autoregression (VAR) based Ensemble model that combines
gration of these two methods yielded both the poverty cluster and the
predictions from daytime and nighttime imaging with features from
poverty predictions from the CNN-LSTM neural network model, which
adjacent tracts. Combining predictions from different models with these
were included as ensemble features in our final prediction model.
features gives us a partially interpretable model. Vector Autoregressions
For nighttime data, we again used images collected from Google
are used to model relationships between multiple time series in a dataset
Earth Engine. We calculated the average and skewness [29–31] of
where the theoretical nature of relationship among the variables is
nighttime image pixel intensity, which can be used as a proxy for eco
either unspecified or not completely specified [37]. Given the temporal
nomic activity, for each census tract. To execute these tasks, we used a
nature of the data and the unspecified nature of theoretical relationships
high-performance computing cluster with eight cores and four CPUs per
among the predictors, we argue that this technique best suits our needs
task that have a memory of 8 GB per CPU. In addition, we used two GPUs
to quantify and interpret the predictive power of each predictor (more
per node to train the neural networks.
details about the development and implementation of the model are
presented in Appendix B). In addition, neural network methods such as
4.2. Phase 2 – extracting features from adjacency network data
CNN and RNN can capture the spatial and temporal variation in image
data, so they would best suit our needs for dealing with census tract
Using the census tract shape files from the census bureau, we created
images. In other words, we combine the best of neural network and
an adjacency network of census tracts where each tract is a node and has
econometric methods (A comprehensive theoretical and empirical
a link to another tract if they share a border. For each tract, we then
comparison of both techniques is presented in Appendix C).
calculated the average and the skewness of poverty rates of its adjacent
We used the first five years of data for training and the last three
tracts from the previous year. These two features were used as poverty
years for testing. Table 2 shows the summarized results of our prediction
predictors that represent spatial spillover effects for the focal tract in a
models on the testing dataset using the features extracted as described
given year [29–31].
earlier. Overall, we found that a lag of three periods (years) yielded the
best results in terms of FPE (Model fit statistics are provided in Appendix
4.3. Phase 3 – extracting baseline features (demographic and crime B – Table B.1). We built four models. The first one is a baseline model
features) that uses features identified from prior poverty research and extracted in
phase 3 as described earlier. It also includes the poverty level of the focal
We also included additional factors that have been found to impact tract in the previous year. The second model adds nighttime intensity
poverty in prior studies [2–5,32–35]. For example, we calculated the features to the baseline model while the third one adds features
extracted from daytime image processing (i.e., poverty clusters and
poverty predictions based on temporal changes to landscape features) to
the second model. The final one combines all four groups of features by
adding spatial network features to the third model.
The model with only the baseline features (i.e., baseline model)
shows that all the economic features except the mean transaction price
and price per square foot have a significant predictive power. The
number of real estate transactions (i.e., the number of houses sold) has a
significant negative coefficient across all models, indicating that the
more the transactions, the lower the poverty rate in that census tract.
3
https://www.assessormelvinburgess.com/propertySearch
4
https://www.tn.gov/tdot.html
5
Fig. 6. Poverty Clusters Across Years. https://osf.io/zyaqn/
6
Fig. 7. Training Result for RNN.
Every one unit increase in the number of homes sold is associated with a Prediction) are significant predictors of poverty rate. The poverty cluster
0.96% reduction in poverty rate. Crime Count has a significant positive is coded in reverse order of poverty level (i.e., clusters with the lowest
coefficient with every crime associated with a 0.225% increase in poverty rates have a smaller value). The positive signs and the signifi
poverty. In addition, the number of high school graduates (i.e., educa cance levels for the coefficients not only indicate the importance of
tion attainment) does not seem to have a significant predictive power identifying landscape features in poverty prediction but also suggest that
(not just in this model but across all four models). We attribute it to the census tracts with higher poverty rates tend to get even poorer over
fact that graduates may move to different regions instead of staying in time. In addition, both the average and skewness of poverty prediction
the same census tract. This model has the lowest level of performance from daytime imaging are significant.
among all four models, underscoring the importance of identifying Overall, the model that uses a combination of all four groups of
additional features to improve the predictive performance. features (i.e., model 3) gives the best predictive power as measured by
After adding features extracted from nighttime image data to the the four-error metrics. The mean squared error (MSE) drops by more
baseline model (i.e., model 1), we find that both the average and the than 10% over the baseline model. In addition, we find that the skewness
skewness of nighttime intensity are significant. Every unit increase in of poverty rates across adjacent tracts, not the average, is significant,
night time intensity is associated with a 0.03% decrease in poverty rate showing that distribution of poverty rates of adjacent tracts plays a vital
while a similar increase in skewness is associated with a 0.66% increase. role in determining the poverty level of a focal tract. Considering the
In addition, we see a 1–2% reduction in all the error metrics compared direction of coefficients, if poverty rates of adjacent tracts are more
with the baseline model, suggesting the importance of leveraging unevenly distributed, the poverty rate of the focal tract is also likely to
nighttime data as a proxy for economic activity in poverty prediction. be higher. These findings add new insights to our understanding of
Results associated with model 2, which incorporates features poverty’s spatial spillover effects. Across all the models, previous period
extracted from the CNN-LTSM neural network model using daytime poverty prediction and average development poverty prediction from
image data, show that all three features (i.e., Poverty Cluster, Average daytime imaging are significant, showing the value of using predictions
Development Poverty Prediction, Skewness Development Poverty from multiple models (i.e., ensembles) in the final model.
Table 1
Descriptive Statistics.*
Average Median Minimum Maximum VIF
Dependent Variable
Families Below Poverty (Percentage) 16.6 12 0 66.3 2.51
Independent Variables
Poverty Cluster (Coded 0,1,2 with 0 being the lowest poverty) 1.84 2.0 0.44 2.0 2.03
Daytime Imaging Poverty Prediction 11.77 6.9 0 59.02 4.04
Nighttime Intensity 22.0 26.08 0.52 75.81 2.52
Adjacent Tract Poverty (Percentage) 12.06 7.14 0 61.4 2.04
High School Graduates 1884 285 0 38,130 3.09
Real Estate Transaction Count (Houses Sold) 34.21 19 1 380 1.6
Price Per Square Foot (USD) 64.9 55.22 2.19 176.2 2.11
Transaction Price (USD) 138,450.60 90,633.33 2400 594,334.72 1.98
Crime Count 1176 1275.29 0 6345 2.6
Minority Percentage 62.7 68.27 4.53 100 4.58
Median Family Income 62,694.75 55,402 0 189,444 2.58
Owner Occupied Units 1139.14 1049 29 3688 2.14
*
The data shown are average per census tract.
7
Table 2
Results.
Baseline Model Model 1 Model 2 Model 3
Education Attainment + Real Baseline + Nighttime Baseline+ Nighttime Baseline+ Nighttime+ Daytime+
Estate Intensity +Daytime Adjacency
0.101** 0.092* 0.092* 0.019**

Previous Period Poverty
(0.034) (0.045) (0.045) (0.053)
–0.03*** –0.03*** –0.01***
Average Nighttime Intensity
(0.000) (0.000) (0.000)
0.66*** 0.63*** 0.035**
Skewness Nighttime Intensity
(0.04) (0.04) (0.013)
1.68** 1.64**
Poverty Cluster
(0.01) (0.01)
Average Development Poverty 0.036** 0.021*
Prediction (0.058) (0.059)
Skewness Development Poverty 0.034** 0.038**
Prediction (0.033) (0.032)
0.000
Average Adjacent Tract Poverty
(0.000)
0.09*
Skewness Adjacent Tract Poverty
(0.014)
0.00 0.00 − 0.000 − 0.000
High School Graduates
(0.00) (0.00) (0.000). (0.000).
− 0.96* − 0.01* − 0.01* − 0.01*
Real Estate Transaction Count
(0.07) (0.00) (0.00) (0.000)
0.06 0.00 0.00 0.000
Mean Transaction Price (Homes)
(0.00) (0.00) (0.00) (0.000)
0.66* − 0.65** − 0.67** − 0.003
Skewness Transaction Price (Homes)
(0.06) (0.03) (0.03) (0.008)
− 0.03 0.04 0.04 − 0.000
Average Price Per Sqft
(0.000) (0.00) (0.00) (0.000)
− 0.06*** − 0.67* − 0.69* − 0.002
Skewness Price Per Sqft
(0.12) (0.03) (0.03) (0.022)
0.255*** 0.264** − 0.255 0.104***
Crime Count
(0.002) (0.01) (0.09) (0.001)
− 0.467*** − 0.315** − 0.467** − 0.48**
Minority Percentage
(0.03) (0.002) (0.02) (0.02)
− 0.339* − 0.282** − 0.339** − 0.326***
Median Family Income
(0.04) (0.004) (0.001) (0.003)
− 0.626* − 0.136** − 0.626** − 0.795*
Owner Occupied Units
(0.001) (0.006) (0.004) (0.004)
MSE 24.12 23 22.54 22.13
RMSE 4.91 4.79 4.74 4.705
MAPE 2.64 2.67 2.64 2.64
MAE 0.174 0.171 0.169 0.168
Significance level: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05.
6. Robustness checks traditional data used in prior poverty research to develop a context-
specific predictive model of poverty in a county in the United States.
We performed three robustness checks on our predictive model. Data from these disparate sources allowed us to estimate features related
First, using Forecast Error Variance Decomposition (FEVD) [38], we to local landscape features and infrastructure development as well as
checked whether any single variable (other than the lagged dependent spatial spillover factors, which may help us predict the changes in
variable) has an outsized influence on the predictive power of the model. poverty in this area. In addition, we proposed and demonstrated the
Results showed that no variable accounted for more than 5% towards application of a VAR-based ensemble model that combines poverty
the error variance (Table B.2 has the details of contribution of each prediction from daytime imaging with other types of features. The
variable towards the forecast error variance). prediction models thus obtained outperform the existing models in
Second, we checked the robustness of significant predictors in each predicting poverty rates. In doing so, we have major theoretical, meth
model. We used permutation feature importance [39], which measures odological, and practical contributions.
the contribution of a feature by randomly interchanging all the values of
a variable across rows, to assess the robustness of each significant var
iable in each of the models. We found that all the significant variables 7.1. Theoretical contributions
across models increase the error by at least 3%.
Third, we used the heteroskedasticity-and-autocorrelation (HAC) Theoretical work in this area has so far focused on behavioral, eco
consistent estimators of the variance-covariance matrix to overcome the nomic, structural, and political determinants of poverty rates. The past
issue of serially correlated error terms across different periods. As a decade has seen more emphasis on using big data sources to fight
result, the direction and significance of the coefficients was consistent poverty. However, most studies tend to rely on a single source of data.
across all the three lagged periods. Our work combines the more traditional sources with publicly available
big data sources to identify key features in poverty prediction. We
7. Contributions and future research propose a context specific model consisting of structural and economic
factors for poverty prediction using design science. We also quantify the
In this research, we leveraged both satellite image data and impact of spatial spillover effects on poverty in a focal area. Overall, our
study highlights the potential of developing overarching theoretical
8
frameworks on the dynamics of poverty, which can be tested by and Census Bureau. Imaging data can be collected from Google Earth
leveraging big data and innovative analytical methods. Engine. The only change in data sources for another city is the high
school graduation data which is usually publicly available for most
7.2. Methodological contributions metropolitan areas. The data extraction and model building processes
can be replicated using Fig. 2. The computational requirements for
We have four major methodological contributions in this study. extracting the features include a high-performance computing cluster
First, we propose a method for combining Convolutional Neural with eight cores and four CPUs per task that have a memory of 8 GB per
Network (CNN) for identification and classification of landscape fea CPU.
tures with Recurrent Neural Network (RNN) for sequential development Recent years have seen explosion of data available to researchers and
to quantify infrastructure and urban development at the census tract policy makers [44]. Research in the information systems community has
level in relation to poverty rates. Using CNN, we were able to find focused on providing big data analytics and data science frameworks in
different landscape features such as residential and commercial zones, numerous areas including online communities [30,31,45], cyber secu
forests, and barren land. Using RNN, we were able to track the devel rity [46,47], text mining [48], sales prediction [49], recommender
opment in a census tract, in relation to its poverty rate, and use it to systems [50,51], mobile design [52], targeted advertising [53] and
predict the poverty rates. This method can be replicated and used in crowdfunding [54]. Our study is a step forward in this direction that
other contexts to quantify infrastructure and urban development and focuses on the public policy and social works domains.
devise proper measures.
Second, we propose and implement a VAR-based Ensemble model
7.4. Future work
that combines predictions from daytime imaging with features related to
adjacent tract poverty rates and demographic features. The literature on
We identify the following potential areas of future work.
time series based ensemble models is sparse [40–42]. However, the
First, our research uses a dataset of eight years for a single county.
advent of big data has necessitated the need for such models. Recently,
Given that the metrics we used, and the neural networks trained can be
Velichety and Shrivastava [43] proposed and implemented a two stage
applied to other regions beyond our research context, a study consisting
Vector Autoregression model to quantify the impacts of fake news on
of similar data from multiple geographical regions will not only allow us
social media platforms. More recently, there has been a surge in the
to develop a deeper theoretical understanding of the contextual factors
usage of neural network-based models for time series prediction. How
influencing poverty but also help customize our processes and metrics to
ever, these models can be complex to interpret because they involve
improve their generalizability.
multiple hidden layers and millions (or even billions) of parameters. On
Second, when using the spatial network measures to predict the
the other hand, econometric based time series methods can only handle
poverty rate of a focal census tract, we treat all neighboring census tracts
numeric data (A comprehensive comparison of the pros and cons of
equally. However, geographic proximity may not be the only factor
these methods is provided in Appendix C). In this research we identify a
impacting the connections between tracts, which may also be influenced
middle ground between the two by compromising only slightly on the
by neighborhood characteristics. Prior research has found that com
interpretability. Our work in this research is a step forward in building
munities’ network is linked by their socioeconomic needs, despite the
partially interpretable models for time series data. In doing so, we also
geographical distance between them [55]. To take such complexity into
contribute to the literature on proposing novel methods using big data
consideration, future research may build dynamic graph neural net
and design science [44].
works to capture temporal relationships between poverty levels of
Third, we use spatial measures to quantify the spatial spillover
census tracts and identify poverty co-movement among tracts, which
impact in poverty. We found that skewness of neighborhood tracts’
may or may not be adjacent to each other. The embeddings learnt from
poverty levels is significant in predicting the poverty rate of the focal
such networks can provide insights about the connections between
tract. The method of constructing spatial network based on neighbor
census tracts beyond their geographical distance, which can further help
hood data and deriving metrics can be replicated by future researchers
improve predictive accuracy. Such a study can also contribute to the
working in this area.
wealth of theoretical and empirical literature on using network analyses
Fourth, we propose a customized algorithm using shapely polygons
to solve societal problems [56–58] and on applications of graph neural
to determine the common area between a census tract and a school
networks [45,59–63].
catchment area and assign a school district to a census tract. This al
gorithm can be replicated efficiently to find common intersecting areas
CRediT author contribution statement
of any two geographies.
Brian Hoogstra: Conceptualization; Data Curation; Methodology.

7.3. Practical contributions
Srikar Velichety: Funding acquisition, Investigation, Project adminis
tration, Validation. Chen Zhang: Writing – review & editing, Validation
Our work provides an overarching framework for policy makers to
Funding acquisition, Visualization.
use multiple data sources effectively to combat poverty. We use a variety
of traditional and big data sources in our research to build efficient and
accurate predictive models. All our data sources are publicly accessible Declaration of Competing Interest
and easily available for decision makers. Accurate and timely pre
dictions of poverty rates at the census tract level can be extremely useful The authors declare the following financial interests/personal re
for city officials in a variety of tasks including budget and resource al lationships which may be considered as potential competing interests:
locations and timely interventions to prevent crime. In addition, we also SRIKAR VELICHETY reports financial support was provided by the
provide ways of combining traditional and big data resources in an FedEx Institute of Technology and the Fogelman College of Business and
efficient way. The computational requirements for running our models Economics. Srikar Velichety reports a relationship with The University
are minimal and are easily available to most of the city officials. Finally, of Memphis that includes: employment.
the models and frameworks proposed here can be easily customized and
replicated to predict poverty in other cities in the US. The baseline Data availability
features for all the models remain the same and can be collected from the
same sources i.e., Department of Transportation, FBI Crime Database Data will be made available on request.
9
Appendix A. Appendix
Fig. A1 shows the landscape of a census tract extracted from GEE while Figs. A2 and A3 show the results of clustering on this image after applying
the ResNet50 model [25]. We can see that the neural network is clearly able to identify and segregate different sections of the landscape including
residential neighborhoods, industrial warehouses, barren land, and dense forests.
Fig. A1. Original.
Fig. A2. 3 Clusters.
Fig. A3. 4 Clusters.
To determine the optimal value for number of clusters, for each value of k (ranging from 2 to 10), we first evaluated the quality of the clustering
results using evaluation functions proposed in [64] that measure the cluster compactness (Cmp), cluster separation (Sep), and combined measure of
overall cluster quality (Ocq). The definitions of these functions are given below.
1∑c v(Ci )
Cmp = (A.1)
c i v(X)
where C is the number of clusters generated by dataset X, v(Ci) is the standard deviation of the cluster Ci and v(X) is the standard deviation of the
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
∑ ̅
2
dataset X. v(X) = N1 N i=1 d (xi , μ) where d() is the Euclidean distance between two vectors, N is the number of members in X, and μ is the mean of X.
( ( ))
1 ∑c ∑ c
d2 xc i , xcj
Sep = exp − (A.2)
c(c − 1) i=1 j=1 j∕
=i
2σ2
( )
where C is the number of clusters, σ is the standard deviation of the dataset X, xci is the centroid of cluster ci, and d xci , xcj is the distance between the
10
centroids of ci and cj.

To calculate the overall quality of clusters obtained, we give equal importance to both Compactness (Cmp) and Separation (Sep) and calculate a
measure of overall quality as follows:
Ocq = 0.5*Cmp + 0.5*Sep (A.3)
The lower the value of Ocq, the better the quality of clusters.
Algorithm for finding and assigning school catchment area to a census tract:
Our algorithm takes the co-ordinates (latitude and longitude) of the nearest intersection edges of the school catchment areas. The catchment area is
approximated to the nearest shaped polygon.
After finding the area of intersection, which is in turn approximated to the nearest shaped polygon, the common intersection area is calculated.
This area is divided by the area of the largest overlapping census tract to determine the extent of overlap. In this way, we assign a school catchment
area to a single census tract. The pseudo code for the algorithm is shown below:
Initialize: - common_intersection ⟵ 0
Catchment_area ⟵ Nearest_polygon (Catchment_area)
CensusTract_area ⟵ Nearest_polygon (CensustTract_area)
Common_intersection_area⟵ intersection (Catchment_area, CensusTract_area)
Common_intersection ⟵ Nearest_polygon (Common_intersection_area)
Appendix B. Appendix
Table B.1
Model Fit Statistics.
Lag AIC HQ SC FPE
1 77.99 78.39 79.04 7435765 × 1027

2 77.13 77.91 79.17 3,145,216 × 1027
3 77.07 78.21 80.08 2,951,472 × 1027
Table B.2
Forecast Error Variance Decomposition.
Variable FEVD (%)
Poverty Cluster 0.65

Average Development Poverty Prediction 1.15
Skewness Development Poverty Prediction 1.78
Average Adjacent Tract Poverty 0.43
Skewness Adjacent Tract Poverty 1.80
High School Graduates 0.03
Crime Count 1.2
Median Family Income 0.15
Minority Percentage 1.65
Owner Occupied Units 0.74
Real Estate Transaction Count 0.41
Mean Transaction Price (Homes) 1.22
Skewness Transaction Price (Homes) 0.01
Average Price Per Sqft 0.43
Skewness Price Per Sqft 0.36
Average Nighttime Intensity 0.31
Skewness Nighttime Intensity 0.87
We propose a Vector Autoregression (VAR) based Ensemble model in this research. It combines poverty prediction from daytime imaging with
features related to poverty of adjacent census tracts and education attainment.
⎡ ⎤
⎢ ⎥
α1 + β 1 t ⎢ qk1,1 qk1,2 qk1,3 qk1,4 qk1,5 qk1,6 ⎥ e1t
⎛ ⎞ ⎡ ⎤ ⎢ ⎥ ⎛ ⎞ ⎡ ⎤
families below povertyit ⎢ ⎥ families below povertyik
⎜ Average Development Predictionit ⎟ ⎢ ⎥ ⎢ ⎥ ⎜ Average Development Predictionik ⎟ ⎢ ⎥
⎜ ⎟ ⎢ ⎥ ⎢ ⎥ ⎜ ⎟ ⎢ ⎥
⎜ Adjacent Tract Poverty Ratesit ⎟ ⎢ α3 + β 3 t ⎥ ∑ ⎢
t− 3 qk3,1 qk3,2 qk3,3 qk3,4 qk3,5 qk3,6 ⎥ ⎜ Adjacent Tract Poverty Ratesik ⎟ ⎢ e3t ⎥
⎜ ⎟=⎢ ⎥+ ⎢ ⎥*⎜ ⎟+⎢ ⎥ (B.1)
⎜ Education Attainmentit ⎟ ⎢ α4 + β 4 t ⎥ ⎢ k k k k k k ⎥ ⎜
q4,1 q4,2 q4,3 q4,4 q4,5 q4,6 ⎥ ⎜ Education Attainmentik ⎟ ⎢ e4t ⎥
⎜ ⎟ ⎦ k=t− 1⎢ ⎟
⎝ Real Estate and Economic Featuresit ⎠ ⎣ ⎢ ⎥ ⎝ Real Estate and Economic Featuresik ⎠ ⎣ ⎦
Crime Countit ⎢ ⎥ Crime Countik
⎢ ⎥
⎢ qk6,1 qk6,2 qk6,3 qk6,4 qk6,5 qk6,6 ⎥ e6t
α6 + β 6 t ⎣ ⎦
Where αi (i = 1, 2, 3, 4, 5, 6) = constant,βi ,qki,j (i, j = 1,2, 3, 4, 5, 6)= Coefficients, K = lag length, ei (i = 1, 2, 3, 4, 5, 6) is the white-noise residual. The
ones in bold are predictions i.e., ensemble features while the ones in regular text are raw features.
To determine the lag order (K), we use Final Prediction Error (FPE). We found that the optimal lag was three. We test for various assumptions of
VAR including multivariate normality (MVN), co-integration of time series, autocorrelation, auto-regression conditional heteroscedasticity (ARCH)
11
and white noise residuals. Fig. B1 gives a pictorial representation.
T T+1
Fig. B1. Pictorial Representation of VAR-based Ensemble across two periods.
Appendix C. Theoretical and empirical comparison of VAR and NN
Table C.1 shows a theoretical comparison of Neural Networks and VAR based models for time series data on four unique dimensions identified
using the literature in both these areas [37,65–68].
Table C.1
Comparison of Neural Network and VAR Based Models.
Neural Networks VAR
Predictive Performance Excellent Moderate

Interpretability Poor Excellent
Explainability Poor Excellent
Computational Cost High Minimal
To compensate for the moderate predictive performance of VAR models, we propose using ensembles. Doing so can moderately decrease the
interpretability while improving the predictive performance to a large extent. Table C.2 compares the computational cost and predictive performance
of a Neural Network model with a combined Neural Network and VAR model. In our case, it is not possible to construct a pure VAR model because
some daytime features were derived using Neural Networks. We use an extreme learning machine (ELM) neural network for our time series prediction.
Computational Cost is calculated based on the time it takes to train the model. Table C.2 shows that combining neural networks and VAR results in a
substantial performance improvement across all the models while decreasing the computational cost.
In addition, Model 5 (using both poverty cluster based on spatial features from CNN and poverty prediction based on temporal features from LSTM)
outperforms Model 3 (using only poverty cluster based on spatial features from CNN) and Model 4 (using only poverty prediction based on temporal
features from LSTM) in model accuracy, demonstrating the strength of combining Convolutional Neural Network (CNN) for identification and
classification of landscape features with Recurrent Neural Network (RNN) for quantify temporal changes to infrastructure and urban development
over time.
Table C.2
Computational Cost and Predictive Performance of Neural Network and VAR Models.
Computational Cost (In MSE RMSE MAPE MAE

Seconds)
Neural Network 1.600 64.40 8.02 0.050 0.240

Model 1 (Baseline Features) Neural Network
0.017 24.92 4.91 2.640 0.174
+VAR
Neural Network 1.660 49.50 7.04 0.047 0.210
Model 2 (Baseline + Nighttime) Neural Network
0.106 23.00 4.79 2.670 0.171
+VAR
Neural Network 2.110 53.13 7.30 0.051 0.234
Model 3 (Baseline + Nighttime + Poverty Cluster) Neural Network
0.080 22.47 4.74 2.660 0.169
+VAR
Neural Network 1.996 50.73 7.12 0.053 0.230
Model 4 (Baseline + Nighttime + Daytime landscape changes) Neural Network
0.110 24.23 4.92 2.630 0.174
+VAR
Neural Network 2.040 41.89 6.47 0.049 0.210
Model 5 (Baseline + Nighttime + Poverty Cluster + Daytime landscape
Neural Network
changes) 0.110 22.45 4.74 2.640 0.169
+VAR
Model 6 (All features) Neural Network 1.857 39.30 6.27 0.039 0.210
(continued on next page)
12
Table C.2 (continued )

Computational Cost (In MSE RMSE MAPE MAE
Seconds)
Neural Network
0.130 22.13 4.71 2.640 0.168
+VAR
References [33] C. Njuguna, P. McSharry, Constructing spatiotemporal poverty indices from big
data, J. Bus. Res. 70 (2017) 318–327.
[34] N. Eagle, M. Macy, R. Claxton, Network diversity and economic development,
[1] K. Schaeffer, 6 facts about economic inequality in the U.S, Pew Res. Cent, 2020. htt
Science. 328 (2010) 1029–1031.
ps://www.pewresearch.org/fact-tank/2020/02/07/6-facts-about-economic-ine
[35] A. Abu, R. Hamdan, N.S. Sani, Ensemble learning for multidimensional poverty
quality-in-the-u-s/ (accessed August 18, 2022).
classification, Sains Malays. 49 (2020) 447–459.
[2] J.E. Blumenstock, Fighting poverty with data, Science. 353 (2016) 753–754.
[36] S. Gillies, The Shapely User Manual, URL Httpspypi OrgprojectShapely, 2013.
[3] N. Jean, M. Burke, M. Xie, W.M. Davis, D.B. Lobell, S. Ermon, Combining satellite
[37] G. Adomavicius, J. Bockstedt, A. Gupta, Modeling supply-side dynamics of IT
imagery and machine learning to predict poverty, Science. 353 (2016) 790–794.
components, products, and infrastructure: an empirical analysis using vector
[4] P.O. Okwi, G. Ndeng’e, P. Kristjanson, M. Arunga, A. Notenbaert, A. Omolo,
autoregression, Inf. Syst. Res. 23 (2012) 397–417.
N. Henninger, T. Benson, P. Kariuki, J. Owuor, Spatial determinants of poverty in
[38] H. Lütkepohl, New Introduction to Multiple Time Series Analysis, Springer Science
rural Kenya, Proc. Natl. Acad. Sci. 104 (2007) 16769–16774.
& Business Media, 2005.
[5] M. Xie, N. Jean, M. Burke, D. Lobell, S. Ermon, Transfer learning from deep
[39] A. Fisher, C. Rudin, F. Dominici, Model Class Reliance: Variable Importance
features for remote sensing and poverty mapping, in: Thirtieth AAAI Conf. Artif.
Measures for any Machine Learning Model Class, from the, Rashomon Perspect,
Intell, 2016.
ArXiv E-Prints, 2018.
[6] M. Bertrand, S. Mullainathan, E. Shafir, A behavioral-economics view of poverty,
[40] J.D. Wichard, M. Ogorzalek, Time series prediction with ensemble models, in: 2004
Am. Econ. Rev. 94 (2004) 419–423.
IEEE Int. Jt. Conf. Neural Netw. IEEE Cat No 04CH37541, IEEE, 2004,
[7] S.N. Durlauf, Groups, social influences, and inequality, Poverty Traps (2006)
pp. 1625–1630.
141–175.
[41] A. Galicia, R. Talavera-Llames, A. Troncoso, I. Koprinska, F. Martínez-Álvarez,
[8] A.V. Banerjee, A. Banerjee, E. Duflo, Poor economics: a radical rethinking of the
Multi-step forecasting for big data time series based on ensemble learning, Knowl.-
way to fight global poverty, Public Affairs (2011).
Based Syst. 163 (2019) 830–841.
[9] L. Rainwater, T.M. Smeeding, Poor Kids in a Rich Country: America’s Children in
[42] M. van Heeswijk, Y. Miche, T. Lindh-Knuutila, P.A. Hilbers, T. Honkela, E. Oja,
Comparative Perspective, Russell Sage Foundation, 2003.
A. Lendasse, Adaptive ensemble models of extreme learning machines for time
[10] M. Bertrand, A. Morse, Information disclosure, cognitive biases, and payday
series prediction, in: Int. Conf. Artif. Neural Netw, Springer, 2009, pp. 305–314.
borrowing, J. Financ. 66 (2011) 1865–1893.
[43] S. Velichety, U. Shrivastava, Quantifying the impacts of online fake news on the
[11] D. Brady, Theories of the causes of poverty, Annu. Rev. Sociol. 45 (2019) 155–175.
equity value of social media platforms–evidence from Twitter, Int. J. Inf. Manag.
[12] M.R. Rank, H.-S. Yoon, T.A. Hirschl, American poverty as a structural failing:
64 (2022), 102474.
evidence and arguments, J. Sociol. Soc. Welf. 30 (2003) 3.
[44] A. Abbasi, S. Sarker, R.H. Chiang, Big data research in information systems: toward
[13] D. Brady, L.M. Burton, The Oxford Handbook of the Social Science of Poverty,
an inclusive research agenda, J. Assoc. Inf. Syst. 17 (2016).
Oxford University Press, 2016.
[45] S. Velichety, S. Ram, Finding a needle in the haystack: recommending online
[14] D. Brady, R.M. Finnigan, S. Hübgen, Rethinking the risks of poverty: a framework
communities on social media platforms using network and design science, J. Assoc.
for analyzing prevalences and penalties, Am. J. Sociol. 123 (2017) 740–786.
Inf. Syst. 22 (2021) 1285–1310.
[15] L. Tach, A.D. Emory, Public housing redevelopment, neighborhood change, and the
[46] A. Abbasi, Z. Zhang, D. Zimbra, H. Chen, J.F. Nunamaker Jr., Detecting fake
restructuring of urban inequality, Am. J. Sociol. 123 (2017) 686–739.
websites: the contribution of statistical learning theory, MIS Q. (2010) 435–461.
[16] D. Brady, A. Blome, H. Kleider, How Politics and Institutions Shape Poverty and
[47] B. Biswas, A. Mukhopadhyay, S. Bhattacharjee, A. Kumar, D. Delen, A text-mining
Inequality, Oxford University Press, Oxford, UK, 2016.
based cyber-risk assessment and mitigation framework for critical analysis of
[17] M.D. Partridge, D.S. Rickman, Persistent pockets of extreme American poverty and
online hacker forums, Decis. Support. Syst. 152 (2022), 113651.
job growth: is there a place-based policy role? J. Agric. Resour. Econ. 201–224
[48] H. Dutta, A. Gupta, PNRank: unsupervised ranking of person name entities from
(2007).
noisy OCR text, Decis. Support. Syst. 152 (2022), 113662.
[18] B. Gweshengwe, N.H. Hassan, Defining the characteristics of poverty and their
[49] T. Geva, G. Oestreicher-Singer, N. Efron, Y. Shimshoni, Using Forum and Search
implications for poverty analysis, Cogent Soc. Sci. 6 (2020) 1768669.
Data for Sales Prediction of High-Involvement Products. https://papers.ssrn.com/s
[19] D.H. Weinberg, Poverty spending and the poverty gap, J. Policy Anal. Manag. 6
ol3/papers.cfm?abstract_id=2294609, 2015 (accessed September 20, 2017).
(1987) 230–241.
[50] S. Oruç, P.E. Eren, A. Koçyiğit, A constraint programming model for making
[20] S.J. Goetz, A. Rupasingha, The returns on higher education: estimates for the 48
recommendations in personal process management: a design science research
contiguous states, Econ. Dev. Q. 17 (2003) 337–351.
approach, Decis. Support. Syst. 152 (2022), 113665.
[21] M.S. Crandall, B.A. Weber, Local social and economic conditions, spatial
[51] R. Duan, C. Jiang, H.K. Jain, Combining review-based collaborative filtering and
concentrations of poverty, and poverty dynamics, Am. J. Agric. Econ. 86 (2004)
matrix factorization: a solution to rating’s sparsity problem, Decis. Support. Syst.
1276–1281.
156 (2022), 113748.
[22] N. Pokhriyal, D.C. Jacques, Combining disparate data sources for improved poverty
[52] F. Provost, D. Martens, A. Murray, Finding similar mobile consumers with a
prediction and mapping, Proc. Natl. Acad. Sci. 114 (2017) E9783–E9792.
privacy-friendly geosocial design, Inf. Syst. Res. 26 (2015) 243–265.
[23] M.E. Delavega, Memphis poverty fact sheet, 2021, p. 20.
[53] B. Kitchens, D. Dobolyi, J. Li, A. Abbasi, Advanced customer analytics: strategic
[24] K. Peffers, T. Tuunanen, M.A. Rothenberger, S. Chatterjee, A design science
value through integration of relationship-oriented big data, J. Manag. Inf. Syst. 35
research methodology for information systems research, J. Manag. Inf. Syst. 24
(2018) 540–574.
(2007) 45–77.
[54] Q. Du, J. Li, Y. Du, G.A. Wang, W. Fan, Predicting crowdfunding project success
[25] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in:
based on backers’ language preferences, J. Assoc. Inf. Sci. Technol. 72 (2021)
Proc. IEEE Conf. Comput. Vis. Pattern Recognit, 2016, pp. 770–778.
1558–1574.
[26] T.-Y. Kim, S.-B. Cho, Predicting residential energy consumption using CNN-LSTM
[55] C. Bevilacqua, P. Sohrabi, N. Hamdy, F. Mangiulli, Mapping connections between
neural networks, Energy. 182 (2019) 72–81.
neighborhoods in response to community-based social needs, Sustainability. 15
[27] W. Boulila, H. Ghandorh, M.A. Khan, F. Ahmed, J. Ahmad, A novel CNN-LSTM-
(2023) 4898.
based approach to predict urban expansion, Ecol. Inform. 64 (2021), 101325.
[56] K. Srinivasan, F. Currim, S. Ram, Predicting high-cost patients at point of admission
[28] A. Vidal, W. Kristjanpoller, Gold volatility prediction using a CNN-LSTM approach,
using network science, IEEE J. Biomed. Health Inform. 22 (2017) 1970–1977.
Expert Syst. Appl. 157 (2020), 113481.
[57] W. Zhang, S. Ram, A comprehensive analysis of triggers and risk factors for asthma
[29] J. Bockstedt, C. Druehl, A. Mishra, Heterogeneous submission behavior and its
based on machine learning and large heterogeneous data sources, MIS Q. 44
implications for success in innovation contests with public submissions, Prod.
(2020).
Oper. Manag. 25 (2016) 1157–1176.
[58] S. Ram, W. Zhang, M. Williams, Y. Pengetnze, Predicting asthma-related
[30] S. Velichety, Quality assessment of peer-produced content in knowledge
emergency department visits using big data, IEEE J. Biomed. Health Inform. 19
repositories using big data and social networks: The case of implicit collaboration
(2015) 1216–1223.
in Wikipedia, in: ACM SIGMIS Database DATABASE Adv. Inf. Syst 50, 2019,
[59] W. Liao, B. Bak-Jensen, J.R. Pillai, Y. Wang, Y. Wang, A review of graph neural
pp. 28–51.
networks and their applications in power systems, J. Mod. Power Syst. Clean
[31] S. Velichety, S. Ram, J. Bockstedt, Quality assessment of peer-produced content in
Energy 10 (2021) 345–360.
knowledge repositories using development and coordination activities, J. Manag.
[60] A. Derrow-Pinion, J. She, D. Wong, O. Lange, T. Hester, L. Perez, M. Nunkesser,
Inf. Syst. 36 (2019) 478–512.
S. Lee, X. Guo, B. Wiltshire, Eta prediction with graph neural networks in google
[32] J. Blumenstock, G. Cadamuro, R. On, Predicting poverty and wealth from mobile
maps, in: Proc. 30th ACM Int. Conf. Inf. Knowl. Manag, 2021, pp. 3767–3776.
phone metadata, Science. 350 (2015) 1073–1076.
[61] X.-M. Zhang, L. Liang, L. Liu, M.-J. Tang, Graph neural networks and their current
applications in bioinformatics, Front. Genet. 12 (2021), 690049.
13
[62] J. Shlomi, P. Battaglia, J.-R. Vlimant, Graph neural networks in particle physics, bachelor’s degree from Davenport University and a master’s degree from University of
Mach. Learn. Sci. Technol. 2 (2020), 021001. Memphis.
[63] H. Tian, X. Zheng, K. Zhao, M.W. Liu, D.D. Zeng, Inductive representation learning
on dynamic stock co-movement graphs for stock predictions, INFORMS J. Comput.
Srikar Velichety is an Assistant Professor of Business Information and Technology at the
34 (2022) 1940–1957.
Fogelman College of Business and Economics at the University of Memphis. Velichety got
[64] J. He, M. Lan, C.-L. Tan, S.-Y. Sung, H.-B. Low, Initialization of cluster refinement
his PhD in Management Information Systems from the Eller College of Management,
algorithms: A review and comparative study, in: in: 2004 IEEE Int. Jt. Conf. On
University of Arizona in 2016. His research interests are in social media and Social Net
Neural Netw, IEEE, 2004.
works, User Generated Content, Recommender Systems and Predictive Analytics. His
[65] M. Bańbura, D. Giannone, L. Reichlin, Large Bayesian vector auto regressions,
research has been published in journals including JMIS, JAIS, IJIM and ACM SIGMIS
J. Appl. Econ. 25 (2010) 71–92.
DATABASE.
[66] N. Frosst, G. Hinton, Distilling a neural network into a soft decision tree, ArXiv
Prepr. ArXiv171109784, 2017.
[67] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, Chen Zhang is an Associate Professor of Business Administration at Ivy College of Business
V. Vanhoucke, A. Rabinovich, Going deeper with convolutions, in: Proc. IEEE Conf. at the Iowas State University. Prior to this, she was the associate dean of faculty and
Comput. Vis. Pattern Recognit, 2015, pp. 1–9. administration at the Fogelman College of Business and Economics at the University of
[68] A. Graves, A. Mohamed, G. Hinton, Speech recognition with deep recurrent neural Memphis. Zhang got her PhD in Management Information Systems from the Krannert
networks, in: 2013 IEEE Int. Conf. Acoust. Speech Signal Process, IEEE, 2013, School of Management at Purdue University in 2007. Her research interests are in software
pp. 6645–6649. platforms, User Generated Content and Sharing Economy. Here research has been pub
lished in top journals including MIS Quarterly, Information Systems Research, IEEE
Software and CAIS.
Brian Hoogstra is a Data Scientist Principal at FedEx Services, Memphis. He has wide
experience in database administration and enterprise content recovery. Brian has a
14

1 s2.0 S0167923623001550 Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S0167923623001550 Main

Uploaded by

Copyright:

Available Formats

Decision Support Systems 177 (2024) 114080

Contents lists available at ScienceDirect

Decision Support Systems

Developing a contextual model of poverty prediction using data science and

Fig. 1. Poverty Rates in Shelby County.

Fig. 2. Research Framework.

Fig. 3. Data Sources for Feature Extraction.

Fig. 4. Process for Day Time Image Feature Extraction.

number of high school graduates as a proxy for the educational attain­

Fig. 7. Training Result for RNN.

0.101** 0.092* 0.092* 0.019**

Significance level: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05.

Brian Hoogstra: Conceptualization; Data Curation; Methodology.

Fig. A1. Original.

Fig. A2. 3 Clusters.

Fig. A3. 4 Clusters.

centroids of ci and cj.

Lag AIC HQ SC FPE

1 77.99 78.39 79.04 7435765 × 1027

Variable FEVD (%)

Poverty Cluster 0.65

and white noise residuals. Fig. B1 gives a pictorial representation.

Fig. B1. Pictorial Representation of VAR-based Ensemble across two periods.

Appendix C. Theoretical and empirical comparison of VAR and NN

Neural Networks VAR

Predictive Performance Excellent Moderate

Computational Cost (In MSE RMSE MAPE MAE

Neural Network 1.600 64.40 8.02 0.050 0.240

Table C.2 (continued )

You might also like

number of high school graduates as a proxy for the educational attain

Significance level: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05.