Professional Documents
Culture Documents
Table A.6 presents key statistics on the Tanzanian industrial between 1991 and 2017.
Table A.7 presents descriptive statistics of key variables from the TNPS survey.
Figure A.8 illustrates the frequency and relative frequency of households that switch treatment
status during the study period when using different threshold distances. These “switchers” make
up the treatment group, and are thus central to identifying the effect of small-scale mining.
in Table B.8, I use a standard two-way fixed effects model to reproduce the main findings.
Columns 1-8 present the estimated effect of exposure to small-scale mining on household outcomes
41
Table A.7: Descriptive statistics
Household size 5.39 2.66 5.80 2.85 5.90 2.99 5.70 2.84
Farming last Masika 0.83 0.38 0.77 0.42 0.80 0.40 0.80 0.40
Acres owned 3.97 4.00 4.95 6.71 5.70 10.9 4.87 7.75
Soil quality (0-1) 0.31 0.39 0.36 0.42 0.33 0.43 0.33 0.41
Acres cultivted 3.44 3.77 3.51 4.32 3.99 4.53 3.65 4.22
Output:
Harvest value (imputed) 201.9 280.6 229.3 335.5 394.5 552.7 275.2 415.6
H. value (self-reported) 236.5 511.3 298.9 1113.6 446.6 831.0 327.3 859.0
Sales value 122.1 373.6 161.1 532.9 246.4 702.6 176.5 555.2
Harvest value (Vuli) 47.9 183.8 113.7 411.9 135.4 1020.9 99.0 645.2
Sales value (Vuli) 18.0 118.3 57.2 308.2 127.0 2535.2 67.4 1476.2
Labor:
All 76.0 85.3 74.1 91.2 83.6 94.1 77.9 90.3
Planting 38.2 49.6 38.3 54.6 42.2 58.7 39.6 54.5
Weeding 35.7 46.0 34.3 47.8 42.1 51.8 37.4 48.7
Harvesting 35.7 46.0 34.6 62.8 41.0 69.7 37.1 60.4
Consumption (TSh):
All 182.6 140.0 220.9 163.9 292.5 240.7 232.0 191.9
Food and beverages 131.6 92.5 158.1 109.4 213.4 169.8 167.7 132.7
Education 6.73 21.5 10.5 28.4 13.4 35.2 10.2 29.0
Health 9.14 26.3 8.75 17.7 12.7 31.3 10.2 25.8
Transport & comm. 14.4 39.4 18.2 37.4 24.3 52.8 19.0 43.9
Other 20.8 24.1 25.3 30.6 28.6 36.6 24.9 31.0
Input exp. (TSh):
All 48.5 149.6 49.5 142.4 90.0 265.8 62.7 195.3
Seeds 5.62 13.6 6.24 19.9 13.2 35.4 8.37 24.9
Fertilizers 14.4 90.5 16.1 85.8 27.3 157.2 19.3 116.0
Pesticides 1.80 9.34 1.67 12.0 2.62 17.3 2.03 13.3
Wages 21.9 69.1 21.7 76.0 39.8 145.5 27.8 103.1
Land rents 4.81 32.0 3.83 21.7 7.00 40.6 5.21 32.4
Any tech 0.50 0.50 0.46 0.50 0.56 0.50 0.51 0.50
Technology:
Improved seeds 0.13 0.34 0.11 0.31 0.31 0.46 0.18 0.39
Fertilizers 0.22 0.41 0.23 0.42 0.28 0.45 0.24 0.43
Pesticides 0.095 0.29 0.079 0.27 0.085 0.28 0.086 0.28
Draft animals 0.31 0.46 0.28 0.45 0.32 0.46 0.30 0.46
Machinery 0.15 0.36 0.15 0.36 0.15 0.36 0.15 0.36
M (10 km) 0.098 0.30 0.11 0.31 0.16 0.36 0.12 0.32
P 0.10 0.29 0.11 0.30 0.15 0.33 0.12 0.31
R 0.99 0.045 0.99 0.055 0.97 0.099 0.98 0.071
Observations 965 965 965 2895
42
Figure A.8: Number and share of treated households (switchers)
15 km
20 km
25 km
30 km
35 km
40 km
0 50 100 150 200
Households treated (switchers)
when using a standard two-way fixed effects strategy, as opposed to the stacked-by-event approach
in Tables B.9 to B.11 I use various alternative techniques to address the potential threat of
heterogeneous dynamic treatment effects. in Table B.9, I use the program provided by Borusyak
et al. (2021) to reproduce the main findings. in Table B.10, I use the program provided by
Sun and Abraham (2021) for the same purpose. in Table B.11, I use the program provided by
De Chaisemartin and d’Haultfoeuille (2020). In all cases, the results are similar to those obtained
43
Table B.9: Effect of Mining on Output (Borusyak et al. (2021))
R-squared
Observations 2610 2610 2610 2610 2609 2610 2610 2610
∗∗∗ ∗∗ ∗
Note: Standard errors in parentheses are adjusted for clustering at the village level: p < 0.01, p < 0.05, p
< 0.1.
Table B.11: Effect of Mining on Output (De Chaisemartin and d’Haultfoeuille (2020))
in Table B.12, I reproduce the main findings of this paper when adjusting for arbitrary forms of
spatial correction, as proposed by Colella et al. (2019). I use thresholds of 100 and 200 kilometers.
44
Table B.12: Accounting for spatial correlation
Table C.13 decomposes the effect on agricultural labor by age and gender. The results suggest
that the negative effect on agricultural labor is mainly driven by individuals below 30 years of age,
Table C.14 presents the results of a heterogeneity analysis where the full sample is split by the
median of some variables at baseline, creating two groups of roughly equal size for comparison.
Overall, the results suggest that the effects are particularly strong among households with small
farms and poor access to communications (as measured by distance to major roads).
Appendix D. Consumption
Tables D.16 and D.17 presents the effect of small-scale mining on consumption using different
consumption measures. The estimates presented in columns A1-A6 of Table D.16 are the same as
those presented in the main text, and are included here for reference. Columns B1-B6 use nominal
consumption figures that have not been adjusted for local price differences. These estimates reveal
that adjusting for local prices slightly increases the size of the estimates, but are similar enough to
indicate that price level changes are not driving the effects (also, the difference is not statistically
significant).
The consumption measures used in panels C and D also adjust for differences in the demo-
graphic composition of households. This measure may take better account of changes in individual
45
Table C.13: Labor days
46
Table C.14: Heterogeneity
Outcome: Output Land Labor Input exp. Cons. Output Land Labor Input exp. Cons.
Model: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
A. Soil quality
Poor soil (“good” quality plots = 0) Better soil (“good” quality plots > 0 )
M (10 km) -124.5 -1.64∗∗∗ -17.5 -56.5∗∗∗ 97.9∗∗ -190.3∗∗∗ -1.05∗∗∗ -25.2∗∗ -13.5 88.9
(94.6) (0.56) (14.6) (17.2) (47.8) (64.8) (0.28) (10.7) (26.5) (70.8)
R-squared 0.35 0.51 0.39 0.38 0.48 0.40 0.56 0.37 0.49 0.57
Observations2502 2502 2502 2502 2502 1938 1938 1938 1938 1936
B. Farm size
Small farm (acres ≤ 3) Larger farm (acres > 3)
R-squared 0.40 0.45 0.44 0.49 0.56 0.35 0.44 0.35 0.43 0.51
47
Observations2634 2634 2634 2634 2632 2343 2343 2343 2343 2343
C. Household size
Small household (members ≤ 5) Larger household (members > 5)
M (10 km) -145.6∗∗∗ -0.99∗∗∗ -19.6∗∗∗ -49.4∗∗∗ 33.0∗∗ -154.8∗ -1.30∗∗∗ -13.1 -17.1 142.9∗
(42.3) (0.32) (7.27) (15.4) (14.9) (81.9) (0.43) (14.5) (26.0) (73.8)
R-squared 0.43 0.50 0.48 0.54 0.53 0.39 0.58 0.37 0.41 0.49
Observations2706 2706 2706 2706 2704 2271 2271 2271 2271 2271
D. Communications
Dist. to major road > 11.7 km)) Dist. to major road ≤ 11.7 km)
M (10 km) -305.9∗∗∗ -1.73∗∗∗ -27.6∗∗∗ -38.9∗∗∗ 136.8∗∗∗ -80.3∗∗ -0.68∗∗∗ -14.6∗ -42.8∗∗ 51.5
(51.4) (0.25) (9.04) (13.8) (39.2) (33.4) (0.24) (8.19) (20.2) (41.1)
R-squared 0.38 0.52 0.42 0.50 0.43 0.45 0.62 0.48 0.44 0.60
Observations2325 2325 2325 2325 2323 2331 2331 2331 2331 2331
Note: All regressions include year-event and household-event fixed effects. Agricultural outcomes refer to Masika season. R2 accounts
∗∗∗ ∗∗ ∗
for the fixed effects. Standard errors in parentheses are adjusted for clustering at the village level: p < 0.01, p < 0.05, p < 0.1.
welfare that follows from the fact that households have different material and nutritional needs.
To produce a measure that considers such differences, the consumption of each household is di-
vided by the weighted sum of household members, using the weights described in Table D.15.
These “adult equivalent units” are intended to approximately capture the different material needs
of household members based on age and gender, and are used by the Tanzania National Bureau of
Statistics. Finally, Table D.17 reproduces the same estimates using a logarithmic transformation
48
Table D.16: Consumption: alternative specifications
49
Table D.17: Consumption: logarithmic transformations
50
Table D.18 reports the estimated effect of small-scale mining on the consumption of different
foods, and their nutritional content. The data comes with several limitations. First, it only in-
cludes food consumed within the household. Second, items are sometimes reported in the census as
a number of units, rather than weight or volume, and measurements are often imprecise. Further-
more, the quality of food products cannot be accounted for. However, despite the limitations, this
data may nonetheless provide a rough approximation of the nutritional content of the reported
food consumption. First, all reported quantities are converted to their approximate weight in
grams. Next, the energy and protein content for each listed food item is calculated using the food
The estimates presented in Table D.18 are imprecise due to the limitations in the data. How-
ever, the coefficient estimates are generally positive, and there is some indication that the overall
consumption of nutrients increased. Columns 1 and 2 refer to nutrients consumed (per capita and
household), while columns 3 – 6 refer to the quantity of four major food categories.
Table D.20 reports the effect of small-scale mining on the expenditure share of each consumption
category reported in Table 3. The results suggest that, while the consumption share of transporta-
tion decreased, the share of food in total consumption expenditure slightly increased. As noted by
Almås (2012), the consumption share of food should decrease with income when relative prices are
held constant. Therefore, these findings suggest that the relative price of food may have increased
in ways not captured by the price index. Table D.19 decomposes the consumption category "other"
reported in Table 3.
51
Table D.19: Consumption shares
Table E.21 presents the effect on in-, out- and net-migration. This data was collected by the
Table E.22 presents the effect of exposure to small-scale mining on land use and the form of
control over agricultural land. The estimates presented in columns A1 and B1 (which are identical)
suggest a negative effect on the total amount of land used or controlled by households of roughly
one acre. This land disappears from the survey, which makes it impossible to determine whether it
was abandoned, sold or put to collective use. The negative effect on cultivated land is larger than
the negative effect on total land controlled by the household, which is explained by small (and
52
statistically insignificant) increases in the amount of land rented out or used for other purposes
(for instance grazing). Regarding ownership structure, the negative point estimate for the total
amount of land used or controlled by households is mainly explained by a reduction in land listed
Table E.23 presents the effect on total land value and the land value per acre. The land
value per acre is computed by dividing the total self-reported land value of agricultural plots
provided by farmers by the self-reported total land area. It is therefore subject to two separate
sources of potential miss-measurement. However, this measure can serve as a rough approximation
for changes in average land values over time under the assumption that the measurement error is
consistent within households. The results suggest that the perceived average value of land decreases
somewhat, both when considering all land and when restricting to land that is being cultivated,
as indicated in columns 3 and 6, but the estimates are small and noisy, and should be interpreted
cautiously.
Table E.24 presents the effect of small-scale mining on variables related to crime and safety.
Column 1 reports the effect on crop theft, and uses a dependent variable indicating whether the
household lost any crops to theft. Column 2 reports the effect on respondents’ stated satisfaction
with crime and safety. The survey asked respondents “How satisfied or dissatisfied would you say
you are with your protection against crime/your safety?” and records responses on a scale between
1 (very satisfied) and 7 (very dissatisfied). The dependent variable used in column 2 is a dummy
53
Table E.23: Self-reported land value
variable indicating whether the household responded with a score between 1 and 3, i.e. “very
satisfied”, “satisfied” or “somewhat satisfied”. Since this survey question was introduced with the
second wave of the TNPS, it omits observations from the TNPS 2008 are omitted.
Table F.25 presents the effects of small-scale gold mining on the weight, height and BMI of
individuals of different ages. If exposure to small-scale gold mining leads to improved nutrition,
then one might expect to find positive effects among young children (aged 0-4 and 5-9), while older
children and adults should remain largely unaffected. Table F.26 instead presents the effects on
weight-for-age, height-for-age and weight-for-height z-scores. Panels A-C use the raw z-score, while
Table F.27 presents the effect of small-scale mining on schooling. The dependent variable is a
54
Table F.25: Effect on weight, height and BMI
55
Table F.26: Effect on z-scores
56
Table F.27: Effect on schooling (respondent currently in school)
57
Appendix G. Additional Robustness Tests
Table G.28 presents key results from this paper when also controlling for rainfall (current and
past year), and access to communications. The variable “Dist. road” captures the distance from
Dist. to road 0.27 0.0056∗∗ -0.015 -0.0079 -0.24 0.000048 -0.00019 -0.00029
(0.42) (0.0028) (0.083) (0.15) (0.15) (0.00054) (0.00037) (0.00030)
Rain lagged -45.1 0.0071 -1.90 6.62 -2.39 -0.075 -0.0014 -0.0085
(47.4) (0.31) (6.82) (27.4) (20.6) (0.050) (0.063) (0.029)
Table G.29 presents results obtained from regressing treatment status Mit on different measures
P recipitation
of rainfall. The variable Rain = M eanprecipitation is calculated as the total 12-month (July-June)
precipitation in the current year divided by average precipitation over all such 12-month periods.
The variable Rain lagged is constructed in the same way, but uses the preceding 12-month period
in the numerator. The specifications presented in column 3 and 6 of Table G.29 include timing
dummies indicating the timing of the onset of the wettest quarter in weeks (1-36) relative to
the mean onset of the wettest quarter. These dummies are intended to capture variation in the
timing of the major rainy season which, in addition to the amount of precipitation, can have
important consequences for agricultural outcomes. The precipitation data is provided along with
the TNPS survey data and comes from the Rainfall Estimates (RFE) database produced by the
National Oceanic and Atmospheric Administration’s Climate Prediction Center (NOAA CPC).
The Tanzanian National Bureau of Statistics (NBS) has matched the NOAA CPC data with GPS-
based household locations before applying the random offsets to the household coordinates, which
are present in the published data. In addition to the remotely sensed precipitation data, Table
G.29 also includes subjective valuations of farmer satisfaction with the timing and amount of
precipitation during the last major wet season (Masika), which have been collected by the TNPS.
The evidence provided by these exercises indicates that the results of this paper are not driven
by shocks related to rainfall or road constructions, and that adverse weather outcomes that could
58
affect agriculture do a relatively poor job in explaining the emergence of small-scale mining sites.
Stated amount:
Stated timing:
Table G.30 presents the results from estimating 1 using leads of M. The purpose of this exercise
is to verify that the estimated effects are not the result of some random noise in the data, and to
rule out anticipatory effects occurring in the period before treatment. The estimates presented in
G.30 show that the leads, or “placebo” treatments, do not produce statistically significant results.
These findings are consistent with the identifying assumptions that underlie the empirical strategy
of this paper and therefore does not reject the internal validity of the main results.
Table G.31 breaks down the total number of eligible households in the study area present in
59
Table G.30: Placebo: leads of M
the first TNPS wave (2008) by treatment statuses, and by whether they were retained or dropped
out in subsequent survey waves. “Switchers” are households that (based on their location in 2008)
were not initially exposed to small-scale mining, but became exposed at some point during the
study period. “Always treated” are households that always were exposed, and “Never treated” are
households that never became exposed. Importantly, the attrition rate is similar for switchers and
non-switchers.
Table G.32 reports the result of regressing dummy variables indicating treatment statuses on a
dummy indicating whether the household is dropped due to attrition. This analysis again indicates
that households that become exposed to small-scale mining (i.e. switchers) do not experience higher
60
Table G.32: Attrition: regressions
Switched 0.019
(0.042)
R-squared
Observations 1100 1100 1100
Figures H.10 and H.9 plot the location of registered PML applications against confirmed small-
scale mining and processing sites. The data was collected by IPIS survey teams visiting a number
of locations between November 2017 and March 2018, and can be used to verify the correspondence
between PML applications and actual mining operations. Since only a limited and non-randomized
sample of locations were visited, the majority of PML locations in the data can be neither verified
nor rejected as real mining operations using this approach. However, for each confirmed mining
operation in the data, it is possible to verify whether it corresponds to a registered PML. I divide
the geographical space into a 0.2 x 0.2 degree grid (roughly 5 x 5 kilometers) and find that,
conditional on the existence of a confirmed site within a section of the grid, the probability that
the same section also contains a PLM is 79.25%. Furthermore, Figure H.11 presents a histogram
over the distance between each confirmed site and the closest neighboring PML. It indicates that
the vast majority of confirmed locations are located at a PML concession, or within close vicinity
of one. In particular, 94 % of confirmed sites are within 5 kilometers of a registered license, and
99.5 % are within 10 kilometers. These findings imply that, conditional on mining having been
confirmed in some particular location, the probability that exists a PML in the immediate vicinity
is very high.
61
Figure H.9: Registered PML applications and confirmed sites: Geita, Mwanza and Shinyanga
Note: Registered PML applications in red and confirmed mining and processing sites in yellow from in the districts
of Geita, Mwanza and Shinyanga
Note: Registered PML applications in red and confirmed mining and processing sites in yellow from in the district
of Mara
62
Figure H.11: Distance to closest PML application
Note: The figure presents a histogram over the distance in kilometers from each confirmed mining or processing site
(from Merket, 2018) to the closest registered application in the PML data.
63
Appendix I. Accounting for GPS Offsetting
Due to the random noise introduced by GPS offsetting in the TNPS data, the true treatment
variable (M) may differ from the observed treatment variable (M̂) which is derived from the scram-
bled GPS coordinates. However, since the error is introduced by a random variable drawn from a
known uniform distribution, it is possible to calculate the probability with which each observation
truly belongs to the treatment group, i.e. the likelihood that Mit = 1, and the reliability with
which it is classified correctly, i.e. the likelihood that M̂it = Mit . This information is obtained by
considering the proportion of all possible true locations of each household that fall within 10 km of
a PML. This proportion can be approximated by the following expression (derived from the area
√
d2it −75 d2it +75 (15−dit )(dit −5)(dit +5)(dit +15))
25 cos−1 10dit + 100 cos−1 20dit − 2
Pit =
25π
where dit is the observed distance between household i in time t and its closest PML using the
scrambled coordinates. Thus, it is possible to closely approximate both the true probability of
treatment Pit ≈ Pr(Mit = 1) and the reliability Rit = |M̂it + Pit − 1| ≈ Pr(M̂it = Mit ) using
the available geographic information, even if exact coordinates are not provided (however, minor
discrepancies could remain due to the additional 5 km offset applied to every 100th cluster and
the existence of other PMLs nearby). Figure I.12 plots M̂it , Pit and Rit against dit .
Figure I.12: Distribution of Pit and Rit around threshold of treatment
1
.8
.6
.4
.2
0
M−hat P R
Table I.33 reports the estimated effect of M on the outcome variables when accounting for the
64
degree of confidence in various ways. Panel A presents the results from the main specification, and
is included the propose of comparison. Panel B substitutes Pit for M̂it as the treatment variable,
but makes no additional adjustments to the sample. Panel C restricts the sample to households
that are correctly classified with at least 50% probability in every time period. Panel D includes
households that are correctly classified with at least 75% probability in every time period. Finally,
panel E restricts the sample further to include only households that can be assigned with certainty
65