Professional Documents
Culture Documents
March 2014
March 21, 2014 GISC9308-D4b Mr. Ian D. Smith, M.Sc., OLS, OLIP, EP Niagara College 135 Taylor Road Niagara-on-the-Lake, ON L0S 1J0 Dear Mr. Smith, Re: GISC9308-Spatial Analysis Statistics- Short Hills Soil pH Analysis Final Report. Geo Datum GIS Consulting and Solutions Inc. (GeoDatum) is pleased to submit the enclosed final report to provide a soil pH analysis and predictions for the Short Hills Provincial Park. The conclusion as to which technique best predicted our data was that the Inverse Distance Weighted technique was most suited. The reasoning behind the decision included cross validation numerical errors as well as visual interpretations of the produced surface. Should you have any questions or comments, please feel free to reach me at my cell phone (289) 9295312. We look forward to your comments and suggestions. Respectfully submitted for Niagara College,
Nathan Page, BA Project Manager GeoDatum GIS Consulting and Solutions Inc. nathanpage90@gmail.com NP/mha
Enclosures:
cc:
Geo Datum GIS Consulting and Solutions Inc. 135 Taylor Road Niagara-on-the-Lake ON L0S 1J0 (289) 929-5312 (905) 327-7988 geodatum.inc@gmail.com
21 March 2014
Contents
1. 2. 2.1. 3. 3.1. 3.2. 4. 5. 6. 7. Introduction .............................................................................................................................. 3 Data Collection .......................................................................................................................... 5 Typified Soil Data .................................................................................................................. 5 Methodology of Analysis ........................................................................................................ 11 IDW Methodology ............................................................................................................... 11 Kriging Methodology ........................................................................................................... 17 Analysis ................................................................................................................................... 24 Conclusions & Recommendations .......................................................................................... 29 Bibliography ........................................................................................................................... 32 Appendices .............................................................................................................................. 33
Figure 1: Major Tributaries to Short Hills Park (Smith I. D., 2012) ................................................... 4 Figure 2: Average Typified Soil pH Values ........................................................................................ 6 Figure 3: pH sample distribution ...................................................................................................... 7 Figure 4: Easting coordinate distribution ......................................................................................... 8 Figure 5: Northing coordinate distribution ...................................................................................... 9 Figure 6: Soil pH Sampling Distribution .......................................................................................... 10 Figure 7: IDW Search Neighbourhood ........................................................................................... 12 Figure 8: Original pH Cross Validation Figure 9: Inverse Log pH Cross Validation ................... 13 Figure 10: Predicted Soil pH (Log Values) Using IDW Technique ................................................... 15 Figure 11: Predicted Soil pH (Inverse Values) Using IDW Technique ............................................ 16 Figure 12: Semivariogram Log pH Values....................................................................................... 18 Figure 13: Semivariogram Inverse pH Values ................................................................................ 18 Figure 14: Covariance Log pH Values ............................................................................................. 19 Figure 15: Covariance Inverse pH Values ....................................................................................... 19 Figure 16: Kriging Neighbourhood Parameters ............................................................................. 20 Figure 17: Original pH Values Cross Validation21 Figure 18: Inverse pH Values Cross Validation..21 Figure 19: Predicted Soil pH (Log Values) Using Kriging Technique............................................... 22 Figure 20: Predicted Soil pH (Inverse Values) Using Kriging Technique ........................................ 23 Figure 21: IDW Cross Validation Comparison ................................................................................ 25 Figure 22: Kriging Cross Validation Comparison ............................................................................ 26 Figure 23: Predicted Soil pH (Log Values) Using IDW Technique ................................................... 27 Figure 24: Predicted Soil pH (Log Values) Using Kriging Technique............................................... 28 Figure 25: Cross Validation of Actual pH Values vs. Predicted Values ........................................... 30
Page 2
21 March 2014
1.
Executive Summary
This report documents, details and explains GeoDatums findings on the Short Hills Provincial Park Soil pH predictive analysis. As per the Terms of Reference, using ESRIs ArcMap, two predictive surface techniques, Inverse Distance Weighted and Kriging, were explored and utilized to produce raster data layers which interpolate and predict pH values between the sampled data points. In order to fully explore the possible predicted surfaces, the pH values were also converted to their Inverse log values and those value sets were also inputted into the two predictive techniques. The resulting surfaces were determined to be very similar as a result of highly unvaried data points. It is explained in the report how the data points collected may not accurately represent the typified soil plots and that future recommendations would be to carry out real world pH soil sampling of the typified soil plots in order to determine whether the pH value recorded is an accurate representation of pH throughout the entire soil plot. The final conclusion was that the Inverse Distance Weighted technique on the original pH values produced the most accurate and in depth predictions of pH soil values within our Short Hills Park area of interest. Full details and explanations of decisions is to follow in the forthcoming report.
Page 3
21 March 2014
2.
Introduction
Short Hills Provincial Park is located about 5 kilometers southwest of St. Catharines, east of the town of Pelham. Trout Unlimited Canada (TUC) intends to restore brook trout habitat in the park and needs to determine the suitability of the four major tributary branches of subwatersheds that exist in the area. Figure 1 below illustrates the location and extent of Short Hills Park, the area of interest (AOI), as well as a depiction of the subwatershed areas.
Habitat restoration will include the analysis of current and recent past habitat conditions, and shall include soil pH levels impacting brook habitat. GeoDatum has been retained by Ian D. Smith to prepare a soil pH assessment of soil pH levels within the park area and its vicinity and
Page 4
21 March 2014
follows procedures and parameters as per the Terms of Reference (Smith I. , 2014). The following provides a soil pH assessment of the soils in the Short Hills Provincial Park area and its vicinity. It is also our understanding that the results of this assessment will be utilized in the near future by GeoDatum to aid in the completion of Brook Trout Habitat Suitability Assessment of Camp Wetaskiwin subwatershed.
3.
Data Collection
3.1. Typified Soil Data
The Soils of the Regional Municipality of Niagara Report (Volume 2) data were used to create typified soil feature classes, based on the existing soil types identified in the Soils of Niagara Region thesis project sourced from the Niagara College Geographic Information Systems Geospatial Management (GIS-GM) program digital archives. The pH levels for each soil type at the different horizons were averaged for each typification; these were then cross-referenced to the soil types in the Soils of Niagara Region shapefiles in the archive to produce pH level distribution based on feature centroids. Error! Reference ource not found. 2 Error! Reference source not found. showcases the horizon values and the calculated averaged.
Page 5
21 March 2014
6.6
BFO
6.6
BRT
6.0
BVY
FRM PEL
6.6 6.4
7.0
TLD
6.7
TUC
6.5
VIT
6.6
Some polygons were deleted from the shapefile attribute table; these polygons did either not have any soil data associated or represented non-soil features. The pH average value was attached to each polygon representing the different soil types within
Page 6
21 March 2014
the AOI. The resulting matrix of coordinates, soil types and soil pH levels represents the pH levels throughout the park area and its surroundings (Error! Reference source not found. Error! ference source not found.). The range of power of hydrogen (pH) levels within the sampled data is 1 pH (from a minimum of 6 to a maximum of 7). The distribution of pH levels, as illustrated in the histogram below, is considered normal given that it follows the Gaussian curve. In addition, the mean and the median overlap, which further corraborates a symmetric and normalized distribution. The frequency of the mean results in a leptokurtic distribution. No outliers are observed. Refer to 3 below
Easting coordinate distribution was based on soil type changes, which created clusters of coordinates at areas of high soil diversity. Therefore, the distribution of Easting coordinates, as
Page 7
21 March 2014
illustrated in the histogram below, is not considered normal; at best, it is a multimodal distribution. The range of Easting coordinate values within the sampled data is 5740 meters (from a minimum of 638000 to a maximum of 643740). The mean and the median do not coincide. A mesokurtic distribution is very evident. Figure 4 below illustrates the histogram for these data.
Northing coordinate distribution was also based on soil type changes; as with easting coordinate distribution, this created clusters of coordinates at areas of high soil diversity. The distribution of Northing coordinates, as illustrated in the histogram below, is not considered normal; at best, it is a multimodal distribution. The range of Northing coordinate values within the sampled data is 5400 meters (from a minimum of 4770900 to a maximum of 4776300). The mean and the median are slightly off. Figure 4 As seen below in 5, the Northing coordinate
Page 8
21 March 2014
The map below (Figure 6: Soil pH Sampling Distribution) portrays the distribution of sample data points throughout the AOI, in relation to the Camp Wetaskiwin subwatershed tributary to Twelve Mile Creek.
Page 9
21 March 2014
Page 10
21 March 2014
4.
Methodology of Analysis
Once the sample data points were collected and rendered in ESRIs ArcMap, they were ready to be interpolated and used in predictive operations which would serve to accurately predict pH values (Z-values) throughout the entire Area of Interest. These predictive operations utilize the collected point values and their locational relationship to one another in order to predict values for the rest of the surface where sample data points were not collected. Even in todays Information Age collecting data can still be highly costly, and so strategies to offset this issue have been developed. Techniques have been established which look at the points collected and interpolate values for areas where no data has been collected. These techniques utilize
different parameters and interpret the sample points in different ways. For this report the predictive techniques employed were: Inverse Distance Weighted and Kriging. The values
examined are in a logarithmic scale which converts the actual value down to a power of ten. In order to properly explore all aspects of pH this report performs predictive techniques on both the original pH values (6.0-7.0) and the inverse log(10) values (1,000,000-10,000,000) and determines which set of values is most suitable for predictive operations. The forthcoming section details the steps and parameters followed when carrying out the Inverse Distance Weighted (IDW) and Kriging Techniques.
4.1.
IDW Methodology
The parameters and settings chosen for the IDW predictive technique, as well as the reasoning behind those choices, are detailed in the forthcoming section. The predictive tools are located in ArcMap in the Geostatistical Wizard and require the user to go through several steps to specify and choose parameters and settings which will be adhered to when predicting the unknown values. The first step in Inverse Distance Weighting is to select whether there will be a Weight Field to go along with the data field; for this analysis only pH is being examined without any possible influencing attributes and so will not require a Weight Field. The second
Page 11
21 March 2014
step in the IDW technique is to choose the search neighborhood type and the amount of neighbours to include in the neighbourhood searches. (See Figure 7 below)
Because of the small size of the Area of Interest examined (roughly 25 square km), the number of data points collected in this analysis and the lack of variability in the data values, the neighbourhood parameters where reduced to 5 neighbours with a sector type of 1. This reduces the size and numbers of neighbours required tightens up the allowed influence of the neighbouring values and produces a prediction which examines more immediately neighbouring values rather than taking into account further away points which realistically do not have much influence. The same neighbourhood parameters where utilized in both the original pH values and the invers log values.
Page 12
21 March 2014
The third step in carrying out the IDW predictive technique is Cross validating the actual measured values with the models predicted values. (See Figures 8 & 9 below)
Because our data values are not normally distributed or related, the cross validation shows our Model line (blue line) to be far from the 1:1 line for both the inverse values and the original values. This suggests our model isnt the best but with the invariability of our data values these were the best results after much exploration of the various technique parameters. For the original values however, there is a much more reasonable root-mean square value; it is close
Page 13
21 March 2014
to 0 as oppose to the inverse logs RMS of 934376.6, suggesting unbiased predictions (Performing cross validation and validation, 2014). For this reason it can already be seen that the original pH values may be most suitable for predictive operations. Once the previously discussed parameters and decisions were made for each set of values, the final predictive surfaces were produced, the results of which will be evaluated in the forthcoming Analysis section but can also be seen on the following pages (Figures 10 & 11).
Page 14
21 March 2014
Page 15
21 March 2014
Page 16
21 March 2014
4.2.
Kriging Methodology
The settings and parameters utilized when carrying out the Kriging predictive technique are as follows. Just like in IDW, there are multiple steps when carrying out a Kriging prediction, which require choices and reasoning behind said choices, those of which have been detailed out in this section.
A Universal Kriging was employed for both the original and inverse log values based on the reasoning that the sample data has irregular spatial variation hence it should not be modelled by a simple, smooth mathematical function (Universal Kriging Functionality, 2013). Universal Kriging is most suited to our sample data because it assumes a local trend which varies slowly across the surface (Universal Kriging Functionality, 2013). This trend suits our relatively unvaried data which appears to have a continuous and slowly varying trend. When examining the Semivariogram and Covariance of both the original pH values and the inverse values it was seen that the model and the averaged values were closely correlated, suggesting satisfactory models. (See Figures 12, 13, 14 & 15 below)
Page 17
21 March 2014
Page 18
21 March 2014
Page 19
21 March 2014
As far as the parameters of the Semivariogram and Covariance examination go, the default Nugget settings were used as well as the default number and size of Lags. The next step in creating these predictive surfaces was to choose and define a neighbourhood. For our purposes we left the default sector type of 4 sectors with an offset of 45 degrees but, like the IDW neighbourhoods, reduced the number of neighbours to a maximum of 5 in order to tighten up the searching neighbourhood to more accurately reflect reality. (See Figure 16 below)
The final step in creating Kriging surface predictions requires a Cross-Validation just like in the IDW technique. Again, because our data values are not normally distributed or related, the cross validation shows our Model line (blue line) to be far from the 1:1 line for both the inverse values and the original values. (See Figures 17 & 18 below)
Page 20
21 March 2014
Similar to IDW, this suggests that our model is not the best but with the abnormality of our data values these were the best results after much exploration of the various technique parameters. Adversely however, when taking a look at the RMS standardized value of each, they both are near 1 (1.2) suggesting a good model. In addition, once again similarly to IDW, the original values show a much more reasonable root-mean square value; it is 0.11, as oppose to the inverse logs RMS of 934376.6, again suggesting unbiased predictions (Performing Cross Validation and Validation, 2014), further supporting the notion that the original pH values are more suitable for predictions than the larger Inverse values. Once the previously discussed parameters and decisions were made for each set of values, the final predictive surfaces were produced, the results of which will be evaluated in the forthcoming Analysis section but can also be seen on the following pages (See Figures 19 & 20)
Page 21
21 March 2014
Page 22
21 March 2014
Page 23
21 March 2014
5.
Analysis
When deciding which predictive technique most accurately and realistically predicts unknown soil pH values, certain comparisons can be made between the two techniques, as well as between the two value sets within each of the two techniques. This analysis will compare the cross validations of each techniques set of values as well as visually evaluate and interpret what the final produced surface is communicating from each technique and come to a supported decision on which is the most ideal technique to use for the set of data examined in this report.
Below (Figure 21) is a screen shot of the IDW cross validations for the original pH values as well as the inverse pH values; comparatively the RMS for the original values (table on the left) is very near zero at 0.1. That, coupled with the earlier notion discussed in the methodology, confirm for the IDW technique that the original values are best.
Page 24
21 March 2014
When considering the Kriging cross validation results between the original and inverse values it can be noted that both models underestimate the variability in the predictions as they are over 1 at a value of 1.2 (Universal Kriging Functionality, 2013). This is coherent with the observations and assumptions about the data; the datas low variability was taken into consideration when creating the model as a result was overcompensated for. Further investigation must be done in order to distinguish the better suitability of the two value sets. When examining the table of the left in Figure 22 (below), it can be seen that the Root Mean Square prediction error is almost the same as the Average Standard Error, only off by 0.03, which would imply the variability of the data is being assessed correctly. .
Page 25
21 March 2014
On the contrary when examining the same values for the Inverse model (Figure 21, table on the right) the RMS prediction error is proportionately more dissimilar (6%) than its respective Average Standard Error in comparison to the two prediction errors in the Original pH values cross validation. At this point a preliminary conclusion indicates that the original pH values are the most suitable for predicting surfaces. When considering the visual interpretation of the two techniques the predictive surfaces that they produced are very similar; both have the same high and low values in the same location and in between their predicted values are very similar. The only difference appears to be the smoothness of the interpolated results: as can be seen in Figure 23 below, the IDW surface has some fuzziness and incongruity around the edges of the
Page 26
21 March 2014
When taking a look at the Kriging predicted surface it can be seen that there is much less interpolation and fuzziness in the transition areas from one value to another; much less
Page 27
21 March 2014
For this reason as well as the previously examined Cross validations, it has been decided that the Original values predicted using IDW technique are most appropriate and most accurately predict soil pH values. Furthermore, as a result of our flat unvaried data the predicted results
Page 28
21 March 2014
were expected to be unvaried and each techniques result was expected to be very similar, hence the ultimate decision came down to which technique provided a better visual interpretation.
6.
Page 29
21 March 2014
Page 30
21 March 2014
In regards to the quality of data collection, certain aspects about the method of data collecting can be considered. The data used were single pH values for entire typified soil plots which had highly irregular spatial distributions. This assumes that the value recorded for that soil type and polygon is congruent throughout the entire polygon. If multiple real world pH readings were collected within each typified soil plot, it could be investigated as to whether there are pH level variations within the soil plot, in order to ensure more accurate and detailed data points. Considering these options, perhaps more accurate and detailed data could be collected and could produce a more exploratory and detailed prediction surface.
Page 31
21 March 2014
7. Bibliography
Kohler, M. f. (2006?). Soils of the Niagara Region. NOTL: Niagara College. M.S. Kingston and E.W. Presant. (1989). The Soils of the Regional Municipality of Niagara. Ontario Institute of Pedology, Soil and Water Management Branch. Guelph, Ontario: Ministry of Agriculture and Food (Canada). Performing cross validation and validation. (2014). Retrieved March 18, 2014, from ArcGIS Resrouce Center: http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//003100000059000000 Smith, I. (2014). Terms of Reference. GISC9308-Deliverable D4a and D4b Major Geostatistics Report. Smith, I. D. (2012, August). Major Tributaries to Short Hills Park. ON, Canada. Universal Kriging Functionality. (2013). Retrieved march 2014, from Spatial Analyst: http://spatialanalyst.net/ILWIS/htm/ilwisapp/universal_kriging_functionality.htm
Page 32
21 March 2014
8.
21 March 2014
Locations vs 'z' Values Soil Code BFO BFO BFO BFO BFO BFO BFO BFO BFO BFO BFO BFO BFO BFO BFO BFO BFO BFO Easting (m) 640442.6272 638144.0852 642864.1298 641775.4549 640458.3153 638115.5614 638851.8623 638454.1135 640113.1756 639767.5346 639711.4618 639484.3358 639851.5036 638008.8245 638318.977 641589.7813 638487.9911 638122.2859 Northing (m) 4771360.022 4771784.341 4771632.11 4772251.304 4772413.52 4772435.518 4772484.852 4772527.562 4772589.872 4772719.868 4772978.564 4773123.792 4773102.234 4773289.302 4773110.406 4773971.451 4773837.669 4773922.647 Soil pH (Log H+ Ions) 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6
Page 34
21 March 2014
Locations vs 'z' Values Soil Code BFO BFO BFO BFO BFO BFO BFO BFO BRT BRT BVY BVY BVY BVY BVY BVY BVY BVY Easting (m) 642153.101 640900.691 638907.1229 640409.7198 638663.5794 638280.2001 638762.4721 640104.7013 641062.2304 639650.6339 643120.2377 639730.6199 639188.877 640152.8248 638946.1122 639360.3637 643296.0035 640057.6381 Northing (m) 4774050.971 4774732.547 4774785.378 4773862.218 4775480.1 4775372.573 4776003.892 4776264.455 4770892.441 4772350.526 4770932.761 4770983.822 4770981.443 4771051.962 4771381.221 4771167.396 4771172.632 4771385.514 Soil pH (Log H+ Ions) 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6 6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6
Page 35
21 March 2014
Locations vs 'z' Values Soil Code BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY Easting (m) 638463.8605 638718.1125 638551.3114 643099.4894 638068.4958 638128.879 638509.753 638762.6045 640044.9866 641781.1645 640363.8747 640908.7428 639691.876 638088.4164 640441.1276 639040.1221 640062.7656 643652.126 Northing (m) 4771393.849 4771264.211 4771840.245 4771751.836 4772043.334 4772242.468 4772269.139 4772213.79 4772268.86 4771549.26 4772746.091 4772853.175 4771963.324 4772839.854 4773376.516 4773200.323 4773194.802 4772762.129 Soil pH (Log H+ Ions) 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6
Page 36
21 March 2014
Locations vs 'z' Values Soil Code BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY BVY FRM FRM Easting (m) 638004.675 642067.8305 639749.8965 643647.3591 642958.7072 642135.6003 641122.908 643647.4889 638676.0764 642877.6109 643742.0918 638092.7432 638480.2403 639420.8947 639320.197 642295.9013 640236.2172 641805.5081 Northing (m) 4774102.428 4773238.667 4774202.73 4774349.089 4774335.685 4774439.984 4774883.066 4774848.327 4775418.328 4774951.955 4775481.219 4775911.652 4775863.741 4776183.805 4775051.168 4775984.141 4774666.13 4775471.994 Soil pH (Log H+ Ions) 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.6 6.4 6.4
Page 37
21 March 2014
Locations vs 'z' Values Soil Code TLD TLD TLD TLD TLD TLD TLD TLD TLD Easting (m) 643470.2773 642323.0676 643232.4647 643645.9437 643089.6039 643715.2747 640914.9757 639502.4827 640314.9263 Northing (m) 4771693.271 4773197.85 4773285.737 4774450.168 4774479.332 4775330.255 4775648.771 4775736.445 4776179.107 Soil pH (Log H+ Ions) 6.7 6.7 6.7 6.7 6.7 6.7 6.7 6.7 6.7
Page 38
21 March 2014
Mainly lacustrine silty clay Orthic Gray Brown Luvisol, fine clayey, alkaline, strongly calcareous, mild humid to subhumid Moderately well
Drainage BRANT (BRT) Parent Material Usual Classification Drainage BEVERLY (BVY) Parent Material Usual Classification Drainage
Mainly lacustrine silt loam andloam Brunisolic Gray Brown Luvisol Well
Mainly lacustrine silty clay Gleyed Brunisolic Gray Brown Luvisol Imperfect
FARMINGTON (FRM) Parent Material Level to nearly level bedrock plain, overlain by thin veneer of variable sediments (10 to 20 cm variable textures over mainly limestone and
Page 39
21 March 2014
dolostone bedrock) Usual Classification Drainage PEEL (PEL) Parent Material Usual Classification Drainage TOLEDO (TLD) Parent Material Usual Classification Mainly lacustrine silty clay Orthic Humic Gleysol, fine clayey, alkaline, strongly calcareous, mild humid to subhumid Poor 40 to 100 cm lacustrine silty clay over clay loam till Gleyed Brunisolic Gray Brown Luvisol Imperfect Orthic Humic Regosol, extremely shallow lithic, mildhumid to subhumid Rapid
Mainly lacustrine silt loam Gleyed Brunisolic Gray Brown Luvisol Imperfect
VITTORIA (VIT) Parent Material Usual Classification 40 to 100 cm sandy textures over lacustrine silt loam Gleyed Brunisolic Gray Brown Luvisol
Drainage
Imperfect
Page 40