You are on page 1of 21

Principal Component Analysis

GISC9216 Digital Image Processing


R. Konrad Hunter For: Janet Finlay

GISC9216 Digital Image Processing Principal Component Analysis


February 12, 2014 D2 Principal Component Analysis

Janet Finlay Instructor / Coordinator Niagara College Post Grad GIS - Geospatial Management Niagara College, NOTL Campus 135 Taylor Rd, S.S.#4 Niagara-on-the-Lake, ON, L0S 1J0

Dear Mrs. Finlay, RE: GISC9216 Deliverable 2 Principal Component Analysis Please accept this letter as a formal submission of Deliverable 2 Principal Component Analysis for GISC9216 Digital Image Interpretation. This submission is comprised of a formal report which investigates the application of a Principal Component Analysis (PCA) The findings of this report have outlined the effectiveness of a PCA for the purpose of significantly minimizing redundancies within the pixel values of a digital image. Utilizing a PCA increases the computational efficiency of digital image transformation, while maintaining the unique values associated. The results of the PCA were utilized to compare the results of land cover classification using both an unsupervised PCA and a standard unsupervised classification image of the same location. The PCA image distinguished urban area successfully in comparison to the original unsupervised image, however misclassified agricultural land at various locations. Overall, a PCA is an effective tool for reducing redundancies within the data while providing a new set pixel values to represent essential information for a digital image. If you have any questions regarding these documents or the assignment in general, please feel free to email me at your convenience. Thank you. Sincerely,

R Konrad Hunter
R. Konrad Hunter - B.A GIS-GM Candidate Project Manager: Hunter Geosystems R.K.H Enclosures: i) Report Principal Component Analysis

konrad-hunter@hotmail.com Hunter Geosystems 349 Queenston St. St. Catharines, ON L2P 2Y1

GISC9216 Digital Image Processing Principal Component Analysis

Table of Contents
1.0 2.0 3.0 4.0 5.0 5.1 6.0 Introduction: ..................................................................................................................................... 1 Principal Component Analysis........................................................................................................... 2 Feature Space Image Comparison .................................................................................................... 3 Principal Component Analysis Discussion ......................................................................................... 7 PCA Unsupervised V.S. Original Subset Unsupervised.................................................................... 10 Classification Results: Urban V.S. Agriculture ............................................................................. 10 Conclusion ....................................................................................................................................... 13

References .................................................................................................................................................. 14 Appendix A: Unsupervised Classification of Principal Component Analysis (Cooks Bay Area, Ontario) ..... A Appendix B: Unsupervised Classification of Original Image (Cooks Bay Area, Ontario) .............................. C

List of Tables
Table 1 Feature Space Image Correlation ..................................................................................................... 4 Table 2 PCA Result ........................................................................................................................................ 8

List of Figures
Figure 1 Subset image of Cook's Bay used for analysis................................................................................. 1 Figure 2 Histograms showing strong correlation .......................................................................................... 2 Figure 3 Histograms showing weak correlation............................................................................................ 3 Figure 4 Feature Space Image of Bands 1 - 2 for (a) original image and (b) PCA image .............................. 8 Figure 5 Feature Space Image of Bands 1 - 3 for (c) original image and (d) PCA image .............................. 9 Figure 6 Feature Space Image of Bands 2 - 3 for (e) original image and (f) PCA image............................... 9 Figure 7 Histograms generated for PCA image result ................................................................................. 10 Figure 8 Urban areas in PCA unsupervised ................................................................................................. 11 Figure 9 Urban areas in original unsupervised ........................................................................................... 11 Figure 10 Misclassified Urban Areas (PCA image) ...................................................................................... 12 Figure 11 Original Subset image (False colour)........................................................................................... 12 Figure 12 Misclassified Agricultural land (PCA Image)................................................................................ 13 Figure 13 Accurate Classification of Agricultural Fields (Unsupervised)..................................................... 13

GISC9216 Digital Image Processing Principal Component Analysis

1.0

Introduction:

Principal Component Analysis (PCA) is an image transformation technique that effectively reduces redundancy and compresses remotely sensed, multispectral data (Lilliesand and Kiefer, 2008). Essentially, a PCA uses the original data within a digital image and transforms it into a new set of data values that can be classified accordingly to represent essential information. Minimizing redundancies within the highly correlated image bands will improve the overall computational efficiency of the transformation as well as reduce data storage space (Munyati, 2004). The following report examines the application of a PCA using a subset image of Cooks Bay, Ontario (Figure 1).

Figure 1 Subset image of Cook's Bay used for analysis

GISC9216 Digital Image Processing Principal Component Analysis

2.0

Principal Component Analysis

The purpose of executing a Principal Component Analysis (PCA) is to successfully reduce the amount of redundancy within multispectral images (Lillesand and Keifer, 2008). The original image (Figure 1) has multiple image bands that generally reflect the same information. This limits the computational efficiency of future image classification/enhancement as such information becomes analyzed repeatedly. Removing a strongly correlated image band effectively reduces such redundancies, allowing the weakly correlated bands to remain throughout the transformation and resulting in a successful PCA. Figure 2 displays Histograms of image bands 4 and 5 from the original image prior to running a PCA. These correlated image bands essentially convey the same information, unlike the comparison of image bands 1 and 6 (Figure 3). The methodology of computing a PCA for this assignment required the transformation of six image bands into three using the PCA image generated from the original subset of Cooks Bay, Ontario (Figure 1). Therefore, key information is maintained via PCA while the image data is compressed, eliminating redundant data and improving the computational efficiency of future image enhancements.

Figure 2 Histograms showing strong correlation

GISC9216 Digital Image Processing Principal Component Analysis

Figure 3 Histograms showing weak correlation

3.0

Feature Space Image Comparison

Scatter plots, via Feature Space images, provide another effective method for displaying correlations within the image data. Feature Space images examine the distribution/concentration of pixel values between image bands, allowing the analyst to visualize the interband correlations, thus understanding the strong, moderate and weak correlations existing within a multispectral image. Bright tones within a Feature Space Image represent a high density of pixel values and weak tones represent low pixel density (ERAS Help, 2013). Table 1 below examines the combination of image bands found in the original image (Figure 1), describing the correlation as well as level of redundancy occurring in the associated Feature Space Image.

GISC9216 Digital Image Processing Principal Component Analysis

Table 1 Feature Space Image Correlation

Bands 12

Feature Space Image

Correlation Strong Correlation

Redundancy High level of redundancy between image bands.

13

Moderate Correlation

High level of redundancy between image bands.

14

Weak Correlation

Low level of redundancy between image bands.

15

Weak correlation

Low level of redundancy between image bands.

GISC9216 Digital Image Processing Principal Component Analysis 16 Weak Correlation Low level of redundancy between image bands.

23

Strong Correlation

High level of redundancy

2-4

No Correlation

No redundancy between image bands

25

Weak Correlation

Low level of redundancy between image bands.

26

Weak Correlation

Low level of redundancy between image bands.

GISC9216 Digital Image Processing Principal Component Analysis 3-4 No Correlation No redundancy between image bands

35

Weak Correlation

Low level of redundancy between image bands.

36

Weak Correlation

Low level of redundancy between image bands.

45

No Correlation

No redundancy between image bands

46

No Correlation

No redundancy between image bands

GISC9216 Digital Image Processing Principal Component Analysis 56 Strong Correlation High level of redundancy between bands.

4.0

Principal Component Analysis Discussion

Following an analysis of the interband correlations, the PCA transformation effectively condensed the bands of the original image from 6 bands to 3 (for the purpose of this assignment). It is important to examine the percentage of data lost during the transformation to determine the success of the PCA. For the purpose of reducing redundancy, it seems practical to only include the image bands containing the highest variance among pixel values. The eigenvalues produced from the PCA are used to assess the amount of data retained within the transformation, with the highest variance found in the first three bands. The variance between image bands is displayed in Table 2. The first channel of the PCA has the highest percentage of variance (80.81%) compared to the second (17.09%) and third (1.68%) PCA channels. In total, 99.51% of the variance was maintained in the PCA transformation, suggesting it was successful in compressing the image, redistributing the pixel values from six channels into three and eliminating redundancy without a significant loss of data.

GISC9216 Digital Image Processing Principal Component Analysis

Table 2 PCA Result

Now that the PCA transformation is complete, the resulting image can be analyzed using the associated Feature Space images. It becomes clear that no correlation exists in the resulting three PCA channels; therefore redundancy has been significantly minimized following the PCA transformation. This allows for each image band to contain unique information, increasing the significance of each when computing other image classifications. In reference to Figure 4, the strongly correlated image bands 1 -2 from the original image have been transformed, resulting in no correlation between bands 1-2 for the PCA image.

a)

b)

Figure 4 Feature Space Image of Bands 1 - 2 for (a) original image and (b) PCA image

GISC9216 Digital Image Processing Principal Component Analysis

c)

d)

Figure 5 Feature Space Image of Bands 1 - 3 for (c) original image and (d) PCA image

e)

f)

Figure 6 Feature Space Image of Bands 2 - 3 for (e) original image and (f) PCA image

Figure 5 and Figure 6 also compare the original image bands to the resulting PCA image. It becomes very apparent that the strong correlations existing between the original image bands 1-3 and 2-3 have been transformed to eliminate redundant data values, resulting in these image bands showing no correlation for the PCA Feature Space images. Therefore, the PCA transformation was successful in eliminating redundancies within the image band pixel values while retaining the unique data from the original image. Figure 7 displays the histograms for the newly created PCA image. These PCA histograms support the function of removing redundancies within the pixel values among image bands, resulting in three uncorrelated histograms.

GISC9216 Digital Image Processing Principal Component Analysis

Figure 7 Histograms generated for PCA image result

5.0

PCA Unsupervised V.S. Original Subset Unsupervised

Once the PCA is complete, the image was than reclassified using an unsupervised classification. An unsupervised classification examines unknown pixels and organizes the pixel values by dividing them into classes based on natural groupings or clusters within the image (Lillesand and Kiefer, 2008). The parameters for the unsupervised classification of the PCA image were set according to the previous unsupervised classification of the original subset image to provide an accurate comparison of land classification. Both classifications used 12 classes for the image clusters with the maximum iterations set to 10. The classification of each land type class followed a common colour scheme for both images to aid in the comparison of the unsupervised PCA image and unsupervised subset image. Due to the high number of similar classes, both images were recoded using the EDAS Imagine Recode function. The final layouts for both images can be found in the Appendix of this report.

5.1

Classification Results: Urban V.S. Agriculture

In reference to the original image (Figure 1), it becomes clear how the area is dominated by the presence of agricultural fields as opposed to urban areas. The unsupervised classification of the original image failed to account for urban areas specifically, classifying urban (residential/commercial land) as agricultural land. The distinction between urban and agriculture was more apparent in the PCA unsupervised image. Figure 8 and Figure 9 compare the result of the classification on both the PCA and original unsupervised images in regards to urban areas.

10

GISC9216 Digital Image Processing Principal Component Analysis

Figure 8 Urban areas in PCA unsupervised

Figure 9 Urban areas in original unsupervised

As mentioned earlier in the report, the PCA effectively reduced redundancies among the pixel values. This helped distinguish residential and commercial land from agricultural because the redundancies within the pixel values were significantly reduced, therefore the areas of development were successfully classified as unique features. Because the pixel values for residential and commercial would differ from that of agriculture, the PCA would account for the weak correlation of pixel values in the regions of development. However, there are some discrepancies with the results of the PCA, as some agricultural land was classified as residential or commercial (Figure 10). This proves that shortcomings exists within both classifications types. Increasing the number of classes for the unsupervised classification function could result in a more accurate representation of the variance among land classes within the image. Overall, the PCA displayed urban areas more effectively than the original unsupervised classification.

11

GISC9216 Digital Image Processing Principal Component Analysis

Figure 10 Misclassified Urban Areas (PCA image)

Figure 11 Original Subset image (False colour)

Figure 10 shows an example of how the PCA unsupervised image misclassified a portion of agricultural land as residential development. This becomes evident when comparing this area of the unsupervised PCA image to the same area in the original subset image (Figure 11), which shows how there is no residential development at this location(at least to the extent shown by the PCA). The location for which these images represent is considerably dominated by agricultural fields. It is important to provide a distinction between these agricultural fields when classifying land types. The results of both classifications successfully distinguished between these land class types, however the original unsupervised classification provided a more accurate representation of the variance among the pixel values representing agriculture specifically. Figure 12 examines how various agricultural land types were misclassified as commercial land for the unsupervised PCA image, which is understood when comparing this image to the original unsupervised image (Figure 13). Therefore, the original unsupervised image provided a better representation of agriculture.

12

GISC9216 Digital Image Processing Principal Component Analysis

Figure 12 Misclassified Agricultural land (PCA Image)

Figure 13 Accurate Classification of Agricultural Fields (Unsupervised)

6.0

Conclusion

A Principal Component Analysis in remote sensing provides an effective tool for eliminating redundancies among pixel values while maintaining the unique data within a digital image. This function provides increased computational efficiency for digital image transformations and allows for the GIS analyst to gain a full understanding of the inherent pixel values characteristics within the dataset. The report outlined how the application of a PCA can be useful for land cover classification, while also providing examples of the shortcomings associated with such analysis. In conclusion, the PCA classification provided an improved classification of urban areas in comparison to the original unsupervised image, but misclassified many of the agricultural land classes. A PCA can be an effective tool for digital image classification of urban areas specifically where as other classification methods may provide more accurate results for the classification of other land classes. Therefore, the requirements of an investigation involving land cover classification must be fully understood before determining whether or not a PCA will yield accurate results. 13

GISC9216 Digital Image Processing Principal Component Analysis

References
Lillesand, T and Kiefer, R. (2008). Remote Sensing and Image Interpretation. New Delhi: John Wiley & Sons Inc.

Munyati, C. (2008). Use of Principal Component Analysis (PCA) of Remote Sensing Images in Wetland Change Detection on the Kafue Flats, Zambia. Geocarto International. Retrieved February 3, 2014 from http://www.geocarto.com.hk/cgi-bin/pages1/sep04/p11.pdf.

ERDAS Help. (2013). Feature Space Image. Retrieved February 10, 2014 from ERDAS IMAGINE 2013.

14

GISC9216 Digital Image Processing Principal Component Analysis

Appendix A: Unsupervised Classification of Principal Component Analysis (Cooks Bay Area, Ontario)

GISC9216 Digital Image Processing Principal Component Analysis

GISC9216 Digital Image Processing Principal Component Analysis

Appendix B: Unsupervised Classification of Original Image (Cooks Bay Area, Ontario)

GISC9216 Digital Image Processing Principal Component Analysis