MANAGING LAYER-UPON-LAYER – AN OPEN-SOURCE APPROACH TO REMOTE SENSING DATA MANAGEMENT

Andrew Haywood, Kristen Thrum, Andrew Mellor Victorian Department of Sustainability and Environment PO Box 500, East Melbourne, Victoria 3002 Phone 03 9637 6980, Fax 03 9637 8117 Andrew.Haywood@dse.vic.gov.au

Abstract Victoria’s new Forest and Parks Monitoring and Reporting Information System (FPMRIS) is designed to asses and monitor the extent, state and sustainable development of Victoria’s forests and parks. As part of this system, a suite of open-source applications are being developed to store, archive, prepare, process and analyse remote sensing and field data. The ultimate goal of the system is to develop a semi-automated approach that turns raw data into timely and usable scientific information about the status and trends of Victoria’s forests and parks to managers for years to come. Remote Sensing imagery, derived products (including land cover classifications and error matrices) and ancillary data, will comprise the majority of data to be managed by the FPMRIS. It is anticipated that hundreds of thousands of raster layers will be generated and it is therefore imperative that such a data management system incorporates tools for version control and archiving. The FPMRIS Open source tools and applications include GRASS, CVS, python, UMN Mapserver, Proj4, R, GDAL, OGR, Geotools, and PostGIS. The current status of the system will be presented, as well as future developments and a more general discussion on the lessons learnt using opensource to manage and process remote sensing data.

1

Introduction Approximately a third (8.3 million ha) of Victoria’s land mass is covered by forest, of which 3.2 million hectares is classified as State Forest and 3.5 million hectares as national parks and other reserves. Privately owned forest accounts for 1.2 million hectares of largely native forest and 440 000 hectares of plantations (predominantly Pinus radiata and Eucalyptus globulus) (Department of Sustainability and Environment 2009). The forests, parks and reserves are managed for wood production and the provision of non-wood production values including recreation, biological and landscape diversity. Since these forests, parks and reserves provide many multi-value functions there is a necessity to monitor their sustainable management and to understand the causes of change. At the time of publication, the Victorian Department of Sustainability and Environment (DSE) is responsible for the sustainable management of public land in Victoria, including the public forest, parks and reserves estate. As a consequence, DSE engages in a number of processes to monitor the sustainability of Victoria’s public land. These include Victoria’s State of the Forests Report (Department of Sustainability and Environment, 2008), State of the Parks Report (Parks Victoria, 2007) and State of the Environment Report (Commissioner for Environmental Sustainability, 2008), the Sustainability Charter for Victoria’s State Forests (Department of Sustainability and Environment 2006) and associated Criteria and Indicators for Sustainable Forest Management in Victorian Forests (Department of Sustainability and Environment 2007). These reporting mechanisms are designed to enable Victoria to critically assess and evaluate progress towards achieving its sustainable management objectives and targets. To support these reporting processes a Forest and Parks Monitoring and Reporting Information System (FPMRIS) has been developed. The FPMRIS utilises a systematic permanent plot-based sampling strategy located on a statewide grid (FPM&R Team 2009a). A combination of ground plots, high resolution imagery plots and remote sensing data will be used to capture a set of basic attributes that are used to monitor and report the extent, state and condition of Victoria’s public forests and parks in a timely fashion. Within the FPMRIS, thousands of spatial layers are generated and it is therefore imperative that the system contains data management processes that are version controlled to enable timely and efficient reporting of the extent state and condition of Victoria’s public forests and parks. This article aims to inform on the FPMRIS and to present the main characteristics of the project. The article first describes the FPMRIS and its data sources, then presents the current status of development and provides an overview on how reporting processes involving thousands of layers are dealt with.

2

Forest and Parks Monitoring and Reporting Information System (FPMRIS) The main objectives of the FPMRIS are: • • • • • to improve knowledge of environmental state and ecosystem services; to understand interactions between forests and their natural and socioeconomic environment; to identify, monitor and report on potentially harmful impacts and threats; to evaluate progress and co-ordination of DSE forest relevant policies, and to report on sustainable forests and parks management and specific policy objectives according to state, national environmental agreements.

Following the release of the Strategic Plan for the FPMRIS (FPM&R Team 2009d), a series of key decisions were made about the approaches to guide both the administrative development and technical evolution. These decisions ranged from the organisational (such as who will carry out certain functions) to the technical (including the data sources and how these will be incorporated into the system). The design includes the development of a priority list of key attributes to be monitored and reported (Table 1). The three primary data sources (or ‘tiers’) in the FPMRIS are listed, together with data primitives and their associated monitoring themes. Having reached this point, it was possible to commence design of an information (management) system (FPM&R Team 2009c). The decisions that affected the design of the information system included the following: • • • • The FPMRIS will be reliant on data maintained by other agencies and departments; The need for a network of links to external data and service providers; Data from implementation projects should be analysed within integrated “off-the-shelf” open-source software; and Systems should be able to carry out reporting in a LINUX/UNIX environment.

The system operates as an independent, integrative mechanism to synthesis a range of external generated inputs, those inputs representing the output of submodels. While such a process provides for the independent development of sub-models, the eventual goal of the FPMRIS will be the integration of the submodels, not just the sub-model outputs.

3

Table 1: Key Attributes Reported within the FPMRIS
Data Sources Data Primitives Tree DBH Tree Height Tree Species Diameter Distribution Ground: Large Tree Plot (Tier 1) Tree Mortality Crown Class Crown Health Class Coarse Woody Debris Slash Piles Ground: Small Tree Plot (Tier 1) Ground: Vegetation Quadrants (Tier 1) Ground: Bird Survey (Tier 1) Stumps and small trees Dead plants frequency/height Living plants frequency/height Bird abundance Bird diversity Soil type Ground: Soil Pits and Sampling (Tier 1) Soil Nutrient Status Duff Fine woody debris Forest Type High Resolution Remote Sensing (Tier 2)
1

Monitoring Themes

Tree Growth Tree Mortality Site Productivity Above-ground Biomass Below-ground Biomass Flora Diversity Canopy Health DWD Below-ground Carbon Nutrient Status Fauna Diversity Growth Stage Old Growth Canopy Disturbance Fragmentation Forest Area by Forest type

Forest Structure Land cover Disturbance

Moderate Resolution Remote Sensing (Tier 3)

Forest extent and cover

System Software The FPMRIS reporting procedures are largely performed within a UNIX operating system based around the integration of a range of tools. These tools include version controlling software (CVS); GIS/Remote sensing software (GRASS, UMN Mapserver, R, GDAL, PostGIS); Relational Database Management System (Postgres) and a Scripting Language (Python) (Figure 1).

1

Unless otherwise specified, high resolution and moderate resolution refer specifically to spatial resolution (the measure of the smallest area identifiable on an image as a discrete separate unit).

4

CVS

Version control

ASPATIAL
Documents Tables

Database engine: Tables, attributes

External Data & Models

raster vector

GRASS

Geostatistics Predictive models (internal)

Mapserver and Apache server for publishing interactive map applications

scripting

SPATIAL

Figure 1: Schematic Figure of FPMRIS System Software

System Environment Wherever possible, material that contributes to the FPMRIS (data collections, studies, models etc.) is held in a relation database management system. This contributes to the openness and verifiability of the FPMRIS. The FPMRIS is fundamentally a purpose-built GIS/Remote Sensing system with the capacity to interface with a range of models, both internal (integrated within the system) and external (applied independently). The principal form of interface are look-up tables (arrays of data), providing the capacity to use models that often will reside outside of the FPMRIS. Tracking of data sources enables verification of inputs to the FPMRIS. All relevant spatial and tabular data are either held within or available on transfer to the FPMRIS data libraries (generally input spatial data are held in their original raw form, eg. shapefile, geotiff etc). Most of the tabular data is in the form of either look-up tables, used to define time course of change, yield curves or raw data, such as areas of forest clear cut annually. The look-up tables, whether generated externally or from within FPMRIS, are stored and operated in a similar fashion. In all cases, the methods used in deriving the tables are recorded in a Decision Support model within the database.

5

Data Types and Flows The data used within the FPMRIS are in various formats and resolutions. Key data types that are identified are: Programmatic data: • • • • Ground plots – point data Photo plots – raster and polygon data Satellite image analysis – raster and polygon data Modelled estimates – raster and polygon data

Non-programmatic data • • • Proportional estimates Topographic, Environmental ancillary data – raster and polygon data Statistics

Programmatic and non-programmatic data are brought together to provide a regional estimate (Figure 2), which is essential for reporting procedures. The suite of tools described in the System Software section are applied to cope with this mixture of formats and resolution in the system. Issues of resolution are dealt in a conservative way by taking a bioregion as the base reporting unit. This approach acknowledges the limitations imposed by the coarseness of some core data: fine resolution data can be aggregated safely to the coarsest level, but initially-coarse resolution data cannot be subdivided further without any added meaning process. The statistical averaging “census” (rather than surface interpolation) approach generally applied means that the coarsest data is at the sub-regional scale resolution. This is therefore the finest resolution of data used in reporting. Access, Storage and Retrieval As previously mentioned, the FPMRIS integrates a range of data types to provide data for reporting procedures. For example, the basic unit for reporting purposes are bioregions, which are spatial aggregations, or polygons, of variable size. Within each reporting unit or bioregion, there are point, polygon and raster data of varying resolution and derivation. The lineage of all data entering the system is fully documented, as is the components of any derived or secondary data (FPM&R Team, 2009b). The Australian Standard for documenting spatial data, the ANZLIC metadata standard (ANZLIC, 2010), is applied to all spatial data. Some aspatial elements, such as models or decision processes are documented according to appropriate bibliographic standards. The documentation of models is further addressed through detailed publication of methods, preferably involving a peer reviewed process.

6

Accounting Unit: IBRA Bioregion

Spatial Data

Accounting cell: Forest cell Point data (Ground plot)

Photoplot Data

Models and Equations Accounting cell estimates

Tabular data

Combine bioregional estimates and report

Combine cells to generate Bioregion estimates

Figure 2: FPMRIS Data Flows – System combines spatial data, such as georeferenced sample point, with aspatial data, such as a proportional estimate for a bioregion

External access to the FPMRIS is provided through an on-line mapping system which accesses the estimates for each sampling unit (bioregion by soil type and bioregion by forest type) and the documentation stating the lineage of the data which makes up the reporting units. General access to this system is provided through a UNIX based UMN-Mapserver (or equivalent) application. Various levels of internal access are provided by the system, which meets the requirements for transparency and verification of estimates (Figure 3). Models are run off-line and the results are verified and certified by appropriate experts. All intermediate and derived data are stored in a relational database management system, allowing rules regarding data relationships and data security to be fully enforced. The system stores and manages spatial as well as tabular data (including metadata, methods and key decision processes). Specific versions of data are archived and available for future review.

7

Plot data Tabular data Models Equation Analyses

Photo Plot data

Image data

Cell estimates

Secure Access

Metadata

Bioregion Estimate

Internet GIS

General Access

Maps Documents Tables

Figure 3: The FPMRIS has various levels of access. All data is subject to strict version control, with updates occurring through authorised procedures. All data is made available for verification through standard audit procedures. Data is stored centrally.

Spatial Analysis Spatial analysis is a key requirement of the FPMRIS and is essential for legislative reporting processes such as the State of The Forest Reporting. To facilitate this process, two essential types of reports are produced: data summaries and interpretive reports. Each has an important role in making data collected for the FPMRIS available to decision makers. Data summaries are brief, comprehensive reports of essential data collected for the monitoring program by internal or external data-collection projects. The primary intent of a summary is to present data in an organised and useful manner. Some evaluations of the significance of the results may also be presented, if readily apparent. Data summaries should be prepared for each monitoring theme each year, or as appropriate to the resource being monitored and therefore can be pre-planned (FPM&R Team, 2010). Interpretive reports provide an evaluation of the significance of status and trends emerging in the monitoring data, as identified in data summaries and provide a basis for legislative reporting. Interpretive reports present a synthesis of monitoring results and statements of their implications critical to management processes and will be used to change plans, direction, or policies and contribute to budgetary and other decisions (FPM&R Team, 2010).

8

The spatial analysis and subsequent reporting needs to support queries both at a state level and at a much finer level. The analysis will generally deal with spatial operations that include “by” or “within” (union or intersection) spatial operands operating on area. For the purposes of this article, the focus will be on the area reporting associated with the FPMRIS Land Cover Classification System. The FPMRIS can answer many reporting questions. Examples of these reporting questions include: • • • • What is the total forested area within a specified region? What is the area of forested types within a specified region? What is the area of forest type by growth stage within a specified region? What is the area of forest types by protection status within a specified region?

When reporting spatial analyses, there are also a number of issues to overcome. Spatial analysis requires thousands upon thousands of spatial overlays and mixed raster and vector data make spatial overlays difficult. There is also large amounts of attribute data that are stored in external databases and data collection is conducted over a long period of time requiring spatial analysis to be performed at regular intervals. The general approach for reporting 2km high resolution tiles is as follows: 1. Version control raw data and scripts only Derived data will not be archived, rather the input data and processes will be version controlled and archived. Using a date provides logical version control and is usually formatted as YYYYMMDD. Versions can also be designated by adding a number to the file name, for example v1.0 for the first version. Version control software such as CVS can eliminate the work of differentiating multiple versions of documents by appending modifying characters to the file name. Such software applications track changes made to a document, add comments related to the different document iterations, and retrieve the document at any recorded stage of development. 2. Load vector FPMRIS spatial data into a Spatial Database (PostGIS). The FPMRIS vector plot data is loaded into PostGIS and UTM Zones created to partition the data. GRASS is then used to publish the data as raster images. 3. Re-sample the high resolution data. UMN Mapserver is used as a “vector to raster” engine to create raster images of a FPMRIS photo plot layer and a FPMRIS plot grid. Each photo plot has a corresponding “grid file”. A 5m by 5m uniquely coloured grid (160 000 cells) is

9

created and the grid files are aggregated into UTM zones using ogrtindex. The index files are referenced in a Mapfile. The WMS getmap request contains the bounding box of 2100 m by 2100 m with a width of 2100 pixels and a height of 2100 pixels. The request returns a one metre resolution image and each 5m by 5m cell in the grid has 25 pixels. This approach allowed spatial data to be stored in a text file.

Grid Data 5m cell array

Photoplot Spatial data (PostGIS + SLD)

e.g. Umn Mapserver or GRASS

Plot Image Resampled to 5 m
resolution grid

Figure 4. Re-sampling the high resolution data

The output CSV file is re-sampled to find the highest number of pixels for each grid cell. The resulting document contains 160,000 rows (one for each 5m by 5m cell) consisting of each re-sampled colour pair. The colour representing the plot layer in each record is replaced by its unique polygon identifier. The result is the creation of a flat file representing the re-sampled grid image (Figure 4). 4. Load re-sampled data into a RDBMS (PostgreSQL) and run simple SQL queries to generate FPMRIS reports. Finally, the re-sampled data is loaded into a RDBMS and simple SQL queries are run to generate data summary reports to fulfil the Victorian DSE’s legislative reporting obligations such as the State of the Forest Report (Department of Sustainability and Environment, 2009). An example of a simple SQL query to generate a data summary (Table 2) is shown below.

10

SELECT DISTINCT 'plot1_pelo' as TILE_THEME_ID, c1.theme_value c2.theme_value c3.theme_value c4.theme_value as THEME_VALUE, count(*) as VALUE_COUNT FROM cell_values c1, cell_values c2, cell_values c3, cell_values c4 WHERE c1.tile_theme_id = 'plot1_landcover_value' AND AND AND AND AND AND c2.tile_theme_id = 'plot1_land_tenure_value' c3.tile_theme_id = 'plot1_fmz_value' c4.tile_theme_id = 'plot1_protectedstatus_value' c1.cell_num = c2.cell_num c2.cell_num = c3.cell_num c3.cell_num = c4.cell_num

GROUP BY c1.theme_value,c2.theme_value,c3.theme_value,c4.theme_value

Table 2: Data Summary generated from simple SQL query
tile_theme_id plot1_pelo plot1_pelo plot1_pelo plot1_pelo plot1_pelo plot1_pelo plot1_pelo plot1_pelo plot1_pelo landcover_value 1162986_00040556 1162986_00040559 1162986_00040560 1162986_00040561 1162986_00040577 1162986_00040578 1162986_00040579 1162986_00040580 1162986_00040581 land_tenure_value 1162986_00000038 1162986_00000038 1162986_00000038 1162986_00000038 1162986_00000038 1162986_00000038 1162986_00000038 1162986_00000038 1162986_00000038 fmz_value 1162986_00000462 1162986_00000462 1162986_00000462 1162986_00000462 1162986_00000462 1162986_00000462 1162986_00000462 1162986_00000462 1162986_00000462 protected_status_value value_count 1162986_00000469 445 1162986_00000469 326 1162986_00000469 619 1162986_00000469 683 1162986_00000469 396 1162986_00000469 300 1162986_00000469 780 1162986_00000469 1033 1162986_00000469 1314

Advantages of this approach This approach allows the combination of vector and raster data to be spatially analysed, allows the user to do spatial queries using SQL, utilizes the power of a RDBMS, allows spatial data to be easily joined (SQL) to attribute data and is easily scalable because open source software is used. This approach is also empowered by version control systems and is very efficient in terms of speed. Lessons Learnt from using an Open Source Approach Through the Open Source Approach to Remote Sensing Data Management, there have been a number of lessons learnt. Governance is critical and should be established. Determining governance involves identifying and defining responsibilities, accountabilities and method of implementation. Human capital is also important with capacity building vital. In terms of the Licensing Model (Open Source vs. Proprietary), many established arguments against open source are no longer relevant. For example, the quality and documentation of open source software has increased dramatically over the past few years and

11

open source software is capable of providing enterprise solutions and leads the way in web development. Added to this is access to a global community of information, users and forums. The development of the two applications described in this presentation took several months by the development team consisting of three members. An advantage of proprietary software is that it comes ready made. However, some customisation is still required by a skilled developer. For example, ArcServer provides “out-of-the-box” web mapping solutions and also Application Programming Interfaces (APIs) to enable web mapping and server development. This increase in interoperability comes at a cost whereby specialist skills and knowledge are still required for customization and full control is maintained by the vendor. In each case human resources is the most critical factor. At DSE, skilled human resources in open source web development existed. Thus open source is a much more viable option for the Victorian DSE, providing more flexibility at largely reduced costs. The choice between implementing open source and proprietary software becomes a policy based on the current infrastructure, capabilities and preference of the organisation. Strict adherence to either open source technologies or proprietary solutions is not necessarily a rational policy, with interoperability between the two possible. References ANZLIC, 2010. Available online at: http://www.auslig.gov.au/ (accessed 17 July 2010) Commissioner for Environmental Sustainability, 2008. State of the Environment Victoria Report 2008, Commissioner for Environmental Sustainability, Melbourne Department of Sustainability and Environment, 2006. Sustainability Charter for Victoria’s State forests. Department of Sustainability and Environment, Melbourne. Department of Sustainability and Environment, 2007. Criteria and Indicators for Sustainable Forest Management in Victoria. Department of Sustainability and Environment, Melbourne. Department of Sustainability and Environment, 2009. Victoria’s State of the Forests Report 2008. Department of Sustainability and Environment, Melbourne. FPM&R Team, 2009a. A Grid-based sample design for monitoring Victoria’s forests (v1), Technical Report 5, Department of Sustainability and Environment, Melbourne FPM&R Team, 2009b. Data Management Plan for the FM&RIS (v1), Technical Report 6, Department of Sustainability and Environment, Melbourne

12

FPM&R Team, 2009c. FM&RIS System Design (v1), Technical Report 4, Department of Sustainability and Environment, Melbourne FPM&R Team, 2009d. Strategic Plan for Implementing a Forest Monitoring and Reporting Information System in Victoria 2009 – 2010 (v1), Technical Report 1, Department of Sustainability and Environment, Melbourne FPM&R Team, 2010. Components of the Forests and Parks Monitoring Program, Department of Sustainability and Environment, Melbourne. Parks Victoria, 2007. State of the Parks Report May 2007. Parks Victoria, Melbourne.

13

Sign up to vote on this title
UsefulNot useful