THE USE OF OPEN SOURCE GEOSPATIAL SOFTWARE WITHIN THE REMOTE SENSING CENTRE, QLD

Rebecca Trevithick and Sam Gillingham Queensland Department of Environment and Resource Management 80 Meiers Rd, Indooroopilly, QLD, 4068 rebecca.trevithick@derm.qld.gov.au

Abstract The Remote Sensing Centre (RSC), within the Queensland Department of Environment and Resource Management (DERM) is one of the largest remote sensing groups within Australia. The scope of the RSC operation is considerable, both in terms of data managed and projects undertaken. The centre undertakes extensive processing of large archives of remotely sensed imagery, along with manual image interpretation and significant field work to produce a variety of spatial products. These operations rely heavily on the centre’s core processing and data management systems, which have been primarily developed around open source tools such as Linux, PostGIS, QGIS, GDAL/OGR, Python and R. While commercial software packages, such as ERDAS Imagine, ESRI ArcGIS and ENVI, continue to play an important role, the centre has largely moved away from the use of proprietary software for its automated processing. This paper discusses the operational application of open source systems in place at RSC and our experience with their implementation, use and resource considerations. Introduction The Remote Sensing Centre is engaged in numerous remote sensing projects across Queensland, including: monitoring of vegetation clearing and vegetation structure; ground cover monitoring; land use mapping; gully mapping; fire scar mapping and monitoring of selected weeds. To facilitate these projects the centre obtains, manages and processes large quantities of data from various sources. A current priority area for both the Federal and State governments is the Great Barrier Reef and catchments draining into it. RSC is undertaking a range of remote sensing projects and programs in these catchments. An overview of these monitoring activities is provided by Tindall and Witte (2010). Landsat imagery has traditionally been the primary source of imagery for our remote sensing projects. Satellite imagery requires significant pre-processing prior to use in automated image analysis. As a result, in addition to operational projects, the centre undertakes research into various corrections, such as radiometric, geometric and topographic correction as well as automated cloud, cloud shadow and water masking.

1

Apart from Landsat, RSC also obtains various other satellite and remotely sensed imagery including SPOT4 and SPOT5, MODIS, ALOS PALSAR and higher resolution data such as Quickbird, Ikonos and LiDAR for priority areas in the state. In addition to remotely sensed data, there is an extensive field program dating back 15 years. The volume of these data is steadily increasing and this has necessitated an automated approach to data storage, management and user access. To achieve this, an operational software system has been built around high performance computing and mass storage facilities (hardware). While commercial software continues to play an important role in various desktop operations at RSC (for example, see Grounds and Tindall 2010), the core automated system at the centre is built entirely on open source tools. Open source products were initially selected due to their portability across platforms and low cost, but have numerous other benefits. Open Source Geospatial Software The term 'open source' software is defined by the Open Source Initiative (http://opensource.org/docs/osd). Primarily, open source software must include the source code with distribution, allow for modification of the source code and must be free to be redistributed. Modifications to the code can be submitted to the original vendor and accepted as fixes if appropriate. As a result, software bugs are identified and resolved sooner and software features can evolve more quickly (Ramsey 2009, Shorter 2010). One of the major benefits of open source software is that maintenance and development is not limited to a handful of programmers, but can be made by any user of the software (Ramsey 2009, Shorter 2010). Additionally, the transparent and customisable nature of the software allows users to adapt it for their own purposes, provided they have the technical skill. Another major benefit of open source is the lack of licensing restrictions. Open software is freely distributable, allowing it to be deployed on multiple machines within an organisation without internal competition for limited licenses or high overheads. The geospatial open source community is small, with a few key players who are consciously developing in parallel. This effectively creates a robust informal software suite with consistent standards and high levels of interoperability (Shorter 2010). While there are obvious benefits in adopting open source software, there are also drawbacks. As Ramsey (http://www.youtube.com/watch?v=zB_a28vBtBk) identifies, the business model for open source software depends not on selling software but on charging for support. Technically competent early adopters are expected to not only manage with minimal support but also contribute to the development of the software. As such, there is often limited documentation or freely available support for entry level users. Later mainstream adopters are expected to require varying degrees of support which is where profits from open source are expected to be achieved (Ramsey 2009).

2

Remote Sensing Centre Open Source Systems The culture within the department has primarily supported the adoption of proprietary software, due to the guaranteed support provided. As such, implementing open source solutions has historically been challenging. Despite this, the practicalities of managing an operation of the scale of the Remote Sensing Centre and the inherent limitations in licensing associated with proprietary software have led to open source solutions gradually dominating the operational processes at the centre. For ease of discussion within this paper, specific details of our operation have been divided into three key areas: data storage and automated processing core; spatial database interface; and projects and research. Data Storage and Automated Processing Core The primary data requiring management is the centre’s extensive Landsat archive. The centre has a total of over 19,000 Landsat images dating back to 1972. The automated processing of these Landsat images includes: radiometric, geometric and topographic corrections along with the application of various masks (water, cloud and shadow) prior to use in operational projects or for research. Initially, processing of Landsat imagery was performed on stand-alone Unix (SGI IRIX) workstations using ERDAS Imagine software. Processing models at this stage were developed in ERDAS Imagine Modeler, but debugging and error handling was problematic due to the ‘black box’ nature of the software. In addition, later versions of Imagine (8.5 and onwards) did not support the IRIX operating system and it was necessary to convert to Windows. There were major limitations performing automated processing in this environment and a move back to a UNIX/Linux based platform was desired. As Imagine and ArcGIS are heavily tied to the Windows operating system, it was necessary to consider other options, such as open source, which is universally compatible with UNIX/Linux systems. Departmental standards, however, required Windows remain the standard desktop operating system. As a result, the final processing system is a core open source Linux system accessed via workstations with Windows operating systems. This initially posed some difficulties, but with the incorporation of an open source UNIX simulator, Cygwin, it became possible to run UNIX/Linux processes on Windows machines. A new in-house application, "PyModeller", was developed. PyModeller is a graphical application that allows the flow between the various processing steps to be visualised and simplifies the development of raster processing models. This package allows the user to concentrate on higher level concepts by hiding details of raster access, resampling and dealing with input datasets of different spatial extents. Additional functionality can be easily added to PyModeller due to the extensible open source nature. PyModeller was built using the Python programming language and the GDAL raster translator library. The combination of the two packages is extremely powerful and is now used for the majority of raster processing within the centre, either through the PyModeller interface or via Python scripts directly accessing

3

the GDAL library. GDAL and Python are both open source and well supported in the geospatial community, as such, there is much existing code available for reuse. As PyModeller was built using Python, any of this code can be accessed from within it, including the scientific modules SciPy and NumPy. PyModeller can also be easily incorporated into Python programs to perform image processing, due to its easy to use scripting interface. Since it is built on top of open source software, PyModeller will run on almost any platform, including Linux and Cygwin. All automated raster processing scripts are now operating with PyModeller.

Figure 1. PyModeller screen shot. Model building screen for PyModeller illustrating a theoretical model incorporating a number of rasters and a colour table.

GDAL and Python have also made it possible to easily incorporate metadata into the imagery. Metadata is now inserted into the header file of each processing stage, documenting how the given image was processed and what parent images were used. The Python script ‘HistoryView’ allows this information to be easily viewed. For a long period, data searching and retrieval on the filestore was managed solely via command line tools and specialised scripts. While well designed file naming conventions made it possible to customise scripts allowing for the search and retrieval of desired imagery, the limitations of these methods and the growing size of the RSC operation eventually necessitated the introduction of a spatial database to facilitate data management and querying.

4

Spatial Database Interface A spatial database is a database which has been spatially enabled to manage geographic data. Spatial databases are not necessarily relational, but most are, allowing for querying of data via the SQL query language. The centre introduced the PostGIS spatial database in September 2006. PostGIS was selected as the spatial database because it is a powerful, spatially enabled relational database and is compliant with SQL standards and compatible with the SGI IRIX and Linux. The database was initially introduced to manage records of Landsat imagery and ground control points, but has since expanded to incorporate other imagery, field data and various other relevant data sources. Currently the database contains references to over 19,000 Landsat images; over 11,000 MODIS scenes; over 1,200 SPOT images; approximately 25,000 georeferencing points; approximately 1,400,000 field observations; thumbnails of over 50,000 field images; along with numerous other miscellaneous data tables relevant to the centre’s operation (Figure 2).

Geometries

Tables

Image Thumbnails

Figure 2. Examples of various (spatially enabled and non-spatial) data stored on PostGIS and readily accessible via open source tools.

5

The introduction of the database has greatly improved the management and access to the extensive record of field data captured over the last 15 years. Until recently, however, these data were stored in text files of varying formats, and, as such, were difficult to access effectively due to lack of querying capabilities. PostGIS acts as the gateway for staff to available data at the centre. PostGIS has Application Programming Interfaces (API’s) to various other open source tools in use at the centre including: Python, Quantum GIS (QGIS) and R making it possible to integrate queries from the database seamlessly into scripts and, in the case of QGIS, allows for viewing of available data. QGIS is an open source GIS system with viewing and editing capabilities. It has developed in parallel with PostGIS resulting in a strong integration of the two software packages. QGIS is easily customisable using Python, allowing for the development of specialised ‘plugins’. Within the centre a number of plugins have been created to assist with the querying, viewing and downloading of various data (Figure 3). In addition, there are many plugins freely distributed within the open source community.

Figure 3. Quantum GIS screen. Displayed are a number of plugins, some developed on-site, used at the centre.

Projects and Research RSC has many projects dependent on accessing and processing spatial data to create operational products or for research purposes. Open source tools are utilised extensively for these purposes. Python is an object oriented scripting language which has replaced an assortment of computer languages previously used within RSC. Operational Python scripts now exist for most projects within the centre for various purposes and Python is now the primary language used for scripting of processes. This uniform system makes it easy to share code between projects in RSC. Python has the advantage of being a high level language, which makes it relatively

6

easy to learn. As such, Python is suitable for use by operational staff with limited programming experience. As discussed previously, GDAL is a raster translation library and is used as the interface for all raster formats within the centre’s scripts. GDAL is primarily a translation library, however it also provides a number of command line tools which easily enable common raster processing functions such as mosaicing, resampling, masking and reprojecting. GDAL allows for the possibility of time series analysis through virtual rasters. Virtual rasters effectively stack a series of rasters upon each other allowing for individual pixel values to be investigated across time. OGR is the vector library component of GDAL. The group is also gradually moving into automating various vector processes using OGR. The OGR library has recently been used to automate accuracy assessment processes for land use mapping. R is the major package used for statistical analysis and modelling within the centre. R is also open source, and is well supported within the geospatial open source framework. For example, GDAL supports the conversion of data directly into a compatible R format and PostGIS has an API for R allowing for direct querying within R scripts. QGIS is being adopted by operational staff for some GIS operations; however, it has not replaced major proprietary software for most desktop GIS processes. For example, the centre has a major requirement for interactive raster editing for which no adequate open source solution exists. Discussion and Conclusions The most obvious benefit to implementing open source is that it is free. Other major benefits to implementing open source software are summarised below. Open source geospatial software has established itself as a high quality and robust solution for large-scale image processing operations. There are specific open source standards by which the community adheres and there is a high degree of support and integration between various software elements, resulting in a virtual software suite. The open source license allows software to be run on multiple machines without associated overheads. The software is also portable across platforms. These factors make the software extremely flexible and implementing processing systems are not hindered by specific platforms or licensing restrictions. Another major advantage is the extendable nature of the software, allowing users to customise applications to suit their own purposes. This also speeds up the development of the software. So while some current open source applications may lag behind in features, they are quickly evolving. The major limitation of open source is that freely available technical support is limited. So, while RSC has successfully established a large, robust operational geo-processing system around open source software, this has been reliant on

7

high levels of skill and experience in programming available in-house. Interestingly, the transparent and extensible nature of open source does appear to have increased the overall level of scripting skills at the centre. As a result, increased levels of automation are being implemented across all aspects of the centres activities. Despite this, anyone desiring to implement a system around open source tools will need to consider their support requirements and how they will meet these, as a certain level of expertise will be required. Acknowledgements The authors would like to thank Neil Flood for his efforts in implementing open source systems at the centre, ongoing support of these systems and commitment to training staff in the operation of these systems. References Open Source Initiative, 2010, The open source definition, viewed 6 August 2010, http://opensource.org/docs/osd Grounds, S. and Tindall, D., 2010, Mapping in a dynamic state: Operational mapping for a rather large area. In 15th Australasian Remote Sensing and Photogrammetry Conference. Alice Springs. Ramsey, P., 2009, Beyond nerds bearing gifts: the future of the open source economy. Keynote address (video) at: FOSS4G: Free and Open Source Software for Geospatial, 20-23 October 2009, Sydney, Australia, http://www.youtube.com/watch?v=zB_a28vBtBk Schmidt, M, and Gillingham, S., 2008, Raster data analysis made simple with a scriptable open source framework: PyModeller. Poster session at: 14th Australasian Remote Sensing and Photogrammetry Conference. 30 September - 2 October 2008, Darwin, Australia. Shorter, C., 2010, Overview of geospatial open source software which is robust, feature rich and standards compliant. In FIG Congress: Facing the Challenges – Building the Capacity. 11-16 April 2010, Sydney, Australia. Tindall, D and Witte, C., 2010, Legislation, policies and research: Queensland Remote Sensing Centre supporting great barrier Reef conservation and management initiatives In 15th Australasian Remote Sensing and Photogrammetry Conference. Alice Springs.

8

Sign up to vote on this title
UsefulNot useful