System Integration and Validation Test Plan

Ref.
Ares(2019)3532206 - 30/05/2019
Copernicus Access Platform Intermediate Layers Small Scale Demonstrator
D3.5 System Integration and Validation Test Plan

v1
Document Identification
Status Final Due Date 30/11/2018
Version 1.3 Submission Date 30/05/2019
Related WP WP3 Document Reference D3.5

Related N/A Dissemination Level (*) PU
Deliverable(s)
Lead Participant Atos France Lead Author Fabien CASTEL
Anne-Sophie TONNEAU Reviewers Michelle Aubrun (TAS FR)
(Atos France)
Fabien CASTEL (Atos
France)
JF Rolland (Atos France)
Keywords:
Cloud, Platform as a Service, System Integration, Validation
This document is issued within the frame and for the purpose of the CANDELA project. This project has received funding from the European
Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 776193. The opinions expressed and arguments
employed herein do not necessarily reflect the official views of the European Commission.
The dissemination of this document reflects only the author’s view and the European Commission is not responsible for any use that may be
made of the information it contains. This document and its content are the property of the CANDELA Consortium. The content of all or parts
of this document can be used and distributed provided that the CANDELA project and the document are properly referenced.
Each CANDELA Partner may use this document in conformity with the CANDELA Consortium Grant Agreement provisions.
(*) Dissemination level: PU: Public, fully open, e.g. web; CO: Confidential, restricted under conditions set out in Model Grant Agreement; CI:
Classified, Int = Internal Working Document, information as referred to in Commission Decision 2001/844/EC.
Document Information
List of Contributors
Name Partner
Fabien CASTEL Atos France
Anne-Sophie TONNEAU Atos France
JF Rolland Atos France
Document History
Version Date Change editors Changes
0.1 31/08/2018 Anne-Sophie TONNEAU Initial Table of Content
(ATOS FR)
0.2 29/09/2018 Fabien CASTEL Reworked table of content
(ATOS FR) Recap of the topic to be addressed in each
chapter.
0.3 28/11/2018 Anne-Sophie TONNEAU Version for partner review
(ATOS FR)
0.4 30/11/2018 Fabien CASTEL (ATOS FR) Version for quality review
0.5 30/11/2018 Juan Alonso (ATOS ES) Quality Assessment
1.0 30/11/2018 Jose Lorenzo (ATOS ES) Coordinator approval for submission
1.1 29/04/2019 Jean-Frannçois Rolland Update of the document to handle remarks
(ATOS FR) issued at the first-year review
1.2 30/05/2019 Juan Alonso (ATOS ES) Quality Assessment
1.3 30/05/2019 Jose Lorenzo (ATOS ES) Final revision before re-submission
Quality Control
Role Who (Partner short name) Approval Date
Deliverable leader Jean-Frannçois Rolland (ATOS FR) 29/05/2019
Quality manager Juan Alonso (ATOS ES) 30/05/2019
Project Coordinator Jose Lorenzo (ATOS ES) 30/05/2019
Document name: D3.5 System Integration and Validation Test Plan v1 Page: 2 of 57
Reference: D3.5 Dissemination: PU Version: 1.3 Status: Final
Table of Contents
Document Information ............................................................................................................................ 2
Table of Contents .................................................................................................................................... 3
List of Tables ............................................................................................................................................ 5
List of Figures........................................................................................................................................... 6
List of Acronyms ...................................................................................................................................... 7
Executive Summary ................................................................................................................................. 8
1 System overview .............................................................................................................................. 9
1.1 System components ................................................................................................................ 9
1.2 Standard components ........................................................................................................... 10
1.2.1 GeoServer ............................................................................................................................. 10
1.2.2 Keycloak ................................................................................................................................ 10
1.2.3 JupyterHub ........................................................................................................................... 11
1.2.4 CreoDIAS data connector ..................................................................................................... 12
1.2.5 MonetDB .............................................................................................................................. 12
1.2.6 PostGIS.................................................................................................................................. 13
1.3 Integration of applications and algorithms ........................................................................... 13
1.3.1 Semantic search application ................................................................................................. 13
1.3.2 Integration of data analytics algorithms .............................................................................. 13
1.3.3 The semantic classification tool ........................................................................................... 14
1.4 User access component – Notebook environment ............................................................... 15
1.4.1 Wpslib ................................................................................................................................... 17
1.4.2 Creodiaslib ............................................................................................................................ 17
1.5 Component interactions........................................................................................................ 17
2 Integration and validation strategy................................................................................................ 19
2.1 Integration process................................................................................................................ 19
2.1.1 Roles and responsibilities ..................................................................................................... 19
2.1.2 Integration and validation workflow .................................................................................... 20
2.2 Integration strategy ............................................................................................................... 22
2.3 Integration and validation test infrastructure....................................................................... 23
2.3.1 CANDELA infrastructure ....................................................................................................... 23
2.3.2 Infrastructure instances ....................................................................................................... 23
3 Integration tests ............................................................................................................................. 25
3.1 Components deployment ...................................................................................................... 25
3.1.1 GeoServer ............................................................................................................................. 25
3.1.2 Keycloak ................................................................................................................................ 26
3.1.3 JupyterHub ........................................................................................................................... 27
3.1.4 Semsearch ............................................................................................................................ 28
3.1.5 PostGIS.................................................................................................................................. 28
3.1.6 MonetDB .............................................................................................................................. 30
3.2 Notebook server .................................................................................................................... 32
3.2.1 Notebook server availability................................................................................................. 32
3.2.2 Python kernel running .......................................................................................................... 33
3.3 Data access ............................................................................................................................ 34
3.3.1 CreoDIAS data availability .................................................................................................... 34
4 Processing services......................................................................................................................... 35
4.1 Optical change detection processing chain ........................................................................... 35
4.1.1 Change detection processes run correctly ........................................................................... 35
4.1.2 Change detection processing pipeline runs correctly .......................................................... 36
4.2 Data split, merge, transformation ......................................................................................... 37
4.2.1 Processes Split Images and Merge Images run correctly ..................................................... 37
4.2.2 Jpeg2Tiff process run correctly ............................................................................................ 38
4.3 SAR change detection............................................................................................................ 39
4.3.1 SAR change detection process runs correctly ...................................................................... 39
4.4 DMG process ......................................................................................................................... 40
4.4.1 DMG process run correctly ................................................................................................... 40
5 Integration deployment environment ........................................................................................... 42
6 Conclusion ...................................................................................................................................... 43
References ............................................................................................................................................. 44
Annexes ................................................................................................................................................. 45
Annex I: Notebook server test script ................................................................................................. 45
Annex II: Change detection processing chain test script ................................................................... 45
Annex III: Data split, merge, transformation test script .................................................................... 48
Annex IV: Sar detection change test script ........................................................................................ 49
Annex V: EOminer DMG test script.................................................................................................... 50
Annex VI: CANDELA Service Providers Guide for service integration ................................................ 51
List of Tables
Table 1: Jupyter main kernels ................................................................................................................... 16
Table 2: Component interfaces overview ................................................................................................. 18
Table 3: Kubernetes configuration ........................................................................................................... 42
Table 4: Integration environment configuration ...................................................................................... 42
List of Figures
Figure 1: CANDELA system components overview __________________________________________ 9
Figure 2: JupyterHub global architecture ________________________________________________ 11
Figure 3: Collections available through s3fs mount point ___________________________________ 12
Figure 4: CANDELA platform processing pipeline __________________________________________ 14
Figure 5: Architecture of DLR application ________________________________________________ 15
Figure 6: Notebook environment architecture ____________________________________________ 16
Figure 7: Components interactions overview _____________________________________________ 17
Figure 8: Overview of the integration and validation workflow _______________________________ 20
Figure 9: Integration steps ___________________________________________________________ 22
Figure 10: Architecture of the CANDELA infrastructure _____________________________________ 23
Figure 11: Overview of the integration and validation environments __________________________ 24
List of Acronyms
Abbreviation / Description
acronym
API Application Programming Interface
COTS Commercial Off-The-Shelf
CSW Catalogue Service for the Web, OGC standard for data catalogue requesting
DBMS Data Base Management System
EO Earth Observation
GIS Geographic Information System
HTML HyperText Markup Langage
HTTP HyperText Transfer Protocol
JSON JavaScript Object Notation
LDAP Lightweight Directory Access Protocol
OGC Open Geospatial Consortium
PaaS Platform as a service
REST Representational State Transfer
SAR Synthetic Aperture Radar
SMTP Simple Mail Transfer Protocol
SQL Structured Query Language
SSH Secure Shell
URL Uniform Resource Locator
VM Virtual Machine
WEBDAV Web-based Distributed Authoring and Versioning
WP Work Package
WPS Web Processing Service, OGC standard for geospatial processing services
WS Web Service
Executive Summary
The objective of this system integration and validation test plan is to describe the system components
and interactions, the integration strategy, the associated testing environments and the integration tests.
This document has been updated tacking into account the requirements from the first project review
meeting (first release of D3.5 took place in November 2018). Overall, the document keeps the same
structure including the following sections:
• The system is described in section 1, with a description of the component structure and
interactions between the different components.
• The integration strategy is presented in section 2, with the description of the integration process
in section 2.1 with the roles and responsibilities, the integration and validation workflow. The
steps of the integration procedure are described in section 2.2 and a description of integration,
validation and production environments is provided in section 2.3.
• The integration tests are detailed in section 3 of the document.
• The data analytics algorithms deployed in the platform are described in the section 4.
• An example describing the algorithm integration steps from an executable to a WPS processing
service is provided as Annex at the end of the deliverable.
D3.5 has been submitted together with a new version of D3.3 [8], where it is described the most
significant changes in the platform, some of them also included below. After the previous version of the
platform several tools have been integrated.
• The semantic search tool from IRIT has been deployed on the platform. Technically it consists of
two components, a website hosted on a Tomcat server and a geospatial PostGIS database. These
two components are deployed as two different docker containers.
• The DLR algorithm performing semantic classification on earth observation products has been
integrated on the platform. This needs to access to a monet database. An instance of this
database has been deployed in a separate container.
• The JupyterLab notebook has been modified to include new client libraries for both monetDB
and PostGIS. A command line client for monetDB has also been included to this JupyterLab
environment.
• In addition to the existing change detection algorithm on optical images provided by TAS France
a new change detection algorithm for SAR images from TAS Italy has been integrated.
This updated version of D3.5 includes the description of the new standard components integrated on
the platform in section 1.2. Section 1.3 provides details on how the algorithms and application from the
partners are integrated in the platform. The component interaction (section 1.5) has been updated and
includes the new components. In chapter 3 new test cases used to validate the correct deployment of
new functionalities of the platform have been described. Finally, Annex section contain two more
example of python test scripts for the DMG process provided by DLR and the SAR change detection
algorithm provided by TAS Italy.
1 System overview
This section presents the CANDELA system components and their interactions.
1.1 System components
The following schema presents the overview of the CANDELA system components.
Figure 1: CANDELA system components overview
CANDELA system components are the following:

• A set of standard components deployed as a base of the platform:
o GeoServer in charge of WPS process launching
o Keycloak manages the authentication
o JupyterHub is a server that instantiate on-demand user development environment
(Jupyter Notebook) to interact with the platform
o Jupyter Notebook – is the user environment to run and interact with the
processing services
o The access to the CreoDIAS data
o Monet DB database
o PostGIS database
o Tomcat webserver
• Application and data analytics algorithms provided by partners:
- Change detection algorithms (on radar and optical products) that are made available as
WPS processing services
- Semantic classification algorithm and application
- Semantic search application
• Integration tools developed on the platform:
- Wpslib: library implement the communication protocols with GeoServer
- Creodiaslib: implements methods to facilitate research of images in the CreoDIAS
catalog
- Rest services: provides access to monetdb from outside the platform.
- Integration scripts for each data analytic algorithms.
1.2 Standard components
1.2.1 GeoServer
GeoServer [1] is an open-source server written in Java that allows users sharing, processing and
editing geospatial data. Designed for interoperability, it publishes data from any major spatial data
source using open standards. GeoServer functions are the reference implementation of the Open
Geospatial Consortium Web Feature Service standard, and also implements the Web Map Service, Web
Coverage Service and Web Processing Service specifications.
GeoServer can be considered both as a processing tool and a service catalogue. Its main use is to
distribute georeferenced data by implementing the OGC standards (WFS, WMS, WCS…). It also
implements the OGC WPS standard, as such can be used as a processing service catalogue.
In the context of CANDELA application, the following functionalities of GeoServer are used:
• Definition of new processing services
• Execution of processing services through WPS standard through a WPS extension of GeoServer
• REST API
Version 2.14-RC is deployed on the platform.
1.2.2 Keycloak
User management in the CANDELA platform is ensured by reusing an off-the-shelf component deployed
on the cluster, called Keycloak [2]. Keycloak is an open source identity and access management system
providing a simple Single-Sign-On solution. Users authenticate with Keycloak rather than doing so on
each individual CANDELA component. This means that the components don't have to deal with login
forms, authenticating users, and storing users. Once logged-in to Keycloak, users don't have to login
again to access a different application.
Keycloak can connect to various sources of authorized users (LDAP, Active Directory, RDBMS "Social
login"...). In the CANDELA configuration, it is configured to store users in a dedicated PostgreSQL
database pod saving its content on the cluster filesystem. New users are added by a Keycloak
administrator through the management console or through a dedicated API. User accounts are
initialized with a random password that the user will change at its first connection. The admin has no
access to user passwords by design. The Keycloak server can be connected to an SMTP server to
automatize the registration and password reset processes.
CANDELA components are connected to Keycloak through the standard OAuth 2 protocol. OAuth 2
clients such as the Jupyter Hub component are registered by an administrator over the Keycloak
management interface or API, and are given a client ID and secret, corresponding to an origin URL and
a call-back URL.
When trying to authenticate a user, the client authenticates itself to Keycloak with its Id and Secret, the
origin URL and the call-back address given by the client is verified against Keycloak registered
information. When client identity is ensured, the user is redirected to a login page. When user identity
is ensured Keycloak calls back the client with an authorization and refresh token. Authorization token
can be used by the client to get additional information about the user. Refresh token is used to generate
a new token for the user without it having to go through the login process.
1.2.3 JupyterHub
The JupyterHub [4] project is a Jupyter [3] sub-project adding a multi-user layer to the Jupyter core
server with an authentication system that allows using it efficiently in a business environment.
JupyterHub is composed of:
• A Hub component, managing the multi-users connections, with their specific access rights and
password.
• A frontend proxy, routing request to the Hub and creating Jupyter server instances specific for
each user. The proxy is then in charge of transferring the incoming requests to the corresponding
Jupyter server.
• Several standard Jupyter servers.
Figure 2: JupyterHub global architecture
The action of creating a Jupyter server specific for a user is called spawning. In the Hub, a specific
module, the Spawner, handles this action. Several implementations exist for this module according to
the strategy chosen to spawn the Notebook servers: local or remote spawning based on cluster
management software (Torque, PBS...), on Docker, on Kubernetes… The spawner implementation
installed on the CANDELA platform is based on Kubernetes.
A specific module handles the user management on the hub. As for the spawning, several
implementations of this module exist: based on the UNIX users, on the OAuth protocol or on LDAP. LDAP
is the adopted solution for the platform, as it is the technology used for the Identity and Authorization
Manager.
A specific Jupyter instance is created every time a user launches the Notebook feature in the platform.
Thus, by default the workspace of this instance is empty. Different solutions exist to enable a persistent
workspace. On the platform, instance spawning is based on Kubernetes, and thus by extension on
Docker containerization. Each newly created instance is a new Docker container running on the
Kubernetes platform. Kubernetes/Docker allows configuring volumes, i.e. folders shared between
containers, or between a container and the host system. The strategy to make the user workspace
persistent on JupyterHub is to create a volume mapping the JupyterHub workspace folder with the user
workspace folder currently used on the platform.
1.2.4 CreoDIAS data connector
For discovering data, CreoDIAS provides access to metadata through a Data Finder API [5]. This is an
HTTP WS API that is accessible for free and anonymously.
The collections available are:
• Sentinel1
• Sentinel2
• Landsat8
• Landsat7
• Landsat5
• Envisat
For accessing data, a s3fs volume is mounted on the platform, making directly accessible the EO data
provided by CreoDIAS.
The mount point is /s3fs, and from this folder the following collections are available:
Figure 3: Collections available through s3fs mount point
1.2.5 MonetDB
MonetDB is an open-source column-oriented database management system. It was designed to provide

high performance on complex queries against large databases, such as combining tables with hundreds
of columns and millions of rows. MonetDB has been applied in high-performance applications for online
analytical processing, data mining, geographic information system (GIS), Resource Description
Framework (RDF), text retrieval and sequence alignment processing.
MonetDB is used to store information generated by DLR’s analytic algorithm (dmg).
1.2.6 PostGIS
PostGIS is a spatial database extender for PostgreSQL object-relational database. It adds support for
geographic objects allowing location queries to be run in SQL (Structured Query Language). In addition
to basic location awareness, PostGIS offers many features. PostGIS adds extra types (geometry,
geography, raster and others) to the PostgreSQL database. It also adds functions, operators, and index
enhancements that apply to these spatial types. These additional functions, operators, index bindings
and types augment the power of the core PostgreSQL DBMS (Data Base Management System), making
it a fast, feature-plenty, and robust spatial database management system.
PostGIS is used by the semantic search engine web application provided by IRIT.
1.3 Integration of applications and algorithms
The different partners in Candela provide different type of components to the platform:
• IRIT provides a semantic search application
• TAS France and TAS Italy provide change detection algorithms
• DLR provides a tool for semantic classification of satellite images
These different types of components imply a different type of integration on the platform.
1.3.1 Semantic search application
The semantic search web application provided by IRIT is deployed on a Tomcat webserver. The
application has access to the Postgis database. Both the Postgis database and the webserver are
deployed as docker containers on Kubernetes.
The administrator user can access to the filesystem of the container using WEBDAV protocol.
For administrations purpose the Postgis database is accessible from outside the candela platform using
any client compatible with this database.
1.3.2 Integration of data analytics algorithms
In this part it is described how it is integrated the change detection algorithms from TAS France and TAS
Italy.
Data analytics algorithms are delivered by service providers as Docker containers, in addition with a
JSON description of inputs/outputs of the service. Once integrated on the CANDELA platform, we talk
about processing services. A GeoServer and Kubernetes API are used to parameterize, launch and
monitor executed services. Each service Docker is manually built, and the Docker image is created and
ready to be launched through the Kubernetes API.
GeoServer provides the ability to discover, execute and manage processing services through an OGC
Web Processing Standard (WPS) interface. Users can then interact with processing services with WPS
requests to the GeoServer instance. The WPS standard also allows to chain processing services, creating
a processing pipeline of services. GeoServer handles the chaining of the outputs of one service as inputs
of other services and runs each service of the pipeline step by step.
Figure 4: CANDELA platform processing pipeline
For each processing service, based on the JSON description, a corresponding processing script describing
the inputs and outputs of the processing service is manually deployed in GeoServer. The GeoServer
exposes the service, thanks to its WPS plugin. The service is then retrieved when performing a
“GetCapabilities” request to GeoServer. This kind of request is part of the WPS standard and enables
users to list all the WPS services provided by a WPS endpoint and to get the metadata required to invoke
them.
When GeoServer receives an "ExecuteProcess" WPS request, the script is parameterized with the WPS
request inputs and GeoServer launches the Execution script corresponding to the processing service.
The script sets the environment of the processing service, sends a scheduling order to the Kubernetes
API with the corresponding Docker image, and monitors the execution of the resulting container.
During the execution of the service, a repository is created with the id of the execution and contains
resulting files processed by the service and log files. User can interact with the file system and gather its
results.
1.3.3 The semantic classification tool
The semantic classification tool provided by DLR is an application composed of different parts:
• The Data model generation algorithm
• MonetDB
• User interface
The data model generation is an algorithm that perform automatic semantic classification on sentinel 1
and 2 images. It produces a SQL file used to enrich the database and a set of jpg files used in the user
interface. This algorithm is integrated on the platform in the same way as algorithms for TAS France and
TAS Italy. The main difference is that the algorithm needs an access to monetDB.
MonetDB is deployed in a container and is accessible from the user interface using REST services.
Examples of utilization of these services are presented in section 3.1.6. The database can also be
accessed directly from a notebook environment. Two possibilities exist to interact with the database
from Jupyter-Lab. The first one is to use the pymonetdb library that allow to access to the database from
a python script. This is shown in section 3.1.6. The command line client for monetDB, mclient, is also
available from a terminal within Jupyter-lab.
The result of the DMG process can be downloaded from the user interface as an archive containing all
the files produced.
Figure 5: Architecture of DLR application
At the end of the execution of a DMG process two log files are produced:
• Dlr.log gives information about the process execution
• Eominer.log contains the console output of the application.
The output of the application is a sql file and a set of jpg files. These files are grouped in a tar.gz archive.
This archive can be downloaded using the Jupyter-Lab environment and used locally in the user interface
part of the EOminer application.
1.4 User access component – Notebook environment
One feature of CANDELA is to enable users to execute processing tools on a remote platform. In general,
processing tools are black boxes: pieces of code packaged and integrated on the platform that cannot
be explored. The CANDELA development environment goal is to give the users the capacity to work at a
lower level. It enables users to develop their own code and execute it on the platform with full access
to the platform computation resources and data repository.
Jupyter Notebook [2] is the technology used to allow user to interact with CANDELA’s server
components.
Jupyter enables users to manipulate notebook documents. A document is an HTML file containing code,
textual information and additional metadata (execution language, version…). From the user point of
view, a document is a sequence of cells. A cell can contain code or rich text information encoded with
markdown syntax. Each cell can be executed independently. An execution context is managed by the
server to keep in memory all variables defined in a previously executed cell.
Jupyter is a remote development environment available for the users from their browser that allows
writing code and executing it on a server machine.
Figure 6: Notebook environment architecture
The notebook documents are stored on the server machine. They can be downloaded by the user to be
kept locally or shared with other users.
There are several obvious benefits of such an approach. First, users do not need to install anything in
their local machine. They can run Python scripts without any local Python installation for instance.
Moreover, programmes executed in the Notebook environment can access data located on the
Notebook server. There is no need to download locally all the data. When dealing with big amounts of
data, it is possible to execute code that filters, selects or reduces them, and transfers only small amounts
to the user machine for displaying.
The core Notebook server is responsible for routing the client request and managing the Notebook
documents. The actual execution of the code is performed by components installed in addition to the
core server. The basic Jupyter installation provides a Python 2.7 kernel. To execute code from any other
language, additional kernels should be installed and configured. When installing a language specific
kernel, it is possible to install additional libraries so that users can use them natively in their Notebook.
Targeted languages and additional libraries are listed in the following table:
Table 1: Jupyter main kernels
Kernel Name Coding Language Additional libraries

Language Version
Python 3 Python 3.6.5 wpslib
Python 3 Python 3.6.5 creodiaslib
1.4.1 Wpslib
A Python library is made available from user’s Notebook environment, for interacting with the WPS
processing services. It offers facilities for launching processes and interact with them. It is built on top
of the OWSLib [6] that is a Python library for programming with OGC web services.
1.4.2 Creodiaslib
This library provides the user of notebook environment facilities for searching and retrieving products
in the CreoDIAS catalog.
1.5 Component interactions
The following schema presents the interactions between the system components.
Figure 7: Components interactions overview
CANDELA is deployed on the CreoDIAS [5] cloud environment provided by CloudFerro.

When user-X connects to the GUI JupyterLab (1), the JupyterHub checks authentication from Keycloak
(2) and instantiates a Notebook Server dedicated to the user-X (3). User-X can then interact with the
Notebook Server through its JupyterLab interface, that allows to create Notebooks files and manipulate
WPS processes deployed in GeoServer (4) via the Python wpslib library. User-X can run a WPS process
service, that can be handled by GeoServer and eventually access to CreoDIAS EO data through a S3fs
volume (5) that have been mounted on the system.
The DMG process from DLR, used to perform semantic classification on images produces a set of images
and structured information that is stored in the monetDB database (6).
The semantic Search tool provided by IRIT is composed of web application hosted on a tomcat server
which used a PostGIS database as a backend (7). The user can access to the application using a web
browser (8). The administrator has a private access to write files on the tomcat server using WEBDAV
protocol and a direct access to the PostGIS database (9).
The jupyter notebook user can also access to monetDB and PostGIS from its notebook using pymonetdb
and psycopg2 libraries (10, 11).
Two REST services have been developed to access to monetDB: one performing select request the other
update requests.
The interfaces presented in Figure 6 are described in the following table:
Table 2: Component interfaces overview
Protocol /
# Components involved Description
Standard
Access to WPS operations
1 Front-end GeoServer HTTP/WPS - GetCapabilites
- DescribeProcess
2 Front-end Keycloak HTTP Administration
3 Front-end JupyterHub HTTP Administration
JupyterLab – interaction with WPS
4 Front-end Notebook HTTP
processing services
GeoServer web services, based on OGC
5 Webservices GeoServer HTTP/WPS
WPS standard.
2 Integration and validation strategy
The integration and validation process, strategy and infrastructure are described in the following
sections.
2.1 Integration process
The integration process defines roles and a workflow for integration.
2.1.1 Roles and responsibilities
The following roles have been identified in the integration and validation process:
• Development teams
Several development teams are working in parallel, with different purposes. Two kinds of
development teams have been identified:
o Algorithm providers, who are in charge of providing the processing algorithms
described in section 1.2.5
o Platform provider, who is in charge of providing the CANDELA dedicated platform
• Integration team, who is in charge of the following tasks:
o Defining integration infrastructure as described in section 2.3
o Defining integration test cases as described in section 3
o Integrating all components into a platform as described in section 2.1.2.2
o Running integration tests into integration infrastructure and reporting test results
and associated issues.
• Validation team, who is in charge of the following tasks:
o Defining validation tests
o Running validation tests and reporting test results and associated.
• User, who is a final user of the deployed application.
2.1.2 Integration and validation workflow
The following schema summarizes the integration and validation workflow.
Figure 8: Overview of the integration and validation workflow
2.1.2.1 Development phase

Each development team works on a different development environment according to its own needs.
Algorithms providers and platform provider do not need the same tooling. These development
environments are local for each partner. Each development team is in charge of its own components
unitary tests inside this environment.
2.1.2.1.1 Service development
Algorithms providers develop the scripts and deliver them to integration team.
The delivery of a processing service to integration team consists of the following content:
• The source code of the algorithm application (Python, Java, R, etc…)
• A detailed description of
o The execution environment
o The inputs and outputs of the application
o The command lines to execute the application
2.1.2.1.2 CANDELA platform development
The CANDELA platform is built on top of CloudFerro infrastructure. VM and storage means are
instantiated in the CreoDIAS cloud environment and configured to be part of a single Kubernetes server.
All the components of the CANDELA platform are then deployed as pods in this Kubernetes layer.
2.1.2.2 Integration phase
The integration phase aims at testing the technical integration and deployment of components into a
testing infrastructure similar to the production environment. Integration tests are performed by the
integration team in order to ensure a correct deployment of the components and valid interactions
between the components.
The inputs of the integration phase are the following:
• Common data identified from processing services expected inputs
• Sources or the processing services applications
• Library for interacting with the processing services
As soon as these inputs are available, the integration phase is launched: it consists in applying the
integration strategy described section 2.2 and for the integration team to run the tests described in
section 3.
The integration environment is described in section 2.3.1.
2.1.2.3 Functional validation phase
The functional validation phase is performed once the integration phase is successful, which means that
all integration tests have been run without any remaining blocking or major issue.
The objective of the functional validation phase is for the validation team from WP1 and WP2 to test the
functional behaviour of the platform, through Jupyter Notebooks.
The functional validation environment is described in section 2.3.2.
2.2 Integration strategy
The figure below describes the steps for the integration of the CANDELA platform.
Figure 9: Integration steps
The first step is the deployment of Kubernetes environment on the CloudFerro infrastructure platform.
On top of this environment,
• Data analytics algorithms can be integrated into GeoServer components as WPS processing
services, performing the following steps:
o First the script provided as source code by service providers is encapsulated into a
Docker container, with inputs, outputs and launching commands clearly identified,
o Then this Docker is registered into platform Docker registry in order to be accessible
for launching.
o The processing service is integrated into GeoServer component as a
GeoServerProcess calling the Docker files with correct mapping of input and output
directories on the distributed file system.
• Data connectors are integrated into the platform
• Keycloak tool is necessary for the accounts creation
• JupyterHub is integrated, providing Jupyter Notebooks for users to interact with the platform
After the integration of those components, it is possible to create an account to a new user allowing him
to connect to its Jupyter instance and interact with the platform through Notebooks.
2.3 Integration and validation test infrastructure
2.3.1 CANDELA infrastructure
The following schema represents the architecture of a CANDELA environment, as introduced already in
the System Architecture document [7]. The environment is composed of
• One “bounce” machine, i.e. a machine whose role is to route all the user access from outside
and to handle security issues,
• Several computation machines, where the Kubernetes cluster is deployed. The cluster is always
composed of one Kubernetes master machine and several Kubernetes worker machines. The
number of worker machine is arbitrary and can even be dynamically adapted according to the
computation needs. The CANDELA cluster is configured to always have at least 2 worker
machines to be able to host permanently the basis of the platform.
• A storage unit, shared between all the Kubernetes machines, to host the input and output
dataset handled by the CANDELA processing algorithms.
candela-bounce
Bounce machine
Access from 1 vcpu / 1 Go

outside access
over SSH over SSH
candela-001 candela-002 candela-003

4 vcpu / 8 Go 4 vcpu / 16 Go 8 vcpu / 32 Go
(K8s Master) (K8s Worker) (K8s Worker)

Access over TCP
80,443,
30000-30030
/data 500Go
shared over NFS
Figure 10: Architecture of the CANDELA infrastructure
2.3.2 Infrastructure instances
The following schema presents the different required infrastructures. The

integration/validation/production environment are all instances instance to the general case presented
above. At the current point of the project, only the integration instance is deployed and strictly similar
to what is describe above. Validation and production environment might require more computation and
storage capabilities, which is totally possible with the cloud facility CANDELA is using. The resources of
the integration environment currently deployed might also need to be increased in the future, which is
also easily feasible in the cloud facility used.
Figure 11: Overview of the integration and validation environments
According to the integration and validation workflow described section 2.1.2, several environments are
necessary:
• One dedicated development environment for each development team (handled by the team
itself). These environments are not hosted on a cloud infrastructure.
• An integration environment available for integration team during integration phase.
• A functional validation environment available for validation team during validation phase.
• In projects aiming at reaching real end-users, a production environment is also required in the
last stages of the project, to start testing the platform in real use conditions. The utility of such
a production environment for the CANDELA project is not assessed at all, as it mainly aims at
being of proof of concept and technological sandbox to validate cloud scaling capabilities. The
possibility to have this third environment is kept as an option depending on the needs observed
in the project in its next steps.
3 Integration tests
3.1 Components deployment
3.1.1 GeoServer
3.1.1.1 GeoServer availability
• Test objective: Ensures the availability of the GeoServer deployed

• Test Id: CANDELA_INT_TEST_3111_DEPLOYMENT_GEOSERVER _COMPONENT
• Prerequisites:
o GeoServer components are deployed
• Test steps:
# Description Expected result
Access GeoServer deployment through direct URL

1 GeoServer is accessible
<<GeoServer URL>>
• Interfaces under tests: N/A
3.1.1.2 GeoServer WPS processes availability
• Test objective: Ensures the following list of WPS processes are deployed into GeoServer
• Test Id: CANDELA_INT_TEST_3112_DEPLOYMENT_ GEOSERVER_WPS
• Prerequisites:
o CANDELA_INT_TEST_3111_DEPLOYMENT_GEOSERVER_COMPONENT successful
• Test steps:
1 Access GeoServer deployment through direct GeoServer is accessible

URL
<<GeoServer URL>>
2 Connect as an admin user with the following User is connected to

credentials: GeoServer
Login: <<GeoServer admin user>>
Password: <<GeoServer admin password>>
3 Select WPS process list page The following processes are

<<GeoServer displayed in the list:
URL>>/wicket/bookmarkable/org.GeoServer.w
ps.web.WPSRequestBuilder
• Interfaces under tests: WPS Process integration
3.1.2 Keycloak
3.1.2.1 Keycloak availability
• Test objective: Ensures the availability of the Keycloak deployed

• Test Id: CANDELA_INT_TEST_3121_DEPLOYMENT_KEYCLOAK _COMPONENT
• Prerequisites:
o Keycloak component is deployed
• Test steps:
Access Keycloak deployment through direct

1 URL Keycloak is accessible
<<GeoServer URL>>
3.1.2.2 Keycloak administration interface availability
• Test objective: Ensures the connection to Keycloak administration interface

• Test Id: CANDELA_INT_TEST_3122_DEPLOYMENT_ KEYCLOAK_ADMIN
• Prerequisites:
o CANDELA_INT_TEST_3121_DEPLOYMENT_KEYCLOAK_COMPONENT successful
• Test steps:
1 Access Keycloak deployment through direct Keycloak is accessible

URL
<<GeoServer URL>>
2 Connect as an admin user with the following User is connected to Keycloak

credentials:
3 Select Clients list page in the left menu bar JupyterHub and GeoServer are
in the clients list
3.1.3 JupyterHub
3.1.3.1 JupyterHub availability
• Test objective: Ensures the availability of the JupyterHub deployed

• Test Id: CANDELA_INT_TEST_3131_DEPLOYMENT_JUPYTERHUB _COMPONENT
• Prerequisites:
o JupyterHub component is deployed
• Test steps:
Access JupyterHub deployment through direct

1 URL JupyterHub is accessible
<<GeoServer URL>>
3.1.3.2 JupyterHub administration interface availability
• Test objective: Ensures the connection to JupyterHub administration interface

• Test Id: CANDELA_INT_TEST_3132_DEPLOYMENT_ JUPYTERHUB_ADMIN
• Prerequisites:
o CANDELA_INT_TEST_3131_DEPLOYMENT_JUPYTERHUB_COMPONENT successful
• Test steps:
1 Access JupyterHub deployment through direct JupyterHub is accessible

URL
<<GeoServer URL>>
2 Connect as an admin user with the following User is connected to

credentials: JupyterHub
3
3.1.4 Semsearch
3.1.4.1 Semsearch availability
• Test objective: Ensures the availability of Semsearch

• Test Id: CANDELA_INT_TEST_3141_DEPLOYMENT_SEMSEARCH _COMPONENT
• Prerequisites:
o Semsearch component is deployed
• Test steps:
Access Semsearch deployment through direct

URL
1 Semsearch is accessible
<<GeoServer URL>>
https://185.178.85.62/semsearch
3.1.5 PostGIS
3.1.5.1 PostGIS availability
• Test objective: Ensures the availability of PostGIS

• Test Id: CANDELA_INT_TEST_3151_DEPLOYMENT_POSTGIS _COMPONENT
• Prerequisites:
o PostGIS component is deployed
o Python is installed
o Psycopg2 is installed
• Test steps:
1 Start a python interpreter
Execute the following lines:

import psycopg2
hostname = '185.178.85.62'
port = '30023'
username = 'candela'
2 No errors returned
password = 'melodiH2020'
database = 'ep'
myConnection = psycopg2.connect(
host=hostname, user=username,
password=password, dbname=database,
port=port)
Execute:
3 Returns ‘ep’
myConnection.info.dbname
3.1.5.2 PostGIS availability from a notebook
• Test objective: Ensures the availability of PostGIS from a notebook environment

• Test Id: CANDELA_INT_TEST_3152_DEPLOYMENT_POSTGIS _COMPONENT
• Prerequisites:
o PostGIS component is deployed
• Test steps:
1 Start a notebook on the candela platform
2 Start a new python 3 notebook New notebook starts

import psycopg2
hostname = ' candela-postgis-service '
port = ' 65432'
username = 'candela'
3 No errors returned
password = 'melodiH2020'
database = 'ep'
myConnection = psycopg2.connect(
host=hostname, user=username,
password=password, dbname=database,
port=port)
Execute:
4 Returns ‘ep’
myConnection.info.dbname
3.1.6 MonetDB
3.1.6.1 MonetDB Rest service availability
• Test objective: Ensures the availability of MonetDB through the REST API
• Test Id: CANDELA_INT_TEST_3161_DEPLOYMENT_MONETDB _COMPONENT
• Prerequisites:
o Monetdb component is deployed
o Curl installed on local computer
• Test steps:
1 Launch a terminal
Execute the command:

curl -u candela-rest:Candela1234 -X GET \
Returns tables
2 'https://185.178.85.62/rest/monetdb_select?request= definition
select%20%2A%20from%20%20tables%20where%20s
ystem%20=%20false;'
Execute the command:

curl -u candela-rest:Candela1234 -X POST \ Returns tables
3
https://185.178.85.62/rest/monetdb_select \ definition
-F 'file=@/Path/to/file/request.sql'
• Request.sql contains:
Select * from tables where system = false;
3.1.6.2 MonetDB availability from a notebook
• Test objective: Ensures the availability of MonetDB from a notebook environment

• Test Id: CANDELA_INT_TEST_3162_DEPLOYMENT_MONETDB _COMPONENT
• Prerequisites:
o Monetdb component is deployed
• Test steps:
1 Start a notebook on the candela platform
2 Start a new python 3 notebook New notebook starts

import os
import pymonetdb
hostnm = "candela-monetdb-service"
prt =
os.getenv('CANDELA_MONETDB_SERVICE_SERVICE_P
2 ORT', None) Retruns 17
connection =
pymonetdb.connect(username="monetdb",
password="monetdb", hostname=hostnm, port=prt,
database="candela")
cursor = connection.cursor()
cursor.execute("select * from tables where system =
false;")
Execute: Returns tables

4
cursor.fetchall() definition
3.2 Notebook server
3.2.1 Notebook server availability
• Test objective: Ensures that user’s Notebook Server Instance starts

• Test Id: CANDELA_INT_TEST_321_DEPLOYMENT_NOTEBOOKSERVER_INSTANCE
• Prerequisites:
o JupyterHub component is deployed
• Test steps:
Access Notebook Server through direct URL The Notebook Server starts
1
<<GeoServer URL>> and is accessible
Your Notebook server

instance started and the web
interface is open
In the left panel, you have
access to /public /work and a
README file
2 Connect with your user credentials
3.2.2 Python kernel running
• Test objective: Ensures that the Python kernel runs correctly

• Test Id: CANDELA_INT_TEST_322_DEPLOYMENT_NOTEBOOKSERVER_KERNEL
• Prerequisites:
o CANDELA_INT_TEST_321_DEPLOYMENT_NOTEBOOKSERVER_INSTANCE successful
• Test steps:
Access Notebook Server through direct URL

1 The Notebook Server starts and is accessible
<<GeoServer URL>>
2 Connect with your user credentials User is connected to its Notebook Server
3 Open the test_32.ipynb [Annex I: Notebook

server test script] Notebook file
4 Click in the cell containing the Python code The code is executed and display “Python
kernel is running”
Click on the play button on top of the title
3.3 Data access
3.3.1 CreoDIAS data availability
• Test objective: Ensures that CreoDIAS data is accessible from the platform
• Test Id: CANDELA_INT_TEST_331_CREODIAS_DATA_CONNECTOR
• Test steps:
The /eodata folder contains data from all the

collections provided by CreoDIAS
Access the platform through MobaXterm

Connect to worker instances (candela-002,
1
candela-003)
List the content of /eodata folder
4 Processing services
In this section it is presented the first basic test of algorithms proposed by the partners. Those tests will
also test the custom Python library wpslib, the Notebook Server with the Python Kernel, and the
GeoServer processing services.
4.1 Optical change detection processing chain
4.1.1 Change detection processes run correctly
• Test objective: Ensures that the Change Detection, Change Index and Change Clustering
processes run correctly
• Test Id: CANDELA_INT_TEST_411_CHANGE_DETECTION_PROCESSES_SEPARATELY
• Test steps:

<<GeoServer URL>>
3 Go to the /work directory

Open the Notebook file test_41.ipynb [Annex
II: Change detection processing chain test
script]
4 Run the cell containing the Python code below The code is executed without error
#4.1.1.4 test the Change Detection Process The folder
/work/tests/test_change_detection_chain/
contains a recent file named
test_change_detection.tif
#4.1.1.5 test the Change Index Process The folder
test_change_index.tif
#4.1.1.6 test the Change Clustering Process
The folder
test_change_clustering.tif
4.1.2 Change detection processing pipeline runs correctly
• Test objective: Ensures that the Change Detection processing pipeline runs correctly
• Test Id: CANDELA_INT_TEST_412_CHANGE_DETECTION_PROCESSING_PIPELINE
• Test steps:

<<GeoServer URL>>

Open the Notebook file test_41.ipynb [Annex
II: Change detection processing chain test
script]
#4.1.2.4 test the Change Detection Processing The folder
Pipeline /work/tests/test_change_detection_pipeline
/ contains a recent file named
test_change_clustering.tif
4.2 Data split, merge, transformation
4.2.1 Processes Split Images and Merge Images run correctly
• Test objective: Ensures that the Split Image and Merge Images processes run correctly
• Test Id: CANDELA_INT_TEST_421_SPLIT_MERGE_IMAGE_PROCESSES
• Test steps:

<<GeoServer URL>>

Open the Notebook file test_42.ipynb [
Annex III: Data split, merge, transformation

test script]
#4.2.1.4 test the Split Images Process The folder /work/tests/test_SplitImages/
contains a /subtiles folder containing sub-
images
#4.2.1.5 test the Merge Images Process The folder /work/tests/test_MergeImages/
contains the resulting .tif file merged
4.2.2 Jpeg2Tiff process run correctly
• Test objective: Ensures that the Jpeg2Tiff process run correctly

• Test Id: CANDELA_INT_TEST_421_JP2TIFF_PROCESS
• Test steps:

<<GeoServer URL>>

Open the Notebook file test_42.ipynb [
Annex III: Data split, merge, transformation
test script]
#4.2.2.4 test the Jpeg2Tiff Process The folder /work/tests/test_jpeg2tiff/
contains a recent result.tif file
4.3 SAR change detection
4.3.1 SAR change detection process runs correctly
• Test objective: Ensures that the SAR change detection runs correctly
• Test Id: CANDELA_INT_TEST_431_SAR_CHANGE_DETECTION
• Test steps:

<<GeoServer URL>>
3 copy to the /public/SAR_CD directory to /work
4 Go to the /work/SAR_CD directory

Open the Notebook file Test_SAR_CD.ipynb
5 Run the cell containing the Python code The code is executed without error
The folder config contains two log files
The folder output/CD_MAPS contains the
image result
The folder output/CROP_IMG contains two
images
The folder output/LOGS/LOG contains
CANDELA_LOG.log file
The folder output/QLK contains two gtif and
two kml files
4.4 DMG process
4.4.1 DMG process run correctly
• Test objective: Ensures that the DMG process runs correctly

• Test Id: CANDELA_INT_TEST_441_DMG
• Test steps:

<<GeoServer URL>>
3 copy to the /public/EOminer directory to /work
4 Go to the /work/EOminer directory

Open the Notebook file Launch_eominer.ipynb
5 Run the cell containing the Python code The code is executed without error
The folder eominer_logs contain two log files
The folder eominer_output/S1_product
contains a file ingestion.sql, a file
EOLib_S1_product.xml and a lot of jpg
images
5 Integration deployment environment
The following table detail the Kubernetes configuration defined for the components of the platform. For
each component a Kubernetes service is defined to make the interaction with it from inside and outside
the environment possible. Volumes are also defined to make some data persistent (component
configuration, user data...).
The detailed configuration can be found on the Git repository of the project.
Table 3: Kubernetes configuration
Component Service name Port Volumes
GeoServer candela-geoserver-svc 8080 GeoServer_data

GeoServer_exchange_folder
Jupyter database candela-jupyterhub-db-svc 5432 Candela-db-manager
Jupyter candela-jupyterhub-svc 8080/30002 Candela-pod-manager
Keycloak candela-keycloak-svc 8080/30003 Candela-keycloakdata
The following deployment values are to be used in integration environment

Table 4: Integration environment configuration
Name Value
<<GeoServer URL>> https://185.x.y.z/geoserver/web/
<<GeoServer admin user>> To be created by integration test team

<<GeoServer admin
To be created by integration test team
password>>
<<Keycloak URL>> http://185.x.y.z/auth/admin/master/console/#/realms/master
<<JupyterHub URL>> http://185.x.y.z/hub/
<<Notebook Server URL>> http://185.x.y.z/hub/
<<test_user>> To be created by integration test team
<<test_user_pwd>> To be created by integration test team
6 Conclusion
This document presented a first version of the system integration and validation test plan. It describes
the system components and the interactions between them, the integration strategy, the environment
in which the components are tested and the integration tests that can be performed to validate the
system.
This first version mainly focusses on the technical basis of the CANDELA platform: the IaaS and PaaS
layers, the components providing generic services (authentication, computation capabilities, standard
interfaces…).
D3.5 v1 (released in November 2018) has been updated following the requirements of the first project
review meeting. This updated version includes new standard components integrated on the platform,
details on how the algorithms and partners’ applications are integrated in the platform, update of the
component interaction section or the description of new test cases. Annexes have been also
complemented with two new examples of python test scrips.
Next version should include a validation test plan related to the test cases running on the platform. First
on a technical basis, validating that the algorithms ran on the platform works properly, and then on
functional basis, validating that the algorithms fulfil the needs of the described use cases.
References
[1] GeoServer, http://geoserver.org, retrieved 2018/11/26
[2] Keycloak, https://www.keycloak.org/, retrieved 2018/11/26
[3] Jupyter Notebook, http://jupyter.org, retrieved 2018/11/26
[4] JupyterHub sub-project, http://jupyter.org/hub, retrieved 2018/11/26
[5] CreoDIAS Data Finder API https://creodias.eu/eo-data-finder-api-manual, retrieved 2018/11/26
[6] OWSLib API https://geopython.github.io/OWSLib/ , retrieved 2018/11/26
[7] CANDELA Deliverable D3.1 System architecture design and Operational scenarios document v1,
dated 2018/10/29
[8] F. Castel, “D3.3 CANDELA Platform v1”, Deliverable of the CANDELA project, 2018, retrieved
20190524
Annexes
Annex I: Notebook server test script
# coding: utf-8
# # A test to check that the Python kernel is running
print("Python kernel is running")
Annex II: Change detection processing chain test script
# coding: utf-8
#4.1.1.4 test the Change Detection Process
import wpslib
wpslogger = wpslib.getLogger(True)
identifier = 'candela:ChangeDetectionProcessing'
PROCESS_ID = "changedetection-timeseries"
inputs = [("IMAGES","/public/test-files/Images/Harbour/TimeSeries/"),
("PROCESS_ID",PROCESS_ID),
("OUTPUT_FOLDER","/work/tests/test_change_detection_chain/"),
("CONFIG_FOLDER","/work/tests/test_change_detection_chain/config/"),
("OUTPUT_FILENAME","test_change_detection.tif")
]
outputs = [('outpath',True),('logfiles',True)]
wpslogger.info('[ChangeDetectionTest]------- Starting ChangeDetection process test --

-----')
wpslib.runSingleProcess(identifier, inputs, outputs, wpslogger)
wpslogger.info('[ChangeDetectionTest] ------- ChangeDetection process test ended ----

---')
#4.1.1.5 test the Change Index process
identifier = 'candela:ChangeIndexProcessing'
PROCESS_ID = "changeindex-timeseries"
inputs =
[("IMAGE","/work/tests/test_change_detection_chain/test_change_detection.tif"),
("OUTPUT_FILENAME","test_change_index.tif")
]
wpslogger.info('[ChangeIndexTest]------- Starting ChangeIndex process test -------')
wpslogger.info('[ChangeIndexTest] ------- ChangeIndex process test ended -------')
#4.1.1.6 test the Change Clustering process
identifier = 'candela:ChangeClusteringProcessing'
PROCESS_ID = "changeclustering-timeseries"
inputs = [("IMAGE","/work/tests/test_change_detection_chain/test_change_index.tif"),
("OUTPUT_FILENAME","test_change_clustering.tif")
]
wpslogger.info('[ChangeClusteringTest]------- Starting ChangeClustering process test

-------')
wpslogger.info('[ChangeClusteringTest] ------- ChangeClustering process test ended --

-----')
#4.1.2.4 test the Change Detection Processing Pipeline
pipeline = []
identifier = 'candela:ChangeDetectionProcessing'
PROCESS_ID = "changedetection-timeseries"
inputs = [("IMAGES","/public/test-files/Images/Harbour/TimeSeries/"),
("OUTPUT_FOLDER","/work/tests/test_change_detection_pipeline/"),
("CONFIG_FOLDER","/work/tests/test_change_detection_pipeline/config/"),
("OUTPUT_FILENAME","test_change_detect.tif")
]
process1 = {
"PROCESS_ID": PROCESS_ID,
"IDENTIFIER": identifier,
"INPUTS": inputs,
"OUTPUTS": outputs,
"PREVIOUS_PROCESS": "",
"INPUT":"",
"PREVIOUS_VALUE": ""
}
pipeline.append(process1)
identifier = 'candela:ChangeIndexProcessing'
PROCESS_ID = "changeindex-timeseries"
inputs = [("PROCESS_ID",PROCESS_ID),
("OUTPUT_FILENAME","test_change_index.tif")
]
process2 = {
"INPUTS": inputs,
"OUTPUTS": outputs,
"PREVIOUS_PROCESS": "changedetection-timeseries",
"INPUT":"IMAGE",
"PREVIOUS_VALUE": "outpath"
}
identifier = 'candela:ChangeClusteringProcessing'
PROCESS_ID = "changeclustering-timeseries"
inputs = [("PROCESS_ID",PROCESS_ID),
("OUTPUT_FILENAME","test_change_clustering.tif")
]
process3 = {
"INPUTS": inputs,
"OUTPUTS": outputs,
"PREVIOUS_PROCESS": "changeindex-timeseries",
"INPUT":"IMAGE",
"PREVIOUS_VALUE": "outpath"
}
wpslogger.info('[ChangeDetectionPipeline]------- Starting ChangeDetection Pipeline

Test -------')
wpslib.runSinglePipeline(pipeline, wpslogger)
wpslogger.info('[ChangeDetectionPipeline] ------- ChangeDetection Pipeline Test ended

-------')
Annex III: Data split, merge, transformation test script
# coding: utf-8
#4.2.1.4 test the Split Images Process
import wpslib
identifier = 'candela:SplitImagesProcessing'
inputs = [("IMAGES","/public/test-files/Images/IMAGES_TerraNIS/im_2016.tif"),
("PROCESS_ID","test-splitimages-im_2016"),
("OUTPUT_FOLDER","/work/tests/test_SplitImages/"),
("N_SPLITS","6")
]
wpslogger.info('[SplitImagesTest]-- Starting SplitImages process test --')
execution = wpslib.runSingleProcess(identifier, inputs, outputs, wpslogger)
wpslogger.info('[SplitImagesTest] -- SplitImages process test ended ----')
#4.2.1.5 test the Merge Images Process
identifier = 'candela:MergeImagesProcessing'
inputs = [("IMAGES","/work/tests/test_SplitImages/subtiles"),
("IMAGE_NAME","im_2016.tif"),
("PROCESS_ID","test-mergeimages-im_2016"),
("OUTPUT_FOLDER","/work/tests/test_MergeImages/")
]
wpslogger.info('[MergeImagesTest]--- Starting MergeImages process test --')
wpslogger.info('[MergeImagesTest] -- MergeImages process test ended ----')
#4.2.2.4 test the Jpeg2Tiff Process
identifier = 'candela:Jpeg2TiffProcessing'
inputs = [("IMAGES","/eodata/Sentinel-
2/MSI/L1C/2018/09/11/S2A_MSIL1C_20180911T105621_N0206_R094_T30TXQ_20180911T131820.SAF
E/GRANULE/L1C_T30TXQ_A016822_20180911T110521/IMG_DATA/"),
("PROCESS_ID","test-jpeg2tiff"),
("OUTPUT_FOLDER","/work/tests/test_jpeg2tiff/"),
("OUTPUT_FILENAME","result.tif")
]
wpslogger.info('[Jpeg2TiffTest]----- Starting Jpeg2Tiff process test -----')
wpslogger.info('[Jpeg2TiffTest] ------- Jpeg2Tiff process test ended -----')
Annex IV: Sar detection change test script
import os
import wpslib
#images must be sentinel 1 SAR images with sensor mode SM

#images must have the same footprint
#images must have the same polarisation
IMAGES = "/work/SAR_CD/input"
OUTPUT_FOLDER = "/work/SAR_CD/output"
PROCESS_ID = "test-SarCD-jupyter"
#below this point are optionnal parameters
#config folder contains technical logs
CONFIG_FOLDER = "/work/SAR_CD/config"
#pixels dimension of the tiles in which images will be divided and CD performed;
#default value is 30 (recommended).
TILE_SIZE = "30"
#threshold value in the [0.1, 1] interval for the CD analysis;
#if the difference between the 2 NN outputs on a given tile is above (or equal to)
#this value, the scene in said tile will be marked as “changed” (red);
#“not changed” (blue) otherwise. The higher the threshold, the higher the probability
#that a change occurred has to be for the tile to be marked as “changed”.
THRESHOLD = "0.6"
#defines the information levels to be included in the LOG file
#(ALL, DEBUG, INFO, WARN, ERROR, FATAL, OFF, TRACE)
LOG_FILE_LEVEL = "ALL"
#subfolder in the output folder in which quicklook are created
QLK_DIR = "QLK"
#subfolder in the output folder in which crop images are stored
CROP_IMAGES = "CROP_IMG"
#subfolder in the output folder containing the change detection result
CHANGE_DETECTION_MAPS = "CD_MAPS"
#subfolder in the output folder containing log file
LOG_DIR = "LOGS"
identifier = 'candela:SarChangeDetectionProcessing'
MEMORY_REQUEST = "10000Mi"
inputs = [("IMAGES",IMAGES),
("OUTPUT_FOLDER",OUTPUT_FOLDER),
("CONFIG_FOLDER",CONFIG_FOLDER),
("TILE_SIZE",TILE_SIZE),
("THRESHOLD",THRESHOLD),
("LOG_FILE_LEVEL",LOG_FILE_LEVEL),
("QLK_DIR",QLK_DIR),
("CROP_IMAGES",CROP_IMAGES),
("CHANGE_DETECTION_MAPS",CHANGE_DETECTION_MAPS),
("LOG_DIR",LOG_DIR),
("MEMORY_REQUEST",MEMORY_REQUEST)
]
Annex V: EOminer DMG test script
import os
import wpslib
INPUT = "/eodata/Sentinel-
1/SAR/GRD/2019/03/08/S1B_IW_GRDH_1SDV_20190308T171625_20190308T171650_015266_01C90B_2
269.SAFE"
OUTPUT = "/work/EOminer/eominer_output"
LOG_FILE_PATH = "/work/EOminer/eominer_logs"
identifier = 'candela:EOminerProcessing'
PROCESS_ID = "test-eominer-jupyter"
MEMORY_REQUEST = "10000Mi"
#Size of the tiles
TILE_SIZE = "120"
#gridLevels choose from (1,2,3)
GRID_LEVELS = "1"
#productType choose from (S1,S2)
PRODUCT_TYPE = "S1"
#featureMethods combine from (GLM,WLD,CHIS,GLC)
FEATURE_METHODS = "GLM"
inputs = [("INPUT",INPUT),
("OUTPUT",OUTPUT),
("LOG_FILE_PATH",LOG_FILE_PATH),
("MEMORY_REQUEST",MEMORY_REQUEST),
("TILE_SIZE",TILE_SIZE),
("GRID_LEVELS",GRID_LEVELS),
("PRODUCT_TYPE",PRODUCT_TYPE),
("FEATURE_METHODS",FEATURE_METHODS)
]
Annex VI: CANDELA Service Providers Guide for service integration
How to package your application to create a service that will be integrated to

Candela Platform
This tutorial is written for helping the partners to package (dockerise) their applications, making them
usable services, in order to allow their integration to Candela platform by Atos Fr.
How the service will be exposed on Candela Platform

To understand how to package a service, it is helpful to have a view on how it will be exposed on the
Candela platform, and how users will be able to run it.
Please read “3.3 Processing Pipeline” section of the “CANDELA D3.1 System Architecture design and
operational scenarios” document to better understand what is Geoserver.
Services are integrated to Geoserver, and we can request it through a WPS Standard API.
For instance, a getCapabilities will return all the services that are exposed by the Geoserver:
<wps:ProcessOfferings>
<wps:Process wps:processVersion="1.0.0">
<ows:Identifier>candela:ChangeClusteringProcessing</ows:Identifier>
<ows:Title>ChangeClustering</ows:Title>
<ows:Abstract><p>Computes a clustering of change types based
on a change index image.</p></ows:Abstract>
</wps:Process>
<ows:Identifier>candela:ChangeDetectionProcessing</ows:Identifier>
<ows:Title>ChangeDetection</ows:Title>
<ows:Abstract><p>Computes unsupervised change detection on a
time-series of sentinel 2 images.</p></ows:Abstract>
</wps:Process>
<ows:Identifier>candela:ChangeIndexProcessing</ows:Identifier>
<ows:Title>ChangeIndex</ows:Title>
<ows:Abstract><p>Computes unsupervised change index on a
time-series change detection maps.</p></ows:Abstract>
</wps:Process>
<ows:Identifier>candela:GeotiffCollectorProcessing</ows:Identifier>
<ows:Title>GeotiffCollector</ows:Title>
<ows:Abstract><p>Collect GeoTiff files into a
folder.</p></ows:Abstract>
</wps:Process>
<ows:Identifier>candela:Jpeg2TiffProcessing</ows:Identifier>
<ows:Title>Jpeg2Tiff</ows:Title>
<ows:Abstract><p>Convert JPEG 2000 files to a GeoTiff
format.</p></ows:Abstract>
</wps:Process>
<ows:Identifier>candela:TestProcessing</ows:Identifier>
<ows:Title>Test</ows:Title>
<ows:Abstract><p>Computes unsupervised change detection on a
time-series of sentinel 2 images.</p></ows:Abstract>
</wps:Process>
</wps:ProcessOfferings>
From this, we can perform a describeProcess request that will return the description of a process, which
inputs are expected, which outputs it returns…
<?xml version="1.0" encoding="UTF-8"?>

<wps:ProcessDescriptions xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:wps="http://www.opengis.net/wps/1.0.0"
xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xml:lang="en" service="WPS" version="1.0.0" xsi:schemaLocation="http://www.opengis.net/wps/1.0.0
http://schemas.opengis.net/wps/1.0.0/wpsAll.xsd">
<ProcessDescription wps:processVersion="1.0.0" statusSupported="true" storeSupported="true">
<ows:Title>ChangeDetection</ows:Title>
<ows:Abstract><p>Computes unsupervised change detection on a time-series of
sentinel 2 images.</p></ows:Abstract>
<DataInputs>
<Input maxOccurs="1" minOccurs="1">
<ows:Identifier>IMAGES</ows:Identifier>
<ows:Title>IMAGES</ows:Title>
<ows:Abstract>Path to the sentinel 2 time-series folder, each image should be in
GeoTiff format</ows:Abstract>
<LiteralData>
<ows:AnyValue/>
</LiteralData>
</Input>
<ows:Identifier>USERNAME</ows:Identifier>
<ows:Title>USERNAME</ows:Title>
<ows:Abstract>User name</ows:Abstract>
<LiteralData>
<ows:AnyValue/>
</LiteralData>
</Input>
<ows:Identifier>OUTPUT_FILENAME</ows:Identifier>
<ows:Title>OUTPUT_FILENAME</ows:Title>
<ows:Abstract>User defined OUTPUT_FILENAME filename. If empty, default file
basename is detected_changes.tif</ows:Abstract>
<LiteralData>
<ows:AnyValue/>
<DefaultValue/>
</LiteralData>
</Input>
<ows:Identifier>LOG_FILE</ows:Identifier>
<ows:Title>LOG_FILE</ows:Title>
<ows:Abstract>name of the output log file. Must have .log extension. If empty,
default file basename is debug.log</ows:Abstract>
<LiteralData>
<ows:AnyValue/>
<DefaultValue/>
</LiteralData>
</Input>
</DataInputs>
<ProcessOutputs>
<Output>
<ows:Identifier>filePath</ows:Identifier>
<ows:Title>filePath</ows:Title>
<LiteralOutput/>
</Output>
<Output>
<ows:Identifier>logFiles</ows:Identifier>
<ows:Title>logFiles</ows:Title>
<LiteralOutput/>
</Output>
</ProcessOutputs>
</ProcessDescription>
</wps:ProcessDescriptions>
Here we can see that the expected inputs will be IMAGES, USERNAME, OUTPUT_FILENAME (optional),
and LOG_FILE (optional)
Those inputs parameters will be set as environment variables to the docker container that will run the
service.
Let’s see an example of request executeProcess that runs a service:
<?xml version="1.0" encoding="UTF-8"?>
<wps:Execute xmlns:wps="http://www.opengis.net/wps/1.0.0"
xmlns="http://www.opengis.net/wps/1.0.0" xmlns:gml="http://www.opengis.net/gml"
xmlns:ogc="http://www.opengis.net/ogc" xmlns:ows="http://www.opengis.net/ows/1.1"
xmlns:wcs="http://www.opengis.net/wcs/1.1.1" xmlns:wfs="http://www.opengis.net/wfs"
xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
version="1.0.0" service="WPS" xsi:schemaLocation="http://www.opengis.net/wps/1.0.0
http://schemas.opengis.net/wps/1.0.0/wpsAll.xsd">
<wps:DataInputs>
<wps:Input>
<ows:Identifier>IMAGES</ows:Identifier>
<wps:Data>
<wps:LiteralData>/data/exchange/candela/common_data/test-
files/Images/Harbour/TimeSeries</wps:LiteralData>
</wps:Data>
</wps:Input>
<wps:Input>
<ows:Identifier>USERNAME</ows:Identifier>
<wps:Data>
<wps:LiteralData>a.tonneau</wps:LiteralData>
</wps:Data>
</wps:Input>
<wps:Input>
<ows:Identifier>OUTPUT_FILENAME</ows:Identifier>
<wps:Data>
<wps:LiteralData>changedetection_result_anso.tif</wps:LiteralData>
</wps:Data>
</wps:Input>
<wps:Input>
<ows:Identifier>LOG_FILE</ows:Identifier>
<wps:Data>
<wps:LiteralData>logs_anso.log</wps:LiteralData>
</wps:Data>
</wps:Input>
</wps:DataInputs>
<wps:ResponseForm>
<wps:ResponseDocument />
</wps:ResponseForm>
</wps:Execute>
Geoserver, receiving this request, will set up proper user environment (create folders, permissions,
environment variables…) and will run the corresponding docker image with that context. From the
docker container the appropriate volumes will be mounted, and the parameters are read from
environment variables.
How to properly package an app to a service
You need to create a Docker configuration in order to build a Docker image that will be used to run a
Docker Container on demand.
An example of packaging of a simple service ExampleService is given here:
Here this is a Python app. The main code of the app is located in main.py, and the libraries are located
in /common_lib
The Dockerfile defines the configuration for creating a Docker image:
For more information on how to write a Dockerfile, see https://docs.docker.com/develop/develop-

images/dockerfile_best-practices/#general-guidelines-and-recommendations
To find your appropriate docker base image, this is https://hub.docker.com/
A README file is also expected to explain how to run the service, what’s the meaning of each
parameter…
Some scripts are needed too but can be written by Atos Fr for integrating the service:
- build.sh: is performing the docker commands to build and push the docker image to the local
registry. See build.sh example
- run.sh: is a test script, that runs a docker container with given parameters. This is needed to
test that the deployment of the service has been correctly performed, before integration to
Geoserver. See run.sh example
Best practices
• Inputs parameters
Configuration or internal parameters can be written into a parameters file copied in the docker container
to be read.
For input parameters that will be given by the user who wants to run a service: This is important to
carefully document the needed inputs parameters, and, from the application, to read them from
environment variables. You can see in run.sh example how we set the parameters.
See this example from the main.py that is the main script of the application ExampleService, that shows
how we retrieve the inputs parameters from the environment:
• Logs
Another important thing is to set proper logging system: it is mandatory for the app to write logs in a log
file that is stored in the file system and that is accessible outside the docker container, therefore the
user can get logs during and after its service run.
A convention is to set up LOG_FILE, CONFIG_FOLDER and OUTPUT environment variables, and to create
the log file with the name LOG_FILE in the folder CONFIG_FOLDER.
See this example from the params.py of the application ExampleService, that shows how we configure
the logging in the app:
And then how we log something in the app:
The OUTPUT variable is used as the output folder of the algorithm.
Examples
• build.sh example
• run.sh example

System Integration and Validation Test Plan

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

System Integration and Validation Test Plan

Uploaded by

Copyright:

Available Formats

Ref.

Copernicus Access Platform Intermediate Layers Small Scale Demonstrator

D3.5 System Integration and Validation Test Plan

Related WP WP3 Document Reference D3.5

1.1 System components

Figure 1: CANDELA system components overview

CANDELA system components are the following:

1.2 Standard components

Figure 2: JupyterHub global architecture

1.2.4 CreoDIAS data connector

Figure 3: Collections available through s3fs mount point

MonetDB is an open-source column-oriented database management system. It was designed to provide

1.3 Integration of applications and algorithms

1.3.1 Semantic search application

1.3.2 Integration of data analytics algorithms

1.3.3 The semantic classification tool

Figure 5: Architecture of DLR application

1.4 User access component – Notebook environment

Figure 6: Notebook environment architecture

Kernel Name Coding Language Additional libraries

Python 3 Python 3.6.5 wpslib

Python 3 Python 3.6.5 creodiaslib

1.5 Component interactions

Figure 7: Components interactions overview

CANDELA is deployed on the CreoDIAS [5] cloud environment provided by CloudFerro.

2.1 Integration process

The integration process defines roles and a workflow for integration.

2.1.1 Roles and responsibilities

Figure 8: Overview of the integration and validation workflow

2.1.2.1 Development phase

2.1.2.3 Functional validation phase

Figure 9: Integration steps

2.3.1 CANDELA infrastructure

Access from 1 vcpu / 1 Go

candela-001 candela-002 candela-003

(K8s Master) (K8s Worker) (K8s Worker)

Figure 10: Architecture of the CANDELA infrastructure

2.3.2 Infrastructure instances

The following schema presents the different required infrastructures. The

3.1.1.1 GeoServer availability

• Test objective: Ensures the availability of the GeoServer deployed

Access GeoServer deployment through direct URL

3.1.1.2 GeoServer WPS processes availability

1 Access GeoServer deployment through direct GeoServer is accessible

2 Connect as an admin user with the following User is connected to

3 Select WPS process list page The following processes are

• Interfaces under tests: WPS Process integration

3.1.2.1 Keycloak availability

• Test objective: Ensures the availability of the Keycloak deployed

Access Keycloak deployment through direct

3.1.2.2 Keycloak administration interface availability

• Test objective: Ensures the connection to Keycloak administration interface

1 Access Keycloak deployment through direct Keycloak is accessible

2 Connect as an admin user with the following User is connected to Keycloak

3.1.3.1 JupyterHub availability

• Test objective: Ensures the availability of the JupyterHub deployed

Access JupyterHub deployment through direct

3.1.3.2 JupyterHub administration interface availability

• Test objective: Ensures the connection to JupyterHub administration interface

1 Access JupyterHub deployment through direct JupyterHub is accessible

2 Connect as an admin user with the following User is connected to

3.1.4.1 Semsearch availability

• Test objective: Ensures the availability of Semsearch