You are on page 1of 19

***DRAFT*** ***DRAFT***

River Invertebrates Classification Tool (RICT)

User Documentation

1. Guide to Predict and Classify (Interactive Mode)


1. Introduction

This document provides a guide to the interactive Predict and Classify process within RICT.

It contains:

a) a summary of the data that needs to be provided to RICT


b) options for how this data can be provided
c) an overview of how to carry out a P&C run
d) references to examples of actual runs carried out for common scenarios, including
screenshots

Note that it is focused on how the tool operates rather than the science/rationale behind it.

2. Summary of Data Required by RICT for a Predict &Classify Run

In order to carry out a Predict & Classify run, the following data needs to be
provided/specified:

- Environmental Variable (EV) data for each relevant site


- Observed Index values for each relevant site/index
- Settings for the Run
- Bias data for each Index/Season
- Limits for each Index to be classified

a) Environmental Variable (EV) data for each relevant site

EV data needs to be provided in order to feed into the Prediction process which
calculates ‘expected’ index values.

Currently, the following data is required for each site (although this may change in
future if new sets of Predictive Environmental Variables (PEVs) are created):

- Grid Reference (NGR Letters/Easting/Northing)


- Altitude
- Slope
- Discharge Category
- Velocity Category (if no Discharge Category)
- Distance from Source
- Mean Width
- Mean Depth
- Alkalinity
- Total Hardness (if no Alkalinity)
- Calcium (if no Alkalinity or Hardness)
- Conductivity (if no Alkalinity or Hardness or Calcium)
- % cover of boulders & cobbles
- % cover of pebbles & gravel
- % cover of sand
- % cover of silt & clay
Note that, since RICT caters for multi-year runs, a Year needs to be provided for each
Site entry.

b) Observed Index values for each relevant site/index

Sites are classified for each relevant index by dividing Observed Value by Expected
Value to obtain an Environmental Quality Index (EQI). This is then compared against
limits to obtain a classification status (e.g. High).

Therefore, Observed Values need to be provided for each Site for each Index that is
to be classified.

As for EVs, Year needs to be provided for each Site/Index

c) Settings for the Run

In order for the Prediction and Classification run to be carried out in the way the user
wishes, a number of settings need to be provided. The key ones are:

- End Group Set Id (e.g. 3 = GB New Model – 43 End Group Set)


- Season Id (e.g. 5 = Spring and Autumn)
- Indices Set (e.g. 2 = Original BMWP + MINTA)
- PEV Set (e.g. 1 = GB)
- Multi Year Flag (Y/N)
- Reference Adjustment Flag (Y/N)
- Taxonomic Prediction Flag (Y/N)
- Taxonomic Prediction Level (e.g. TL1 = BMWP Family Level)
- Output File Prefix
- Run Name

Note that Number of Iterations can also be provided as a setting, but if not provided,
it will be set to the default value held within the administration section.

d) Bias data for each Index/Season

The classification process takes account of Bias when varying the Observed Values
and so Bias Values need to be provided for each Index for the Season Code relevant
to the Run.

If no Bias values are provided for an Index then zero used … what if no Bias file?
E.g. defaults file does not exist?

e) Limits for each Index to be classified

The classification process compares EQIs against limits and so the limits to be used
for the run need to be provided for each index.
3. Options for Providing the Data

The input/output formats for RICT are XML. However, there are a number of options for
providing/specifying the required data.

3.1 Environmental Variable Data

a) Create an XML File(s) and Upload

The XML file(s) must conform to the specified XML schema and an
example of a valid file for one site is provided in Appendix 1.

The user can either create a new XML file(s) or amend an existing file.
Information about how this can be done is being provided separately but, for
example, ‘Notepad++’ can easily be used to open, amend and save an
existing XML file.

Once created the file(s) can be uploaded to RICT during the run – see
Section 4.

b) Load in an EV file in Existing RIVPACS Format

An EV file in the existing RIVPACS format (with extension of .asc) can be


uploaded to RICT during the run – see Section 4. RICT then automatically
converts the RIVPACS format file to the required RICT XML format.

Note that, as this facility is specific to the existing RIVPACS EV format, it


will not be usable for any new EVs that may be introduced in future.

c) Create XML file(s) from Excel

It is expected that many users will maintain their EV data in Excel, and so a
special RICT Data Entry spreadsheet has been created that enables XML
files to be generated from Excel that can be processed by RICT.

More details are being provided on this separately but, briefly, the EV data
has to be entered/copied into the appropriate columns in the ‘Environmental
Variables’ worksheet and then the ‘Start Here’ worksheet is used to generate
the XML format file.

The generated file(s) can then be uploaded to RICT during the run – see
Section 4.

Note that the RICT Data Entry Spreadsheet has been set up so that data from
an existing RIVPACS EV file can be cut and pasted into the spreadsheet if
required.

Note also that the RICT Data Entry Spreadsheet is only applicable for
existing EVs. If new EVs are introduced in future, then a new Data Entry
Spreadsheet will be required and a change made to RICT to recognise the
new data.

d) Manually Enter Data

Data can be manually entered during the run – see Section 4.

Note that manually entered data is subsequently saved as an XML format


file. Therefore, this functionality could be used to create an initial XML file
which could then be amended as required for future runs.
3.2 Observed Index Values

a) Create an XML File(s) and Upload

As for 3.1 a). An example file is provided in Appendix 2.

c) Load in an Observed Index file in Existing RIVPACS Format

An Observed Index file in the existing RIVPACS format (with extension of


.oe1) can be uploaded to RICT during the run – see Section 4.

Note that, as this facility is specific to the existing RIVPACS Observed


Index format, it will not be usable for any new Indices that may be
introduced in future.

c) Create XML file(s) from Excel

As for 3.1 c) except for entering/copying the data into the ‘Observed and
Expected’ worksheet.

d) Manually Enter Data

As for 3.1 d)

3.3 Settings for the Run

a) Use the Settings in the Default Settings File

The system has a default settings file defined which contains the settings
that will be used for a run if no other settings file is provided – see Appendix
3 for example.

Before scheduling the run it is then possible to amend individual settings as


required. Therefore, if there are only a couple of settings that are different
from the defaults then it is easiest to use the default settings and then amend
them prior to scheduling a run. Note that any changes made are applicable
for that run only and do not change the underlying default settings file.

b) Change the Default Settings File

It is possible to change the file that is defined as the Default Settings file via
the Administration function (see separate guide). This might be useful if a
number of future runs are to have the same settings.

Note that it will still be possible to amend individual settings prior to


scheduling a run.

c) Upload a Settings File

Rather than change the Default Settings file, it is possible to upload a


Settings File for use during the particular run. This would be useful if
multiple settings for the run are different from the defaults.

The user can either create a new Settings file or amend an existing file.
Information about how this can be done is being provided separately but, for
example, ‘Notepad++’ can easily be used to open, amend and save an
existing XML file.
Note that it will still be possible to amend individual settings prior to
scheduling a run.

Note also that the Settings for a particular run are subsequently saved as an
XML format file. Therefore, a new Settings file could be created by using
the Default Settings file and then amending the required settings prior to
scheduling the run.

3.4 Bias data for each Index/Season

a) Use the Default Bias File

The system has a default bias file defined which contains the bias values
that will be used for a run if no other bias file is provided – see Appendix 4
for example.

Before scheduling the run it is then possible to amend bias values as


required. Note that any changes made are applicable for that run only and do
not change the underlying default bias file.

b) Change the Default Bias File

It is possible to change the file that is defined as the Default Bias file via the
Administration function (see separate guide). This might be useful if a
number of future runs are to have the same bias values.

Note that it will still be possible to amend bias values prior to scheduling a
run.

c) Upload a Bias File

Rather than change the Default Bias file, it is possible to upload a Bias File
for use during the particular run. This would be useful if the required bias
values are significantly different from the defaults.

The user can either create a new Bias file or amend an existing file.
Information about how this can be done is being provided separately but, for
example, ‘Notepad++’ can easily be used to open, amend and save an
existing XML file.

Note that it will still be possible to amend bias values prior to scheduling a
run.

Note also that the bias values for a particular run are subsequently saved as
an XML format file. Therefore, a new Bias file could be created by using the
Default Bias file and then amending the required values prior to scheduling
the run.

3.5 Limits for each Index to be classified

a) Use the Default Limits File

The system has a default limits file defined which contains the limits that
will be used for a run if no other limits file is provided – see Appendix 5 for
example.

Before scheduling the run it is then possible to amend limits as required.


Note that any changes made are applicable for that run only and do not
change the underlying default limits file.
b) Change the Default Limits File

It is possible to change the file that is defined as the Default Limits file via
the Administration function (see separate guide). This might be useful if a
number of future runs are to have the same limits.

Note that it will still be possible to amend limits prior to scheduling a run.

c) Upload a Limits File

Rather than change the Default Limits file, it is possible to upload a Limits
File for use during the particular run. This would be useful if the required
limits are significantly different from the defaults.

The user can either create a new Limits file or amend an existing file.
Information about how this can be done is being provided separately but, for
example, ‘Notepad++’ can easily be used to open, amend and save an
existing XML file.

Note that it will still be possible to amend limits prior to scheduling a run.

Note also that the limits for a particular run are subsequently saved as an
XML format file. Therefore, a new Limits file could be created by using the
Default Limits file and then amending the required values prior to
scheduling the run.
4. Overview of Carrying out a Predict and Classify Run

The stages for carrying out a Predict and Classify Run are as follows:

a) Access and Log In to RICT (if not already done)

Full details of how to access and log in to RICT are provided in a separate document.
However, it basically involves typing in the relevant URL to your browser and then
entering your Username and Password.

This will display the Home screen:


b) Navigate to the Run Menu

Click on Run Menu which will result in a page similar to the following being
displayed:

c) Create a New Run

Click on ‘Create a New Run’ which will result in the following page being displayed:
d) Select Run Type

Click on ‘Predict and Classify’ which will result in the following page being
displayed:

e) Upload Files (if required)

If any files are to be uploaded then click on Browse, which will result in a page
similar to the following being displayed:
Then navigate to the required file using normal Windows functionality and either
double-click on the filename or select the filename and click on Open.

The file will then be uploaded to the RICT input area and a page similar to the
following will be displayed. As part of the upload RICT will check to see if the
format is recognised and, if so, the type of file will be displayed.

The above process should be repeated for all files that are to be uploaded.

Once all files have been uploaded then click on Continue. This will result in the files
being processed and validated. A page similar to the following is then displayed:
f) Amend any Data

At this point there are options to:

- Amend the Data provided to the run


- Amend the Settings applicable to the run
- Amend the Limits applicable to the run
- Amend the Bias data applicable to the run

i) Amend the Data provided to the run

This option will normally be used to manually enter data but can also be
used to amend any data that has been loaded or add more data files to the
run.

ii) Amend the Settings applicable to the run

The settings for the run will either have been taken from an uploaded
settings file or the default settings file if no file has been uploaded.

This option can be used to amend the settings if required. Note that these
will only be applicable for the current run. Also note that the settings used
will be saved in a settings file that can then be used for future runs if
required.

iii) Amend the Limits applicable to the run

As for ii) above

iv) Amend the Bias data applicable to the run

As for ii) above

g) Schedule the Run

Once any required data has been amended, then click on Schedule Run.

This will result in the run being scheduled and the Run Menu being displayed with
the new run at the top – see below.

Note that there is an option to delay the run for a specified period if necessary (e.g. if
it is a large run that is best scheduled outwith normal hours)

The run will initially be displayed with an ‘in progress’ icon and the page will refresh
automatically until the run is complete, when a ‘complete’ icon will be displayed.
Run In progress:

Run complete:
h) View/Extract Results

Once the job is complete then the results can be viewed/extracted as follows:

- Reports

Access to the reports is via the shortcuts menu:

Detail to be added…

- Visualise

If there is an internet connection, then the results can be viewed (using a


google maps interface) via the shortcuts menu:
This will result in a page similar to the following being displayed:

Note that a more detailed guide of the Run Menu is provided in a separate document.

5. Examples of Predict & Classify Runs

A number of Predict & Classify examples are being prepared:

Example 1 - Upload XML EVs and OE files (use defaults for rest)

Example 2 - Manual Input of EVs and OEs (use defaults for rest)
Appendix 1 – Sample XML Environmental Variable File
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Datasets NS1:noNamespaceSchemaLocation="ev.xsd" xmlns:NS1="http://www.w3.org/2001/XMLSchema-instance">
<Creator>WQMASTER</Creator>
<Creation_Date>2008-02-14</Creation_Date>
<Name>Single Site</Name>
<Dataset ID="9875" Year="2007">
<Name>9875</Name>
<EV ID="NGR_LETTERS">
<Description>NGR_LETTERS</Description>
<Value>NT</Value>
</EV>
<EV ID="NGR_EAST">
<Description>NGR_EAST</Description>
<Value>08192</Value>
</EV>
<EV ID="NGR_NORTH">
<Description>NGR_NORTH</Description>
<Value>36934</Value>
</EV>
<EV ID="ALTITUDE">
<Description>ALTITUDE</Description>
<Value>190</Value>
</EV>
<EV ID="SLOPE">
<Description>SLOPE</Description>
<Value>1.1</Value>
</EV>
<EV ID="DISCHARGE">
<Description>DISCHARGE</Description>
<Value>3</Value>
</EV>
<EV ID="DIST_FROM_SOURCE">
<Description>DIST_FROM_SOURCE</Description>
<Value>11.4</Value>
</EV>
<EV ID="MEAN_WIDTH">
<Description>MEAN_WIDTH</Description>
<Value>2.625</Value>
</EV>
<EV ID="MEAN_DEPTH">
<Description>MEAN_DEPTH</Description>
<Value>40</Value>
</EV>
<EV ID="ALKALINITY">
<Description>ALKALINITY</Description>
<Value>80.9581</Value>
</EV>
<EV ID="BOULDER_COBBLES">
<Description>BOULDER_COBBLES</Description>
<Value>12.6667</Value>
</EV>
<EV ID="PEBBLES_GRAVEL">
<Description>PEBBLES_GRAVEL</Description>
<Value>46.6667</Value>
</EV>
<EV ID="SAND">
<Description>SAND</Description>
<Value>30.8333</Value>
</EV>
<EV ID="SILT_CLAY">
<Description>SILT_CLAY</Description>
<Value>9.8333</Value>
</EV>
</Dataset>
</Datasets>
Appendix 2 – Sample XML Observed/Expected Index File
<Datasets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="oei.xsd">
<Creator>WQMASTER</Creator>
<Creation_Date>2008-03-06</Creation_Date>
<Name>SEPA_3</Name>
<Dataset Site_ID="9875" Year="2007">
<Index ID="ASPT" Name="ASPT">
<Observed_Value>5.655172413793103448</Observed_Value>
</Index>
<Index ID="BMWP" Name="BMWP">
<Observed_Value>164</Observed_Value>
</Index>
<Index ID="NTAXA" Name="NTAXA">
<Observed_Value>29</Observed_Value>
</Index>
</Dataset>
<Dataset Site_ID="10480" Year="2007">
<Index ID="ASPT" Name="ASPT">
<Observed_Value>4.25</Observed_Value>
</Index>
<Index ID="BMWP" Name="BMWP">
<Observed_Value>51</Observed_Value>
</Index>
<Index ID="NTAXA" Name="NTAXA">
<Observed_Value>12</Observed_Value>
</Index>
</Dataset>
<Dataset Site_ID="11030" Year="2007">
<Index ID="ASPT" Name="ASPT">
<Observed_Value>6.739130434782608696</Observed_Value>
</Index>
<Index ID="BMWP" Name="BMWP">
<Observed_Value>155</Observed_Value>
</Index>
<Index ID="NTAXA" Name="NTAXA">
<Observed_Value>23</Observed_Value>
</Index>
</Dataset>
</Datasets>
Appendix 3 – Sample Settings File
<Datasets xmlns:xdb="http://xmlns.oracle.com/xdb" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="settings.xsd">
<Dataset ID="0" Name="Rict">
<Setting Name="End_Group_Set">
<Value>3</Value>
</Setting>
<Setting Name="Season">
<Value>5</Value>
</Setting>
<Setting Name="Indices_Set">
<Value>2</Value>
</Setting>
<Setting Name="PEV_Set">
<Value>1</Value>
</Setting>
<Setting Name="Output_File_Prefix">
<Value>(Date)_rict_(Run_ID)_</Value>
</Setting>
<Setting Name="Run_Name">
<Value>Sepa_(Run_ID)</Value>
</Setting>
<Setting Name="Multi-Year">
<Value>N</Value>
</Setting>
<Setting Name="Ref Adjust">
<Value>Y</Value>
</Setting>
<Setting Name="Predict_Taxa">
<Value>N</Value>
</Setting>
<Setting Name="Predict_Taxonomic_Level">
<Value>TL1</Value>
</Setting>
<Setting Name="Simulation Iterations">
<Value>500</Value>
</Setting>
</Dataset>
</Datasets>
Appendix 4 – Sample Bias File
<?xml version="1.0" encoding="WINDOWS-1252" standalone='no'?>
<Datasets NS0:noNamespaceSchemaLocation="bias.xsd" xmlns:NS0="http://www.w3.org/2001/XMLSchema-instance">
<Dataset Index_Name="NTAXA" Season_ID="1">
<Value>1.62</Value>
</Dataset>
<Dataset Index_Name="NTAXA" Season_ID="2">
<Value>1.62</Value>
</Dataset>
<Dataset Index_Name="NTAXA" Season_ID="3">
<Value>1.62</Value>
</Dataset>
<Dataset Index_Name="NTAXA" Season_ID="4">
<Value>1.6524</Value>
</Dataset>
<Dataset Index_Name="NTAXA" Season_ID="5">
<Value>1.6524</Value>
</Dataset>
<Dataset Index_Name="NTAXA" Season_ID="6">
<Value>1.6524</Value>
</Dataset>
<Dataset Index_Name="NTAXA" Season_ID="7">
<Value>1.7982</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="1">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="2">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="3">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="4">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="5">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="6">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="7">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="1">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="2">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="3">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="4">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="5">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="6">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="7">
<Value>0</Value>
</Dataset>
</Datasets>
Appendix 5 – Sample Limits File
<Datasets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="limits.xsd">
<Dataset Type="Default" ID="Default" Description="Default Limit Set">
<Index NAME="NTAXA">
<Bucket Classification="H" ID="H" RANK="1">
<Lower_Bound Operator="gte">.8879</Lower_Bound>
<Upper_Bound Operator="gte">10</Upper_Bound>
</Bucket>
<Bucket Classification="G" ID="G" RANK="2">
<Upper_Bound Operator="lt">.8879</Upper_Bound>
<Lower_Bound Operator="gte">.7417</Lower_Bound>
</Bucket>
<Bucket Classification="M" ID="M" RANK="3">
<Upper_Bound Operator="lt">.7417</Upper_Bound>
<Lower_Bound Operator="gte">.5954</Lower_Bound>
</Bucket>
<Bucket Classification="P" ID="P" RANK="4">
<Upper_Bound Operator="lt">.5954</Upper_Bound>
<Lower_Bound Operator="gte">.491</Lower_Bound>
</Bucket>
<Bucket Classification="B" ID="B" RANK="5">
<Upper_Bound Operator="lt">.491</Upper_Bound>
<Lower_Bound Operator="gte">0</Lower_Bound>
</Bucket>
</Index>
<Index NAME="ASPT">
<Bucket Classification="H" ID="H" RANK="1">
<Lower_Bound Operator="gte">1.0059</Lower_Bound>
<Upper_Bound Operator="gte">5</Upper_Bound>
</Bucket>
<Bucket Classification="G" ID="G" RANK="2">
<Upper_Bound Operator="lt">1.0059</Upper_Bound>
<Lower_Bound Operator="gte">.8918</Lower_Bound>
</Bucket>
<Bucket Classification="M" ID="M" RANK="3">
<Upper_Bound Operator="lt">.8918</Upper_Bound>
<Lower_Bound Operator="gte">.7778</Lower_Bound>
</Bucket>
<Bucket Classification="P" ID="P" RANK="4">
<Upper_Bound Operator="lt">.7778</Upper_Bound>
<Lower_Bound Operator="gte">.6533</Lower_Bound>
</Bucket>
<Bucket Classification="B" ID="B" RANK="5">
<Upper_Bound Operator="lt">.6533</Upper_Bound>
<Lower_Bound Operator="gte">0</Lower_Bound>
</Bucket>
</Index>
</Dataset>
</Datasets>

You might also like