TOOLKIT Supporting Elements

Summary

New work to improve the protocol for calculation of supporting boundaries for good ecological status

Setting appropriate parameter boundaries that support ecological status is a vital step in the protection and restoration of aquatic ecosystems. It enables programmes of measures to be tailored to maintain or restore waterbodies on a type specific basis in cases where parameters such as nutrients are driving status. Previous work made significant improvements in harmonising the approach to nutrient boundary setting in the EU, introducing several methods and producing values for MS (Phillips et al. 2018). However, some limitations were also present in the methods and data used, with some estimated boundary values not corresponding to MS expert’s estimations. In addition, only nutrients were considered and not other parameters representing a gap in knowledge which was subsequently judged by ECOSTAT as needing additional work. In response to the ECOSTAT requests, further work is ongoing to develop boundary setting approaches for other physico-chemical elements as well as improvements in the methods for setting evidence-based thresholds, with particular reference to nutrients. This report presents a new protocol and the results of its application to calculate nutrient boundaries in separate documents for lakes, rivers and TRAC. It is intended to extend the method, where appropriate, to other parameters.

About

Best Practice Guide on establishing nutrient concentrations to support good ecological status

Geoff Phillips, Martyn Kelly, Wera Leujak, Fuensanta Salas, Heliana Teixeira, Anne Lyche Solheim, Gary Free, Gabor Varbiro

Final version following testing at Bucharest workshop November 2018

Summary

1. High concentrations of inorganic nutrients are a major factor contributing to the failure of many water bodies to achieve Good Ecological Status and Member States need to determine levels appropriate to their own territories.
2. This report describes statistical methods for determining appropriate concentrations for supporting ecological status. These statistical methods need to be set in a broader framework that also encompasses chemical, ecological and regulatory aspects relevant to the type of water body under consideration.
3. Three approaches to setting these threshold concentrations are included. These are: • Regression analysis, using a continuous relationship between EQR and nutrient concentration • Categorical analysis, using the distribution of nutrient concentration within biological classes • Minimisation of mis-match of classifications for biology and nutrients
4. The choice of method depends upon a number of factors, including the length of the gradient that available datasets cover and the statistical strength of the relationship between the explanatory and response variables. In some cases, Member States may be better able to achieve the statistical prerequisites for methods by joining forces with neighbours who share similar water body types.
5. Excel and R-based statistical “toolkits” are provided in order to make calculation of threshold concentrations more straightforward.
6. Options for situations where none of these methods are appropriate are also described.
7. Finally, some practical issues associated with the use of these threshold concentrations for regulation are discussed.

Stressor Boundary Test

To test a boundary only, you can use the Tester page. Here, it is possible to check the stressor along with EQR and EQC.

The individual test for boundary check will open in a separate page. https://shiny.freshwater-ecology.com/Tkit_Test/

Disclaimer

This application has been developed through a collaborative framework (the Common Implementation Strategy (CIS)) involving the Member States, EFTA countries, and other stakeholders including the European Commission. The document is a working draft and does not necessarily represent the official, formal position of any of the partners.

To the extent that the European Commission's services provided input to this technical document, such input does not necessarily reflect the views of the European Commission.

Neither the European Commission nor any other CIS partners are responsible for the use that any third party might make of the information contained in this document.

The technical document is intended to facilitate the implementation of Directive 2000/60/EC and is not legally binding. Any authoritative reading of the law should only be derived from Directive 2000/60/EC itself and other applicable legal texts or principles. Only the Court of Justice of the European Union is competent to authoritatively interpret Union legislation.

For support contact: varbirog@gmail.com

Reference

Please reference the use of this application in any publications as

Phillips, G., Teixeira, H., Kelly, M., Lyche Solheim, A., Free, G., Salas Herrero, M.F., Kolada, A., Varbiro, G., Poikane, S.: Establishing supporting element standards. A revised approach and applications, European Commission: Joint Research Centre, Publications Office of the European Union, 2024, https://data.europa.eu/doi/10.2760/55461

How-to
Analyses guides
How-to & Guides

Best Practice Guide How-to

First Import the file or use the sample dataset. Don't forget to use the proper decimal and separation symbols.

The content of the data / boxplots etc can be found in the Check data tab

You can adjust the EQR boundaries, the application recalculate all statistic with the new values

Before proceeding with the Toolkit you can select outliers to exlude from the analyses

You can adjust the Nutrient boundaries for linear and mismatch methods check the R2 to achieve the best correlation, the application recalculate all statistic with the new values.

Best Practice Guide Analyses guides

Measure (abbreviation)

Measures of classification accuracy used to compare two binary classifications

Import

Select the proper separator and decimal settings before loading the data

Separator

Comma

Semicolon

Tab

Decimal

Comma

Point

Tools

Choose CSV File

Browse...

*In order to test the application ....

Content of the datafile

1. Summary of the datafile

Summary

2. Variable names of the datafile

Names

Variable selection

In this section it is possible to select the variables to be used by the toolkit. Please check if it is numerical variable.

Select a variable

Select the variable which would be used in the analyses

Plot_EQR Outliers

Before proceeding with the Toolkit's calculation steps you can select outliers to exlude from the analyses. Blue dots : sample points, Red dots: outliers.

* by clickling on the sampling points you can select or deselect them...

Sample points excluded

Linear method regression tables

The toolkit allows three regression models to be fitted to the linear portion of the data:

a type II regression; assumes equal uncertainty in measurement of both EQR and nutrient (slope between the two OLS regressions).
an Ordinary Least Squares (OLS) regression of EQR v nutrient concentration; assumes all uncertainty is in measurement of the EQR (underestimate of slope);
an OLS regression of nutrient v EQR; assumes all uncertainty lies in measurement of nutrient concentrations (overestimate of slope);

The boundary values predicted by the regression models depend on the slope of the relationship and the difference in the slopes produced by these relationships depends on the R² : the lower the value the greater the difference.

It is important to stress that selecting the segment of data to be used will have a significant influence on the slope of the linear models and thus the predicted boundary values and is to an extent a subjective decision, albeit guided by the gam and segmented regression models.

Linear regression main table

Linear regression summary

Ranged major axis regression EQR vs. Nutrient

Figure A Relationship between EQR and Nutrient concentration in test data set showing RMA type II regression. Predicted high/good boundary values shown.

Figure B Relationship between EQR and Nutrient concentration in test data set showing RMA type II regression. Predicted good/moderate boundary values shown.

Linear method Confusion matrix metrics

Confusion matrix metrics

Linear methods results

Data.frame contains the following values as columns: Model, r2, the number of observations used N, slope, intercept, predicted good/moderate boundary GM, lower and upper estimates of good/moderate boundaries GML GMU, the high/good boundary HG, lower and upper estimates of high/good boundaries HGL HGU. Rows are results for Model 1, Model 3 and Model 2.

OLS Regression of EQR on Nutrient

Figure C Relationship between EQR and Nutrient concentration in test data set showing OLS regression of EQR vs Nutrient. Predicted good/moderate boundary values shown.

OLS Regression of Nutrient on EQR

Figure D Relationship between EQR and Nutrient concentration in test data set showing OLS regression of Nutrient on EQR (inverted to plot on same scales). Predicted good/moderate boundary values shown.

Visualise Nutrient range

The Stressor range for linear and logistic methods. Compare the R² values to see the effect of different ranges of data.

Boxplot methods results

Boxplot summary table

Boxplot summary

Select if the stressor trend is reversed:

If the differences between the nutrient concentrations in adjacent classes are not significant, treat quantiles with extreme caution. On the Boxplots significance indicated by Wilcoxon signed rank test.

Wedge shape detection

Test on variance /f.test/

In this test the 'EQR' is being tested across the 1st and 5th quantile categories generated based on stressor using a variance test (var.test) within this subset of the data. The resulting p-value from this test indicates whether there is a statistically significant difference in the variance in the EQR in the two quantile categories. If the p-value is less than the 0.05 significance level then there is evidence of heteroscedasticity which could mean wedge shape distribution.

Breusch-Pagan test

The Breusch-Pagan test statistic is used to test for heteroscedasticity in a linear regression model. If the test statistic is significant (i.e., the p-value is less than the 0.05 significance level), we can reject the null hypothesis and conclude that there is evidence of heteroscedasticity whihc could mean wedge shape distribution.

Wedge shape plot

Standard Residual plot

Linear Quantile regression results

Linear quantiles

Scatterplot of Nutrient vs. EQR with quantile regression lines for the 10th(grey), 25th(red), 50th(black), 75th(blue), and 90th(grey) percentiles.

* The linear quantiles restricted for the data range choosen with the range sliders in the Linear methods menu.

Quantile table

Quantile method Confusion matrix metrics

Additive Quantile regression results

* The additive quantiles plot

Scatterplot of Nutrient vs. EQR with quantile regression lines for the 25th(red), 50th(green), 75th(red)

Additive Quantile Confusion matrix metrics

* The additive quantiles Confusion plot

* The additive quantiles Confusion table

Logistic method result

To fit a segmented regression, it is necessary to estimate the break points. In the test data set curvature is not that marked, thus it is best to start with a single break point. Other data sets may clearly require 2 break points and to accommodate these two different functions have been provided.

Figure A- 13 Scatter plot showing relationship between EQR and TP with fitted GAM model, with segmented regression lines. a) with two estimated break points b) with single estimated break point

Good-Moderate on Nutrient
High-Good-Moderate on Nutrient
Classification decision tree methods

Classification decision tree plot

Figure A30 Classification decision tree of the selected Nutrient on biological classes (Good or better, Moderate or worst). Threshold value of selected Nutrient for Good and Moderate status. Each node shows the predicted class, the predicted probability of each class and the percentage of observations in the node ( High, Good, Moderate) .

Classification decision tree plot

Figure A30 Classification decision tree of the selected Nutrient on biological classes (High, Good, Moderate). Threshold value of selected Nutrient for High, Good and Modearte status. Each node shows the predicted class, the predicted probability of each class and the percentage of observations in the node ( High, Good, Moderate) .)

Binary logistic models

Parameters for Binary logistic methods

Select stressor trend

Select stressor trend:

Select stressor Logarithm

Select stressor if logarithmic :

Log 10

Raw

Select to use EQR or EQC

The selection is aviable only if the dataset contain EQC / Biological Class values/

* The analysis takes a bit of time so click on this button wait a bit till the results appears.... * The analysis run only after clicking this button so if you change EQR boundaries or nutrient range you should click on this button again....

Table 1

The following table shows predicted boundary values, from binary logistic models together with key measures from their confusion matrix. The most appropriate boundary, based on the optimum criteria is marked as orange.

Table 2

The most appropriate boundary is shown here :

Table 3

The descriptive metrics of the model

Binary Logistic methods results

Fig1a

a) scatter plot with model fit and predicted boundary concentrations for the stressor threshold determined by optimal decision.

Fig1c

c) Boxplots showing range of stressor for samples classified by biota. (*dotted lines show boundary values.)

Fig1b

b) Confusion matrix showing number of true and false records and measure.

Fig1d

d) Boxplots showing range of EQR for samples classified using the predicted stressor boundary. (*dotted lines show boundary values.)

Explanatory graphs

Binary Logistic methods results

Figure a: Results of fitting GLM to scatter plot with model fit and predicted boundary concentrations (SE); b)confusion matrix showing number of true and false records and measures, c) boxplots showing range of stressor for waterbodies classified by BQE. d) boxplots showing range of EQR for waterbodies classified using the predicted stressor boundary. dotted lines show boundary values

Figure b: Density distribution and box plots showing the range of stressor concentration in sites classified biologically into good or better and moderate or poor status. Data are synthesised to illustrate a good separation of the stressor concentration.

Lineplot

Binary Logistic methods Lineplots

Change in measures used for assessing confusion matrix, with five possible cut-points marked:

a) max CCR,

b) max kappa,

c) FNR=FPR (the Tool Kit MisMatch Method),

d) commission = 0.2,

e) cross-over commission/omission,

Stressor Boundary Tester

To test the boundary you can adjust the slider to the desired value, Then the confusion matrix is generated based on the new values. The same true for adjusting the EQR on the sidebar.

Select to use EQR or EQC

Select if the stressor trend is reversed:

Confusion matrix metrics

Summary of the methods

Download the report of the analyses

The toolkit can generate a report which includes the figures and tables produced by the analyses. In order to generate all figures and tables you need to go through(click) all analytical steps.

Summary

New work to improve the protocol for calculation of supporting boundaries for good ecological status

About

Best Practice Guide on establishing nutrient concentrations to support good ecological status

Geoff Phillips, Martyn Kelly, Wera Leujak, Fuensanta Salas, Heliana Teixeira, Anne Lyche Solheim, Gary Free, Gabor Varbiro

Final version following testing at Bucharest workshop November 2018

Stressor Boundary Test

Disclaimer

Reference

Best Practice Guide How-to

Best Practice Guide Analyses guides

Measure (abbreviation)

Import

Content of the datafile

Summary

Names

Variable selection

Plot_EQR Outliers

Before proceeding with the Toolkit's calculation steps you can select outliers to exlude from the analyses. Blue dots : sample points, Red dots: outliers.

Linear method regression tables

The toolkit allows three regression models to be fitted to the linear portion of the data:

Ranged major axis regression EQR vs. Nutrient

Linear method Confusion matrix metrics

Linear methods results

OLS Regression of EQR on Nutrient

OLS Regression of Nutrient on EQR

Visualise Nutrient range

Boxplot methods results

Boxplot summary table

Boxplot summary

Select if the stressor trend is reversed:

If the differences between the nutrient concentrations in adjacent classes are not significant, treat quantiles with extreme caution. On the Boxplots significance indicated by Wilcoxon signed rank test.

Wedge shape detection

Test on variance /f.test/

Breusch-Pagan test

Wedge shape plot

Standard Residual plot

Linear Quantile regression results

Linear quantiles

Quantile table

Quantile method Confusion matrix metrics

Additive Quantile regression results

Additive Quantile Confusion matrix metrics

Logistic method result

Classification decision tree plot

Classification decision tree plot

Binary logistic models

Parameters for Binary logistic methods

The selection is aviable only if the dataset contain EQC / Biological Class values/

Table 1

Table 2

Table 3

Binary Logistic methods results

Fig1a

Fig1c

Fig1b

Fig1d

Explanatory graphs

Binary Logistic methods results

Lineplot

Binary Logistic methods Lineplots

Stressor Boundary Tester

Select if the stressor trend is reversed:

Summary of the methods

Download the report of the analyses

Linear methods results

Boxplot table

Boxplot summary

Linear quantiles summary

Binary Logistic summary