Skip to main content

Change your cookie settings

We cannot change your cookie settings at the moment because JavaScript is not running in your browser. To fix this, try:

  1. turning on JavaScript in your browser settings
  2. reloading this page
View cookies

Change your cookie settings

We cannot change your cookie settings at the moment because JavaScript is not running in your browser. To fix this, try:

  1. turning on JavaScript in your browser settings
  2. reloading this page
View cookies
  1. Home
  2. Search
  3. Seahorse species predicted habitat distributions and associated environmental data layers covering the shelf seas surrounding the UK

Seahorse species predicted habitat distributions and associated environmental data layers covering the shelf seas surrounding the UK

Summary

This dataset contains predicted seahorse habitat distributions for two species (*Hippocampus hippocampus *and *H. guttulatus*) and the genus combined (Hippocampus hippocampus MAXENT.asc, Hippocampus guttulatus MAXENT.asc and Hippocampus sp. MAXENT.asc, respectively). Rasters of the raw predicted habitat suitability outputs from the maximum entropy (MAXENT, Phillips et al. 2006) species distribution model algorithm are provided. Additionally, the environmental data layers used for modelling the species distributions, including distance to seagrass habitat (degrees) (Distance From Seagrass.tif), distance to the coastline (DistCo.tif) bathymetry (m) (Bathy.tif), minimum winter and maximum summer SST (oC) (Kriging SST Winter Min.tif and Kriging SST Summer Max.tif, respectively) and chlorophyll a concentration (mg m-3) (Kriging Chla Winter Smooth.tif and Kriging Chla Summer Smooth.tif, respectively). All raster layers are provided in the WGS84 (EPSG::4326) spatial reference system. All raster layers are at a gridded resolution of 0.0042 x 0.0042 degrees. All layers cover the shelf seas surrounding the UK, including the English Channel, Celtic Seas, North Sea, and extend into shallow coastal and estuarine habitats. Raster layers of predicted habitat suitability are provided in ASCI format, whilst environmental predictor layers are provided in TIFF format. This work was prioritised by Natural England’s Seahorse Working Group and was funded by Natural England. This work was conducted in collaboration with The Seahorse Trust who provided the seahorse occurrence data. The information provided will support marine spatial planning to reduce the broader impact of anthropogenic activities and enable better decision-making to protect these sensitive species and their habitats.

Categories

Keywords

N/A

Use limitation statement

There are no public access constraints to this data. Use of this data is subject to the licence identified.

Attribution statement

See Cefas Data Portal for details – link to dataset below.

Technical information

Update frequency

notPlanned

Lineage

The outputs from this project advance the development of habitat suitability models for two sensitive species of seahorse (*H. hippocampus *and *H. guttulatus*) and the combined genus by predicting habitat suitability in shallow, nearshore waters (coastal and estuarine sites) that are considered important for both species. Satellite-derived gridded environmental data (sea surface temperature (SST, oC) and chlorophyll a concentration (mg m-3), MODIS-Aqua Level-3 `https://oceancolor.gsfc.nasa.gov/l3/`_[Downloaded May 2018]) were combined with Environment Agency field-collected water quality data from the Water Quality Archive [Accessed January 2023] using ordinary kriging within the "Geostatistical Analyst" tool in ArcGIS (v10.5) to produce environmental predictor layers that extend to the coastline. SST and chlorophyll summer maximums and winter minimums were calculated. Bathymetry (m) (GEBCO Compilation Group (2022) GEBCO 2022 Grid (doi:10.5285/e0f0bb80- ab44-2739-e053-6c86abc0289c) [Downloaded January 2023]) was interpolated using the natural neighbour method in ArcGIS 10.5. Seagrass (*Zostera marina*, *Z. noltii* and *Ruppia *spp.) polygon and point data were collated from multiple sources including Natural England’s open-access polygon data on seagrass coverage in England. Point data (from the year 2000 onwards) were converted to polygons using a buffer and merged with existing polygons to ensure all possible seagrass locations were included in the calculation. A distance from seagrass meadows layer was calculated in ArcGIS 10.5 using the “Euclidean Distance” tool. The final list of six variables was used for modelling, all with a Pearson's correlation of less than 0.7, including distance to seagrass habitat, bathymetry, minimum winter and maximum summer SST and chlorophyll concentrations which are provided. Seahorse records provided by the Seahorse Trust's National Seahorse Database, Natural England and other records from open-source databases described in Bluemel et al. 2020 were filtered to remove records that were duplicated or fell outside of the study area, or where species identification was dubious. Prior to modelling, records were reduced to one point per grid cell to reduce sampling biases. All occurrence records from the year 2000 to the present were used for model training and testing (*Hippocampus *genus n = 165, *Hippocampus hippocampus *= 144, *Hippocampus guttulatus *= 45) (species occurrences are not provided). Seahorse habitat distributions were modelled using maximum entropy (MAXENT, Phillips et al. 2006). The default convergence threshold of 106 and a maximum of 5,000 iterations were used. A maximum of 10,000 background samples were randomly selected from the study area. Within the model settings, random seed was selected, which means that a different random subset was used for each model run, and the number of replicates was set to 10 and the replicate run type was set to subsample. In addition, the random test percentage was set to 25%. All other MAXENT settings were default. Model performance was examined by computing the Area Under the Curve (AUC) score of the receiver operating characteristic (ROC) curve for each species, which is calculated by the MAXENT software as part of the modelling process and is considered to be an effective measure of model performance (Reiss et al. 2011). The AUC represents the relationship between sensitivity and specificity and varies between 0 and 1 (Reiss et al., 2011). It is generally considered that values >0.9 indicate an excellent model fit, values between 0.7 and 0.9 represent a good fit, anything <0.7 represents a poor model fit and anything <0.5 represents nothing better than random (Reiss et al., 2011). The model for *H. guttulatus* had the highest AUC, which had a mean value of 0.99 ± 0.0005 across the 10 runs, followed by *H. hippocampus* (0.97 ± 0.0008) and the genus combined (0.97 ± 0.0006). Modelling limitations and sampling biases: Known sampling biases exist in the seahorse occurrence data used in this study and therefore the predictions should be taken with caution. Biases include higher sampling effort in the southwest and English Channel, due to high recreational SCUBA diving intensity in this area, resulting in uneven sampling distributions across the study area. Recorder biases exist due to known locations of seahorses being regularly visited and targeted by dive companies, and records are often provided by multiple diver trips of the same seahorse that has moved to a nearby locality (Garrick-Maidment pers. comm.). MAXENT is sensitive to biased data distributions, and there is a strong assumption of random sampling distribution for presences when random background data are used. Thus the underlying model assumptions are invalidated by the sampling bias, which could be restricting predictions to southerly localities where sampling biases (increased effort and recorder bias) are known to exist. To improve confidence in the habitat predictions for seahorses, further work should include testing different modelling algorithms. Variations in model predictions were observed by Bluemel et al. (2020), highlighting the need to compare outputs and identify consistent robust predictions, especially in data-limited, presence-only settings. Additionally, testing different habitat suitability thresholds that are tailored to the intended use of the predicted distributions (Freeman and Moisen 2008), testing validation approaches more suited to presence-only situations, because using AUC in isolation can be uninformative when species prevalence is low. Furthermore, sampling bias and spatial autocorrelation in the occurrence data require further work. Methods to deal with spatial autocorrelation and sampling biases could include defining a bias grid to include in the modelling process (i.e., known dive sites with higher sampling activity, thus a higher likelihood of observation), using other suitable point process models that account for these biases or assessing model performance with a block cross-validation approach (Roberts et al. 2017)). Bluemel, J.K., Lynam, C. and Ellis, J. (2020). Fish biodiversity: state and pressure indicators. Cefas Project Report for Defra, 67 pp. Freeman, E.A. and Moisen, G.G. (2008). A comparison of the performance of threshold criteria for binary classification in terms of predicted prevalence and kappa. Ecological modelling, 217(1-2): 48-58. Phillips, S.J., Anderson, R.P. and Schapire, R.E., (2006). Maximum entropy modelling of species geographic distributions. *Ecological modelling*, *190*(3-4): 231-259. Reiss, H., Cunze, S., König, K., Neumann, H. and Kröncke, I., (2011). Species distribution modelling of marine benthos: a North Sea case study. *Marine Ecology Progress Series*, *442*: 71-86. Roberts, D.R., Bahn, V., Ciuti, S., Boyce, M.S., Elith, J., Guillera-Arroita, G., Hauenstein, S., Lahoz-Monfort, J.J., Schröder, B., Thuiller, W., others. (2017). Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography. 40: 913-929. .. _`https://oceancolor.gsfc.nasa.gov/l3/`: https://oceancolor.gsfc.nasa.gov/l3/

Spatial information

Coordinate reference system

N/A

Geographic extent

  • Latitude from: 49 to 61.5
  • Longitude from: -15 to 8
Metadata information

Language

English

Metadata identifier

73a523cb-56dc-42fd-b48b-33d103428e9e


Published by

Centre for Environment, Fisheries & Aquaculture Science

Contact publisher

data.manager@cefas.gov.uk

Dataset reference dates

Creation date

17 April 2023

Revision date

01 February 2024

Publication date

26 September 2023

Period

  • From: 01 January 2000
  • To: 01 February 2024

Search

Data and Supporting Information
Data services and download by area of interestLinkAction
The Cefas Data Portal contains metadata records and data sets available to download and connect to in support of our commitment to open science. Data is available in the following formats: Binary download.Open link