Using the bathing water data API

The web Application Programming Interface (API) is an easy to use tool to select and retrieve data. It is useful for building applications that use bathing water profile and bathing water quality data. It is suitable, for example, for building a web page that displays a dashboard of bathing waters in a particular region or for building a table to include in a report.

The API can be accessed using a Web browser or any other program that can retrieve web pages. For example, clicking on this link. will list, in a new window or tab, all of the bathing waters in Cornwall.

The previous link retrieved the data formatted as an HTML page. This xml link will retrieve the same data formatted as XML, and this json link will retrieve it in JSON format. Other formats, e.g. comma-separated values (CSV) and Turtle are also supported.

As with any web based API, the interface consists of patterns of web addresses or URLs. In this introduction we will look at some example patterns to illustrate how the API can be used. There is a comprehensive description of the structure of the data and the URL patterns of the API, which can be found in the bathing water quality reference documentation.

A basic example

Let's look in more detail at the parts of the link used above:

http://environment.data.gov.uk/doc/bathing-water.json?_view=basic&_pageSize=200&district=http://data.ordnancesurvey.co.uk/id/7000000000043750

This URL consists of an endpoint URL and some query parameters. The endpoint URL here is http://environment.data.gov.uk/doc/bathing-water.json. The endpoint URL defines the broad category of data to be returned, in this case a list of one or more bathing waters. The .json on the end of this endpoint URL specifies the format of the data to be returned; JSON in this case.

Query parameters modify the data that is returned. Query parameters can be used to filter the data, limit the number of results, and sort the data. Let's look in more detail at the query parameters used in the example:

  • _pageSize=200: the number of results to return. The default is 10. The upper limit is 200.
  • _view=basic: which properties of each bathing water to return. There are a number of predefined views. The basic view is the smallest, containing just the name and type of each bathing water. We will see other predefined views below, as well as how to extend them.
  • district=http://data.ordnancesurvey.co.uk/id/7000000000043750: this is a filter. It says only return bathing waters whose district property has the value http://data.ordnancesurvey.co.uk/id/7000000000043750. That value is the Ordnance Survey's linked data identifier for Cornwall. Thus the filter says to only retrieve bathing waters located in Cornwall.

The API need not only be accessed through a web browser. Here is an example of using that URL to retrieve data not using a browser. The program wget is commonly found on linux systems (and is also available on OSX and Windows computers). Wget is used to retrieve data from a URL from the command line, or a script. Here, it is used to retrieve JSON format data into a file:

wget -O data.json "http://environment.data.gov.uk/doc/bathing-water.json?_view=basic&_pageSize=200&district=http://data.ordnancesurvey.co.uk/id/7000000000043750"

Using a browser to develop API queries

Most the of the functionality of the API is exposed via the web pages it generates. For example, when data is retrieved as a web page, there is a drop down menu listing all the data formats available. Another menu shows the views available on an endpoint. The menu to the right of a property value supports sorting and filtering values.

One way to develop a URL to retrieve data is to retrieve an example using the Web interface, using the menus available to select, filter and sort the data as required. The URL in the browser bar at the end of this process can then be copied and used in an application.

Other endpoints

The http://environment.data.gov.uk/doc/bathing-water endpoint is known as a list endpoint because it returns data about a list of items. Generally, to retrieve information about a particular item, simply append a '/' character, followed by the identifier for the item to an appropriate list endpoint.

For exmample, the identifier for the Kingsand bathing water is ukk3101-26520. Data about Kingsand can be retrieved by appending the identifier to the URL for the bathing water list endpoint to produce http://environment.data.gov.uk/doc/bathing-water/ukk3101-26520

Other endpoints in the data that may be of interest include:

A comprehensive list of endpoints can be found in the reference documentation.

Interpreting the returned data

The data returned by the API includes metadata as well as the data requested. In the XML formatted data the data requested is included in the <items> and <item> elements. Similarly, in the JSON formatted data, the "items" and "item" members hold the requested data. The other elements and members hold metadata.

Views

The data returned consists of data items and their properties. The value of a property might be a data value such as a string or an integer, or another item, which can also have properties.

Views are used to select which properties are included in the returned data. In our example above, we specified that we wanted to use the basic view. The basic view is the simplest view and just includes type and label properties. If no view is specified, as in this request:

http://environment.data.gov.uk/doc/bathing-water.json?_view=basic&amp;_pageSize=200&amp;district=http://data.ordnancesurvey.co.uk/id/7000000000043750

then a default view is used. The easiest way to see which views are available for a particular endpoint is to use the view drop down menu in the web page for the end point.

There are some built in views that are common to all end-points:

  • basic: a minimal view with just type and label properties
  • description: all the properties of an item
  • all: like a description view, but adds the label properties of any sub-items

If none of the predefined views are suitable, then it is possible to augment a predefined view by adding to properties that are returned from the API. We can add a _properties query parameter to the URL. This parameter takes a comma-separated list of properties to include in the view. Here, for example, we have extended the basic view by adding the yearDesignated property:

http://environment.data.gov.uk/doc/bathing-water?_view=basic&_properties=yearDesignated&_pageSize=200&district=http://data.ordnancesurvey.co.uk/id/7000000000043750

To retrieve properties of sub items, chain the property names together with a dot character. For examle, we can add the latest compliance assessment by adding the chain of properties latestSampleAssessment.sampleClassification.label to the view, like this:

http://environment.data.gov.uk/doc/bathing-water?_view=basic&_properties=yearDesignated,latestRiskPrediction.riskLevel.label&_pageSize=200&district=http://data.ordnancesurvey.co.uk/id/7000000000043750

If we had only added the property latestRiskPrediction, we would have retrieved the URI for that prediction object but none of its properties. We can instead use the dot notation to specify a chain of properties to be retrieved. The example uses this to retrieve the human friendly name of the risk level of the latest short-term pollution prediction. A star character, * denotes all properties. To retrieve all of the properties of the latest pollution prediction, including the label, use the path latestRiskPrediction.riskLevel.*.

The names of all the available properties can be found in the bathing water quality reference documentation.

Filters

We have seen one use of a filter so far in our example The district=http://data.ordnancesurvey.co.uk/id/7000000000043750 parameter restricts the results to bathing waters in Cornwall. In this example, we are specifying the exact value of a property. We can also specify the minimum value of a property by adding min- to its property name, or max- to specify the maximum value of a property. Property chains can be used too.

For example, to select all the bathing waters west of Penzance, we specify a maximum value for the longitude:

http://environment.data.gov.uk/doc/bathing-water?&_pageSize=200&max-samplingPoint.long=-5.53552896653342

Sorting and paging

The _sort parameter can be used to sort results, while the _pageSize parameter limits the number of results returned, which can be use with the _page parameter to select a particular batch of results. For example:

http://environment.data.gov.uk/doc/bathing-water?_sort=samplingPoint.lat&_pageSize=10&_page=2

selects the third batch (the first one is page zero) of ten bathing waters, ordered from south to north.

To invert the sort order, and sort from north to south, insert a '-' before the sort property like this:

http://environment.data.gov.uk/doc/bathing-water?_sort=-samplingPoint.lat&_pageSize=10&_page=2

Limitations

While a lot can be done with the API, there are some things that require a full query language such as SPARQL. The one that is most often encountered is that you can't have an 'or' operation in a filter. Its not possible, for example, to select bathing waters that are west of Penzance OR east of London.

What next

As mentioned above, more detailed information about the bathing water API can be found in the bathing water quality reference documentation.

This API is implemented by an open source tool called Elda. See the Elda documentation for more comprehensive documentation about what it can do.