What should I consider a ‘data set’ ‘series’ and  ‘service’ for purposes of recording discovery metadata in MEDIN?

Often it is difficult to decide if the data that has been collected constitutes one data set or many - this is called ‘granularity’. It is important to get the level or ‘granularity’ correct otherwise it is possible to end up with either too many or too few records which makes it difficult for a user to find what they want. MEDIN has some practical guidence to help you decide:

 • the correct level for a dataset is a cruise, survey or a set of repeat observations with a common purpose,
 • a data set usually constitutes a specifically-funded piece of work,
 • the dataset should be easily extractable from a database for a thrid party,
 • if you are searching for a data set using a portal and get the result every time you search by different combinations of time,
   location and parameter then it is probably too coarse.

To guide the metadata creator better some draft examples of what should be considered a dataset are given below:

 • A monitoring programme that produces data for the same parameters at the same locations each year
 • A multidisciplinary cruise that has been specifically funded to answer a specific research question and is not anticipated
   to be carried out repeatedly
 • A number of different types of data collected over the course of one year in a pecific location that forms an Environmental
   Impact Assessment for a specific activity.
 • A survey carried out over one month in a Special Area of Conservation that has been funded as one piece of work.

It is difficult to be very prescriptive as often the decision of whether a collection of information forms one or more datasets is case specific. However, in all cases the metadata creator should ask him/herself what would be most effective for a user to quickly find to the information that they want via the portal.

What is a series?

Given the above definition of a dataset then a series is a collection of datasets which as INSPIRE define ‘are linked by a common specification’. In this case it is not believed that INSPIRE mean an Annex 1, 2 or 3 data specification but a common theme. Some draft suggestions of what constitutes a series is given below:

 • A collection of cruises that are linked by a common research question and so form part of a larger project (e.g. North Sea
   Project, RAPID)
 • A project that has collected data from a range of sources to produce a large number of GIS layers across many topics
   (e.g. MB102  - Biophysical data layers).
 • A project that collects the same theme of data on a regular basis but at distinctly separate geographical locations each
   time (e.g. MCA Civil Hydrography Programme)

Services

MEDIN considers a 'service' to be a publically available web service (specifically an implementation of a geo-portal software architecture), that provides views of, access to or processes geographic and thematic information. In order to achieve compliance with INSPIRE the following definitions of service types are used;

A discovery service – Allows data resources to be found through searches of metadata which describe the service(s), series or data set(s).

A view service – Provides client applications with a visual representation in image form of geo-referenced data via a standard protocol using portrayal rules. A typical client application will present data set(s) or parts of data set(s) to the user via a map-based graphical user interface.

A download service - gives access to data set(s) or parts of data set(s). The Download Service provides access to spatial objects whether representing discrete or continuous phenomena.

A registry service may be considered a type of 'other service' which provides access to resources describing the data thus allowing correct processing and interpretation. Registry services need to be maintained properly and must have a clear and well-defined governance model. It is important that all registers keep track of all changes so that data created with reference to an outdated register can still be interpreted and that completely superseded or retired register items remain in the register. Every item in the register must be associated with a unique, unambiguous and permanent identifier.

A transformation service – a service for carrying out data content or data structure transformations. This is an auxiliary service type normally connected with a download service. It is designed as a mechanism to enable spatial data sets to be transformed with a view to achieving interoperability. An example would be transforming between coordinate reference systems. A transformation service will usually not be made directly accessible for the general public but a metadata record of it should exist.

An invoke spatial data service – A service invoking the use of spatial data service(s) that allows the definition of data inputs and data outputs expected by the spatial service and defines a workflow or service chain combining multiple services. It also allows the definition of a web service interface managing and accessing (executing) workflows or service chains. The service chains should be expressed in a standard (e.g. XML-based) notation that can be consumed by commercial and open-source orchestration engines from multiple sources. Invoke Spatial Data Services will enable a user or client application to run them without requiring the availability of a GIS.