DCIO Data Centre Inter Operability - Metadata Harvesting and Sharing

Currently, Atmospheric EO and Cal/Val data are available from multiple sources and data archives across the world, there is no so-called "one-stop-shop" for search of data. In order to facilitate simpler and faster search metods for the users, EVDC is now setting up harvesting methods for sharing metadata between data archives from a number of national and international projects and programmes, based on The Data Center InterOperability project that was an initiative started by the European Space Agency (ESA) in December 2008 and further continued in GECA project (Generic Environment for Calibration/Validation Analysis)

There is a growing interest in using Cal/Val data particular in connection with the new Sentinel missions and other upcoming satellites, as well as in Copernicus
and related initiatives. Through metadata sharing, EVDC aims to promote cooperation between the various data archives, work for open data policy and exploit
and strenghten collaboration throughout EO disciplines in the best possible way.

The main objective of DCIO is to bring EO data centers closer together and foster future collaboration.

The specific objectives of DCIO are:

- Motivate collaboration of peer data centers
- Harmonize metadata standards for EO
- Provide data discovery and search for distributed resources
- Increase exposure of hosted data
- Develop joint data exchange agreements
- Enable uniform user authentication
- Allow systematic exchange of cal/val data
- Provide feedback on data use

DCIO has been developed by data centers in close contact with data providers. The group of data centers participating in DCIO is enabled to host data in a
collaborative manner and to share metadata and data through agreed and harmonized channels. The DCIO will ensure the visibility of many data resources and
enable access to databases across several EO domains. Principle investigators from networks can benefit from DCIO as their data will be available to a greater
number of data users. This may enable a better collaboration between scientists, can result in an increased quality of data sources and will eventually lead to more
publications. The OAI-PMH techniqe and "behind-the-scenes" information is provided in detail below.

How to join the DCIO group and become a peer data center:
To register your archive in this intiative and to set up the required protocols, please contact the EVDC team
(see tab "Contact Us" from main menu).
Our database management team will help you getting started and provides first line support for setting up harvester services.

As of March 2018 the DCIO is set up for at the following data centers:

Data centre Data Search Data Harvester
AVDC - Aura Validation Data Center Search oaicat
EVDC - ESA atmospheric Validation Centre Search GEOMS oaicat / non-GEOMS oaicat
NDACC - Network for the Detection of Atmospheric Composition Change Search non-GEOMS
ACTRIS - Aerosol, Clouds, and Trace gases Research Infrastructure Search non-GEOMS
WOUDC – World Ozone and Ultraviolet Radiation Data Centre Search non-GEOMS CSW / non-GEOMS OWS


DCIO OAI-PMH metadata exchange structure files

DCIO metadata mapping for EO datasets: OAI-PMH XSD files, examples: EOCDCIO-Metadatamapping-V08.zip

XSD file: eocdcio_20110725.xsd


EVDC Harvester

For users already famliar with metadata harvesting - an EVDC harvester is set up for use at https://dcio.evdc.nilu.no/oaicat/


Example on use:

To see what files that are new/changed/deleted in the system since yyyy-mm-dd:

From the main OAICAT link, select ListRecords (Resumption) and type e.g. the following

  • from: 2018-05-01
  • until: 2018-05-30
  • set: dcio
  • metedataPrefix: eocdcio

Press "Send" button

The OAIHandler will return a list of records and their valid metadata. It also gives information on where to download the files.


<metadata>

<eocdcio:CorrelativeProduct xmlns:eocdcio="http://earth.esa.int/eocdcio" xmlns:str="http://exslt.org/strings" xmlns:xlink="http://www.  w3.org/1999/xlink" xmlns:gml="http://www.opengis.net/gml/3.2" gml:id="EVDC_1199235" version="0.8.0">

<gml:metaDataProperty>

<eocdcio:CorrelativeProductMetaData>

<eocdcio:fileIdentifier>urn:x-eocdcio:evdc:1199235</eocdcio:fileIdentifier>

<eocdcio:metaDataUpdateDate>2017-05-16T12:04:49Z</eocdcio:metaDataUpdateDate>

<eocdcio:fileGenerationDate>2017-05-16T05:12:57Z</eocdcio:fileGenerationDate>

<eocdcio:description>

Routine stratospheric temperature profile from LIDAR at DWD-Hohenpeissenberg Observatory, Germany. See: doi:10.5194/amt-2-125-2009

</eocdcio:description>

<eocdcio:discipline>ATMOSPHERIC.CHEMISTRY</eocdcio:discipline>

<eocdcio:acquisitionType>REMOTE.SENSING</eocdcio:acquisitionType>

<eocdcio:dataLevel/>

<eocdcio:dataVariable>ALTITUDE||</eocdcio:dataVariable>

<eocdcio:dataVariable>ALTITUDE|INDEPENDENT|INITIALIZATION</eocdcio:dataVariable>

<eocdcio:dataVariable>ALTITUDE|INDEPENDENT|NORMALIZATION</eocdcio:dataVariable>

<eocdcio:dataVariable>ALTITUDE.INSTRUMENT||</eocdcio:dataVariable>

<eocdcio:dataVariable>DATETIME||</eocdcio:dataVariable>

<eocdcio:dataVariable>DATETIME.START||</eocdcio:dataVariable>

<eocdcio:dataVariable>DATETIME.STOP||</eocdcio:dataVariable>

<eocdcio:dataVariable>INTEGRATION.TIME||</eocdcio:dataVariable>

<eocdcio:dataVariable>LATITUDE.INSTRUMENT||</eocdcio:dataVariable>

<eocdcio:dataVariable>LONGITUDE.INSTRUMENT||</eocdcio:dataVariable>

<eocdcio:dataVariable>NUMBER.DENSITY|BACKSCATTER|</eocdcio:dataVariable>

<eocdcio:dataVariable>

NUMBER.DENSITY|BACKSCATTER|UNCERTAINTY.COMBINED.STANDARD

</eocdcio:dataVariable>

<eocdcio:dataVariable>

NUMBER.DENSITY|BACKSCATTER|UNCERTAINTY.RANDOM.STANDARD

</eocdcio:dataVariable>

<eocdcio:dataVariable>

NUMBER.DENSITY|BACKSCATTER|UNCERTAINTY.SYSTEMATIC.STANDARD

</eocdcio:dataVariable>

<eocdcio:dataVariable>NUMBER.DENSITY|INDEPENDENT|</eocdcio:dataVariable>

<eocdcio:dataVariable>NUMBER.DENSITY|INDEPENDENT|SOURCE</eocdcio:dataVariable>

<eocdcio:dataVariable>PRESSURE|INDEPENDENT|</eocdcio:dataVariable>

<eocdcio:dataVariable>PRESSURE|INDEPENDENT|SOURCE</eocdcio:dataVariable>

<eocdcio:dataVariable>TEMPERATURE|BACKSCATTER|</eocdcio:dataVariable>

<eocdcio:dataVariable>

TEMPERATURE|BACKSCATTER|RESOLUTION.ALTITUDE.DF.CUTOFF

</eocdcio:dataVariable>

<eocdcio:dataVariable>

TEMPERATURE|BACKSCATTER|RESOLUTION.ALTITUDE.DF.NORMALIZED.FREQUENCY

</eocdcio:dataVariable>

<eocdcio:dataVariable>

TEMPERATURE|BACKSCATTER|RESOLUTION.ALTITUDE.DF.TRANSFER.FUNCTION

</eocdcio:dataVariable>

<eocdcio:dataVariable>

TEMPERATURE|BACKSCATTER|RESOLUTION.ALTITUDE.IMPULSE.RESPONSE

</eocdcio:dataVariable>

<eocdcio:dataVariable>

TEMPERATURE|BACKSCATTER|RESOLUTION.ALTITUDE.IMPULSE.RESPONSE.FWHM

</eocdcio:dataVariable>

<eocdcio:dataVariable>

TEMPERATURE|BACKSCATTER|UNCERTAINTY.COMBINED.STANDARD

</eocdcio:dataVariable>

<eocdcio:dataVariable>

TEMPERATURE|BACKSCATTER|UNCERTAINTY.RANDOM.STANDARD

</eocdcio:dataVariable>

<eocdcio:dataVariable>

TEMPERATURE|BACKSCATTER|UNCERTAINTY.SYSTEMATIC.STANDARD

</eocdcio:dataVariable>

<eocdcio:origin>EXPERIMENTAL</eocdcio:origin>

<eocdcio:originator>

<eocdcio:PersonIdentification>

<eocdcio:personName>Steinbrecht,Wolfgang</eocdcio:personName>

</eocdcio:PersonIdentification>

</eocdcio:originator>

<eocdcio:investigator>

<eocdcio:PersonIdentification>

<eocdcio:personName>Steinbrecht,Wolfgang</eocdcio:personName>

</eocdcio:PersonIdentification>

</eocdcio:investigator>

<eocdcio:submitter>

<eocdcio:PersonIdentification>

<eocdcio:personName>Steinbrecht,Wolfgang</eocdcio:personName>

</eocdcio:PersonIdentification>

</eocdcio:submitter>

</eocdcio:CorrelativeProductMetaData>

</gml:metaDataProperty>

<gml:validTime>

<gml:TimePeriod gml:id="EVDC_1199235_T">

<gml:beginPosition>2017-05-15T20:40:12Z</gml:beginPosition>

<gml:endPosition>2017-05-16T02:38:36Z</gml:endPosition>

</gml:TimePeriod>

</gml:validTime>

<gml:using>

<eocdcio:CorrelativeProductEquipment gml:id="EVDC_1199235_E">

<eocdcio:platform>GROUNDBASED</eocdcio:platform>

<eocdcio:instrumentModelType>LIDAR.TEMPERATURE</eocdcio:instrumentModelType>

<eocdcio:instrumentModelOrganisation>DWD</eocdcio:instrumentModelOrganisation>

<eocdcio:instrumentModelIdentifier>001</eocdcio:instrumentModelIdentifier>

</eocdcio:CorrelativeProductEquipment>

</gml:using>

<gml:target>

<eocdcio:CorrelativeProductSpatialExtent gml:id="EVDC_1199235_SE">

<eocdcio:fixedLocationName>HOHENPEISSENBERG</eocdcio:fixedLocationName>

<eocdcio:pointLocation>

<gml:Point gml:id="EVDC_1199235_P1" srsName="CRS:84">

<gml:pos>47.8 11.02</gml:pos>

</gml:Point>

</eocdcio:pointLocation>

<eocdcio:lowestAltitude>14930.0</eocdcio:lowestAltitude>

<eocdcio:highestAltitude>67390.0</eocdcio:highestAltitude>

</eocdcio:CorrelativeProductSpatialExtent>

</gml:target>

<gml:resultOf>

<eocdcio:CorrelativeProductResult gml:id="EVDC_1199235_PR">

<eocdcio:product>

<eocdcio:ProductInformation>

<eocdcio:onlineResourceURL>

ftp://ftp.evdc.nilu.no/groundbased/lidar.temperature/hohenpeissenberg/groundbased_lidar.temp erature_dwd001_hohenpeissenberg_20170515t204012z_20170516t023836z_001.hdf

</eocdcio:onlineResourceURL>

<eocdcio:fileAccessConstraints>This file may only be accessed by ....</eocdcio:fileAccessConstraints>

<eocdcio:dataFileHash>e46656ffc267ad2f891acdd43d7917b9</eocdcio:dataFileHash>

<eocdcio:fileFormat/>

<eocdcio:cloudCoverPercentage uom="%">12</eocdcio:cloudCoverPercentage>

<eocdcio:snowCoverPercentage uom="%">0</eocdcio:snowCoverPercentage>

</eocdcio:ProductInformation>

</eocdcio:product>

</eocdcio:CorrelativeProductResult>

</gml:resultOf>

</eocdcio:CorrelativeProduct>

</metadata>

About DCIO - OAI-PMH

DCIO chose the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) for interchanging catalog metadata. Some peer data centers are fully OAI-PMH compliant, and some have chosen to provide metadata through a custom-built mechanism. The OAI-PMH's purpose is to transfer large amount of records from one data repository to another. All metadata will be searchable in both records, but the data remains in the primary archive. Using the OAI Harvester module, a site administrator must first register an existing data repository by name and base URL. The base URL specifies the Internet host, port, and path, of an HTTP server acting as a repository, without any parameter. After saving the repository, the OAI Harvester module will detect and save all the necessary information, such as the descriptive information about the data provider, the list of metadata formats, and sets, the availability of the server. After fetching the last record from data provider and processing, the result is a collection of all new, updated, and deleted records in the archive.

Read more on https://www.openarchives.org/pmh/