1 October 2015
Building 30.10, KIT (Campus South)
Europe/Berlin timezone

Data and software preservation for open science - connecting publications with Cyberinfrastructure

1 Oct 2015, 09:30
1h
Lecture Hall NTI (Building 30.10, KIT (Campus South))

Lecture Hall NTI

Building 30.10, KIT (Campus South)

Speaker

Jarek Nabrzyski (University of Notre Dame)

Description

Many science domains are exploring mechanisms to preserve research data artifacts such that they can be reused in the future consistent with scientific principles of reproducibility. For the computational sciences, research artifacts include not just data, but the software that produced the data artifacts. Without access to software, it is difficult to form a proper scientific contextualization, and therefore judgement about the resultant data. One possible mechanism for preserving and sharing of data is to utilize the principles of Linked Open Data. These principles facilitate data to be discovered, shared, understood and reused for scientific research utilizing web standards for handling data and encourage publication of data under an open license. However, it has been observed that linked data without proper context just more data in a different schema. This observation is particularly true in the sciences where requirements such as provenance, quality, credit, attribution, methods are critical to the scientific process. For the Computational Sciences, context and provenance of data artifacts require connection to the software that produced those artifacts as well as connection to conceptual and mathematical models that constitute algorithms that are instantiated in software. In this talk various US-based projects addressing the issues of scientific reproducibility will be introduced. Next, a model of publishing software that would facilitate connecting data artifacts to the software algorithms that produced those artifacts utilizing a Linked Open Data Model will be presented in detail. This model extends work done by different scientific communities to share measurements in a standardized way that captures both provenance, methods, conditions, units of the measurement process and extends conceptualization to include model and algorithm that constitute a “computational measurement”.

Presentation materials