23 January 2025 to 20 February 2025
Europe/Berlin timezone

Direct DOI Dataset Access

Not scheduled
20m

Speaker

Alexander Paul Millar (IT (Research and Innovation in Scientific Co))

Description

Associating datasets with persistent identifiers (a globally unique ID) is crucial for ensuring that research outputs are available for open science and comply with the FAIR principles (Findable, Accessible, Interoperable, and Reusable). This enhances the value of the data and it also allows other people to recreate existing work, or to use the data in new and novel research.

Currently, several platforms and institutes offer the possibility to store data with a DOI, a type of persistent identifier. However, no standard exists that allows a machine to access or download the data from a DOI. Repositories typically either have no way to download data automatically, or have adopted some proprietary solution. With no widely deployed standard, any support for accessing or downloading data from a DOI can only be incomplete.

You would be working on this task, for which a solution has already been proposed, based on existing approaches in other contexts (HTTP content-negotiation, and a dataset description language). You would implement a proof-of-concept code, based on the design, that will demonstrate that the approach can work. This will involve a mixture of modifying existing production software and developing new code to build a demonstration of the benefits of this approach.

Special Qualifications

Requirements: Experience with programming and git

Beneficial: network protocols and HTTP in particular, knowledge of golang, experience with filesystems and FUSE.

Group IT
Project Category B5. Computing
DESY Site Hamburg

Primary authors

Alexander Paul Millar (IT (Research and Innovation in Scientific Co)) Dr Kilian Schwarz (IT (IT Scientific Computing)) Melanie Nentwich (None)

Presentation materials

There are no materials yet.