Name: LSDMA Technical Forum
Start: 2016-10-06T08:50:00+02:00
End: 2016-10-06T12:20:00+02:00
Location: KIT Campus Nord

LSDMA Technical Forum

Thursday 6 October 2016 - 08:50

Monday 3 October 2016
Tuesday 4 October 2016
Wednesday 5 October 2016
Thursday 6 October 2016
08:50 Introduction - Marcus Hardt (KIT)
Introduction
- Marcus Hardt (KIT)
08:50 - 08:59
Room: FTU Aula
09:00 Nanoscience Foundries & Fine Analysis (NFFA) - Thomas Jejkal (KIT)
Nanoscience Foundries & Fine Analysis (NFFA)
- Thomas Jejkal (KIT)
09:00 - 09:20
Room: FTU Aula Nanoscience Foundries & Fine Analysis (NFFA) is a research project funded from the EU's H2020 framework programme for research and innovation. The vision of the NFFA project is to provide free and transnational access to the widest range of tools for research at the nanoscale. The NFFA infrastructure is distributed all over Europe providing synchrotron, FEL and neutron radiation sources for growth, nano-lithography, nano-characterization, theory and simulation and fine-analysis. To make research data stored in data archives attached to the different facilities manageable, retrievable and sharable a distributed Information and Data Repository Platform (IDRP) is build up by WP8 of the NFFA project. This talk gives a short overview on the overall architecture of the IDRP, adopted technologies and novel approaches. Finally, challenges approached when designing the distributed repository are presented and elaborated.
09:20 Automated Provenance Management for Enabling Scientific Data Reproducibility - Ajinkya Prabhune (KIT)
Automated Provenance Management for Enabling Scientific Data Reproducibility
- Ajinkya Prabhune (KIT)
09:20 - 09:40
Room: FTU Aula Provenance traces history within workflows and enables researchers to validate and compare their results. Modelling workflows in ProvONE standard provenance model is an arduous task and lacks an automated approach. To overcome this limitation, in this talk we present a novel graph drawing algorithm for generating ProvONE prospective provenance graphs. These graphs are further updated with the relevant retrospective provenance during the execution of the workflow. Finally, we also show the provenance management architecture for scientific data repository and present the various queries for retrieving the provenance information.
09:40 Services for long time data storage - project bwDataArchiv - Jos van Wezel (Karlsruhe Institute of Technology)
Services for long time data storage - project bwDataArchiv
- Jos van Wezel (Karlsruhe Institute of Technology)
09:40 - 10:00
Room: FTU Aula Requirements from data archives and repositories form the basis of the reliable long term storage infrastructure built in the bwDataArchiv project. Large data from scientific experiments including HPC simulations that must be kept around but does not need to occupy precious on-line analysis storage can be stored easily by using common storage protocols, reliably by using end to end checksums, and economically by using magnetic tape. Through available technologies in combination with developments from e.g. other LSDMA/DSIT work packages the infrastructure is effectively used by national and international projects.
10:00 Towards Information Infrastructures in DFG Collaborative Research Centres - Richard Grunzke (TU Dresden) Rainer Stotzka (KIT)
Towards Information Infrastructures in DFG Collaborative Research Centres
- Richard Grunzke (TU Dresden)
- Rainer Stotzka (KIT)
10:00 - 10:30
Room: FTU Aula In this slot we present two new SFBs: Collaborative Research Centre "Volition and Cognitive Control": Data Management, Workflow Optimization and Science Gateway Richard Grunzke The overarching aim of the Collaborative Research Centre (CRC) is to elucidate cognitive and neural mechanisms underlying adaptive volitional control as well as impaired control in selected mental disorders. Researchers of the CRC collect a wide variety of data such as MRI-images, EEG, genetic, or behavioral data from participants in various research projects. Due to the increasing number of participants, multimodal assessment, improved imaging technologies as well as new research projects the amount of CRC data stored in diverse files is steadily increasing. The INF project will design and build a system that manages the data including metadata, enables its analysis using HPC resources, enables data sharing, and integrates with the existing science gateway. ---------------------------- Collaborative Research Center “Episteme in Motion”: Data and Analysis Infrastructure Danah Tonne, Rainer Stotzka The Collaborative Research Center “Episteme in Motion” is dedicated to the examination of processes of knowledge change in European and in non-European pre-modern cultures. The INF project of the CRC develops methods and practices for the digital exploitation and visualization of epistemic changes within long-term processes of transmission of premodern corpuses. It uses travelling manuscripts, codices, prints, albums and library inventories as examples of systemic transfer processes. The aim is to build a repository for the digital data objects and their metadata useful for the specific purposes of all the projects of the CRC. By cooperating closely with the project “Manuscripts in Motion: Tools for Documenting, Analysing and Visualising the Dynamics of Textual Topographies" we test forms of cooperation between the humanities and applied computer science. Given that a cooperation between three institutions is going to be established, the INF project will serve as a pilot for DARIAH-DE for the implementation of complex institutional cooperation.
10:30 Coffee
Coffee
10:30 - 11:00
Room: FTU Aula
11:00 GeRDI - Generic Research Data Infrastructures - Richard Grunzke (TU Dresden)
GeRDI - Generic Research Data Infrastructures
- Richard Grunzke (TU Dresden)
11:00 - 11:20
Room: FTU Aula The new ~3 million Euro DFG project GeRDI (Generic Research Data Infrastructure) aims at building and connecting research data mangement systems. The project involves significant efforts in the areas of requirement analysis, implemenation, pilot operation, and sustainability. Germany-wide scientisits will be enabled to store, search for, and re-use cross-disciplinary research data.
11:20 BigStorage - Michael Kuhn (Universität Hamburg)
BigStorage
- Michael Kuhn (Universität Hamburg)
11:20 - 11:40
Room: FTU Aula BigStorage is a European Training Network (ETN) whose main goal is to train future data scientists in order to enable them to apply holistic and interdisciplinary approaches for taking advantage of a data-overwhelmed world. Such expertise is mandatory to enable researchers to propose appropriate answers to application requirements while leveraging advanced data storage solutions unifying cloud and HPC storage facilities. Four representative big data application use cases are studied to set up the foundation for the project: the Human Brain Project (HBP), the Square Kilometre Array (SKA), climate science and smart cities. More information is available at http://bigstorage-project.eu/.
11:40 Thrill - Timo Bingmann (KIT)
Thrill
- Timo Bingmann (KIT)
11:40 - 12:00
Room: FTU Aula
12:00 dCache: new and exciting features - Paul Millar (DESY)
dCache: new and exciting features
- Paul Millar (DESY)
12:00 - 12:20
Room: FTU Aula The dCache project develops and supports software for storing large volumes of scientific data in a POSIX namespace, optionally storing some of the data on tape, with scalable performance and support for many protocols. This talk will focus on recent improvements that are already available or are anticipated for the next major release. We are introducing a new token-based authorisation scheme that will allow easy sharing of data and external user management. The interface for managing the quality of service users expect for their files is being improved, along with a new web interface for managing data. This provides users with an enriched view of their data. Under the hood, core services are being updated so they can be scaled horizontally and are no longer a single-point-of-failure. We are also adding support within dCache for using clustered storage, initially targeting CEPH, a popular object store.