- Sign in for one of the available DESY or European XFEL Tours that take place thursday afternoon.
Greetings from Senator Katharina Fegebank (BWFGB), UHH president Hauke Heekeren, DESY director Beate Heinemann, TUHH president Andreas Timm-Giel
Welcome from CDCS spokesperson Matthias Rarey
Session Chair: Matthias Rarey
Physical and Natural sciences in general have entered the domain of Big Data and the quantity, complexity and rate at which data are produced, analyzed or needed to be simulated,
is imposing considerable stress on conventional computing to extract relevant scientific information in a timely manner. Novel, disruptive techniques such as Artificial Intelligence
and, at a different level Quantum...
In recent years, the fields of data science and bio science have been growing closer together; many state-of-the-art approaches for DNA sequencing and metagenomics, for example, make use of machine learning. However, the two disciplines often just apply techniques or use cases from the other field without truly engaging in the details, particularities, and opportunities that the most recent...
Moderated by Marc Wenskat.
A scientist introduces a topic or situation in a scientific context. Then, they present one or more statements and the audience must guess whether these statements are True or False.
The amount, size, and complexity of astronomical data-sets is growing rapidly in the last decades. Now, with new technologies and dedicated survey telescopes, the databases are even growing faster. Besides dealing with poly-structed and complex data, sparse data has become a field of growing scientific interest. By applying technologies from the fields of computer sciences, mathematics, and...
In the last 15 years there has been a spectacular rise of large data volumes acquired in X-ray diffraction experiments. In 2006, around the time I started my PhD, the world’s first soft X-ray free-electron laser, FLASH in Hamburg, was collecting diffraction patterns at roughly 1 Hz. Nowadays we're collecting data at the European XFEL at a peak rate into the megahertz.
This has enabled the...
Rapidly growing Omics data are providing an unprecedented opportunity to gain novel insights into biological systems and disease processes. Network modeling is a powerful approach that can be used to integrate complex information from multiple types of Omics data. In the field of network medicine, our group has developed a suite of methods that support: (1) effective integration of multi-omic...
You are invited to discuss 10 current problems from the 5 interdisciplinary units. Move freely between the standing tables in the marquee and share your thoughts.
Monitoring is a key element in guaranteeing the state of health of a system, all the more important when the system is critical, autonomous, and/or operating remotely. Anomaly detection and diagnosis are two main aspects. While model-based approaches have been around for a long time, they have been challenged in recent years by data-based approaches which proceed with an exploration of...
What is controlled and how does it work.
Current and future impact of Data Science/Machine Learning/Deep Learning on natural sciences
Moderated by Klaus Ehret.
Six panelists with expertise in natural sciences and/or computer science will discuss the similarities and differences between their fields, ubiquitous challenges of the digitalisation era, and the convergence and divergence they see between their fields and computer science.
Alternative splicing (AS) is a major contributor to transcriptome and proteome diversity in health and disease. A plethora of tools has been developed for studying alternative splicing in RNA-seq data. Previous benchmarks focused on isoform quantification and mapping. They neglected event detection tools which arguably provide the most detailed insights into the alternative splicing process....
Protein-protein interaction (PPI) networks are an important resource in systems biology. PPI interactions are identified in tedious experiments. Due to the high number of possible interactions, efforts are limited to testing only major protein isoforms, hence neglecting the considerable influence of Alternative splicing (AS) on the interactome.
To close this gap, we developed DIGGER (Domain...
The design and study of plasma-based accelerators greatly relies on numerical simulations with particle-in-cell codes. These simulations accurately model the interaction between plasmas, lasers, and charged particle beams, but are often computationally expensive. Thus, given the wide range of physical parameters involved, optimizing the accelerator performance requires efficient methods that...
Viral infections by RNA viruses emerged in the past decade as a global health challenge, given their enormous epidemic potential and their severe pathological outcomes. Despite a relatively high genetic similarity and overall conserved replication strategies, these viruses evolved finely tuned and divergent mechanisms of host exploitation, resulting in extraordinarily distinct tropisms and...
Bacterial insecticides are important in green agricultural pest control and the combat against arboviruses. They act very specific on target organisms, thus neither harming other insects, nor vertebrates (including humans). Their occurrence as native nanocrystals and lack of structural homologues prevent current structure determination efforts to understand their mode of action. Alphafold v2.0...
At the time when FLASH was constructed, controlling a high-repetition SASE FEL represented a bunch if challenges like the extraordinary requirements on timing on the femtosecond scale and the high number of electron bunches accelerated by the superconducting Linac. Especially the operation of the FLASH1 and FLASH2 beamlines by the same accelerator in parallel requires a reliable...
DAPHNE4NFDI (DAta from PHoton and Neutron Experiments for NFDI) is one of 19 consortia receiving funding as part of the German National Research Data Infrastructure (NFDI e.V.). The aim of DAPHNE4NFDI is to create a comprehensive infrastructure to process research data from large scale photon and neutron infrastructures according to the FAIR principles (Findable, Accessible, Interoperable,...
Area X-ray detectors became bigger (having more megapixels) and faster (measuring more frames per second). This allows to measure dynamical processes in protein crystals with high resolution and below 1ms time scale. The price to pay is the amount of data that such detectors generate. Unfortunately, storage volume is growing much slower. Therefore, there is an increasing gap between the data...
Single-cell RNA sequencing (scRNA-seq) technology provides an unprecedented opportunity to understand gene functions and interactions at single-cell resolution. Various computational methods have been developed for differential expression and co-expression analysis in scRNA-seq data. However, little attention has been paid to differential co-expression analysis that potentially holds valuable...
One of the most prominent challenges in the field of diffractive imaging is the phase retrieval problem: In order to reconstruct an object from its diffraction pattern, the inverse Fourier transform must be computed. This is only possible given the full complex-valued diffraction data, i.e. magnitude and phase. However, in diffractive imaging, generally only magnitudes can be directly measured...
Dynamic proton migration along the protein undergoes conformation structural changes being able to promote a folding/unfolding process. Those migration processes have been investigated by challenging near-edge X-ray absorption mass spectrometry (NEXAMS) experiments and computationally expensive calculations at high ab initio theory levels. Therefore, to obtain a solid understanding of...
Since latent disease heterogeneity complicates discovery of biomarkers and elucidation of disease mechanisms, unsupervised stratification based on omics data is an extremely important problem in biomedicine. This problem is traditionally approached by clustering methods which may not be efficient for high-dimensional datasets with multiple overlapping patterns of various sizes. A promising...
The phenomenal growth of computing capabilities have accelerated the ability to combine chemistry, physics and Machine Learning (ML), as a true symbiosis, so as to precisely model and understand complex biomolecular processes at the atomistic scale. However, complexities of proteins and high computational costs of quantum mechanics methods for large systems impose a great challenge in...
Almost all areas in the physical or engineering sciences rely on computational models to some extent. The models can be based on fundamental physics processes (physics-based) which typically leads to a set of differential equations. Alternatively, machine learning techniques can be used to infer input-output relations out of very large sets of data. Both approaches come with different...
In high energy physics, detailed and time-consuming simulations are used for particle interactions with detectors. For the upcoming High-Luminosity phase of the Large Hadron Collider (HL-LHC), the computational costs of conventional simulation tools exceeds the projected computational resources. Generative machine learning is expected to provide a fast and accurate alternative. The CMS...
Plasma accelerators enable the acceleration of charged particles over short distances due to their multi-GeV/m field gradients, making them a compact alternative to conventional technologies. Despite large progress on beam energy and quality over the last decade, significant progress is still required on beam quality and stability to fill the gap between promising concepts and...
Current noisy intermediate-scale quantum devices suffer from various sources of intrinsic quantum noise. Overcoming the effects of noise is a major challenge, for which different error mitigation and error correction techniques have been proposed.
In this paper, we conduct a first study of the performance of quantum Generative Adversarial Networks (qGANs) in the presence of different types of...
Cryo-EM is a popular technique for understanding the structure of biological molecules. At intermediate resolutions (worse than ~4.5 Å), building and assessing the quality of atomic models derived from cryo-EM data is particularly difficult. At this resolution range, existing X-ray models or models derived from machine-learning based structure prediction approaches such as AlphaFold2 offer...
For optimal operation, accelerators and FELs require precise control of their control parameters. Lasers are of critical importance for photocathode, FEL seeding, and probe lasers. We will show our current results and plans to optimize the performance (pulse parameters, fast set-point tuning, stability) of our photocathode and pump-probe lasers using AI methods.
In an accelerator, the...
Cyber-Physical Systems (CPS) consist of embedded digital devices while interacting with their physical environment. Typical examples range from simple heating systems over robotic subsystems to highly complex control systems, e.g., industrial production systems or particle accelerators and their subsystems. Understanding and modeling these systems is difficult because they consist of multiple,...
The Hamburg Leibniz ScienceCampus "Integrative Analysis of pathogen-induced Compartments" InterACt has set itself the goal of better understanding the role of compartments in the course of infection.
InterACt investigates the interaction between pathogens such as viruses, bacteria, and parasites and the affected host. During the cellular infection cycle, pathogens use the existing reaction...
In high-energy particle physics, complex Monte Carlo simulations are needed to connect the theory to measurable quantities. Often, the significant computational cost of these programs becomes a bottleneck in physics analyses.
In this contribution, we evaluate an approach based on a Deep Neural Network to reweight simulations to different models or model parameters, using the full kinematic...
Single particle cryo-electron microscopy (cryo-EM) is an increasingly important method for determining the three-dimensional structure of proteins. As a single particle technique, it allows for the elucidation of large macromolecular complexes, provides information on protein dynamics and gives access to proteins that are difficult to crystallize.
For this purpose, molecules in aqueous...
Finding new indications for approved drugs is a promising alternative to de novo drug development, an often lengthy and costly process. Systems medicine has brought forth several different approaches to tackle this important task. We recently published NeDRex, a network medicine tool for the identification of disease modules and drug repurposing. NeDRex-Web (https://web.nedrex.net) brings...
DASHH is an interdisciplinary graduate school that offers challenging PhD topics at the interface of the natural sciences, applied mathematics, and computer science. Here, highly talented graduates can do innovative data science research, acquiring and deepening unique insights with our partner institutes, the Deutsches Elektronen-Synchrotron, Universität Hamburg, Hamburg University of...
The simulation of particle showers in calorimeters is a computational demanding process. Deep generative models have been suggested to replace these computations. One of the complexities of this approach is the dimensionality of the data produced by high granularity calorimeters. One possible solution could be progressively growing the GAN to handle this dimensionality. In this study,...
The ProteinsPlus web server (https://proteins.plus)[1] offers modelling support for numerous challenges concerning the in-depth investigation of biomolecules. Its unique tools provide easy access to various structure-based analyses for interdisciplinary researchers through an intuitive user interface. Users can perform numerous computational studies for more than 174,000 three-dimensional...
In High Energy Physics, the interaction of particles with matter at
the detectors are best simulated with the GEANT4 software. Alternatively,
less precise but faster simulations are sometimes preferred to
reach higher statistical precision. We present recent progress of refinement
of fast simulations with ML techniques to enhance the quality of
such fast simulations. We demonstrate the...
Some machine learning algorithms use statistical gradient-based learning methods in a data driven way to solve problems. These methods find correlations in the presented datasets and, thus, also for problems that are difficult to solve with classical algorithms. Lately, so-called artificial neural networks (ANNs) have become one of the most important and indispensable machine learning tools in...
The state space of a quantum-mechanical system grows exponentially in the number of its classical degrees of freedom. Thus, efficient approximations are crucial for extracting physical information from this vast space. In the variational approach, computations are performed on trial states determined by a tractable number of parameters. Recently, the so-called neural quantum states (NQS) have...
Capillarity-driven flows in pores a few nanometers in diameter play an important role in many natural and technological processes, for example in clay swelling, frost heave, catalysis and transport across artificial nanostructures, bio-membranes and tissues [1]. Here we present molecular dynamics simulations modelling the capillary flow of water into silica nano-pores (MCM-41) of around 3 nm...
Weakly-bound complexes are very appealing for experimental investigations of resonances in dissociation dynamics, which is of vital importance to roaming reactions. Planning and elucidating experiments requires accurate quantum mechanical calculations of (ro-)vibrational energies up to dissociation, which is a challenging task for these systems because of their flexible degrees of freedom and...
Introduction: Alternative splicing (AS) drives protein and transcript diversity and is known to play a role in many diseases. The exact mechanisms controlling the AS machinery are currently insufficiently understood. During disease progression or organism development, AS may lead to isoform switches (IS) that follow temporal patterns. Several IS genes occurring at the same time point could...
Voting, or more generally taking decision in groups, is seen as a common procedure in our culture. We vote for our representatives, we find available time slots for a group meeting, we answer a survey on our favourite films, or we vote for the best poster. In this poster, we discuss different vote algorithms, their properties, important paradoxes, and concrete implementations. In particular,...
Outrunning radiation damage, femtosecond pulses of x-ray free-electron lasers (XFELs) open up the possibility of imaging the structure and dynamics of uncrystallized single-macromolecules, frozen in time at room-temperature, at ultrafast timescales. Imaging light-induced ultrafast dynamics in single-macromolecules in real-time is one of the key applications of XFELs. However, photoactive...