2nd Pan-European Advanced School on Statistics in High Energy Physics

Europe/Berlin
SR 4a/b (DESY Hamburg)

SR 4a/b

DESY Hamburg

Description

We are looking forward to welcome you to the second Pan-European Advanced School on Statistics in High Energy Physics. The virtual school is open to master and PhD students as well as to Post-docs. Participants should already have a good knowledge of statistical methods in data analysis. The school focuses on two topics:

  1. Modeling of Data 
  2. Data Combination.

The compact school (3 days a ~4 hours) comprises lectures by renowned statisticians and particle physicists, followed by ample time for discussion.
 

Click here to join Zoom

The biennial series of Pan-European Advanced Schools on Statistics in High Energy Physics was founded in 2019 by the INSIGHTS Marie Sklodowska-Curie ITN and DESY. More information on INSIGHTS can be found here.

Participants
    • 14:00 18:30
      Modeling of data 1
      • 14:00
        Introduction 10m
        Speaker: Olaf Behnke (CMS (CMS Fachgruppe TOP))
      • 14:10
        Goodness-Of-Fit tests (Part 1) 45m

        A goodness-of-fit test is concerned with the question whether a
        given data set was generated by a specific probability distribution
        such as an exponential. In this seminar we will discuss a variety
        of such tests. We will consider their relative merits, and how to
        run several of them simultaneously. We will also discuss tests for
        multivariate data and a number of special cases such as binned
        data and comparing Monte Carlo to data.

        Speaker: Wolfgang Rolke (UPR )
      • 14:55
        Discussion time 10m
      • 15:05
        Virtual coffee break 10m
      • 15:15
        Goodness-Of-Fit tests (Part 2) 45m
        Speaker: Wolfgang Rolke (UPR - Mayaguez)
      • 16:00
        Discussion time 15m
      • 16:15
        Virtual coffee break 30m
      • 16:45
        Statistical Modeling 1h

        In this lecture I will given an overview of various aspects of
        statistical modeling. I'll start be reviewing parametric models.
        Then I will discuss methods for dealing with model
        misspecification. Next I will discuss nonparametric methods and
        then move on to universal inference which is a new method for
        handling irregular models. I'll finish with a few quick remarks on
        semiparametric inference.

        Speaker: Larry Wasserman (CMU )
      • 17:45
        Discussion time 15m
    • 14:00 20:15
      Data combination
      • 14:00
        Data combination - introduction 45m

        The lecture will address the topic of combining information
        from different sources in an analysis of Particle Physics data.
        The general formalism by which this is done in both the
        Bayesian and Frequentist approaches will first be reviewed.
        Combination of results relies fundamentally on constructing
        a likelihood that reflects all of the available data. Often this
        requires some approximations and assumptions as the
        detailed information needed to write down the full likelihood
        may not be available. An important aspect of combined (and
        individual) data analyses is the assignment of uncertainties to estimates
        to nuisance parameters. A method will be described by which
        uncertainties on the assigned uncertainties
        themselves can be incorporated, and the impact of this type of
        a model on combinations will be shown.

        Link to the python code used for the example of a+bx fit:
        https://www.pp.rhul.ac.uk/~cowan/stat/fitCombo.py
        https://www.pp.rhul.ac.uk/~cowan/stat/fitCombo.ipynb

        Speaker: Glen Cowan (RHUL)
      • 14:45
        Discussion time 15m
      • 15:00
        Virtual coffee break 30m
      • 15:30
        Data combination - in practice 45m

        The lecture will address practical aspects and possible pitfalls
        when combining particle physics measurements or limits and will
        give pointers to methods and tools that can be used for that
        purpose. One particular focus will be the combination of single
        valued and multiple valued (differential) measurements with
        complex correlations between their nuisance parameters. The
        lecture will be accompanied by hands-on examples. A CERN
        account with the possibility to log in to lxplus is recommended, but
        all examples can also be followed without running the software.

        For use of convino code at lxplus.cern.ch:

        After login do:

        bash

        cd /afs/cern.ch/user/j/jkiesele/public/Convino/latest

        source lxplus_env.sh

        cd

        mkdir convino_tutorial

        cd convino_tutorial

        convino /afs/cern.ch/user/j/jkiesele/public/Convino/latest/examples/exampleconfig.txt

        cp -r /afs/cern.ch/user/j/jkiesele/public/Convino/tutorial/* .

        Speaker: Jan Kieseler (CERN )
      • 16:15
        Discussion time 15m
      • 16:30
        Virtual coffee break 30m
      • 17:00
        What to publish 45m

        The statistical model is the a unique summary of a physics
        analysis from which many of the key results can be derived
        from. Preserving it allows not only the reproduction of key
        results but also its reuse in statistical combinations and
        reinterpretations. In this talk we will cover how to best
        publish statistical model data to enable these use-cases.

        A different way of making measurements available to physicists
        outside of the experimental collaboration is by reversing
        the smearing effects of the detector
        and reconstruction. Various different methods
        of unfolding allow to publish data in a way
        that is independent of a specific experimental setup.

        Speakers: Carsten Burgard (ATLAS (ATLAS Dark Matter with Higgs)), Lukas Heinrich (CERN)
      • 17:45
        Discussion time 15m
    • 14:00 20:25
      Modeling of data 2
      • 14:00
        Introduction to Optimal Transport 45m

        Optimal transport (OT) is a method for mapping one probability
        distribution into another. OT also leads to a method for defining a
        geodesic between distributions which allows us to morph one
        distribution into another. I will introduce the basics of optimal
        transport and I will explain how optimal transport maps and
        morphings can be estimated from data.

        Speaker: Larry Wasserman (CMU )
      • 14:45
        Discussion time 15m
      • 15:00
        Virtual coffee break 30m
      • 15:30
        Gaussian Processes 45m

        In this lecture, I will provide an introduction to Gaussian
        processes (GPs), with a view toward applications in high-energy
        physics. I will start with the basic definition of a GP and explain
        how to perform inference with these models. I will then describe
        the choice and estimation of the mean and the covariance
        function and demonstrate these ideas with simple examples.
        I will close with a brief overview of applications of GPs in
        high-energy physics.

        Speaker: Mikael Kuusela (Carnegie Mellon University)
      • 16:15
        Discussion time 15m
      • 16:30
        Virtual coffee break 30m
      • 17:00
        EFT Lagrangian Morphing 45m

        In this lecture I will discuss a method of morphing distributions
        that is useful to measure the parameters of an Effective Field
        Theory (EFT). I will introduce EFT which is a powerful
        theoretical framework that is used to systematically extend
        known physics lagrangians. I will then talk about the idea
        behind the morphing between distributions given the
        predictions at some point in the parameter space which allow
        to obtain a continuous prediction in terms of EFT parameters.
        I will finally show a couple of examples of the implementation
        of this technique as the RooLagrangianMorphFunc class
        within RooFit toolkit that is available with the ROOT software.

        Speaker: Rahul Balasubramanian (Nikhef and University of Amsterdam)
      • 17:45
        Discussion time 15m
      • 18:00
        Closing of School 10m
        Speaker: Olaf Behnke (CMS (CMS Fachgruppe TOP))