interface types options

Table of Contents

1. Use case

The use case that inspired this is the one of TrackerHits, where different detector concepts will potentially provide different tracker hit types / measurement encapsulations. E.g. a 3D space point reconstructed from a TPC measurment, or a 2D measurement from a silicon detector. A Track has OneToManyRelations to several tracker hits, which not necessarily have the same type. In podio generated EDMs the only existing way of doing this currently is to add a OneToManyRelation for every existing tracker hit type. This is not really a satisfactory solution because:

  • Adding a new tracker hit type, also requires modifying the Track definition
  • It is not easily possible to iterate over all tracker hits of a track in an ordered fashion, e.g. from the IP outwards. Keeping track of the order in the different relations requires additional work.
  • In many(?) cases the user is not so much interested in the exact details of which tracker hit type he is currently dealing with, but rather just in some common information available from all of them, e.g. the 3D position to calculate the track length from all tracker hits.

1.1. Solution in LCIO

In LCIO this is simply solved by having a hierarchy of types, where the TrackerHit is a simple abstract interface, that is implemented by several other types, see: https://ilcsoft.desy.de/LCIO/current/doc/doxygen_api/html/classEVENT_1_1TrackerHit.html

2. Proposed functionality

Introduce a new type of types, interfaces, into the podio grammar that serve as generic handles that can store several different concrete datatypes under the hood. They provide the common or interface functionality by effectively simply forwarding to the actually contained value. They behave very similar to the immutable default handles, and currently no support for mutable interface types is foreseen. The reason for this is that at creation time of the things that go into the interface types they are done with the concrete types in any case.

The basic features that interface types will / will not support:

  • They are usable in relations in generated EDMs just like any concrete datatype. This comprises the behavior in the yaml definition as well as in the generated code.
  • They “feel” like concrete types in their usage, i.e. they are handles that are easy to copy and that offer the same “value semantics” as the rest of podio generated datatypes
  • No plan for interface type collections is foreseen. I.e. all the values in interface types have to be stored in a collection of the concrete types involved, and there is no heterogenous collection I/O foreseen. Again, the argument here is similar to the one for why there are only immutable interface type handles.
  • For I/O handling purposes it is necessary to define a list of datatypes that can be persisted in an interface relation. This is necessary, as we need to first instantiate the concrete type value when reading and then fill the interface type with it.

2.1. Proposed grammar

In the YAML EDM definition interfaces are defined as a third category, similar to the two existing ones; components and datatypes. The grammar looks like this

interfaces:
  edm4hep::TrackerHitInterface:
    Description: "A generic tracker hit"
    Author: "EDM4hep authors"
    Members:
      - float eDep [GeV] // the deposited energy
      - edm4hep::Vector3d position [mm] // the position
    Types:
      - edm4hep::TrackerHitPlane
      - edm4hep::TrackerHit

The Members define the interface (i.e. the available getters) of the interface types. All the types that should be covered by one interface type need to also have these getter methods defined, either through code generation or through ExtraCode (if the implementation supports it).

The Types are all the types that have I/O enabled for this interface. Depending on the implementation other types that have a sufficient interface might also work in memory.

2.2. Proposed features / interface

  • The interface types offer (non)equality operators to themselves and also to the contained handle classes, such that it is possible to do something like:
TrackerHitPlane thp{};
TrackerHitInterface hit = thp;
REQUIRE(hit == thp);
  • The interface types also offer functionality to get the contained handle back, resp. to also check the type of the current value
TrackerHitPlane thp{};
TrackerHitInterface hit = thp;

REQUIRE(hit.holds<TrackerHitPlane>()); // <---- naming up for debate
REQUIRE_FALSE(hit.holds<TrackerHit>());

auto thp2 = hit.getValue<TrackerHitPlane>();
auto th = hit.getValue<TrackerHit>(); // <---- throws an exception

3. Implementation options

Given that we want “value semantics” (like) behavior a purely inheritance based solution is immediately ruled out. The two main options that come into play are either based on std::variant or on type-erasure. Depending on some details the achievable behavior can be made pretty much identical for either.

3.1. Variant based solution basics

For the std::variant based solution the basic idea is to introduce a templated InterfaceWrapper class that contains the std::variant. It provides the necessary functionality of handle classes as well as other basic functionality that is more or less independent of the contained types. The actual interface types will simply inherit from a concrete instantiation of it and will define the additional functionality, mainly the getter functions from the Members list.

The basic definition is like this

template<typename HandleT...>
class InterfaceWrapper {
public:
  // common public functionality
protected:
  std::variant<...> m_value;
};

where the exact definition of the m_value variant has two main options, either effectively containing the Obj*, or directly containing the different Handle types.

The TrackerHitInterface will then look something like (generated code)

class TrackerHitInterface : public InterfaceWrapper<edm4hep::TrackerHit,
                                                    edm4hep::TrackerHitPlane> {
  // additional functionality not covered by the basics
};

3.1.1. Variant of Obj*

More precesily

std::variant<MaybeSharedPtr<ObjT>...>;

where ObjT is obtained from the variadic template argument list of the InterfaceWrapper via some template meta programming.

  1. Pros
    • Quite straight forward to generalize to also have mutable interface type handles
    • Some common implementation is in plain c++ and not in jinja templates
  2. Cons
    • Can only work on the data that is available from the Obj, i.e. mainlyall Data members. Not possible to accomodate types that provide some functionality through ExtraCode
    • A bit more involved in the necessary template machinery to get to the Obj types for the different handles

3.1.2. Variant of immutable handles

More precisely

std::variant<HandleT...>;

where HandleT are simply passed through from the variadic template argument list of the InterfaceWrapper

  1. Pros
    • Supports ExtraCode functionality since the std::visit calls operate on the handles directly
    • Some common implementation is in plain c++ and not in jinja templates
  2. Cons
    • Always stores a handle inside, which might incur overhead compared to simply having the Obj*
    • Not completely straight forward to generalize to mutable handles

3.2. Type-erasure based solution basics

The basic idea is to simply have a jinja template from which a type-erased interface implementation is generated. This basically follows the usual type-erasure technique, i.e. using the TrackerHitInterface as example

class TrackerHitInterface {
  struct TrackerHitConcept {
    virtual float getEDep() const = 0;
    // All the other necessary getters as pure virtual functions
  };

  template<typename T>
  struct TrackerHitModel final : public TrackerHitConcept {
    TrackerHitModel(T val) : m_value(val) {}

    float getEDep() const final { return m_value.getEDep(); }
    // All the other necessary overrides
  private:
    T m_value{};
  };

  std::unique_ptr<TrackerHitConcept> m_self{nullptr};

public:
  template<typename T>
  TrackerHitInterface(T val) :
    m_self(std::make_unique<TrackerHitModel<T>(val)) {}

  float getEDep() const { return m_self->getEDep(); }
  // All the other necessary getters
};
  1. Pros
    • Supports ExtraCode functionality since it simply hands through to the contained value
    • Can in principle also store other values in memory if they have the corresponding interface
  2. Cons
    • Always stores a handle inside
    • Can in principle also store other values in memory if they have the corresponding interface. This might lead to unexpected issues for I/O.

    From a technical point of view it is not impossible to restrict the type erasure based solution to the same restrictions as the variant based ones

Author: Thomas Madlener

Created: 2023-11-30 Do 13:05