======================================
  Type A
======================================

Abstract L1: "Observation [..] is presented" -> "The observation [..] is reported"

Abstract L8: "tZq signal significance" -> "significance of the tZq signal"

L3: "The number [..] facilitate" -> "The number [..] facilitates"

L4: [..] is such a process. -> One such process is [..].

L49: ", pileup," -> suggest to use parentheses "(pileup)", or state it more clearly, e.g. ", referred to as pileup,"

L139: "the tZq and background events" -> "tZq and background events", or "tZq production and background processes"

L148: |eta| -> the |eta| (in L146 you use "the |eta| of the recoiling jet")

Table 1: increase the size of the table; if it helps, consider removing the code-name of each category in the headers, e.g. "(SR-2b)".

Figure 1A, Caption L3: "|eta| of the recoiling jet" -> "the |eta| of the recoiling jet"

Figure 1A, Caption L3: "pane" -> "plot"

======================================
  Type B
======================================

Abstract L7: "previous measurements of the tZq cross section" -> wouldn't it be better to say something like "previous searches for tZq production", given that this one is the paper which establishes the observation?

L61: reading the rest of the paper and looking at the AN, it seems jet energy resolution corrections (in MC) are not applied in the analysis. We understand the standard JetMET prescription is to apply them, so why was this not needed for this analysis?

L78: "clustered using the above jet finding algorithm with the tracks assigned to the PV as inputs" -> suggest to improve readability, for example with the following rephrasing "obtained by clustering the tracks assigned to the PV with the same jet finding algorithm mentioned above"

L91: "which takes into account the increased particle collimation at high pT values" -> suggest to state a bit more clearly why mini-isolation works better here, maybe just rephrasing to "this pT dependence improves the efficiency for the identification of leptons originating from high-pT top quarks".

L117: "The combination of tight leptons and [...] 'loose leptons'" -> we think it should be made clearer that, as far as we understand, tight leptons don't necessarily pass the loose selection ("loose selection criteria on the attributes [..]"), and what you define as "loose leptons" is the OR of tight leptons and the other looser leptons; explaining this in one sentence (as currently done) makes the sentence hard to follow, we suggest to split this explanation into two shorter sentences.

L133: the purpose of having SR-2/3j-1b is briefly contextualized ("contains most tZq events"), but the same isn't done for SR-4j-1b and SR-2b and one is left wondering how these are exploited. A brief comment, as done for SR-2/3j-1b, would be useful.

L137-157: in the description of the BDT inputs, there is no mention of how/whether the BDT inputs were validated (for example, was there a requirement on good data/MC agreement for each of the BDT inputs), and how/whether the BDT settings were optimized. Although we agree this is to some extent a technical detail, we suggest considering the addition of a sentence covering these two items.

L141: "Therefore, the b-tagged jet having the closest invariant mass to the top quark mass [35], when combined with the lepton not forming the Z boson candidate ( ` W ) and ~ p T miss , is considered as originating from the top quark decay." -> suggest to improve readability, for example with the following rephrasing "Therefore, the b-tagged jet which, once combined with pT-miss and the lepton not forming the Z boson candidate (l_W), gives the closest invariant mass to the top quark mass [35] is considered as originating from the top quark decay."

L169: "implicitly" -> suggest to remove; as it comes up in some of our other comments, it's unclear whether the normalization of ttV is left unconstrained in the fit or some prior error is used; if the latter is true, the value of this prior should be specified.

L234: we understand the ttZ and Xgamma backgrounds are taken from simulation (as stated in L169 and L185, respectively), but the sentence in L234 suggests there are no pre-fit (theoretical) uncertainties on the cross sections of these two processes. Could you please clarify what are the prior uncertainties on the cross-sections of ttZ and Xgamma in the final fit?

L244: we assume Figures 1 and 1A contain the *post-fit* distributions; it might help to specify "post-fit", just once, here where Fig.1 is introduced (or in the captions).

L255: from this sentence, we understand the lepton-ID efficiency (SF) uncertainties for 2016 and 2017 are fully correlated (unlike other experimental uncertainties like b-tagging, trigger and JES). Why is that the case?

L269: writing "asymptotic approximation of the test statistic" might leave the reader wondering what test statistic is used; please consider specifying this (for example, maybe something along the lines of "a test statistic based on the profile likelihood ratio")?

Figure 1, right (SR2b): what explains the larger statistical uncertainty in one of the bins (at BDT ~= 0.2)?