A Structure Theorem For Generalized-Noncontextual Ontological Models

A structure theorem for generalized-noncontextual ontological models
David Schmid1,2,3 , John H. Selby1 , Matthew F. Pusey4 , and Robert W. Spekkens2
1 International Centre for Theory of Quantum Technologies, University of Gdańsk, 80-308 Gdańsk, Poland
2 Perimeter Institute for Theoretical Physics, 31 Caroline Street North, Waterloo, Ontario Canada N2L 2Y5
3 Institute for Quantum Computing and Department of Physics and Astronomy, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
4 Department of Mathematics, University of York, Heslington, York YO10 5DD, United Kingdom
It is useful to have a criterion for when the 2.1 Process theories . . . . . . . . . . . . . . 5

arXiv:2005.07161v3 [quant-ph] 8 Mar 2024
predictions of an operational theory should be 2.2 Operational theories . . . . . . . . . . . 8

considered classically explainable. Here we take
2.2.1 The GPT associated to an opera-
the criterion to be that the theory admits of a
tional theory . . . . . . . . . . . 9
generalized-noncontextual ontological model. Ex-
isting works on generalized noncontextuality have 2.2.2 Tomographic locality of a GPT . . 11
focused on experimental scenarios having a sim- 2.3 Representations of operational theories
ple structure: typically, prepare-measure scenar- and GPTs . . . . . . . . . . . . . . . . . 13
ios. Here, we formally extend the framework of 2.3.1 Ontological models . . . . . . . . 14
ontological models as well as the principle of gen- 2.3.2 Quasiprobabilistic models . . . . 15
eralized noncontextuality to arbitrary composi-
tional scenarios. We leverage a process-theoretic 3 Three equivalent notions of classicality 15
framework to prove that, under some reasonable 4 Structure theorem 17
assumptions, every generalized-noncontextual on- 4.1 Diagram-preserving quasiprobabilistic
tological model of a tomographically local opera- models are exact frame representations . 18
tional theory has a surprisingly rigid and simple 4.1.1 Diagram-preserving quasiproba-
mathematical structure—in short, it corresponds bilistic models of quantum theory 19
to a frame representation which is not overcom-
4.2 Structure theorems for ontological models 20
plete. One consequence of this theorem is that the
largest number of ontic states possible in any such 4.3 Consequences of the dimension bound . . . 21
model is given by the dimension of the associated 4.4 Diagram preservation implies ontic sepa-
generalized probabilistic theory. This constraint rability and more . . . . . . . . . . . . . 22
is useful for generating noncontextuality no-go 4.5 Converses to structure theorems . . . . . 23
theorems as well as techniques for experimentally
4.6 Categorical reformulation . . . . . . . . 23
certifying contextuality. Along the way, we ex-
tend known results concerning the equivalence 5 Revisiting our assumptions 24
of different notions of classicality from prepare- 5.1 Revisiting diagram preservation . . . . . 24
measure scenarios to arbitrary compositional sce- 5.2 Necessity of tomographic locality . . . . 25
narios. Specifically, we prove a correspondence
6 Outlook 25
between the following three notions of classical ex-
plainability of an operational theory: (i) existence Acknowledgements 26
of a noncontextual ontological model for it, (ii) ex- References 26
istence of a positive quasiprobability representa- A Context-dependence in representations of GPTs 28
tion for the generalized probabilistic theory it de-
B Proof of the structure theorem (Theorem 4.1) 29
fines, and (iii) existence of an ontological model for
the generalized probabilistic theory it defines. C Completing the proof of Proposition 3.2 31
D Proof of Proposition 4.4 33
Contents E Proofs for preliminaries 34
E.1 Proof that quotienting is diagram-preserving 34
1 Introduction 2
1.1 Results . . . . . . . . . . . . . . . . . . . 4 E.2 Proof of Lemma 2.5 . . . . . . . . . . . . 36
1.2 Assumptions . . . . . . . . . . . . . . . 4 E.3 Proof of Lemma 2.6 . . . . . . . . . . . . 37
2 Preliminaries 5 E.4 Proof of Eqs. (57) and (58) . . . . . . . . 40
Accepted in Q
uantum 2024-03-06, click title to verify. Published under CC-BY 4.0. 1
1 Introduction transformation, and measurement procedures) to be lists
of laboratory instructions detailing actions that can be
For a given operational theory, under what circumstances taken on some physical system. Such a theory also makes
is it appropriate to say that its predictions admit of a clas- predictions for the statistics of outcomes in any given ex-
sical explanation? This article starts with the presump- perimental arrangement (without making any attempt
tion that this question is best answered as follows: the to explain these predictions). As lists of laboratory in-
operational theory must admit of an ontological model structions, the processes in an operational theory con-
that satisfies the principle of generalized noncontextu- tain details which are not relevant to the observed statis-
ality, defined in Ref. [1]. Admitting of a generalized- tics; any such details are termed the context of the given
noncontextual ontological model subsumes several other process [1]. In contrast, a quotiented operational theory,
notions of classical explainability, such as admitting of a or GPT, arises when one removes this context informa-
positive quasiprobability representation [2, 3], being em- tion by identifying any two processes that differ only by
beddable in a simplicial generalized probabilistic theory context—that is, which lead to all the same statistical pre-
(GPT) [4–7], and admitting of a locally causal model [8, 9]. dictions, and so are said to be operationally equivalent.
(Note that the first two of these results are first proved Our formalization of both operational theories and
for general compositional scenarios in this paper.) Addi- generalized probabilistic theories follows that of quo-
tionally, generalized noncontextuality can be motivated tiented and unquotiented operational probabilistic the-
as an instance of a methodological principle for theory ories (OPTs), as in Refs. [32–34]. For completeness, we
construction due to Leibniz, as argued in Ref. [10] and provide a pedagogical introduction to our notation and
the appendix of Ref. [11]. Finally, operational theories conventions in Sec. 2. The framework presented here is
that fail to admit of a generalized-noncontextual ontolog- also a precursor to a more novel framework presented in
ical model provide advantages for information processing Ref. [35], which is motivated by the objective of cleanly
relative to their classically explainable counterparts [12– separating the causal and inferential aspects of a theory.
19]. Because the notion of generalized noncontextuality There are multiple different representations of opera-
is the only one we consider in this article, we will often tional theories and generalized probabilistic theories that
refer to it simply as ‘noncontextuality’. one can consider, often motivated by the aim of explain-
To date, prepare-measure scenarios are the experimen- ing the predictions of the theory by appealing to some un-
tal arrangements for which the consequences of general- derlying realist model of reality. The quintessential sort
ized noncontextuality have been most explored. A few of explanation is an ontological model of an operational
works have also studied experiments where there is a theory, which presumes that the systems passing between
transformation or an instrument intervening between the experimental devices have properties, and that the out-
preparation and the measurement [18, 20–24]. However, comes of measurements reveal information about these
generalized noncontextuality has not previously been properties. A complete specification of these properties
considered in experimental scenarios wherein the compo- for a given system is termed its ontic state. The variabil-
nent procedures are connected together in arbitrary ways, ity in this ontic state mediates causal influences between
that is, in arbitrary compositional scenarios. Indeed, gen- the devices. Ontological models may be defined for either
eralized noncontextuality has not even been formally de- operational theories or generalized probabilistic theories.
fined at the level of compositional theories prior to this Another type of representation which has been widely
work; rather, it and several related concepts have only considered (particularly by the quantum optics commu-
been formally defined for particular types of scenarios. nity) is that of quasiprobabilistic representations. Such
In this work, we give a process-theoretic [25–28] formu- representations are only defined for generalized proba-
lation of the various relevant notions of operational the- bilistic theories, and can be viewed as ontological models
ories and of representations thereof, enabling the study using quasiprobability distributions—that is, analogues
of noncontextuality in arbitrary compositional scenarios, of probability distributions in which some of the values
and indeed of the noncontextuality of operational theo- can be negative.
ries themselves. We then derive a number of results re- Our formalization of ontological models is more gen-
garding the structure of noncontextual representations eral than that which is usually given, since we define
of operational theories, and we ultimately put strong con- them in a compositional manner, and for arbitrary theo-
straints on the nature of these representations. ries rather than for particular scenarios. (Although note
Like Ref. [4], this work is sensitive to the distinction that this was already done for the special case of quan-
between operational theories and quotiented operational tum theory in Ref. [36].) Furthermore, our formalization
theories, commonly termed generalized probabilistic the- of quasiprobabilistic representations is more general than
ories (or GPTs) [27–31]. In an operational theory, one that which is usually given, since we define them for ar-
understands the primitive processes (e.g., preparation, bitrary GPTs (not necessarily quantum). In this latter
Accepted in Q
case, our formalization was strongly influenced by the Note that simplex-embeddability can also be moti-
prescription suggested in Ref. [37]. vated as a notion of classicality as follows. First, a simpli-
We can now return to the question of when a theory’s cial GPT, i.e., one in which all of the state spaces are sim-
predictions admit of a classical explanation. plices, transformations are arbitrary convex-linear maps
As argued above, our guiding principle is that of non- between these simplices, effects are elements of the hy-
contextuality. The principle of noncontextuality is a con- percubes which are dual to the simplices, and which is
straint on ontological models of operational theories: tomographically local, has been argued to capture a no-
namely, that the representation of operational processes tion of classicality among operational theories [29, 30].
does not depend on their context. That is, operational If a GPT satisfies simplex-embeddabiltiy, then it follows
processes which lead to identical predictions about op- that the set of states and the set of effects therein can
erational facts are represented by ontological processes be conceptualized as a subset of those arising in a simpli-
which lead to identical predictions about ontological cial GPT, implying that every experiment describable by
facts. If such a noncontextual ontological model exists, the GPT can be simulated within the simplicial GPT. It
we take it to be a classical realist explanation of the pre- follows that simplex embeddability captures the possibil-
dictions of the operational theory. Hence, the notion of ity of simulatability within a classical operational theory,
classical-explainability for operational theories is the ex- hence a notion of classicality. Furthermore, the existence
istence of a generalized-noncontextual ontological model. of a positive quasiprobabilistic representation is a notion
If one takes the processes of a generalized probabilistic of classicality in the sphere of quantum optics. Hence, our
theory as the domain of one’s representation map, then results ultimately show that three independently moti-
there is no context on which a given representation (be it vated notions of classicality (namely these two, and the
an ontological model or a quasiprobabilistic representa- existence of a noncontextual ontological model of an op-
tion) could conceivably depend. This point was first made erational theory) all coincide in general compositional
in Ref. [4], and we expand on it in this work, in particular situations (such as is relevant, for example, in quantum
in Appendix A. In such an approach, one cannot directly computation).
take noncontextuality—independence of context—as a Most importantly, this equivalence allows us to prove
notion of classicality for generalized probabilistic theo- that every noncontextual ontological model of a tomo-
ries. Still, Ref. [4] showed that, in prepare-measure sce- graphically local operational theory which satisfies an as-
narios, the notion of noncontextuality for operational the- sumption of diagram preservation has a rigid and simple
ories induces a natural and equivalent notion of classical- mathematical structure. In particular, every such model
ity for generalized probabilistic theories. In particular, a is given by a diagram-preserving positive quasiprobabilis-
generalized probabilistic theory admits of a classical ex- tic model of the GPT associated with the operational
planation if and only if there is a simplex that embeds theory, and we prove that every such quasiprobabilistic
its state space and furthermore the hypercube of effects model is in turn a frame representation [3, 38] that is not
that is dual to this simplex embeds its effect space. Such overcomplete. As a corollary, it follows that the number
an embedding can be viewed as an ontological model of of ontic states in any such model is no larger than the di-
a GPT [4, 5]. Hence, the resulting notion of classical ex- mension of the GPT space.
plainability for a GPT is the existence of an ontological This rigid structure theorem and bound on the num-
model for it. Our work extends this result to the case of ber of ontic states shows that there is much less freedom
arbitrary compositional theories and scenarios. in constructing noncontextual ontological models than
We also extend (from prepare-measure scenarios to ar- previously recognized. In particular, it means that once
bitrary compositional scenarios) the proof that positive the representation of the states is fixed (i.e., by choice
quasiprobabilistic models are in one-to-one correspon- of frame) then there is no remaining freedom in the rep-
dence with noncontextual ontological models [2, 4]. Note resentations of the measurements and transformations.
that our proof—like the special case given in Ref. [4]— Moreover, in many ontological models the number of on-
corrects some issues with the original arguments in tic states is taken to be infinite (e.g., corresponding to
Ref. [2].1 points on the surface of the Bloch ball); however, all
such models are immediately ruled out by our bound on
1 Ref. [2] was not careful to distinguish between quotiented and
the number of ontic states. These results also imply new
unquotiented operational theories, and as such did not stipulate
whether quantum theory was being considered as an operational
theory or as a GPT. As a result, it failed to note that the most cal model. Our recasting of the relation between quasiprobability
natural domain for a quasi-probabilistic representation is quantum representations and noncontextual ontological representations is
theory as a GPT, while the domain of a noncontextual ontologi- explicit about such distinctions, and consequently we show that
cal representation is necessarily quantum theory as an operational a positive quasiprobability representation is not in and of itself a
theory. Also as a consequence, it argued that a positive quasiproba- noncontextual ontological model. Rather, it is just that the sets of
bility representation is the same thing as a noncontextual ontologi- these are in one-to-one correspondence.
Accepted in Q
proofs of the fact that operational quantum theory does preserving ontological model is equal to the dimen-
not admit of a generalized-noncontextual model and sim- sion of the state space of that system in the GPT.
plifies the problem of witnessing generalized contextual- For instance, a noncontextual ontological model of
ity experimentally. a qudit must have exactly d2 ontic states. Similarly,
Categorically, we view the GPT as being a particu- the dimension of the sample space of any diagram-
lar monoidal category, and the representations thereof preserving quasiprobabilistic model is the GPT di-
as being particular strong monoidal functors into sub- mension.
categories of FVectR (the category of linear maps be- These results show that by moving beyond prepare-
tween finite dimensional real vector spaces). In the case measure scenarios, the concept of a noncontextual on-
of tomographically local GPTs, our structure theorem tological model of an operational theory becomes con-
states that any such functor is naturally isomorphic to a strained to a remarkably specific and simple mathemati-
standard representation of a tomographically local GPT cal structure. Moreover, our bound on the number of on-
within FVectR . In particular, this means that ontologi- tic states yields new proofs of the impossibility of a non-
cal models for such theories, should they exist, are essen- contextual model of quantum theory (e.g., via Hardy’s
tially unique. ontological excess baggage theorem [39]) and dramatic
We now summarize our key results and main assump- simplifications to algorithms for witnessing contextuality
tions in more detail. in experimental data (e.g. reducing the algorithm intro-
duced in Ref. [4] from a hierarchy of tests to a single test).
1.1 Results
1.2 Assumptions
We begin by providing informal statements of our main
results. The first result in this list extends the results of The assumptions that are needed to prove our results will
Ref. [4] from the case of prepare-measure scenarios to ar- be formally introduced as they become relevant. For the
bitrary scenarios. The second entry in the list constitutes sake of having a complete list in one place, however, we
the main technical result of this work. The third is a pri- provide an informal account of them here. These assump-
mary consequence of this for the study of noncontextual tions can be divided into two categories.
ontological models. First, we have assumptions limiting the sorts of opera-
1. We refine and generalize the notions of quasiproba- tional theories that we are considering.
bilistic models and ontological models of operational 1. Unique deterministic effect: We consider only opera-
theories and of GPTs to arbitrary compositional sce- tional theories in which all deterministic effects (cor-
narios and theories, and we show a triple equivalence responding to implementing a measurement on the
between: system and marginalizing over its outcome) are op-
(a) a positive quasiprobabilistic model of the GPT erationally equivalent [32].
associated to the operational theory, 2. Arbitrary mixtures: We assume that every mixture
(b) an ontological model of the GPT associated to of procedures within an operational theory is also an
the operational theory, and effective procedure within that operational theory.
(c) a noncontextual ontological model of the oper- That is, for any pair of procedures in the theory,
ational theory. there exists a third procedure defined by flipping a
2. We then prove a structure theorem for representa- weighted coin and choosing to implement either the
tions of a GPT which implies that: first or the second, depending on the outcome of the
(a) every diagram-preserving quasiprobabilistic coin flip.
model of a GPT is a frame representation that 3. Finite dimensionality: We assume that the dimen-
is not overcomplete, i.e., an exact frame repre- sion of the GPT associated to the operational the-
sentation ory is finite.
(b) every diagram-preserving ontological model of 4. Tomographic locality: For some of our results, we
a GPT is a positive exact frame representation, moreover limit our analysis to operational theories
and whose corresponding GPT is tomographically local
(c) every diagram-preserving noncontextual onto- (namely, where all GPT processes on composite sys-
logical model of an operational theory can be tems can be fully characterized by probing the com-
used to construct a positive exact frame rep- ponent systems locally [29]).
resentation of the associated GPT, and vice Second, we have assumptions that concern the onto-
versa. logical model (or quasiprobabilistic model).
3. A key corollary of these is that the cardinality of the 1. Deterministic effect preservation: Any determinis-
set of ontic states for a given system in any diagram- tic effect in the operational theory is represented by
Accepted in Q
marginalization over the sample space of the system 2.1 Process theories
in the ontological (or quasiprobabilistic) model.
2. Convex-Linearity: The representation of a mixture
of procedures is given by the mixture of their rep- In this paper we will represent various types of theories as
resentations, and the representation of a coarse- process theories [26], which highlights the compositional
graining of effects is given by the coarse-graining of structures within these theories. We will express certain
their representations. relationships that hold between these process theories
3. Empirical adequacy: The ontological (or quasiprob- in terms of diagram-preserving maps 2 . We give a brief
abilistic) representations must make the same pre- introduction to this formalism here. Readers who would
dictions as the operational theory. like a deeper understanding of this approach can read,
4. Diagram preservation: The compositional structure for example, Refs. [25–28, 35].
of the ontological (or quasiprobabilistic) representa- A process theory P is specified by a collection of sys-
tion must be the same as the compositional structure tems A,B,C,... and a collection of processes on these sys-
of the operational theory. (Formally, this means that tems. We will represent the processes diagrammatically,
we take these representations to be strong monoidal e.g.,
functors.)
The most significant assumption regarding the scope 

 B


of operational theories to which our results apply is that 



B A C E D




of tomographic locality. Among the assumptions concern- 
 

 c 
ing the nature of the ontological (or quasiprobablistic) f , a , , b ⊂ P, (1)
model, the only one that is not completely standard is 

 A E



D B B 
 
that of diagram preservation. 


 A C



As we will explain, however, the assumption of dia-
gram preservation does not restrict the scope of applica- where we work with the convention that input systems
bility of our results; rather, it is a prescription for how are at the bottom and output systems are at the top. We
one is to apply our formalism to a given scenario. Fur- will sometimes drop system labels, when it is clear from
thermore, our main results do not require the full power context. Processes with no input are known as states,
of diagram preservation, but rather can be derived from those with no outputs as effects, and those with neither
the application of this assumption to a few simple scenar- inputs nor outputs as scalars. The key feature of a process
ios: the identity operation, the prepare-measure scenario, theory is that this collection of processes P is closed under
and the measure-and-reprepare operation. However, full wiring processes together to form diagrams; for example,
diagram preservation is a natural generalization of these D
assumptions, as well as of a number of other standard
assumptions that have been made throughout the liter- b
ature on ontological models, and so we will build it into B B c
our definitions rather than endorsing only those partic-
A
ular instances that we need for the results in this paper.
We discuss these points in more detail in Section 5, and f E ∈ P. (2)
provide a defense of full diagram preservation in Ref. [35]. C
A D
2 Preliminaries Wirings of processes must connect outputs to inputs such
that systems match and no cycles are created.
In this section we provide a pedagogical introduction to We will commonly draw ‘clamp’-shaped higher-order
the diagrammatic notation that we will employ and its
application to operational theories, tomographically lo-
cal GPTs, and their ontological and quaisiprobabilistic
representations. This section should be treated largely as 2 This work could also be presented in the language of category
a review of the relevant literature, which we include to theory. In particular, process theories can be viewed as symmetric
have a self-contained presentation of the necessary for- monoidal categories [26] and diagram preserving maps as strong
malism for our main results. monoidal functors between them [40].
Accepted in Q
processes [41, 42] such as: ing the map is equivalent, that is:
M (D)
B
D
τ (3) M (D)
A D
b
B BM c
b M (B) A E M
B B c M (B) M (A)
which we call testers. These can be thought of as some-
thing which maps a process from A to B to a scalar via: A B B A
M (E)
f E = f , (8)
B B A C M
T 7→ T τ . (4) C
M (C)
A A a C E
A D M
a
M (A) M (D) D
These are not primitive notions within the framework of M (A)
M
M (D)
process theories and instead are always thought of as be-
ing built out of particular state, effect and auxiliary sys- and such that wirings are mapped to wirings, e.g.
tem3 . In other words, the tester τ is really just shorthand
notation for a triple (sτ ,eτ ,Wτ ) where: M (B) M (B) M (D) M (B) M (B) M (D)
B B
eτ D = . (9)
B B M
τ = Wτ . (5) M (B) M (B) M (D) M (B) M (B) M (D)

A A In particular, this means that M maps the identity pro-
sτ cesses in P to identity processes in P ′ .
Remark 2.1. If we interpret these process theories as
symmetric monoidal categories, then any strict monoidal
functor M̄ defines a diagram preserving map M , simply by
A diagram-preserving map M : P → P ′ from one pro- taking M (A) = M̄ (A) and M (f ) = M̄ (f ). Note that this
cess theory P to another, P ′ , is defined as a map taking latter equation is not obviously well-typed, as, according
systems in P to systems in P ′ , denoted as to Eq. (7), M (f ) : M (A1 ) ⊗ ··· ⊗ M (An ) → M (B1 ) ⊗
S → M (S), (6) ···M (Bm ) while M̄ (f ) : M (A1 ⊗···⊗An ) → M (B1 ⊗···⊗
Bm ). However, in the case of strict monoidal functors, we
have that M̄ (A1 ⊗ ··· ⊗ An ) = M̄ (A1 ) ⊗ ··· ⊗ M̄ (An ), and
and processes in P to processes in P ′ . Taking inspiration so this is not actually a problem. If, on the other hand,
from [40, 44] this will be depicted diagrammatically as one instead has a strong monoidal functor, M̄ , in which
M (B1 )··· M (Bm ) this equality is relaxed to the natural isomorphism [45] µ,
µ
B1 ··· Bm B1 ··· Bm M̄ (A1 ⊗···⊗An ) ∼ = M̄ (A1 )⊗···⊗ M̄ (An ), one can still use
M :: f 7→ f , (7) this to define a diagram preserving map. The difference is
A1 ··· An A1 ··· An M that now (following Ref. [40]) we need to use the natural
M (A1 )··· M (An ) isomorphisms µ in order to define M (f ), that is, we define
M (f ) := µ−1 ◦ M̄ (f )◦µ. In this paper we will always be
considering strong monoidal functors where M̄ (I) = I,
such that wiring together processes before or after apply- ϵ
but if one merely had that M̄ (I) ∼ = I, then one must also
incorporate this natural isomorphism when defining the
action of M on states, effects, and scalars.
We will also use the concept of a sub-process theory,
where the intuitive idea is that P ′ is a sub-process theory
3 The recent work of Ref. [43] has shown how these, and other, of P, denoted P ′ ⊆ P, if the processes in P ′ are a subset
higher order processes can actually be incorporated as primitive of the processes in P that are themselves closed under
notions within a process-theoretic framework. forming diagrams. Formally, we do not require that a
Accepted in Q
′
sub-process theory is such a theory, but only that it is linear maps from RΛ to RΛ , denoted by:
equivalent to such a theory; that is, we say that P ′ ⊆ P if ′
f : RΛ → RΛ :: v 7→ f (v) (13)
and only if there exists a faithful strong monoidal functor
from P ′ into P. where for all λ′ ∈ Λ′ , we define f (v)(λ′ ) :=
′
P
The key process theory which underpins this work is λ∈Λ f (λ |λ)v(λ). It is then straightforward to show that
FVectR , defined as follows: sequential composition of the stochastic processes corre-
sponds to composition of the associated linear maps and
Example 2.2 (FVectR ). Systems are labeled by finite that parallel composition of the stochastic processes cor-
dimensional real vector spaces V where the composition responds to the tensor product of the associated linear
of systems V and W is given by the tensor product V ⊗ maps. For example, for sequential composition we have
W . Processes are defined as linear maps from the input that for all v ∈ RΛ ,
vector space to the output vector space. Composing two X
processes in sequence corresponds to composing the linear (g◦f )(v)(λ′′ ) = g◦f (λ′′ |λ)v(λ) (14)
maps, while composing them in parallel corresponds to λ∈Λ
XX
the tensor product of the maps. If a process lacks an = g(λ′′ |λ′ )f (λ′ |λ)v(λ) (15)
input and/or an output then we view them as linear maps λ∈Λλ′ ∈Λ′
to or from the one-dimensional vector space R. Hence, X X
!
processes with no input correspond to vectors in V and ′′ ′ ′
= g(λ |λ ) f (λ |λ)v(λ) (16)
processes with no output to covectors, i.e., elements of λ′ ∈Λ′ λ∈Λ
V ∗ . This implies that scalars—processes with neither X
= g(λ′′ |λ′ )f (v)(λ) (17)
inputs nor outputs—correspond to real numbers. FVectR
λ′ ∈Λ
is equivalent to the process theory of real-valued matrices.
However, representing the former in terms of the latter = g(f (v))(λ) (18)
requires artificially choosing a preferred basis for the vector Moreover, consider processes with no input—that is, lin-
spaces. ear maps p : R → RΛ . By using the trivial isomorphism
R∼ = R⋆ where ⋆ is the singleton set ⋆ := {∗}, one can
The first is the process theory of (sub)stochastic pro-
cesses. Here, systems are labeled by finite sets Λ which P correspond to functions p : ⋆ × Λ → [0,1]
see that these
such that λ∈Λ p(λ|∗) ≤ 1; following standard conven-
compose via the Cartesian product. Processes with in-
tions, we can denote this as p(λ) := p(λ|∗). So p just cor-
put Λ and output Λ′ correspond to (sub)stochastic maps,
responds to a subnormalised probability distribution, as
and can be thought of as functions
expected. Similarly, processes with no output, such as
f : Λ×Λ′ → [0,1] :: (λ,λ′ ) 7→ f (λ′ |λ) (10) r : RΛ → R, correspond to response functions, that is,
where for all λ ∈ Λ we have λ′ ∈Λ′ f (λ′ |λ) ≤ 1. When
P functions r : Λ × ⋆ → [0, 1] such that for all λ one has
this inequality is an equality, they are said to be stochas- r(λ) := r(∗|λ) ∈ [0,1]. Finally, it follows that processes
tic (rather than substochastic). For any pair of functions with neither inputs nor outputs, s : R → R, correspond to
f : Λ×Λ′ → [0,1] and g : Λ′ ×Λ′′ → [0,1] (where the output elements of [0,1], i.e., to probabilities.
type of f matches the input type of g), sequential com- Summarizing the above, we have:
position is given by g◦f : Λ×Λ′′ → [0,1] via the following Example 2.3 (SubStoch). We define SubStoch as
rule for composing the functions: a subtheory of FVectR where systems are restricted to
vector spaces of the form RΛ and processes are restricted
X
g◦f (λ′′ |λ) := g(λ′′ |λ′ )f (λ′ |λ). (11)
λ′ ∈Λ′
to those that correspond to (sub)stochastic maps.
For any pair of functions f : Λ × Λ′ → [0,1] and g : Λ′′ × The second subtheory of FVectR is QuasiSubStoch,
Λ′′′ → [0,1], parallel composition is given by g ⊗f : (Λ′ × which is the same as the process theory of (sub)stochastic
Λ)×(Λ′′′ ×Λ′′ ) → [0,1] via: processes, but where the constraint of positivity is
g⊗f ((λ′′′ ,λ′′ )|(λ′ ,λ)) := g(λ′′′ |λ′ )f (λ′′ |λ). (12) dropped. The systems can be taken to be finite sets Λ,
and the processes with input Λ and output Λ′ can be
It is sometimes more convenient or natural to take an taken to be functions
alternative (but equivalent) point of view on this process
f : Λ×Λ′ → R :: (λ,λ′ ) 7→ f (λ′ |λ). (19)
theory (e.g., this view makes it more clear that this is a
sub-process theory of FVectR ). In this alternative view, These are said to be quasistochastic (as P
opposed to qua-
the systems are not simply given by finite sets Λ, but sisubstochastic) if they moreover satisfy λ∈Λ′ f (λ′ |λ) =
rather are taken to be the vector space of functions from Λ 1 for all λ ∈ Λ. The way that these compose and are rep-
to R, denoted RΛ . Then, rather than taking the processes resented in FVectR is exactly the same as in the case of
to be functions f : Λ × Λ′ → [0,1], one takes them to be substochastic maps.
Accepted in Q
Summarizing, we have must factorise over separated diagrams, for example,
Example 2.4 (QuasiSubStoch). We define E1 E2 E1 E2
QuasiSubStoch as the subtheory of FVectR where sys- = (23)
P1 P2 P1 P2
tems are restricted to vector spaces of the form RΛ and p p p
processes are those that correspond to quasi(sub)stochastic
maps. = Pr(E1 ,P1 )Pr(E2 ,P2 ). (24)
Moreover, if T1 is a procedure that is a mixture of T2 and
By construction, SubStoch ⊂ QuasiSubStoch ⊂ T3 with weights ω and 1 − ω respectively4 , then it must
FVectR ; however, in contrast to FVectR , SubStoch hold that for any tester τ , we have
and QuasiSubStoch do come equipped with a preferred
basis for each system. It is known that quantum theory
as a GPT (QT) can be represented as a subtheory of
QuasiSubStoch (see, for example, [37]). T1 τ =ω T2 τ +(1−ω) T3 τ (25)
p p p
2.2 Operational theories
Additionally, if one operational effect E1 is the coarse-
We now introduce a process-theoretic presentation of the
graining of two others, E2 and E3 , then Pr(E1 ,P ) must
framework of operational theories as defined in Ref. [1],
be the sum of Pr(E2 ,P ) and Pr(E3 ,P ) for all P .
resulting in a framework that is essentially that of (unquo-
tiented) operational probabilistic theories [34]. An oper- Our main result holds only for operational theories
ational theory Op is given by a process theory specifying satisfying the following property:
a set of physical systems and the processes which act on
them (where processes are viewed as lists of lab instruc- E1 E2 E1 E2
tions), together with a rule for assigning probabilities to
any closed process. A generic laboratory procedure has
∀E1 ,E2 ,P1 ,P2 T = T′
an associated set of inputs and outputs, and will be de- P1 P2 P1 P2
p p
noted diagrammatically as:
⇐⇒
B1 Bm
...
T . (20)
... E E
A1 An
Of special interest are processes with no inputs and pro- ∀E,P T = T′ . (26)
cesses with no outputs, depicted respectively as
P P
E p p
, . (21)
P ′
In other words, any two processes T and T that give the
same statistics for all local preparations on their inputs
The former is viewed as a preparation procedure and the
and all local measurements on their outputs also give the
latter is viewed as an effect, corresponding to some out-
same statistics in arbitrary circuits. Such operational the-
come of some measurement. We depict the probability
ories are alternatively characterized by the fact that the
rule by a map p, as
GPT defined by quotienting them satisfies tomographic
locality, as we show below.
E
= Pr(E,P ) ∈ [0,1]. (22) Two processes with the same input systems and output
P
p systems are said to be operationally equivalent [1] if they
give rise to the same probabilities no matter what closed
That is, the application of p on any closed diagram yields
diagram they are embedded in. The testers from Eq. (5)
a real number between 0 and 1. Note that this is not
facilitate a convenient diagrammatic representation of
a diagram-preserving map as it can only be applied to
this condition. That is, two processes are operationally
processes with no input and no output. (Nonetheless, we
will see shortly how it has a diagram-preserving extension
to arbitrary processes—namely, the quotienting map).
4 That
This probability rule must be compatible with certain is, either T2 or T3 is implemented, as determined by the
relations that hold between procedures [31, 41]. First, it outcome of a weighted coin-flip.
Accepted in Q
equivalent, denoted by the predictions of the operational theory, since
B B
E E E
T ≃ T′ (27) E
e
∼
= = = = Pr(E,P ). (29)
A A P P
Pe P
∼ p
if they assign equal probabilities to every tester5 , so that ∼
It is worth noting that in the quotiented operational the-

ory, a closed diagram is equal to a real number (the prob-
∀τ T τ = T′ τ . (28) ability associated to it), while in the operational theory
these are not equal until the map p is applied to the closed
p p diagram.
We will assume that every deterministic effect for a
It is easy to see that operational equivalence defines an given system, A, in the operational theory (correspond-
equivalence relation. Hence, we can divide the space of ing to implementing a measurement on the system and
processes into equivalence classes, and each process T in marginalizing over its outcome) is operationally equiva-
the operational theory can be specified by its equivalence lent. We denote these deterministic effects as:
class Te, together with a label cT of the context of T , c
specifying which element of the equivalence class it is. For (30)
A
a given T , cT provides all the information which defines
that process which is not relevant to its equivalence class. where c labels the context.
Hence, each procedure is specified by a tuple, T := (Te,cT ),
and we will denote it as such when convenient. In the case
of closed diagrams, the equivalence class can be uniquely 2.2.1 The GPT associated to an operational theory
specified by the probability given by the map p, and so
any information beyond this forms the context of the It is well known [31–33] that a quotiented operational
closed diagram. theory, Op,
f is nothing but a generalized probabilistic the-
Next, we define a quotienting map ∼ which maps pro- ory [29, 30], and in fact for this paper we view this as
cedures into their equivalence class (exactly as is done to the definition of a GPT. We will now demonstrate this
construct quotiented operational probabilistic theories by showing that Op f is tomographic (a notion that will
in Ref. [34]). Given a characterization of each procedure be defined momentarily), is representable in real vector
as a tuple of equivalence class and context, the quotient- spaces, is convex, and has a unique deterministic effect.
ing map picks out the first element of this tuple, taking This is analogous to how quotiented OPTs arise from un-
(Te,cT ) → Te. 6 Diagrammatically, we have quotiented OPTs in [32, 33].
Firstly, note that the quotiented operational theory
is tomographic. For a generic process theory, P, being
(T̃ ,cT ) = T̃ . tomographic means that processes are characterized by
∼
scalars. That is, given any two distinct processes f, g :
We prove that it is diagram-preserving in Appendix E.1. A → B ∈ P,
For processes which are closed diagrams, one can al-
ways choose the representative of the equivalence class f ̸= g , (31)
to be the real number specified by the probability rule.
Hence, the map ∼: Op → Op f can be viewed as a diagram- there must exist a tester h ∈ P that turns each of these
preserving extension of the probability rule p. This im- processes into a closed diagram, i.e., a scalar, such that
plies that the quotiented operational theory reproduces the scalars are distinct:
5 The recognition that (for operational theories corresponding to ∃h ∈ P : f h ̸= g h . (32)

GPTs that are not tomographically local) one must consider testers
that allow for side channels can be found in Refs. [31, 34, 46] .
6 This That Eq. (32) implies Eq. (31) for processes in a quo-
notation is very convenient in practice, but it is worth
noting that it has the awkward feature that two operationally tiented operational theory is trivial; we now give the proof
equivalent procedures S and T would map to S e and Te (respectively), that Eq. (31) implies Eq. (32). Consider two distinct pro-
which constitute two distinct labels for the same equivalence class cesses, Te and Te′ , in the quotiented operational theory—
(since S
e= Te). the images under the quotienting map of process T and
Accepted in Q
T ′ in the operational theory—such that: finite dimensional vector representation RTe of a process
B B Te, defined componentwise via
Te ̸= Te′ . (33)
[RTe]α := Te τeα (39)
A A
By definition, we know that Te ̸= Te′ implies that T ̸≃ T ′ , for α = 1,2,...,mA→B . This new representation, RTe, has
and hence there exists some tester τ such that a straightforward relation to the original vector represen-
tation, KTe; all one must do to go from the original to the
new representation is to restrict the KTe vectors to the
T τ ̸= T′ τ . (34) relevant subset of their components. We can think of this
A→B A→B
as a linear restriction map F A→B : RT → RF ::
p p
K 7→ K|F A→B . This allows us now to relate these two rep-
Since the action of p is identical to that of ∼ on closed resentations via the observation that KTe|F A→B = RTe for
diagrams, this implies that all processes Te : A → B.
The second key property of fiducial sets of testers is
that they define a linear compression of the KTe vectors.
T τ ̸= T′ τ . (35) Formally, what we mean by this is that there is a linear
A→B A→B
map E A→B : RF → RT which is the inverse to
∼ ∼ the restriction map F A→B on vectors KTe, that is, for all
Finally, we can use the fact that the quotienting map is Te we have that E A→B (RTe) = E A→B (KTe|F A→B ) = KTe.
diagram-preserving to write that there exists τe such that We reiterate that F A→B is taken to be a minimal fidu-
cial set of testers, which means that it is a minimal cardi-
Te τe ̸= Te′ τe . (36) nality set of testers satisfying these two properties. Note
that minimal fiducial sets are typically not unique.
Consider now how the sequential composition of pro-
This establishes that Eq. (31) implies Eq. (32), and so cesses is represented. Given representations of a pair of
the quotiented operational theory is tomographic. processes (RTe, RTe′ ) we know (as R is injective) that
This means that we can identify an operational equiv-
we can determine Te and Te′ , compute their composition
alence class of processes, Te, with a real vector, KTe, liv-
A→B Te′ ◦ Te, and via Eq. (39) obtain RTe′ ◦Te. We denote the se-
ing in RT , where T A→B denotes the set of testers for quential composition map on the vector representation
processes with input A and output B. Concretely, we de- as RTe′ □RTe := RTe′ ◦Te. Similarly, for parallel composition
fine these vectors component-wise via we can define RTe′ ⊠RTe := RTe′ ⊗Te. As we demonstrate in
h i Appendix E.2, both □ and ⊠ can be uniquely extended to
KTe := Te τe . (37) bilinear maps on the relevant vector spaces. Specifically,
τ
we have:
Clearly, following on from the discussion around Eq. (36), Lemma 2.5. The operation □ can be uniquely extended
f′ . This vector
we have that KTe = KTe′ if and only if Te = T to a bilinear map
space representation, however, is generally infinite dimen-
mB→C A→B
A→C
sional, and gives a highly inefficient characterisation of □: R ,Rm → Rm , (40)

processes. We can instead focus on some minimal subset and the operation ⊠ can be uniquely extended to a bilinear
of fiducial testers F A→B ⊂ T A→B which, for notational map
convenience, we index as F A→B := {e τα }α=1,2,...,mA→B .
mA→B C→D
AC→BD
The term fiducial means that this subset of testers ⊠: R ,Rm → Rm . (41)
satisfies two key properties. The first is that they must
also suffice for tomography, i.e., This implies that transformations act linearly on the
state space, and also that the summation operation dis-
B B
tributes over diagrams, i.e.:
Te ̸= Te′ ⇐⇒ ∃α ∈ {1,...,mA→B } : Te τeα ̸= Te′ τeα .
RTe RTe
A A X
= ri . (42)
(38) P
RTe
i ri RT
ei RTe′ i i
RTe′
We can therefore use these fiducial testers to define a
Accepted in Q
It is generally easier to work with this vector represen- had a unique equivalence class of deterministic effects
tation of processes rather than directly with the abstract means that the quotiented operational theory will have
process theory of operational equivalence classes of pro- a unique deterministic effect [32] for each system:
cedures. We will do so when convenient, abusing notation
by dropping the explicit symbol R, and simply denot- c
ing the vector representation of the equivalence classes in := . (48)
A A
∼
the same way as the equivalence classes themselves. That
is, we will denote RTe by Te. For example, we will write
Eq. (42) as In summary, a quotiented operational theory satisfies
the key properties of a GPT: being tomographic, repre-
sentability of each system in Rd (for some d), convexity,
Te X Te
= ri . (43) and uniqueness of the deterministic effect for each sys-
tem. Henceforth, we will refer to the quotiented opera-
P
i ri Ti
e Te′ i Tei Te′
tional theory as the GPT associated to the operational
P theory, and we presume that every GPT can be achieved
Note that generic linear combinations such as i ri Tei
in this way.
need not correspond to any process in the operational
For example, quantum theory qua operational theory
theory. However, some linear combinations correspond to
is the process theory whose processes are laboratory pro-
mixtures and coarse-grainings, and these will correspond
cedures (including contexts), while quantum theory qua
to other processes in the operational theory. Namely, if
GPT is the process theory whose processes are com-
T1 is a procedure that is a mixture of T2 and T3 with
pletely positive [47, 48] trace-nonincreasing maps, whose
weights ω and 1−ω, then by Eq. (25) it follows that
states are density operators, and so on. When one quo-
tients quantum theory qua operational theory, one ob-
tains quantum theory qua GPT.
T1 τ =ω T2 τ +(1−ω) T3 τ (44) It is worth noting that a quotiented operational theory
should not be viewed as an instance of an operational the-
p p p
ory7 . In an operational theory, it is not merely that con-
for all τ , which in turn implies that texts are permitted, rather they are required in the defini-
tion of the procedures. The primitives in an operational
theory are laboratory procedures, and these necessarily
Te1 τe = ω Te2 τe +(1−ω) Te3 τe , (45) involve a complete description of the context of a proce-
dure. For example, “prepare the maximally mixed state”
and so by the fact that quotiented operational theories specifies a preparation when viewing quantum theory as
are tomographic, a quotiented operational theory, but not when viewing
it as an operational theory. In the latter case, one must
Te1 = ω Te2 +(1−ω)Te3 . (46)
additionally specify how this mixture is achieved, e.g.,
Hence, the mixing relations between preparation proce- which ensemble of pure states was prepared or which en-
dures in the operational theory are captured by a convex tangled state was followed by tracing of a subsystem.
structure in this representation. More generally we find
that 2.2.2 Tomographic locality of a GPT
Tomographic locality is a common assumption in the
∀τ T τ =
X
ri Ti τ ⇐⇒ Te =
X
ri Tei (47) GPT literature—indeed, in some early work on GPTs it
i i
was considered such a basic principle that it was taken as
p p
part of the framework itself [30]. Intuitively, it states that
processes can be characterized by local state preparations
for arbitrary coefficients ri . Hence, for example, the and local measurements. In this section, we will show
coarse-graining relations that hold among operational ef- that all tomographically local GPTs can be represented
fects are captured by the linear structure in this repre- as subtheories Opf ⊂ FVectR , using arguments similar to
sentation of the quotiented operational theory. those in, e.g., the duotensor formalism in Ref. [31].
If one makes the standard assumption that every pos- A GPT is said to satisfy tomographic locality if one can
sible mixture of processes in the operational theory is an- determine the identity of any process by local operations
other process in the operational theory, then it follows
that the quotiented operational theory is convex. 7 Although this is not captured by the formal definitions in this
Finally, the fact that we assumed that each system A paper, it is captured by the more refined formalization in Ref. [35].
Accepted in Q
on each of its inputs and outputs: The vector space spanned by the set {PejA }j=1,2,...,mA
A
of states is Rm , and the vector space spanned by the set
E e′
E E e′
E
e e
{Ee B }i=1,2,...,mB of effects is RmB . Note that in general,
i
∀E,
eE e ′ ,Pe,Pe′ Te = f′ e A ◦ PeA ̸= δij 8 , which generically implies that M is not
T E j i 1A
e
Pe Pe′ Pe Pe′ the identity matrix (nor is it equal to Ne 1A
), counter to in-
tuitions one might have from working with orthonormal
⇐⇒
bases. The following corollary then makes explicit some
extra structure which was implicit in the vector represen-
tation RTe of the previous section. In particular, it shows
A→B
Te = Te′ . (49) that the vector space Rm of transformations from A
to B is isomorphic to the vector space of linear maps from
A B
One can immediately verify that an operational theory Rm → Rm , where a process Te is represented as a vec-
satisfies Eq. (26) if and only if the GPT obtained by quoti- tor RTe in the former and a matrix MTe in the latter.
enting relative to operational equivalences is tomograph-
ically local.
There are many equivalent characterizations of tomo- Corollary 2.7. A GPT is tomographically local if and
graphic locality for a GPT. The most useful for us is the only if one can decompose any process Te as
following condition, first introduced in Sections 6.8 and
B
9.3 of [31], which allows us to show that tomographically
B PejB
local GPTs can be represented as subtheories of FVectR . X
Consider an arbitrary set of linearly independent and Te = [MTe]ji , (53)
spanning states {PejA }j=1,2,...,mA on system A and an ar- A
ij eA
Ei
bitrary set of linearly independent and spanning effects A
{Ee B }i=1,2,...,mB on system B. (If the systems are com-
i
posite, these should moreover be chosen as product states where MTe = Me
1B
◦NTe ◦Me
1A
.
and product effects respectively.) Define the ‘transition
matrix’ in this basis for a process Te from system A to sys-
tem B as Proof. To prove that a GPT is tomographically local if one
can decompose any process as in Eq. (53), simply note that
eB
E k
for the special case of Te = 1
eA , Eq. (53) implies Eq. (51),
B and hence, by Lemma 2.6, implies tomographic locality.
[NTe]kj := Te . (50) To prove the converse, we assume tomographic locality
A and apply Lemma 2.6 to decompose the input and output
PejA
systems of an arbitrary process Te as
PelB
Lemma 2.6. A GPT is tomographically local if and only X PelB
l
if one can decompose the identity process, denoted 1
eA , for [Me ]
X
[Me ]l
1B k
1B k
every system A as eB
E kl
B kl k
A Te = Te = [NTe]kj
PejA A PejA X
[Me ]j
X
[Me ]j
X
= [Me ]j
1A i
, (51) 1A i
ij
1A i
A ij eA
E
ij i
eA
E eA
i Ei
A
(54)
where Me 1A
is the matrix inverse of the transition matrix
Ne1A
of the identity process, that is,
Me
1A
:= N−1 . (52)
1A
e 8 This fails to be an equality whenever the spanning effects E
i
do not perfectly distinguish the spanning states Pj . Note that this
This was originally shown in [31, Sec. 6.8], and we re- will be the generic case, as it is only in a simplicial GPT that there
prove this in our notation in Appendix E.3. is a spanning set of states which can be perfectly distinguished.
Accepted in Q
which can be rewritten as: Theorem 2.8. Any tomographically local GPT has a
diagram-preserving representation in FVectR given by
B PelB
 
X X the map
Te =  [M ]lk [N ]kj [M ]ji  (55) A 7→ Rm
A
1B
e T
e 1A
e
il jk eA
A E i on systems and the map
Te 7→ MTe ◦Ne
1A
PelB
X on processes, where
= [Me
1B
◦NTe ◦Me ]l
1A i
(56)
il eA eB
Ei Ek
B
at which point we can simply identify MTe = Me
1B
◦NTe ◦ [NTe]kj := Te (59)
Me 1A
, giving the desired result. A
PejA
Given the vector representation of two equivalence
classes, RTe and RTe′ , we showed how to compute the rep-
for some basis of states {PejA } and effects {E
e B }, and where
k
resentation of either the sequential or parallel composi-
tion of these (via □ and ⊠, respectively). However, if we MTe := N−1 ◦NTe ◦N−1 . (60)
1B
e 1A
e
represent equivalence classes by matrices MTe instead,
then how must we represent the parallel and sequential This result is implicit in the work of Refs. [31, 34] and
composition of processes? It turns out that parallel com- more explicit in the quantum case of [37].
position is represented by Effectively this means that we can view any tomo-
MTe⊗Te′ = MTe ⊗MTe′ (57) graphically local GPT simply as a suitably defined sub-
f ⊂ FVectR . For the remainder of this paper
theory Op
where the ⊗ on the left represents the parallel compo-
we restrict our attention to tomographically local GPTs,
sition of equivalence classes, while on the right it repre-
and we will moreover abuse notation and simply denote
sents the tensor product of the two matrices. Meanwhile,
the linear maps in this representation by Te rather than
the sequential composition of this matrix representation
by MTe ◦Ne , and similarly, the vector spaces as A rather
is given by 1A
A
than by Rm . That is, we will neglect to make the dis-
MTe′ ◦Te′ = MTe′ ◦Ne ◦MTe (58)
1B tinction between the quotiented operational theory and
where on the left-hand side ◦ represents the sequential its representation as a subtheory of FVectR , as preserv-
composition of the equivalence classes, while on the right- ing the distinction is unwieldy and typically unhelpful.
hand side it represents matrix multiplication. These two Quantum theory is an example of an operational the-
facts are proven in Appendix E.4. ory, and it is well known that the GPT representation of
The fact that Eq. (58) is not simply the sequential com- quantum theory is tomographically local. The latter is
position rule for FVectR , namely the matrix product of a subtheory of FVectR , as B(H) is a real vector space
MTe′ and MTe, implies that this matrix representation is and completely positive trace non-increasing maps are
not a subtheory of FVectR , nor even some other diagram- just a particular class of linear maps between these vec-
preserving representation of the GPT. This form of com- tor spaces. Classical theory, the Spekkens toy model [50],
position has, however, appeared numerous times in the and the stabilizer subtheory [51] for arbitrary dimensions
literature, for example in Refs. [3, 31, 37, 49]. There is, are also tomographically local. Examples of GPTs which
moreover, a well known trick to turn this representation are not tomographically local are real quantum theory
into a diagram-preserving representation in FVectR : one [52] and the real stabilizer subtheory.
simply defines a new matrix representation by replacing
9
MTe with MTe ◦Ne 1A
. It is then easy to verify that these 2.3 Representations of operational theories and
do indeed compose using the standard composition rules
(tensor products for parallel composition and matrix mul-
GPTs
tiplication for sequential composition), and to verify that One often wishes to find alternative representations of
the identity process is represented by the identity matrix. an operational theory or a GPT, e.g., as an ontologi-
Putting all of this together we arrive at the following. cal model or a quasiprobabilistic model (to be defined
shortly). A key motivation for studying ontological mod-
9 In
the language of duotensors [31] this means that we put the els is the attempt to find an explanation for the statis-
duotensors into standard form. tics in terms of some underlying properties of the rele-
Accepted in Q
vant systems, especially if this explanation can be said and 1−ω, respectively, then it must hold that
to be classical in some well-motivated sense. In this sec-
tion, we introduce the definition of ontological models
and quasiprobabilistic models, and in the next section T1 =ω T2 +(1−ω) T3 .
we discuss under what conditions one can say that such
ξ ξ ξ
representations provide a classical explanation of the op-
erational theory or GPT which they describe. (61)
This diagrammatic definition of an ontological model
reproduces the usual notions [1] of ontological represen-
tations of preparation procedures and of operational ef-
2.3.1 Ontological models
fects. In particular, an operational preparation procedure
is an operational process with a trivial input, and by di-
An ontological model is a map associating to every system agram preservation of ξ, this is mapped to a process in
S a set ΛS of ontic states, and associating to every process SubStoch with a trivial input, that is, to a probabil-
a stochastic map from the ontic state space associated ity distribution over the ontic states: P 7→ ξ(P ) : Λ →
to the input systems to the ontic state space associated R+ s.t. λ ξ(P )(λ) = 1. Similarly, an operational effect
P
with the output systems. is an operational process with a trivial output, and by di-
It is important to distinguish between ontological mod- agram preservation of ξ is mapped to a substochastic pro-
els of operational theories and ontological models of cess with a trivial output, that is, to a response function
GPTs, as was shown in Ref. [4]. In particular, the former over the ontic states: E 7→ ξ(E) : Λ → [0,1] s.t. ξ(E)(λ) ≤
allows for context-dependence while the latter does not. 1.
See App. A for a detailed discussion of this point. Definition 2.10 (Ontological models of GPTs). An on-
tological model ξe of a GPT Op
f is a diagram-preserving
Definition 2.9 (Ontological models of operational the- f → SubStoch , depicted as
map ξe: Op
ories). An ontological model [53] of an operational the-
RΛB
ory Op is a diagram-preserving map ξ : Op → SubStoch ,
depicted as B B
RΛB ξe:: Te 7→ Te ,
B B A A
ξ̃
RΛA
ξ :: T 7→ T ,
A A ξ
from the GPT to the process theory SubStoch, where the
RΛA map satisfies three properties:
1. It represents the deterministic effect for each system
from the operational theory to the process theory appropriately:
SubStoch, where the map satisfies three properties:
1. It represents all deterministic effects in the opera- = = 1.
tional theory appropriately: A
ξ̃
RΛ A
RΛA
c = = 1.
A
2. It reproduces the operational predictions of the GPT
ξ
RΛA (i.e., is empirically adequate), so that for all closed
RΛA
diagrams,
2. It reproduces the operational predictions of the oper-
ational theory (i.e., is empirically adequate). That E
e E
e
is, for all closed diagrams: = = Pr(E,
e Pe). (62)
Pe Pe
ξ̃
E E 3. It preserves the convex and coarse-graining relations

= = Pr(E,P ).
P P between operational procedures. E.g., if
ξ ∼
3. It preserves the convex and coarse-graining relations

between operational procedures. E.g., if T1 is a pro- Te1 = ω Te2 +(1−ω) Te3 (63)
cedure that is a mixture of T2 and T3 with weights ω
Accepted in Q
then it must hold that diagrams,
E
e Ẽ
= = Pr(E,
e Pe).
Te1 =ω Te2 +(1−ω) Te3 . Pe P̃
ξ̂
ξ̃ ξ̃ ξ̃
3. It preserves the convex and coarse-graining relations

(64) between operational procedures. E.g., if
In analogy with the discussion above, one has that nor-
malized GPT states on some system are represented in
an ontological model by probability distributions over Te1 = ω Te2 +(1−ω) Te3 (66)
the ontic state space associated with that system, while
GPT effects are represented by response functions. then it must hold that
The state spaces in SubStoch form simplices, and so
we will sometimes refer to an ontological model of a GPT
as a simplex embedding. This terminology is a natural
Te1 =ω Te2 +(1−ω) Te3 .
extension of the definition of simplex embedding in [4].
ξ̂ ξ̂ ξ̂
2.3.2 Quasiprobabilistic models (67)

We now introduce quasiprobabilistic models of a GPT.
One could analogously define quasiprobabilistic models One can see that the only technical distinction between
of an operational theory (as diagram-preserving maps an ontological model of a GPT and a quasiprobabilistic
from Op to QuasiSubStoch). However, given that the model of a GPT is that in the latter, the probabilities are
expressive freedom offered by the possibility of context- replaced by quasiprobabilities, which are allowed to go
dependence is sufficient to ensure that every operational negative.
theory admits of an ontological model, and hence a posi-
In analogy with the discussion at the end of Sec-
tive quasiprobabilistic model, there is no need to make use
tion 2.3.1, one has that GPT states on some system are
of the additional expressive freedom offered by allowing
represented in a quasiprobabilistic model by quasidistri-
negative quasiprobabilities, and hence no motivation to
butions over the sample space associated with that sys-
introduce such models. On the other hand, in the case of
tem, that is, functions on Λ normalised to 1 but where the
GPTs, there does not always exist an ontological model,
values can be negative, while GPT effects are represented
hence quasiprobabilistic models are a useful conceptual
by arbitrary real-valued functions over the sample space.
and mathematical tool for assessing the classicality of a
GPT.
Definition 2.11 (Quasiprobabilistic models of GPTs).
A quasiprobabilistic model of a GPT Op,
f is a diagram- 3 Three equivalent notions of classicality
preserving map ξˆ: Op
f → QuasiSubStoch , depicted as
The only ontological models that constitute good classi-
ΛB
cal explanations are those that satisfy additional assump-
B B tions. One such principle is that of (generalized) noncon-
ξˆ:: Te 7→ Te , textuality [1]. It was argued in Refs. [1, 2, 4, 54] that
A A an ontological model of an operational theory should be
ξ̂
ΛA deemed a good classical explanation only if it is noncon-

textual. We now provide the definition of a noncontex-
where the map satisfies three properties: tual ontological model in the framework we have intro-
1. It represents the deterministic effect for each system duced here.
appropriately:
= = 1. (65) Definition 3.1 (A noncontextual ontological model of

an operational theory). An ontological model of an op-
A
ξ̂
RΛA erational theory ξnc : Op → SubStoch satisfies the prin-
RΛA
ciple of generalized noncontextuality if and only if every
2. It reproduces the operational predictions of the GPT two operationally equivalent procedures in the operational
(i.e., is empirically adequate), so that for all closed theory are mapped to the same substochastic map in the
Accepted in Q
ontological model. That is, if following diagram:
RΛ B RΛB C
Op ∼
B B Op
f
′ ′ ξnc ,
T ≃ T =⇒ T = T . (68) ξe
A A
ξnc ξnc SubStoch
RΛ A RΛA where C is defined as a map, which is not diagram-
preserving (hence the dashed arrow), and which takes
Another way of stating this condition is that the map any process Te in the GPT Op f to some process T = (Te,cT )
ξnc does not depend functionally on the context of any in the operational theory. There always exists at least
processes in the operational theory, so that for all T := one such map C (in general, there exist many), and all
(Te,cT ) one has ξnc (T ) = ξnc (Te). of these satisfy ∼◦C = Id (in general, no choice of C will
satisfy C ◦∼ = Id).
Ontological models of GPTs (as we have defined
Now, consider an operational theory Op and the GPT
them) cannot be said to be either generalized-contextual
or generalized-noncontextual (in contrast to ontological Op
f it defines.
models of operational theories, which can). This is be- Given an ontological model ξe of Op, f one can define
cause the domain of our notion of an ontological model of a noncontextual model ξnc of Op via ξnc := ξe ◦ ∼. The
a GPT has no notion of a context on which the ontolog- map constructed in this manner cannot depend on the
ical representation could conceivably depend. (This was contexts of processes in the operational theory, since these
first pointed out in Ref. [4], and we explain it further in are removed by the quotienting map ∼. As such, the
Appendix A.) However, Ref. [4] showed (in the context map ξnc necessarily satisfies Eq. (68), and hence is indeed
of prepare-and-measure scenarios) that the principle of noncontextual.
noncontextuality nonetheless induces a notion of classi- Given a noncontextual ontological model ξnc of Op, one
cality within the framework of GPTs: namely, the GPT can define an ontological model ξe of Op f via ξe := ξnc ◦ C.
is said to have a classical explanation if and only if it ad- Because the map ξnc does not depend on the context, the
mits of an ontological model. (Not all GPTs admit of an map constructed in this manner does not depend on the
ontological model, even if the operational theory from choice of C, and is unique.
which they are obtained as a quotiented theory do. This For completeness, we prove in Appendix C that ξnc :=
is a consequence of the representational inflexibility re- ξe◦ ∼ indeed satisfies the relevant constraints to be an
sulting from the lack of contexts on which the represen- ontological model of an operational theory, and similarly,
tation might depend.10 ) We now extend this result (The- that ξe:= ξnc ◦C satisfies the relevant constraints to be a
orem 1 of Ref. [4]) from prepare-and-measure scenarios valid ontological model of a GPT.
to arbitrary scenarios.
Finally, we note that this notion of classicality of a
GPT is closely linked to the positivity of quasiprobabilis-
Proposition 3.2. There is a one-to-one correspondence tic models. This result can be seen as an extension of the
between noncontextual ontological models of an operational equivalence in Ref. [2] from the prepare-and-measure sce-
theory, ξnc : Op → SubStoch , and ontological models of nario to arbitrary compositional scenarios.
f → SubStoch .
the associated GPT, ξe: Op
Definition 3.3 (Positive quasiprobabilistic model of a to-
mographically local GPT). A positive quasiprobabilistic
Proof sketch. The idea of the proof is captured by the model of a tomographically local GPT Op f is a quasiproba-
bilistic model ξˆ+ : Op
f → QuasiSubStoch in which all of
10 Accordingly, the Beltrametti-Bujaski model [55] can be viewed the matrix elements of the quasisubstochastic maps in the
as an ontological model of the single qubit subtheory qua operational image of ξˆ+ are positive, that is, if and only if all of the qua-
theory, but not as an ontological model of the single qubit subtheory sisubstochastic maps in the image of ξˆ+ are substochastic.
qua GPT. This can be seen by noting that this model is explicitly
contextual while the single qubit subtheory qua GPT has no contexts. Simply by examining the definitions, it is
Equivalently, it can be seen by noting that the single qubit subtheory clear that a positive quasiprobabilistic model
qua GPT does not admit of any ontological model. The same can
be said of the 8-state model of Ref. [56] relative to the stabilizer ξˆ+ : Op
f → QuasiSubStoch of a GPT is equivalent to
qubit subtheory: it is an ontological model of the stabilizer qubit
subtheory qua operational theory (a contextual ontological model) f → SubStoch of that GPT.
an ontological model ξe: Op
but not of the stabilizer qubit subtheory qua GPT. The latter has It follows that:
no contexts and does not admit of any ontological model.
Accepted in Q
Proposition 3.4. There exists a positive quasiprobabilis- Note that we have colored the linear maps χA to make
tic model of a GPT Opf if and only if there exists an onto- it immediately apparent that they came from the associ-
logical model of Op.
f ated diagram-preserving map.
The proof consists of three main arguments, provided
Although Proposition 3.4 follows immediately from
explicitly in Appendix B and sketched here.
the relevant definitions, we have nonetheless highlighted
it here. This is because a generic quasiprobabilistic model First, we leverage tomographic locality of the GPT, as
of a GPT has no meaningful conceptual relationship to well as convex-linearity and diagram preservation of the
an ontological model of a GPT, and so it is conceptually map, to prove that one can represent the action of the
important to understand in what special cases the two map on a generic process in terms of its action on states
notions coincide. Furthermore, we hope that highlighting and effects
this fact will encourage more dialogue between those re- VB VB
VB
searchers studying quasiprobabilistic models and those
studying ontological models. B Pei Pei
X X M
Putting Props. 3.2 and 3.4 together, one has that: Te = rij = rij . (70)
A ij ij
Corollary 3.5 (Three equivalent notions of classicality). E
ej
E
ej
M
M M
Let Op be an operational theory and Op
f the GPT obtained
VA VA VA
from Op by quotienting. Then, the following are equiva-
lent: Second, using convex-linearity of the map, we prove
(i) There exists a noncontextual ontological model of Op, that one can represent the action of M on states and
ξnc : Op → SubStoch . effects simply as some linear maps within FVectR ; that
(ii) There exists an ontological model (a.k.a. sim- is,
plex embedding) of Op, f ξe: Opf → SubStoch . (iii) VB VB
B E
There exists a positive quasiprobabilistic model of Op, χB
ej
f E
ej
Pei = and AM = ϕA , (71)
ξˆ+ : Op
f → QuasiSubStoch .
M Pei
VA VA
This generalizes the results of Refs. [2, 4, 5] from
prepare-measure scenarios to arbitrary compositional which relies on the isomorphism between vectors (or cov-
scenarios. ectors) in V and linear maps from R to V (resp. V to
R). Note that χB and ϕA are uniquely fixed by Eq. (71),
which means that there can be no other choice made for
4 Structure theorem the χA appearing in Eq. (69).
Next, we leverage empirical adequacy, that
With this framework in place, we can prove our main
results. We start with a general theorem, leveraging the E
e
f ⊂ FVectR , as stated in Theorem 2.8. We E
e
fact that Op = (72)
then specialize to the various physically relevant cases. Pe Pe
M
Theorem 4.1. Any convex-linear, empirically

for all Pe and E,
e together with the fact that they span the
adequate and diagram-preserving map (Eq. (8)),
vector space and dual, to show that
f → FVectR , where Op
M : Op f is tomographically local
A
can be represented as
ϕA
VB
VB = VA ; (73)
χB A
B χA
B
A
Te = Te , (69)
A A that is, that ϕA is the left inverse of χA .
χ−1
M
A Finally we consider the representation of the identity
VA as,
VA
where for each system A, χA : A → VA is a invertible linear

map within FVectR . Moreover, the χA are uniquely = , (74)
M
determined by Eq. (69).
Accepted in Q
which is a consequence of diagram preservation, to prove Proposition 4.3. Any diagram-preserving quasiproba-
that, bilistic model of a tomographically local GPT can be writ-
VA ten as
RΛB
χA
RΛB
= A , (75) χB
VA B B
ϕA
Te = Te (77)
VA
A A
ξ̂
which means that ϕA is also the right inverse of χA , and χ−1
A
RΛ A
hence that it is unique such that we can write ϕA = χ−1 A . RΛA
This shows that the only freedom in the representation
is in representation of the states, via the choice of linear for invertible linear maps {χS : S → RΛS } within FVectR
maps χS , of the theory; after specifying these, one can for each system, where these satisfy
uniquely extend to the representation of arbitrary pro-
cesses. It also shows that the representation M is neces-
sarily invertible as we can always define the inverse of M χS = . (78)
by using the inverses of the χ’s.
One key consequence of this result is the following
corollary, whose significance we investigate in Section 4.3. Proof. Since ξˆ satisfies the requirements of Theorem 4.1
we immediately obtain Eq. (77). For the particular case
Corollary 4.2. The dimension of the codomain, VA of of the deterministic effect, we have that
the map χA is given by the dimension of the GPT vector
space A.
ξ̂
= χ−1
S . (79)
Proof. The linear map χA is invertible, so the dimension
of its domain and of its codomain must be equal, and its Recall that, by definition, a quasiprobabilistic model
domain is the GPT vector space. satisfies Eq. (65):
Note that the proof of the structure theorem and this
subsequent corollary do not require the full generality of = . (80)
ξ̂
diagram preservation, only the (mathematically) much
weaker conditions that: Combining these gives
Pei
Pei
M E
e E
e = χ−1
S . (81)
M
= , = , and = .
M
E
ej Pe Pe Composing both sides of this with χS gives Eq. (78).
E
ej M
M
M M
(76) The extra constraint of Eq. (78) is not part of the general
We will give justifications of these (for the case of onto- structure theorem because an abstract vector space does
logical models and quasiprobabilistic models) in Sec. 5, not have a natural notion of discarding. Such a privileged
and will discuss further consequences of general diagram notion is found within, for example, SubStoch, as the
preservation in Sec. 4.4. all ones covector, which represents marginalization.11
Since the χS are just invertible linear maps, this map
can be seen as merely transforming from one represen-
4.1 Diagram-preserving quasiprobabilistic models tation of the GPT to another. Critically, however, one
are exact frame representations must note that the vector spaces in QuasiSubStoch are
all of the form RΛ , and so they come equipped with ex-
As mentioned in the introduction, SubStoch and
tra structure—namely, a preferred basis and dual basis.
QuasiSubStoch are subprocess theories of FVectR ,
SubStoch ⊂ QuasiSubStoch ⊂ FVectR . This implies 11 Even within vector spaces with an associated physical interpre-
that our main theorem applies to these special cases. The tation of vectors as processes, there is not always a unique discard-
fact that the codomain is restricted can then equivalently ing map; for example, in the vector space of quantum channels, ap-
be expressed as a constraint on the linear maps χA . In plying the channel to any fixed input state and tracing the output
the case of quasiprobabilistic representations we obtain: constitutes a discarding operation.
Accepted in Q
Hence, these representations are effectively singling out of χA means that
this preferred basis for the GPT.
e A′
D λ′ χA
X λ
To see this more explicitly, denote the preferred basis =
and cobasis for RΛ as λ,λ′ ∈ΛA FeλA λ χ−1
( ) ( ) A
RΛ
and λ respectively, (82)
λ
λ∈Λ
RΛ
λ∈Λ
=
which, in particular, means we can decompose identities as
X λ X λ′
= . (83) = δλλ′ , (89)
λ∈Λ λ λ,λ′ ∈ΛA λ
This means that, for any system A in the GPT, we can

and hence
write χA as:
e A′
D λ
λ
= δλλ′ . (90)
FeλA
λ λ
X X
χA = χA =: , (84) That is, {FeλA }λ defines a basis of VA and {D
e A′ }λ′ defines
eA λ
λ∈ΛA λ∈ΛA D ∗
λ a dual basis of VA .
We can then represent the action of ξˆ as:
As χA is invertible, then {De A }λ∈Λ is a cobasis for the RΛB RΛ B

λ A
e B′
D
GPT:   B X
λ
λ′

eA
 Te = Te (91)
D λ , (85) A λ∈ΛA ,λ′ ∈ΛB λ
  ξ̂ FeλA
λ∈ΛA RΛ A
RΛA
whereby condition (78) becomes which can be viewed as a quasistochastic map defined by
the conditional quasiprobability distribution
X eA
Dλ = . (86) e B′
D λ
λ∈ΛA
ˆ Te)(λ′ |λ) =
ξ( Te . (92)
FeλA
Similarly, we could run the same argument using χ−1

A Finally, we note that Proposition 4.3 also implies that
to single out a basis, {FeλA }λ∈ΛA for the GPT: any quasiprobability representation constructed using
 
  an overcomplete frame necessarily fail to be diagram-
A , (87) preserving.
 Fλ 
e
λ∈ΛA
4.1.1 Diagram-preserving quasiprobabilistic models of
quantum theory
where (by Eq. (81)) ∀λ,
We now consider the case of quantum theory as a GPT.
= 1. (88) The basis {Feλ }λ∈Λ is a basis for the real vector space
FeλA of Hermitian operators for the system while the cobasis,
{De λ }λ∈Λ is a basis for the space of linear functionals
on the vector space of Hermitian operators. The Riesz
Moreover, this decomposition, together with reversibility representation theorem [57] guarantees that every element
Accepted in Q
D
e λ of the cobasis can be represented via the Hilbert- and where moreover every pair (χ−1
A ,χB ) defines a positive
Schmidt inner product with some Hermitian operator, map from the cone of transformations from A to B in the
e ∗ , such that for all ρ:
which we will denote as D GPT Op
f to the cone of substochastic maps from ΛA to ΛB
λ
in SubStoch.
D
e λ′
=D e ∗ ρ).
e λ ◦ρ = tr(D (93)
λ
ρ
The proof is given in Appendix D. Apart from positivity,
P e∗ the proof follows immediately from Proposition 4.3.
The condition in Eq. (86) becomes λ D λ = 1, the condi-
tion in Eq. (88) becomes tr(Fλ ) = 1, and the condition of
e One can interpret this map from the GPT to the ontolog-
Eq. (90) becomes ical model as an explicit embedding into a simplicial GPT
as discussed in [4], but generalized to the case in which
e λ∗ ′ Feλ ) = δλλ′ .
tr(D (94) both the GPT under consideration and the simplicial
It is clear, therefore, that {Feλ }λ and {D e ∗ }λ constitute GPT have arbitrary processes, not just states and effects.
λ
a minimal frame and its dual (in the language of, for This follows from the positivity conditions, empirical ad-
example, Refs. [3]). Hence, this representation is nothing equacy, and the preservation of the deterministic effect.
but an exact frame representation, that is, one which is not As in the case of quasiprobabilistic models, we can
overcomplete. That is, a transformation Te, represented write out χA as:
by a completely positive trace preserving map ETe, will
be represented as a quasistochastic map defined by the λ
X
conditional quasiprobability distribution: χA = , (98)
λ∈ΛA eA
D
ˆ Te)(λ′ |λ) = tr(D
ξ( e ∗ ′ E (Feλ )) (95) λ
λ T e
It is easy to see that any set of spanning and linearly where {D e A }λ forms a basis for the vector space of the
λ
independent vectors summing to identity will define a GPT defined by the operational theory. The positivity
suitable dual frame {D e ∗ }, and then the frame {Feλ } itself
λ condition for χA discussed above immediately implies
is uniquely defined by Eq. (94). (Note in particular that that De A is a linear functional which is positive on the
λ
the elements of the frame need not be pairwise orthogonal, state cone, and the normalization condition immediately
nor must those of the dual frame.) implies that their sum over λ is the deterministic effect.
It has previously been shown that all quasiprobabilistic By a similar argument, each FeλA is a vector in the vector
models of quantum theory are frame representations [3]. space of states which is positive on all GPT effects.
What we learn here is that diagram-preserving quasiproba-
In the case of a GPT which satisfies the no-restriction
bilistic models are necessarily the simplest possible frame
hypothesis [32] (e.g., quantum theory and the classical
representations, namely those that are not overcomplete.
probabilistic theory), this means that the D e A are effects
λ
A
(forming a measurement) and that the Fλ are states. In
e
4.2 Structure theorems for ontological models the quantum case, the notion of positivity that we have
expressed here reduces to positivity of the eigenvalues
In the case of ontological models of a GPT, we obtain:
of the Hermitian operators. This provides another im-
mediate proof that quantum theory, as a GPT, does not
Proposition 4.4. Any diagram-preserving ontological
admit an ontological model—it would require an exact
model of a tomographically local GPT can be written as
frame and dual frame for the space of Hermitian opera-
RΛ B tors which are all positive, but it is known that such a
RΛB basis and dual do not exist [3].
χB
B B We have shown (Prop 3.2) that every noncontextual
Te = Te , (96) ontological model of an operational theory is equivalent
A A to an ontological model of the GPT defined by the oper-
ξ̃
χ−1
A
ational theory. Combining this with proposition 4.4, it
RΛA
immediately follows that:
RΛ A
where
χA = (97) Corollary 4.5. For operational theories whose corre-

sponding GPT satisfies tomographic locality, any noncon-
Accepted in Q
textual ontological model can be written as tuality rules out ontological excess baggage. Since Hardy
showed that all ontological models of a qubit must in fact
RΛB RΛB RΛB
have unbounded excess baggage, his result can immedi-
χB
B ately be combined with ours to give a new proof that the
T = T = T full statistics of processes on a qubit do not admit a non-
∼
Aξ ∼ contextual model.
nc ξ̃
χ−1
A
RΛA RΛA In particular, our result implies that a diagram-
RΛA
preserving noncontextual ontological model of a qubit must
for have exactly 4 ontic states. This result extends to any sub-
RΛA theory of a qubit whose corresponding GPT is tomograph-
RΛ A λ ically local, e.g. the stabilizer subtheory. Hence, it consti-
X tutes a stringent constraint on ontological models of the
χA = ,
eA
qubit stabilizer subtheory qua operational theory. For in-
A λ∈ΛA Dλ
stance, it immediately guarantees that the 8-state model of
A
Ref. [56]—which, as the name suggests, has 8 ontic states—
e A }λ forms a basis for the vector space of the
where {D must be contextual. Indeed, the 8-state model was pre-
λ
GPT defined by the operational theory. viously shown to be contextual by a different argument
which focused on the representation of transformation
Previously, the notion of a noncontextual ontological procedures in prepare-transform-measure scenarios [20].
representation seemed to be a highly flexible concept, but
Furthermore, our bound improves an algorithm first
this corollary demonstrates that it in fact has a very rigid
proposed in Ref. [4]. In particular, Ref. [4] gave an algo-
structure. Every noncontextual ontological model can be
rithm for determining if a GPT admits of an ontological
constructed in two steps: i) quotient to the associated
model by testing whether or not the GPT embeds in a sim-
GPT, ii) pick a basis for the GPT such that it is manifestly
plicial GPT of arbitrary dimension. The lack of a bound on
an ontological model. Furthermore, the only freedom in
this dimension means that there is no guarantee that the
the representation is in representation of the states in the
algorithm will ever terminate. Ref. [58] solves this problem
theory (via this choice of basis); after specifying this, one
by providing such a bound, namely, the square of the given
can uniquely extend to the representation of arbitrary
GPT’s dimension. Our result strengthens this bound, re-
processes.
ducing it to the given GPT’s dimension. In fact, our bound
is tight, as there can never be an embedding of the GPT
4.3 Consequences of the dimension bound into a lower dimensional space. These results simplify the
algorithm dramatically: rather than testing for embed-
We can specialize Corollary 4.2 to the case of quasiproba- ding in a sequence of simplicial GPTs of increasing dimen-
bilistic representations or ontological representations of sion, one can simply perform a single test for embedding in
GPTs. In this case, it states that the dimension of the a simplicial GPT of the same dimension as the given GPT.
GPT vector space for a given system A, dim(A), is equal
to the dimension of the codomain of the map χA defining Yet another application of the dimension bound follows
the quasiprobabilistic or ontological representation of A, from the results of Ref. [59]. Ref. [59] demonstrates that
that is, the dimension of RΛA . In the case of an ontologi- the number of classical bits required to specify the ontic
cal model, this dimension is simply |ΛA |, the cardinality state in any (necessarily contextual) ontological model
of the set ΛA of ontic states of A, so the number of ontic of the qubit stabilizer subtheory is quadratic in the num-
states is equal to the dimension of the GPT space. More- ber of qubits. This is contrasted to the case of the qutrit
over, by considering Proposition 3.2, this immediately stabilizer subtheory, wherein there exists a (noncontex-
implies that for any operational theory whose correspond- tual) ontological model with linear scaling in the number
ing GPT satisfies tomographic locality, if there exists a of qutrits. The quadratic scaling result for the stabilizer
noncontextual ontological model thereof, then it must also qubit subtheory implies that, for a collection of qubits,
have a number of ontic states equal to the dimension of the number of ontic states is necessarily greater than the
the GPT state space. In the language of Hardy [39], this dimension of the space of quantum density operators. To-
exactly means, for each system A, that the “ontological gether with our dimension bound, this fact is sufficient to
excess baggage factor”, deduce the contextuality of the qubit stabilizer subthe-
ory. Moreover, the fact that there exists a noncontextual
|ΛA |
γA := , (99) ontological model for the qutrit stabilizer subtheory [60],
dim(A) together with our dimension bound, is sufficient to de-
must be exactly 1. In other words, demanding noncontex- duce the linear scaling in this case.
Accepted in Q
4.4 Diagram preservation implies ontic separabil- Proof. To begin, let us define
ity and more VA VB
Returning to representations of a GPT by some χAB (101)
f → FVectR satisfying the conditions of Theo-
M : Op A B
rem 4.1: if we make use of additional instances of diagram
preservation beyond the three instances which we used via
in proving Theorem 4.1, then we can derive additional
χAB = . (102)
constraints on the representation.
One important and immediate consequence of diagram Pe
Pe M
preservation for composite systems is that the composite
system AB is represented by the tensor product of the Next, applying this to a product state we have
representations of the components: VAB = VA ⊗ VB . In
the special cases of quasiprobabilistic representations, χAB = . (103)
this means that RΛAB = RΛA ⊗ RΛB = RΛA ×ΛB , which
Pe Pe′ Pe Pe′
in turn means that ΛAB = ΛA ×ΛB . That is, the sample M
space of a composite system is the Cartesian product of
Since M is diagram-preserving, we have
the sample spaces of the components.
This constraint has particular significance if we con-
= . (104)
f → SubStoch ,
sider the case of ontological models, ξe: Op
Pe Pe′ Pe Pe′
as this means that for an ontological model, the ontic state M M M
space of a composite system is the Cartesian product of
Recalling the definition of χS again, we conclude that
the ontic state spaces of the components. We term the
latter condition ontic separability (See Refs. [61, 62]). It
χAB = χA χB . (105)
is a species of reductionism, asserting, in effect, that com-
posite systems have no holistic properties. More precisely, Pe Pe′ Pe Pe′
the property ascriptions to composite systems are all and
only the property ascriptions to their components. Yet In a tomographically local GPT, the product states span
another way of expressing the condition is that the proper- the entire state space, and so this implies Eq. (100).
ties of the whole supervene on the properties of the parts.
The assumption of ontic separability for ontological
models has been discussed in many prior works [61, 62],
and has been a substantive assumption in certain argu-
ments. For instance, in Ref. [63], ontic separability was Now, if we consider the case of quasiprobabilistic repre-
used to demonstrate that in a noncontextual ontological sentations, ξˆ: Op
f → QuasiSubStoch , we obtain that:
model, all and only projective measurements are repre-
sented outcome-deterministically. RΛA RΛB RΛA RΛ B
It is also worth noting that the assumption of prepara- χAB = χA χB . (106)
tion independence in the PBR theorem [64] follows from
diagram preservation (e.g. Eq. (105)). This connection A B A B
between PBR and preservation of compositional structure
has been previously explored in Sec. 4 of [36], in which Given Eq. (84), this is equivalent to
they use this connection to derive a categorical version of RΛA RΛ B RΛA RΛB
the PBR theorem. λ λ′
λ ′
λ′′
Moreover, by considering parallel composition we can X X

also obtain the following: = .
D AB
λ∈ΛA ,λ′ ∈ΛB ′) λ′ ∈ΛA ,λ′′ ∈ΛB
e(λ,λ e A′ e B′′
D λ D λ
Proposition 4.6. Via diagram preservation, parallel
composition implies an additional constraint on the linear A B A B
maps, χA , namely, (107)
VA VB VA VB That is, diagram preservation implies that the frame

representation must factorize across subsystems. In other
χAB = χA χB . (100) words, the vector basis defining the frame representation
A B A B must be a product basis.
Accepted in Q
4.5 Converses to structure theorems representations: one simply needs to choose a family of
invertible linear maps for each fundamental system (i.e.,
In the above section we showed how all diagram-preserving one that cannot be further decomposed into subsystems),
quasiprobabilistic and ontological representations must and then define the others as the tensor product of these
have a particularly simple form given by a collection of (so that general χA factorise over subsystems). Similarly,
invertible linear maps {χA } satisfying certain constraints. it provides a simple recipe for constructing quasiproba-
We now prove what is essentially the converse to each of bilistic representations, using the same construction but
these results. where the χA preserve the deterministic effect. In the
Consider defining a map from Op f to FVectR by
case of noncontextual ontological models, however, the
VB recipe is less simple: one must not only choose invertible
linear maps which factorise over subsystems and preserve
χB
the deterministic effect; one must also check the positiv-
B B
ity condition (which is nontrivial, since for any particular
Te 7→ Te . (108)
map χA , one must check the condition for every χB ).
A A
χ−1
A
VA
4.6 Categorical reformulation
Under what conditions on the set {χA } is this map a There is an elegant categorical reframing of our structure
quasiprobabilistic or ontological representation? theorem, as suggested to us by one of the referees:
To ensure that Eq. (108) defines a diagram-preserving Theorem 4.7 (IV.1 categorical version). Any convex-
map one must simply impose that: linear, empirically adequate and diagram-preserving map
VA VB VA VB (Eq. (8)), M : Opf → FVectR , where Op f is tomographi-
χAB = χA χB . (109) cally local, is naturally isomorphic to the canonical rep-
resentation R : Opf → FVectR defined in Theorem 2.8.
A B A B
Moreover, this natural isomorphism M =⇒ R is unique.
This condition, together with invertibility and linearity
of the χA , easily implies that diagram preservation and Proof. The components of the natural isomorphism are
indeed all the assumptions of Theorem 4.1 are satisfied. given by the χA , as these go from A → VA = M (A),
To ensure that Eq. (108) defines a linear representation and where we are abusing notation by denoting R(A)
that is moreover a quasiprobabilistic representation, as simply as A. Eq. (69) ensures that these do define a
in Def. 2.11, one must impose that VA = RΛA and that natural transformation. To see this, first recall that we
are abusing notation by suppressing explicitly notating
RΛ A the canonical representation R, so Eq. (69) really tells
χA = , (110) us that M (Te) = χB ◦ R(Te) ◦ χ−1
A A which is equivalent to
A
M (T ) ◦ χA = χB ◦ R(T ), which is what we need for this
e e
which implies that the conditions in Def. 2.11 are satisfied. to be a natural transformation. Clearly it is moreover a
Finally, to ensure that Eq. (108) defines a quasiproba- natural isomorphism as every χA is invertible (Eqs. (73)
bilistic representation that is moreover an ontological rep- and (75)). Eq. (100) then ensures that this is a monoidal
resentation, as in Def. 2.10, one must introduce a positivity natural isomorphism. To see this, note that the LHS of
constraint for this map. Specifically, one must have that this equation is not quite the component of the natural
isomorphism χA⊗B , instead we have χAB := µ−1 ◦χA⊗B ◦
RΛB RΛ B
µ, and so Eq. (100) is equivalent to the condition that
χB χB χA⊗B ◦µ = µ◦(χA ⊗χB ), which is exactly what we need
B B B for this to be a monoidal natural isomorphism. Uniqueness
:: Te 7→ Te (111) of this natural isomorphism follows from the fact that
A A A the components χA are uniquely determined, as noted in
χ−1
A χ−1
A Theorem 4.1.
RΛA RΛ A
This means that ontological models, should they ex-
defines a positive map from the cone of transformations ist, are essentially unique, in that they are unique up to
from A to B in Opf to the cone of stochastic maps from a unique natural isomorphism. There is, however, an
ΛA to ΛB . important subtlety going on here. This natural isomor-
This provides a simple recipe for constructing linear phism is given by viewing ontological models as living in
Accepted in Q
FVectR , and so the components of the natural isomor- Of special note among those that satisfy the assumption
phism are just invertible real linear maps. is Gross’ discrete Wigner representation [69], which is the
Alternatively, however, one could demand a stricter unique representation in this family that satisfies a natu-
notion of isomorphism between ontological models by ral covariance property. Ref. [19] further shows that this
viewing them as living in SubStoch, in which case the is the unique noncontextual ontological model for stabi-
components of a natural isomorphism would be invertible lizer subtheories in odd dimensions.
stochastic maps, i.e., permutations of the ontic states. Although we endorse diagram preservation in its most
This is a much stricter notion of isomorphism, and it is general form, it is worth noting that our main results
likely that in this sense there are many different ontological (given in Section 4) require only the following very specific
models for a given GPT. In certain situations, however, instances of that assumption:
even this stricter notion can be proven, see, for example, (i) diagram preservation for prepare-measure scenarios,
Ref. [19].
E
e E
e
ξ̃
ξe:: 7→ . (112)
5 Revisiting our assumptions Pe Pe
ξ̃
We have derived surprisingly strong constraints on the
form of noncontextual ontological models of operational and diagram preservation for measure-and-reprepare
theories, and so it is important to examine the assump- processes
tions that went into deriving these constraints. These
concerned both the types of operational theories under Pe
Pe
consideration and the types of ontological representations ξ̃
ξe:: 7→ , (113)
of these, and were summarized in Section 1.2. The major-
ity of these are ubiquitous and well-motivated. The only E
e
E
e
ξ̃
notable restriction on the scope of operational theories
we consider is the one induced by our assumption of to-
mographic locality (as discussed further in Section 5.2). (ii) diagram preservation for the identity process
Similarly, the only notable restriction on ontological (and
quasiprobabilistic) models that is warranting of further ξe:: 7→ . (114)
A ΛA
discussion is that they are diagram-preserving.
These are easily justified.
5.1 Revisiting diagram preservation Eq. (112) captures the idea that the ontic state is the
complete causal mediary between the preparation and the
In the case of ontological models, we will provide below effect. This assumption is built into the very definition
a motivation for the instances of diagram preservation of the standard ontological models framework (implicitly
that we required for our proofs. Since we have defined in early work [1, 62] and explicitly in later work [70, 71]),
quasiprobabilistic models as representations of opera- and is assumed in virtually every past work on ontological
tional theories wherein the only difference from an onto- models.
logical model is that the probabilities are allowed to be- Eq. (113) is a similarly natural assumption. The natural
come quasiprobabilities (i.e., drawn from the reals rather view of the process
than the interval [0,1]), it follows that these same motiva-
tions are also applicable to them. It is worth noting that
among the quasiprobabilistic representations for continu- Pe
ous variable quantum systems that are most studied in (115)
the literature, the Wigner representation satisfies our defi-
E
nition12 while the Q [65] and P representations [66, 67] do
e
not (as they are defined by overcomplete frames). There

are also examples of both types among quasiprobabilistic
representations of finite-dimensional systems in quantum is that one has observed effect E
e and then one has indepen-
theory. In particular, Ref. [68] defines a family of discrete dently implemented the preparation Pe. There need not be
Wigner representations, some of which satisfy the assump- any system acting as a causal mediary between E e and Pe.
tion of diagram preservation, and some of which do not. The natural ontological representation, therefore, is one
wherein there is no ontic state mediating the two processes,
12 Strictlyspeaking, one would need to generalize our definition as depicted in Eq. (113). Although we are not aware of
to the case of infinite-dimensional GPTs. this assumption having been made in previous works, it
Accepted in Q
is directly analogous to the preparation-independence as- is still noncontextual when restricted to this subtheory.
sumption made in Ref. [64] (which involved two indepen- But now our structure theorem does not hold, because
dent states, rather than an independent effect and state). this model still uses 9n ontic states even though the den-
Eq. (114) can be justified by noting that within the sity matrices now live in a 12 3n (3n +1)-dimensional space.
equivalence class of procedures associated to the identity Moreover, we can show that this sort of example is
operation in the GPT, there is the one which corresponds rather generic, that is, that there is no hope of obtaining
to waiting for a vanishing amount of time. In any reason- a structure theorem with our dimension bound for any
able physical theory, no evolution is possible in vanishing theory that is not tomographically local.
time, and hence the only valid ontological representation Suppose that we have any ontological representation
of such an equivalence class of procedures is the identity ξe of some GPT wherein each GPT system of dimension
map on the ontic state space. d is represented by an ontic state space of cardinality d.
Because we consider the full assumption of diagram Then the GPT is necessarily tomographically local.
preservation to be a natural generalization of all of these To see this, consider the representation of GPT trans-
specific assumptions, we have endorsed it in our definitions. formations from this state space to itself. Then, the map
See Appendix B of Ref. [35] for a more thorough defense ξe is a linear map from the space of transformations to
of this full assumption. d×d substochastic matrices, which are d2 dimensional.
By empirical adequacy ξ is injective, and so the space of
5.2 Necessity of tomographic locality GPT transformations is at most d2 dimensional. But the
effect-prepare channels already span d2 dimensions, so
The assumption of tomographic locality is common in there cannot be any channels outside this span. Hence,
the GPT literature, so we will not attempt a defense of by Corollary 2.7, the theory is tomographically local.
it here. Nevertheless, it is natural to ask if the assump-
tion is actually necessary to obtain our structure theo-
rems. Here we provide an example which shows that it 6 Outlook
is. The operational theory we consider in our example is
the real-amplitude version of the qutrit stabilizer subthe- These results can be directly applied to the study of con-
ory of quantum theory13 . In this subtheory, two systems textuality in specific scenarios and theories. For instance,
are described by 45 parameters, whereas only 62 = 36 pa- we have already seen that our dimension bound is a use-
rameters are available from local measurements, which ful tool for obtaining novel proofs of contextuality (e.g.,
immediately implies that the theory is not tomographi- via Hardy’s ontological excess baggage theorem [39] or
cally local (just as the real-amplitude version of the full for the 8-state model of Ref. [56]), and for providing novel
quantum theory fails to be tomographically local [52]). algorithms for deriving noise-robust noncontextuality in-
To begin with, consider the standard (complex- equalities (namely, the algorithm in Ref. [4] but informed
amplitude) qutrit stabilizer subtheory. Gross’s discrete by our dimension bound). It remains to be seen whether
Wigner function [69] provides a (diagram-preserving) other algorithms for witnessing nonclassicality, such as
quasiprobability representation for qutrits for which the those in Ref. [71] or Ref [70], could be extended within
stabilizer subtheory is positively represented. By Corol- our framework to more general compositional scenarios.
lary 3.5, this corresponds to a noncontextual ontological
Our formalism is also ideally suited to understanding
model of the subtheory. Indeed, this ontological model
the information-theoretic advantages afforded by con-
has been examined in Ref. [60], where it is shown that
textual operational theories, such as for computational
it can be reconstructed from an “epistemic restriction”.
speedup, since it has the compositional flexibility to de-
Since the standard qutrit stabilizer subtheory is tomo-
scribe arbitrary scenarios, such as families of circuits
graphically local, these models obey our structure theo-
which arise in the gate-based model of computation. In
rems. In particular, the representation of n qutrits uses
fact, our structure theorem is a major first step in sim-
9n ontic states, matching the dimension of the relevant
plifying the proof that contextuality is a necessary re-
space of density matrices.
source for the state-injection model of quantum comput-
Now consider the subtheory consisting of only those
ing [19, 72]. Ref. [19] shows that such a proof can pro-
qutrit stabilizer procedures that can be represented using
ceed by applying our structure theorem to show that the
real amplitudes. This does not introduce any new oper-
only positive quasiprobabilistic models of the (classically-
ational equivalences, and so the model discussed above
simulable) stabilizer subtheory for odd dimensions are
13 Note that in order for this to be an operational theory according given by Gross’s discrete Wigner function [69]; then, the
to our definitions herein, we are here referring to the ‘convexified’ known fact that the injected resource states necessarily
stabilizer subtheory in which all convex combinations of, for example, have negative representation in this particular model es-
the pure stabilizer states are permitted. tablishes the result in a direct and elegant fashion.
Accepted in Q
The key limitation of our results is the assumption that [3] C. Ferrie and J. Emerson, J. Phys. A: Math. Theor.
the GPT associated to the operational theory under con- 41, 352001 (2008).
sideration is tomographically local. There are two poten- [4] D. Schmid, J. H. Selby, E. Wolfe, R. Kunjwal, and
tial approaches to dealing with this limitation. On the R. W. Spekkens, PRX Quantum 2, 010331 (2021).
one hand, one could provide an argument that theories [5] F. Shahandeh, PRX Quantum 2, 010330 (2021).
which are not tomographically local are undesirable in [6] J. H. Selby, D. Schmid, E. Wolfe, A. B. Sainz,
some principled sense. For example, it seems likely that R. Kunjwal, and R. W. Spekkens, Phys. Rev. Lett.
one can rule them out on the grounds that they violate 130, 230201 (2023).
Leibniz’s methodological principle [10]. From a practical [7] J. H. Selby, D. Schmid, E. Wolfe, A. B. Sainz,
perspective, wherein the goal is to experimentally ver- R. Kunjwal, and R. W. Spekkens, Phys. Rev. A
ify nonclassicality in a theory-independent manner, one 107, 062203 (2023).
would instead be motivated to seek experimental evidence [8] J. S. Bell, Physics 1, 195 (1964).
that nature truly satisfies tomographic locality, indepen- [9] N. Brunner, D. Cavalcanti, S. Pironio, V. Scarani,
dent of the validity of quantum theory. One possible ap- and S. Wehner, Rev. Mod. Phys. 86, 419 (2014).
proach to this end would be to extend the techniques in- [10] R. W. Spekkens, arXiv:1909.04628 [physics.hist-ph]
troduced in [73] to composite systems. (2019).
[11] M. D. Mazurek, M. F. Pusey, R. Kunjwal, K. J.
Resch, and R. W. Spekkens, Nat. Commun. 7,
Acknowledgements 11780 (2016).
[12] R. W. Spekkens, D. H. Buzacott, A. J. Keehn,
We thank Matt Leifer, Lorenzo Catani, Shane Mansfield,
B. Toner, and G. J. Pryde, Phys. Rev. Lett. 102,
Martti Karvonen, Alex Wilce, and our reviewers for use-
010401 (2009).
ful discussions. We thank Giulio Chiribella for useful dis-
[13] A. Chailloux, I. Kerenidis, S. Kundu, and J. Sikora,
cussions on quotiented operational theories and their re-
New J. Phys. 18, 045003 (2016).
lation to GPTs. RWS thanks Bob Coecke for early discus-
[14] A. Ambainis, M. Banik, A. Chaturvedi,
sions on the topic of understanding noncontextuality in
D. Kravchenko, and A. Rai, Quant. Inf. Process.
a category-theoretic way. JHS thanks Lucien Hardy for
18, 111 (2019).
many discussions on tomographic locality and the duoten-
[15] D. Saha, P. Horodecki, and M. Pawłowski, New J.
sor formalism during PSI and since, and thanks John van
Phys. 21, 093057 (2019).
de Wetering for helpful discussions on his work on quasis-
[16] D. Saha and A. Chaturvedi, Phys. Rev. A 100,
tochastic representations of quantum theory. JHS also
022108 (2019).
thanks Matt Wilson for his help on understanding the re-
[17] D. Schmid and R. W. Spekkens, Phys. Rev. X 8,
lationship between diagram preserving maps and strong
011015 (2018).
monoidal functors. We also acknowledge the First Perime-
ter Institute-Chapman Workshop on Quantum Founda- [18] M. Lostaglio and G. Senno, Quantum 4, 258 (2020).
tions (PIMan) as the venue that spawned this collabo- [19] D. Schmid, H. Du, J. H. Selby, and M. F. Pusey,
rative project. DS was supported by a Vanier Canada arXiv:2101.06263 (2021).
Graduate Scholarship. MFP is supported by the Royal [20] P. Lillystone, J. J. Wallman, and J. Emerson, Phys.
Commission for the Exhibition of 1851. JHS and DS are Rev. Lett. 122, 140405 (2019).
supported by the National Science Centre, Poland (Opus [21] M. S. Leifer and R. W. Spekkens, Phys. Rev. Lett.
project, Categorical Foundations of the Non-Classicality 95, 200405 (2005), arXiv:quant-ph/0412178 .
of Nature, project no. 2021/41/B/ST2/03149). This re- [22] M. F. Pusey and M. S. Leifer, in Proceedings of the
search was supported by Perimeter Institute for Theoreti- 12th International Workshop on Quantum Physics
cal Physics. Research at Perimeter Institute is supported and Logic, Electron. Proc. Theor. Comput. Sci.,
in part by the Government of Canada through the De- Vol. 195 (2015) pp. 295–306.
partment of Innovation, Science and Economic Develop- [23] M. F. Pusey, Phys. Rev. Lett. 113, 200401 (2014).
ment Canada and by the Province of Ontario through the [24] R. Kunjwal, M. Lostaglio, and M. F. Pusey, Phys.
Ministry of Colleges and Universities. Rev. A 100, 042116 (2019).
[25] B. Coecke and A. Kissinger, in Categories for the
Working Philosopher, edited by E. Landry (Oxford
References University Press, 2017) pp. 286–328.
[26] B. Coecke and A. Kissinger, Picturing Quantum Pro-
[1] R. W. Spekkens, Phys. Rev. A 71, 052108 (2005). cesses: A First Course in Quantum Theory and Dia-
[2] R. W. Spekkens, Phys. Rev. Lett. 101, 020401 grammatic Reasoning (Cambridge University Press,
(2008). 2017).
Accepted in Q
[27] J. H. Selby, C. M. Scandolo, and B. Coecke, Quan- [47] M. A. Nielsen and I. L. Chuang, Quantum Compu-
tum 5, 445 (2021). tation and Quantum Information (Cambridge Uni-
[28] S. Gogioso and C. M. Scandolo, in Proceedings of the versity Press, 2010).
14th International Workshop on Quantum Physics [48] D. Schmid, K. Ried, and R. W. Spekkens, Phys. Rev.
and Logic, Electron. Proc. Theor. Comput. Sci., A 100, 022112 (2019).
Vol. 266 (2018) pp. 367–385. [49] M. Appleby, C. A. Fuchs, B. C. Stacey, and H. Zhu,
[29] L. Hardy, arXiv:quant-ph/0101012 (2001). Eur. Phys. J. D 71, 197 (2017).
[30] J. Barrett, Phys. Rev. A 75, 032304 (2007). [50] R. W. Spekkens, Phys. Rev. A 75, 032110 (2007).
[31] L. Hardy, arXiv:1104.2066 [quant-ph] (2011). [51] D. Gottesman, in 22nd International Colloquium on
[32] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Group Theoretical Methods in Physics (1999) pp. 32–
Phys. Rev. A 81, 062348 (2010). 43, arXiv:quant-ph/9807006 .
[33] G. Chiribella, G. M. D’Ariano, and P. Perinotti, [52] L. Hardy and W. K. Wootters, Found. Phys. 42,
Physical Review A 84, 012311 (2011). 454 (2012).
[53] N. Harrigan, T. Rudolph, and S. Aaronson,
[34] G. Chiribella, G. M. DAriano, and P. Perinotti,
arXiv:0709.1149 (2007).
in Quantum theory: informational foundations and
foils (Springer, 2016) pp. 171–221. [54] R. W. Spekkens, Noncontextuality: how we should
define it, why it is natural, and what to do about its
[35] D. Schmid, J. H. Selby, and R. W. Spekkens,
failure (2017), PIRSA:17070035.
arXiv:2009.03297 (2020).
[55] E. G. Beltrametti and S. Bugajski, J. Phys. A 28,
[36] A. Gheorghiu and C. Heunen, in Proceedings of the
3329 (1995).
16th International Workshop on Quantum Physics
[56] J. J. Wallman and S. D. Bartlett, Phys. Rev. A 85,
and Logic, Electron. Proc. Theor. Comput. Sci.,
062121 (2012).
Vol. 318 (2020) pp. 196–212.
[57] F. Riesz, in Annales scientifiques de l’École Normale
[37] J. van de Wetering, in Proceedings of the 14th Inter- Supérieure, Vol. 31 (1914) pp. 9–14.
national Workshop on Quantum Physics and Logic,
[58] V. Gitton and M. P. Woods, Quantum 6, 732 (2022).
Electron. Proc. Theor. Comput. Sci., Vol. 266
[59] A. Karanjai, J. J. Wallman, and S. D. Bartlett,
(2018) pp. 179–196.
arXiv:1802.07744 (2018).
[38] C. Ferrie and J. Emerson, New J. Phys. 11, 063040 [60] R. W. Spekkens, in Quantum Theory: Informational
(2009). Foundations and Foils, edited by G. Chiribella and
[39] L. Hardy, Stud. Hist. Phil. Mod. Phys. 35, 267 R. W. Spekkens (Springer Netherlands, Dordrecht,
(2004). 2016) pp. 83–135.
[40] P.-A. Mellies, in International Workshop on Com- [61] R. W. Spekkens, The paradigm of kinematics and
puter Science Logic (Springer, 2006) pp. 1–30. dynamics must yield to causal structure, in Ques-
[41] G. Chiribella, G. M. D’Ariano, and P. Perinotti, tioning the Foundations of Physics: Which of Our
Physical review letters 101, 060401 (2008). Fundamental Assumptions Are Wrong?, edited by
[42] G. Chiribella, G. M. D’Ariano, and P. Perinotti, A. Aguirre, B. Foster, and Z. Merali (Springer Inter-
EPL (Europhysics Letters) 83, 30004 (2008). national Publishing, Cham, 2015) pp. 5–16.
[43] M. Wilson and G. Chiribella, in Proceedings 18th [62] N. Harrigan and R. W. Spekkens, Found. Phys. 40,
International Conference on Quantum Physics and 125 (2010).
Logic, Gdansk, Poland, and online, 7-11 June 2021, [63] R. W. Spekkens, Found. Phys. 44, 1125 (2014).
Electronic Proceedings in Theoretical Computer Sci- [64] M. F. Pusey, J. Barrett, and T. Rudolph, Nat. Phys.
ence, Vol. 343, edited by C. Heunen and M. Backens 8, 475 (2012).
(Open Publishing Association, 2021) pp. 265–300. [65] K. Husimi, Proc. Physico-Mathematical Soc. Jpn.
[44] T. Fritz and P. Perrone, in Proceedings of the Thirty- 3rd Series 22, 264 (1940).
Fourth Conference on the Mathematical Foundations [66] R. J. Glauber, Phys. Rev. 131, 2766 (1963).
of Programming Semantics (MFPS XXXIV), Elec- [67] E. C. G. Sudarshan, Phys. Rev. Lett. 10, 277
tron. Notes Theor. Comput. Sci., Vol. 341 (2018) (1963).
pp. 121 – 149. [68] K. S. Gibbons, M. J. Hoffman, and W. K. Wootters,
[45] S. Mac Lane, Categories for the working mathemati- Phys. Rev. A 70, 062101 (2004).
cian, Vol. 5 (Springer Science & Business Media, [69] D. Gross, J. Math. Phys. 47, 122107 (2006).
2013). [70] A. Krishna, R. W. Spekkens, and E. Wolfe, New J,
[46] G. Chiribella, in Proceedings of the 11th workshop on Phys. 19, 123031 (2017).
Quantum Physics and Logic, Electron. Notes Theor. [71] D. Schmid, R. W. Spekkens, and E. Wolfe, Phys.
Comput. Sci., Vol. 172 (2014) pp. 1 – 14. Rev. A 97, 062103 (2018).
Accepted in Q
[72] M. Howard, J. Wallman, V. Veitch, and J. Emerson, Spekkens, PRX Quantum 2, 020302 (2021).
Nature 510, 351 (2014).
[73] M. D. Mazurek, M. F. Pusey, K. J. Resch, and R. W.
A Context-dependence in representations of GPTs

In the main text, we stated that the notion of an ontological model of a GPT that we have defined cannot be said to be
either generalized-contextual or generalized-noncontextual (unlike our notion of an ontological model of an operational
theory). We will now elaborate on this point.
Consider the contexts that one may wish to associate to a GPT state. One of the examples which appears in the
literature corresponds to different decompositions of the GPT state into mixtures of other GPT states, for example:
X X
pi si = s = qj s′j . (116)
i j
Now consider any ontological representation map which has the GPT as its domain. In the GPT, all three terms in
Eq. (116) are strictly equal, and hence all three map to the same probability distribution over Λ. As such, there is no
possibility for the map to represent an s arising from the LHS mixture differently from how it represents an s arising
from the RHS mixture.
A natural question one might consider in light of this is how one should represent ensembles of states ontologically.
The ensembles of relevance in the example just given are {(pi ,si )}, {(qj ,s′j )}, and {(1,s)}; all of these are operationally
equivalent:
{(pi ,si )} ∼ {(1,s)} ∼ {(qj ,s′j )}, (117)
but not strictly equal. If one defines a new kind of ontological representation map which acts on such objects, then it
could take these distinct ensembles to distinct probability distributions over Λ. One could then meaningfully talk about
whether such a representation depended on context or not.
However, the notion of ontological representation for a GPT that we have defined herein has as its domain processes
within the GPT (such as states), not ensembles of such processes. This is also true for the more general quasistochastic
representations of GPTs. As such, applying the notion of generalized contextuality to them is a category mistake, just
as it would be a category mistake to ask whether a variable X depends on another variable Y if Y cannot possibly
vary [4]. Because standard quasiprobability representations (such as Wigner’s or Gross’s) are instances of our definition
(and in particular, because they take the domain of the representation to be states and effects rather than ensembles of
states or ensembles of effects), it is equally meaningless to ask whether they are noncontextual or contextual.
Of course, one could define a map which has as its domain the set of ensembles of GPT processes. For such a map, it
would be appropriate to ask whether or not the map is noncontextual. This is similar to what is done in the causal-
inferential framework of Ref. [35], where the central objects of study are ensembles of processes corresponding to an
agent’s knowledge of what process occurred (although with the difference that in this case we consider ensembles of
unquotiented processes). In that context, we formalize the resulting notion of an ontological representation, as well as
the natural generalization of the notion of ‘noncontextuality’ that arises for it.
A similar story holds for the notion of context that is relevant for the study of Kochen-Specker contextuality. Consider
two measurements, M1 and M2 , which we conceptualize as processes with a GPT input and a classical output. Suppose
that these each have a particular outcome, labeled a and b respectively, which correspond to the same GPT effect:
a b
M1 = M2 . (118)
The fact that the effect associated to getting outcome a in measurement M1 is strictly equal to the effect associated
to getting outcome b in measurement M2 implies that any map which has the GPT effect space as its domain must
represent the two cases identically. Again, one finds that there is no possibility for a representation map with this
specific choice of domain to depend on whether or not the effect was realized using measurement M1 or M2 .
But, also as above, one could choose to consider a different kind of ontological representation map in which the
domain is no longer the set of GPT processes per se, but something else, which includes, for instance, measurement-
Accepted in Q
outcome pairs. In this particular case, we are interested in pairs (M1 ,a) and (M2 ,b) which are operationally equivalent,
   
a b
 M1 ,  ∼  M2 , , (119)
but not strictly equal. If one defines a new kind of ontological representation map which acts on such objects, then
it could take distinct such objects to distinct response functions. One could then meaningfully talk about whether
such a representation depended on context or not. This is what is typically done (if only implicitly) in the study of
Kochen-Specker noncontextuality.
B Proof of the structure theorem (Theorem 4.1)

We now complete the proof of Theorem 4.1, as sketched in the main text.
Proof. Since we are assuming tomographic locality of the GPT, Corollary 2.7 immediately gives
VB VB
B Pei
X
Te = rij . (120)
A ij E
ej
M
M
VA VA
Since M is convex-linear and preserves the zero processes14 , and since the effect-state channels span the vector space,
M can be uniquely extended to a linear map M̂ . Hence,
VB
X Pei X Pei
rij = rij . (121)
ij E
ej ij E
ej
M M̂
VA
Now, using the linearity of M̂ , we have
X Pei Pei
X
rij = rij . (122)
ij E
ej ij E
ej
M̂ M̂
Noting that in this diagram, M̂ is only applied to objects in the domain of M , on which the two maps act identically
(by the fact that the former is the linear extension of the latter), one has
VB
Pei Pei Pei

X X X M
rij = rij = rij . (123)
ij E
ej ij E
ej ij
E
ej
M̂ M M
VA
14 This follows from the fact that one can construct the zero process by composing a state and an effect with the zero scalar as 0B
A = P ◦0◦ E.
e e
Then, by empirical adequacy of M , one has M (0) = 0, and so diagram-preservation of M then gives M (0B
A ) = M (P ) ◦ M (0) ◦ M (E) =
e e
e = 0VB . V
M (P
e)◦0◦M (E)
A
Accepted in Q
where the last step follows from the fact that M is diagram-preserving. In summary, we have shown that
VB VB
B Pei
X M
Te = rij , (124)
A ij
M
E
ej
M
VA VA
as claimed in Eq. (70).
Next, we analyse M in the specific case of a state Pei :

VB
Pei . (125)
M
Since the DP map M has a unique linear extension which takes the vector space of GPT states B to the vector space VB ,
and since both of these are in FVectR , one can uniquely re-interpret the action of M as a process χ within FVectR :
VB
Pei = χB . (126)
M
Pei
In particular, we are using the fact that a linear map L : L(R,V ) → L(R,V ′ ) can always be uniquely represented by a
linear map l : V → V ′ by exploiting the fact that L(R,V ) ∼= V . The fact that χB is the unique linear map satisfying
Eq. (126), means that there is no possibility for making other choices for the χA appearing in Eq. (69).
Similarly, M on effects E
ej has a unique linear extension and takes functionals on GPT states to functionals on VA ; in
other words, M is the adjoint of a process ϕ within FVectR :
E
ej E
ej
AM = ϕA , (127)
VA VA
In particular, we are using the fact that a linear map L : L(V,R) → L(V ′ ,R) can always be uniquely represented by a
linear map l : V ′ → V by exploiting the fact that L(V,R) ∼= V ∗ and that L(V ∗ ,V ′∗ ) ∼
= L(V ′ ,V ). Combining this with
Eq. (124), we have
χB
χB
X Pei
Te = rij = Te . (128)
M ij E
ej
ϕA
ϕA
All that remains is to show that χA and ϕA are inverses. Consider the special case that Te is the identity, then
Eq. (128) becomes
χA
= . (129)
M
ϕA
Accepted in Q
Since M is diagram-preserving, it maps identity to identity, and so this becomes
χA
= . (130)
ϕA
Now consider a state Pe followed by an effect E.

e This gives a probability, and since M is empirically adequate it must
preserve this probability:
E
e E
e
= , (131)
Pe Pe
M
and since M is diagram-preserving,
E
e E
e
M
= . (132)
Pe Pe
M
M
Combining this with Eqs. (126) and (127) gives

E
e
E
e ϕA
= . (133)
Pe χA
Pe
Since this holds for all E e span A∗ and the Pe span A, and we have that
e and Pe, tomographic locality implies that the E
A
ϕA
= VA . (134)
A
χA
A
Combining this with Eq. (130) gives that χ and ϕ are inverses of each other. Hence, we can write that ϕA = χ−1
A and so
rewrite Eq. (128) as
VB
χB χB
B
Te = Te = Te , (135)
M
A
ϕA χ−1
A
VA
which completes the proof.
C Completing the proof of Proposition 3.2

The key argument required to establish Prop 3.2 was given just after the proposition itself, but we now complete the proof.
We now prove that ξnc := ξ◦e ∼ is indeed a valid ontological model of an operational theory if ξe is a valid ontological
model of a GPT. To do so, we show that each of the three properties (enumerated in Definition 2.9) that ξnc should
satisfy is implied by the corresponding property (enumerated in Definition 2.10) that ξe is assumed to satisfy by virtue
of being an ontological model of a GPT.
Accepted in Q
First, recall that we assumed that all deterministic effects in the operational theory are operationally equivalent.
Hence, the map ∼ will take any such deterministic effect to the unique deterministic effect in the GPT, which (by
property 1 of Definition 2.10) must be represented by the unit vector 1. Hence, ξnc represents all deterministic effects in
the operational theory appropriately, namely as the unit vector 1.
Second, recall that ∼ preserves the operational predictions of the operational theory; hence, the fact that (by property
e ∼ preserves the operational
2 of Definition 2.10) ξe preserves the operational predictions of the GPT implies that ξnc := ξ◦
predictions of the operational theory.
Third, recall that if, in the operational theory, P1 is a procedure that is a mixture of P2 and P3 with weights ω and
1−ω, then it follows that under ∼, one has
P
f1 = ω P
f2 +(1−ω)P f3 . (136)
Hence, the fact that (by property 3 of Definition 2.10) the representations of these three processes under ξe satisfy
P
f1 =ω P
f2 +(1−ω) P
f3 (137)
ξ
e ξ
e ξ
e
implies that the representations of P1 , P2 , and P3 satisfy
P1 =ω P2 +(1−ω) P3 . (138)
ξnc ξnc ξnc
Hence ξnc satisfies all the properties of an ontological model of an operational theory.
Conversely, we prove that ξe:= ξnc ◦C is a valid ontological model of a GPT if ξnc is a valid noncontextual ontological
model of an operational theory. To do so, we show that each of the three properties (enumerated in Definition 2.10)
that ξe should satisfy is implied by the corresponding property (enumerated in Definition 2.9) that ξnc is assumed to
satisfy by virtue of being an ontological model of a GPT.
First, consider the unique deterministic effect in the GPT. Applying C to this process yields one of the many
deterministic effects in the operational theory. Because (by property 1 of Definition 2.9) ξnc maps every one of these to
the unit vector 1, it follows that ξe:= ξnc ◦C maps the unique deterministic effect to the unit vector 1.
Second, recall that the context of a process is irrelevant for the operational predictions it makes, and that consequently,
the map C preserves the operational predictions. Given that (by property 2 of Definition 2.9) ξnc preserves the
operational predictions, ξe:= ξnc ◦C also preserves the operational predictions.
Third, consider three processes Pe1 , Pe2 , and Pe3 such that Pe1 = ω Pe2 +(1−ω)Pe3 in the GPT. Under C, one has processes
C(Pe1 ) = (Pe1 ,c1 ), C(Pe2 ) = (Pe1 ,c2 ), and C(Pe3 ) = (Pe1 ,c3 ) in the operational theory, where ci are arbitrary contexts specified
by the map C. The fact that Pe1 = ω Pe2 +(1−ω)Pe3 implies that C(Pe1 ) is operationally equivalent to the effective procedure
Pmix defined as the mixture of C(Pe2 ) and C(Pe3 ) with weights ω and 1−ω, respectively. (C(Pe1 ) may not actually be this
mixture, depending on its context ci , which depends on one’s choice of C.) By property 3 of Definition 2.9, ξnc must satisfy
Pmix =ω C(P
f2 ) +(1−ω) C(P
f3 ) . (139)
ξnc ξnc ξnc
But since ξnc is a noncontextual model and since C(Pe1 ) is operationally equivalent to Pmix , it follows that
C(Pe1 ) =ω C(P
f2 ) +(1−ω) C(P
f3 ) . (140)
ξnc ξnc ξnc
Hence we see that ξe:= ξnc ◦C satisfies property 3 of Definition 2.10, as required.
Accepted in Q
D Proof of Proposition 4.4
Proof. Since the ontological model ξe satisfies the requirements of Proposition 4.3 we immediately obtain Eqs. (96) and
(97).
We however also obtain additional constraints arising from the fact that the codomain of ξe is SubStoch rather than
QuasiSubStoch. This additional constraint can be viewed as a set of positivity conditions as we will now explain.
The notion of positivity we require is defined for linear maps between ordered vector spaces. A positive cone V + for
a real vector space V defines an ordering:
v ≤ v ′ ⇐⇒ ∃w ∈ V + s.t. v+w = v ′ . (141)
+ +
Given two such ordered vector spaces, (V,V ) and (W,W ), maps between these ordered vector spaces are linear maps
on the underlying vector spaces L : V → W and are said to be positive if and only if:
L(V + ) ⊆ W + . (142)
Now, the question is: what are the relevant ordered vector spaces which we want to consider here?
In the GPT Op,
f we can define an ordered vector space for each pair of systems (A,B) as the vector space spanned by
B
the transformations from A to B, which we denote by Span[Op f ]. The positive cone is defined by the vectors in this
A
B
space which can be expressed as a positive linear combination of vectors in Op f :
A
( )
B X B
Cone[Op
f ] := ri fei ri ∈ R+ , fei ∈ Op
A
f . A (143)
i
B B
Then it is clear that, (Span[Op
f ],Cone[Op
A
f ]) defines an ordered vector space.
A ′
Similarly, in SubStoch we can define an ordered vector space for each pair of systems (RΛ ,RΛ ) as the vector space
′ ′
spanned by the substochastic maps from RΛ to RΛ , which we denote by Span[SubStochΛ Λ ] which is a subspace of the
′
real vector space of linear maps from RΛ to RΛ . The positive cone is defined by the vectors in this space which can be
′
expressed as a positive linear combination of vectors in SubStochΛ Λ :
( )
Λ′ Λ′
X
+
Cone[SubStochΛ ] := ri si ri ∈ R , si ∈ SubStochΛ . (144)
i
′ ′
Then it is clear that, (Span[SubStochΛ Λ
Λ ],Cone[SubStochΛ ) defines an ordered vector space.
B
f to SubStochΛB can be extended to a linear
Then, for a pair of GPT systems (A,B) the action of the map ξe from Op A ΛA
B
map from Span[Op
f ]
A to Span[SubStochΛB
ΛA ].
Moreover, it is clear that this will be a positive linear map—in the sense
B
that we defined above—as it maps the positive cone Cone[Opf ] into the positive cone Cone[SubStochΛB ], that is:
A ΛA
( )!
B X B
ξe Cone[Op
f ] = ξe
A ri fei ri ∈ R+ , fei ∈ Op
f
A (145)
i
( )
X B
= e fei ) ri ∈ R+ , fei ∈ Op
ri ξ( f (146)
A
i
( )
X B
= ri si ri ∈ R+ , si ∈ ξ(
e Op
f )
A (147)
i
( )
X
⊆ +
ri si ri ∈ R , si ∈ SubStochΛ
ΛA
B
(148)
i
= Cone[SubStochΛB
ΛA ]. (149)
B B
In summary, for each pair (A,B), the representation map ξe defines a positive linear map from (Span[Op
f ],Cone[Op
A
f ])
A
′
Λ′
to (Span[SubStochΛ Λ ],Cone[SubStochΛ ).
This positivity condition is all fairly abstract so let us consider some more concrete consequences of this result.
Accepted in Q
If, rather than considering transformations from one GPT system to another, we consider just the states of a single
system A then everything simplifies considerably. The vector space we consider in the domain is simply the vector
space spanned by the GPT state space, and the positive cone is then just the standard cone of GPT states. The vector
space we consider in the codomain is simply the vector space RΛA with positive cone given by the cone of unnormalised
probability distributions. Moreover, the linearly extended action of ξe is nothing but the linear map χA so we find that
χA must be a positive map in the sense defined above.
Similarly, if we consider the contravariant action of χ−1
A on the space of GPT effects (that is, by composing the effect
onto the outgoing wire of χ−1 −1
A ) then we arrive at a similar result. Here we find that the contravariant action of χA is a
ΛA
positive linear map from the dual of the GPT vector space ordered by the effect cone to the dual of R ordered by the
cone of response functions.
E Proofs for preliminaries

E.1 Proof that quotienting is diagram-preserving
In order to see that the quotienting map is diagram-preserving, we must first define what it means for processes in the
quotiented theory to be composed. That is, given a suitable pair of equivalence class Te and R,e we must define R◦e Te
(assuming that the relevant type matching constraint is satisfied) and R⊗ T . We define these via composition of some
e e
choice of representative elements, r ∈ R
e and t ∈ Te, for each equivalence class, as
R
e
:= ◦t
rg and R
e Te := g .
r⊗t (150)
Te
For this to be well defined, it must be independent of the choices of representatives, i.e. for any t1 ,t2 ∈ Te and r1 ,r2 ∈ R,
e
one has
1 ◦t1
r] = 2 ◦t2
r] and 1 ⊗t1
r^ = 2 ⊗t2
r^ (151)
or equivalently
r1 r2
∼ and r1 t1 ∼ r2 t2 . (152)
t1 t2
If this is the case, then the quotienting map is a structure-preserving equivalence relation, or congruence relation, for
the process theory.
It is straightforward to show that the first equality in Eq. (152) is equivalent to the conditions
r r r1 r2
∀r ∼ and ∀ t ∼ . (153)
t1 t2 t t
To verify the nontrivial direction of this equivalence, consider the special case of these where r = r1 (in the first) and
where t = t2 (in the second); then, one has
r1 r1 r1 r2
∼ = ∼ (154)
t1 t2 t2 t2
Accepted in Q
Similarly, the second equality in Eq. (152) is equivalent to the conditions
∀r r t1 ∼ r t2 and ∀ t r1 t ∼ r1 t . (155)
We now prove that the first of these four conditions (namely, the first equivalence in Eq. (153)) holds:
t1 ∼ t2 ⇐⇒ ∀τ t1 τ = t2 τ (156)
p p
r r
=⇒ ∀τ ′ τ′ = τ′ (157)
t1 t2
p p
r r
⇐⇒ ∼ (158)
t1 t2
where in the second step we are noting that
r
τ′ (159)
is an example of a tester τ for any τ ′ and r. The argument for the other three conditions is analogous.
This establishes that the notion of composition that we have defined is independent of the choice of representative
elements, and so we can simply write
R
e
= ◦t
rg = R◦T
] and R
e Te = r⊗t
g = R⊗T
^ (160)
Te
Now, recalling that
T := Te (161)
∼
Accepted in Q
it follows that the quotienting map is diagram-preserving:
R R
e R
∼
= R◦T = R◦T
] = = , (162)
T ∼
T
∼
Te
∼
R T = R⊗T = R⊗T
^ = R
e Te = R T . (163)
∼ ∼ ∼ ∼
E.2 Proof of Lemma 2.5

We now prove Lemma 2.5, restated here:
Lemma E.1. The operation □ can be uniquely extended to a bilinear map

mB→C A→B A→C
□: R ,Rm → Rm , (164)
and the operation ⊠ can be uniquely extended to a bilinear map

mA→B C→D AC→BD
⊠: R ,Rm → Rm . (165)
Proof. Here we show the proof for □. The proof for ⊠ follows similarly.
A→B
To begin, note that the vectors RTe with Te : A → B span the vector space Rm , as we have taken F A→B to be a
B→C
minimal fiducial set of testers. Consequently, we can always (nonuniquely) write an arbitrary U ∈ Rm
P
as i ui RTe for
i
A→B
some transformations Tei : B → C, ui ∈ R, and can write an arbitrary V ∈ Rm
P
as j vj RTe′ for some transformations
j
Tej′ : A → B, vj ∈ R. Hence, we propose the linear extension be defined by U□V := ij ui vj (RTe □RTe′ ) = ij ui vj RTe ◦Te′ .
P P
i j i j
For this to be a valid definition, however, it must be the case that this is independent of the chosen decomposition of U
and V. We now show that this is indeed the case.
To begin, let us consider two distinct decompositions in the second argument of □. That is, given
X X
ui RTe = U = u′ j RTe′ , (166)
i j
i j
we want to show that X X

ui (RTe□RTe ) = u′ j (RTe□RTe′ ) (167)
i j
i j
for all Te.

To begin, we use linearity of E A→B (as defined in Section 2.2.1) to give us that
X X
ui KTe = u′ j KTe′ . (168)
i j
i j
Unpacking the definition of K gives us that for all testers τe,
′
P P
i ui Tei τe = ju j Te′ j τe . (169)
Accepted in Q
Now, define
Te
τe′ := τe (170)
for some arbitrary τe and transformation Te. Substituting tester τe′ into Eq. (169), we find that
X Te X Te
ui τe = vj τe . (171)
i Tei j Te′ j
As this holds for all τe, and so, in particular, for our fiducial testers, we therefore have that
X X
ui RTe◦Te = u′ j RTe◦Te′ . (172)
i j
i j
Finally, using the fact that RTe◦Te = RTe□RTe and similarly that RTe◦Te′ = RTe□RTe′ we find
i i j j
X X
ui (RTe□RTe ) = u′ j (RTe□RTe′ ), (173)
i j
i j
which is our desired result.

One can similarly show linearity in the first argument, namely,
X X
vk (RTe′′ □RTe) = v ′ l (RTe′′′ □RTe). (174)
k l
k l
Putting these together, we obtain full bilinearity of □, namely

X X
U□V = ui vj (RTe □RTe′ ) = u′ k v ′ l (RTe′′ □RTe′′′ ), (175)
i j k l
ij kl
as required.
E.3 Proof of Lemma 2.6

We now prove Lemma 2.6, restated here:
Lemma E.2. A GPT is tomographically local if and only if one can decompose the identity process for every system A,
denoted 1
eA , as
A
PejA
X
= [Me ]j
1A i
, (176)
A ij eA
Ei
A
where Me
1A
is the matrix inverse of the transition matrix Ne
1A
of the identity process, that is,
Me
1A
:= N−1 . (177)
1A
e
Proof. First, we prove that if a GPT satisfies tomographic locality, then the identity has a decomposition of the form in
Eq. (176). We do this by defining a particular process f as a linear expansion into states and effects with the carefully
chosen set of coefficients [Me
1A i
]j , and then we prove that f = 1
eA .
Accepted in Q
Take any minimal spanning set {PeiA }i of GPT states and spanning set {E
e A }j of GPT effects, and consider the
j
transition matrix Ne
1A
with entries given by
eA
Ej
[Ne ]j :=
1A i
. (178)
PeiA
Next, define M−1 := Ne

1A
, that is the inverse Ne
1A
with matrix elements [Me ]j satisfying
1A i
1A
e
X
[Me ]j [Ne
1A i
]k = δik ,
1A j
(179)
j
]j = 0
P
The matrix inverse of Ne 1A
exists, since the rows of Ne
1A
are linearly independent. That is, we show that i ai [Ne
1A i
if and only if ai = 0 for all i. First, note that we have
eA
E
X X j
0= ai [Ne ]j =
1A i
ai ,
i i PeiA
but as the Ee A span the space of effects and composition is bilinear (see Lemma 2.5), this means that P ai PeA = 0; then,
j i i
since the PeiA are linearly independent, we have ai = 0 for all i, as desired.
Next we use this inverse to define a process f as
PejA
X
f := [Me ]j
1A i
. (180)
ij eA
E i
A priori, there is no reason why this process must be a physical GPT process; however, it turns out to be the identity
process, as we will now show. Consider the expression
eA
E l
f (181)
PekA
for some PekA and some Ee A from the minimal spanning sets above. Substituting the expansion of f followed by applying
l
the definition of Me1A
and Ne 1A
, one has
eA
El
eA
El
PejA
X X
f = [Me ]j
1A i
= [Me ]j [Ne
1A i
]l [Ne
1A j
]i .
1A k
(182)
ij eA ij
Ei
PekA
PekA
But now it follows from Eq. (179) that
eA
E
X X l
j l i
[Me
1A
] i [N1A
e ]j [Ne1A
] k = δil [Ne ]i = [Ne
1A k
]l =
1A k
. (183)
ij i PekA
Accepted in Q
Hence, it holds that
eA
E eA
E
l l
f = (184)
PekA PekA
for all PekA and Ee A in the minimal spanning sets above. By the fact that these sets span the state and effect spaces
l
respectively, it follows that
E
e E
e
f = (185)
Pe Pe
for all Pe and E.

e Now, in any GPT which satisfies tomographic locality, namely Eq. (49), two channels which give the
same statistics on all local inputs and outputs are equal, and hence f is in fact the identity transformation. Hence, the
identity transformation has a linear expansion, of the form given by Eq. (176), namely
PejA
X
f = = [Me ]j
1A i
. (186)
ij eA
Ei
Next, we prove the converse: if the identity has a linear expansion as in Eq. (176) in a given GPT, then that GPT
satisfies tomographic locality. To see this, consider two bipartite processes Te and Te′ which give rise to the same
statistics on all local inputs, so that
E
e1 E
e2 E
e1 E
e2
B D B D
∀P1 ,P2 ,E1 ,E2 Te = Te′ . (187)
A C A C
Pe1 Pe2 Pe1 Pe2
For any tester τe of the appropriate type, one can write the probability generated by composing with Te as
PekB′ PelD
′
B D eB
Ek
eD
El
X ′ ′ ′ ′
τe Te = [Me ]i [Me
1A i
]j [Me
1C j
]k [Me
1B k
]l
1D l τe Te (188)
ijkli′ j ′ k′ l′
A C PeiA′ PejC′
eA
E eC
E
i j
Accepted in Q
simply by inserting the linear expansion of the identity on each system. Similarly, one can write
PekB′ PelD
′
B D eB
E k
eD
E l
X ′ ′ ′ ′
τe Te′ = [Me ]i [Me
1A i
]j [Me
1C j
]k [Me
1B k
]l
1D l τe Te′ . (189)
ijkli′ j ′ k′ l′
A C PeiA′ PejC′
eA
E eC
E
i j
Noting that the RHS of Eq. (188) splits into two disconnected diagrams, and the same holds for the RHS of Eq. (189),
it follows from Eq. (187) that
τe Te = τe Te′ . (190)
Since this is true for any two processes satisfying Eq. (187), the principle of tomographic locality (Eq. (49)) is satisfied.
E.4 Proof of Eqs. (57) and (58)

To prove Eq. (57), one can decompose the four identities in the diagram and perform some simple manipulations of the
resulting expression.
PelB PelD
′
X X ′ PelB PelD
′
[Me ]l
1B k
[Me ]l ′
1D k
X X ′
kl k ′ l′
[Me ]l
1B k
[Me ]l ′
1D k
eB
E e D′
E
B D k k kl k ′ l′
′
Te Te′ = Te Te′ = [NTe]kj [NTe′ ]kj ′ (191)
A C PejA PejC′ X X ′
X X ′ [Me ]j [Me ]j
1C i′
[Me ]j
1A i
[Me ]j
1C i′ ij
1A i
i′ j ′
eA
E e C′
E
ij i′ j ′ i i
eA
E e C′
E
i i
  PelB PelD
′
X X ′ ′ ′
=  [Me ]l [NTe]kj [Me
1B k
]j [Me
1A i
]l ′ [NTe′ ]kj ′ [Me
1D k
]j 
1C i′
ii′ ll′ jj ′ kk′
eA
E e C′
E
i i
(192)
PelB PelD
′
X
l l′
= [Me
1B
◦NTe ◦M1A
e ]i [M1D
e ◦NTe′ ◦M1C
e ]i ′
ii′ ll′
eA
E e C′
E
i i
(193)
Accepted in Q
PelB PelD
′
X ′

= [(Me
1B
◦NTe ◦Me
1A
)⊗(Me
1D
◦NTe′ ◦Me )]ll ′
1C ii
ii′ ll′
eA
E e C′
E
i i
(194)
PelB PelD
′
X ′

= [MTe ⊗MTe′ ]ll
ii′ . (195)
ii′ ll′
eA
E e C′
E
i i
To prove Eq. (58), one can insert four decompositions of the identity into the following diagram:
PepC
X
[Me ]p
1C o Pp eC
op eoC
E X
p
[Me ]
1C o
Te′ op
PenB
[NTe′ ]on
X
[Me ]n
X
C 1B m [Me ]n
1B m
mn
Te′ E
emB mn
B = = [Ne]m
1 l (196)
Te PelB X
X [Me ]l
1B k
A [Me ]l
1B k kl
kl eB
E k [NTe]kj
Te X
[Me ]j
1A i
PejA ij eA
X Ei
[Me ]j
1A i
ij
eA
E i
PepC
X
= [Me
1C
◦NTe′ ◦Me
1B
◦Ne
1B
◦Me
1B
◦NTe ◦Me ]o
1A i
(197)
ip eA
E i
PepC
X
= [MTe′ ◦Ne
1B
◦MTe]oi . (198)
ip eA
Ei
Accepted in Q

A Structure Theorem For Generalized-Noncontextual Ontological Models

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Structure Theorem For Generalized-Noncontextual Ontological Models

Uploaded by

Copyright:

Available Formats

A structure theorem for generalized-noncontextual ontological models

David Schmid1,2,3 , John H. Selby1 , Matthew F. Pusey4 , and Robert W. Spekkens2

It is useful to have a criterion for when the 2.1 Process theories . . . . . . . . . . . . . . 5

predictions of an operational theory should be 2.2 Operational theories . . . . . . . . . . . 8

The most significant assumption regarding the scope 

τ = Wτ . (5) M (B) M (B) M (D) M (B) M (B) M (D)

It is worth noting that in the quotiented operational the-

5 The recognition that (for operational theories corresponding to ∃h ∈ P : f h ̸= g h . (32)

sional, and gives a highly inefficient characterisation of □: R ,Rm → Rm , (40)

E E 3. It preserves the convex and coarse-graining relations

3. It preserves the convex and coarse-graining relations

3. It preserves the convex and coarse-graining relations

2.3.2 Quasiprobabilistic models (67)

ΛA deemed a good classical explanation only if it is noncon-

= = 1. (65) Definition 3.1 (A noncontextual ontological model of

Theorem 4.1. Any convex-linear, empirically

where for each system A, χA : A → VA is a invertible linear

which, in particular, means we can decompose identities as

This means that, for any system A in the GPT, we can

As χA is invertible, then {De A }λ∈Λ is a cobasis for the RΛB RΛ B

Similarly, we could run the same argument using χ−1

χA = (97) Corollary 4.5. For operational theories whose corre-

Moreover, by considering parallel composition we can X X

VA VB VA VB That is, diagram preservation implies that the frame

not (as they are defined by overcomplete frames). There

A Context-dependence in representations of GPTs

B Proof of the structure theorem (Theorem 4.1)

Now, using the linearity of M̂ , we have

Pei Pei Pei

as claimed in Eq. (70).

Next, we analyse M in the specific case of a state Pei :

Now consider a state Pe followed by an effect E.

and since M is diagram-preserving,

Combining this with Eqs. (126) and (127) gives

which completes the proof.

C Completing the proof of Proposition 3.2

implies that the representations of P1 , P2 , and P3 satisfy

E Proofs for preliminaries

where in the second step we are noting that

Now, recalling that

E.2 Proof of Lemma 2.5

Lemma E.1. The operation □ can be uniquely extended to a bilinear map

and the operation ⊠ can be uniquely extended to a bilinear map

we want to show that X X

for all Te.

Unpacking the definition of K gives us that for all testers τe,

which is our desired result.

Putting these together, we obtain full bilinearity of □, namely

E.3 Proof of Lemma 2.6

Next, define M−1 := Ne

But now it follows from Eq. (179) that

for all Pe and E.

E.4 Proof of Eqs. (57) and (58)

You might also like