Topic in measure theory
In mathematics , the category of Markov kernels , often denoted Stoch , is the category whose objects are measurable spaces and whose morphisms are Markov kernels .
[ 1] [ 2] [ 3] [ 4]
It is analogous to the category of sets and functions , but where the arrows can be interpreted as being stochastic .
Several variants of this category are used in the literature. For example, one can use subprobability kernels[ 5] instead of probability kernels, or more general s-finite kernels.[ 6]
Also, one can take as morphisms equivalence classes of Markov kernels under almost sure equality ;[ 7] see below.
Definition
Recall that a Markov kernel between measurable spaces
(
X
,
F
)
{\displaystyle (X,{\mathcal {F}})}
and
(
Y
,
G
)
{\displaystyle (Y,{\mathcal {G}})}
is an assignment
k
:
X
×
G
→
R
{\displaystyle k:X\times {\mathcal {G}}\to \mathbb {R} }
which is measurable as a function on
X
{\displaystyle X}
and which is a probability measure on
G
{\displaystyle {\mathcal {G}}}
.[ 4] We denote its values by
k
(
B
|
x
)
{\displaystyle k(B|x)}
for
x
∈
X
{\displaystyle x\in X}
and
B
∈
G
{\displaystyle B\in {\mathcal {G}}}
, which suggests an interpretation as conditional probability .
The category Stoch has:[ 4]
δ
(
A
|
x
)
=
1
A
(
x
)
=
{
1
x
∈
A
;
0
x
∉
A
{\displaystyle \delta (A|x)=1_{A}(x)={\begin{cases}1&x\in A;\\0&x\notin A\end{cases}}}
for all
x
∈
X
{\displaystyle x\in X}
and
A
∈
F
{\displaystyle A\in {\mathcal {F}}}
;
Given kernels
k
:
(
X
,
F
)
→
(
Y
,
G
)
{\displaystyle k:(X,{\mathcal {F}})\to (Y,{\mathcal {G}})}
and
h
:
(
Y
,
G
)
→
(
Z
,
H
)
{\displaystyle h:(Y,{\mathcal {G}})\to (Z,{\mathcal {H}})}
, the composite morphism
h
∘
k
:
(
X
,
F
)
→
(
Z
,
H
)
{\displaystyle h\circ k:(X,{\mathcal {F}})\to (Z,{\mathcal {H}})}
is given by
(
h
∘
k
)
(
C
|
x
)
=
∫
Y
h
(
C
|
y
)
k
(
d
y
|
x
)
{\displaystyle (h\circ k)(C|x)=\int _{Y}h(C|y)\,k(dy|x)}
for all
x
∈
X
{\displaystyle x\in X}
and
C
∈
H
{\displaystyle C\in {\mathcal {H}}}
.
This composition formula is sometimes called the Chapman-Kolmogorov equation .[ 4]
This composition is unital, and associative by the monotone convergence theorem , so that one indeed has a category .
Basic properties
Probability measures
The terminal object of Stoch is the one-point space
1
{\displaystyle 1}
.[ 4] Morphisms in the form
1
→
X
{\displaystyle 1\to X}
can be equivalently seen as probability measures on
X
{\displaystyle X}
, since they correspond to functions
1
→
P
X
{\displaystyle 1\to PX}
, i.e. elements of
P
X
{\displaystyle PX}
.
Given kernels
p
:
1
→
X
{\displaystyle p:1\to X}
and
k
:
X
→
Y
{\displaystyle k:X\to Y}
, the composite kernel
k
∘
p
:
1
→
Y
{\displaystyle k\circ p:1\to Y}
gives the probability measure on
Y
{\displaystyle Y}
with values
(
k
∘
p
)
(
B
)
=
∫
X
k
(
B
|
x
)
p
(
d
x
)
,
{\displaystyle (k\circ p)(B)=\int _{X}k(B|x)\,p(dx),}
for every measurable subset
B
{\displaystyle B}
of
Y
{\displaystyle Y}
.[ 7]
Given probability spaces
(
X
,
F
,
p
)
{\displaystyle (X,{\mathcal {F}},p)}
and
(
Y
,
G
,
q
)
{\displaystyle (Y,{\mathcal {G}},q)}
, a measure-preserving Markov kernel
(
X
,
F
,
p
)
→
(
Y
,
G
,
q
)
{\displaystyle (X,{\mathcal {F}},p)\to (Y,{\mathcal {G}},q)}
is a Markov kernel
k
:
(
X
,
F
)
→
(
Y
,
G
)
{\displaystyle k:(X,{\mathcal {F}})\to (Y,{\mathcal {G}})}
such that for every measurable subset
B
∈
G
{\displaystyle B\in {\mathcal {G}}}
,[ 7]
q
(
B
)
=
∫
X
k
(
B
|
x
)
p
(
d
x
)
.
{\displaystyle q(B)=\int _{X}k(B|x)\,p(dx).}
Probability spaces and measure-preserving Markov kernels form a category , which can be seen as the slice category
(
H
o
m
S
t
o
c
h
(
1
,
−
)
,
S
t
o
c
h
)
{\displaystyle (\mathrm {Hom} _{\mathrm {Stoch} }(1,-),\mathrm {Stoch} )}
.
Measurable functions
Every measurable function
f
:
(
X
,
F
)
→
(
Y
,
G
)
{\displaystyle f:(X,{\mathcal {F}})\to (Y,{\mathcal {G}})}
defines canonically a Markov kernel
δ
f
:
(
X
,
F
)
→
(
Y
,
G
)
{\displaystyle \delta _{f}:(X,{\mathcal {F}})\to (Y,{\mathcal {G}})}
as follows,
δ
f
(
B
|
x
)
=
1
B
(
f
(
x
)
)
=
{
1
f
(
x
)
∈
B
;
0
f
(
x
)
∉
B
{\displaystyle \delta _{f}(B|x)=1_{B}(f(x))={\begin{cases}1&f(x)\in B;\\0&f(x)\notin B\end{cases}}}
for every
x
∈
X
{\displaystyle x\in X}
and every
B
∈
G
{\displaystyle B\in {\mathcal {G}}}
. This construction preserves identities and compositions, and is therefore a functor from Meas to Stoch .
Isomorphisms
By functoriality, every isomorphism of measurable spaces (in the category Meas ) induces an isomorphism in Stoch . However, in Stoch there are more isomorphisms, and in particular, measurable spaces can be isomorphic in Stoch even when the underlying sets are not in bijection.
Relationship with other categories
H
o
m
S
t
o
c
h
(
X
,
Y
)
≅
H
o
m
M
e
a
s
(
X
,
P
Y
)
{\displaystyle \mathrm {Hom} _{\mathrm {Stoch} }(X,Y)\cong \mathrm {Hom} _{\mathrm {Meas} }(X,PY)}
between Stoch and the category of measurable spaces .
As mentioned above, one can construct a category of probability spaces and measure-preserving Markov kernels as the slice category
(
H
o
m
S
t
o
c
h
(
1
,
−
)
,
S
t
o
c
h
)
{\displaystyle (\mathrm {Hom} _{\mathrm {Stoch} }(1,-),\mathrm {Stoch} )}
.
Particular limits and colimits
Since the functor
L
:
M
e
a
s
→
S
t
o
c
h
{\displaystyle L:\mathrm {Meas} \to \mathrm {Stoch} }
is left adjoint , it preserves colimits .[ 8] Because of this, all colimits in the category of measurable spaces are also colimits in Stoch . For example,
The initial object is the empty set, with its trivial measurable structure;
The coproduct is given by the disjoint union of measurable spaces, with its canonical sigma-algebra .
The sequential colimit of a decreasing filtration is given by the intersection of sigma-algebras.
In general, the functor
L
{\displaystyle L}
does not preserve limits . This in particular implies that the product of measurable spaces is not a product in Stoch in general. Since the Giry monad is monoidal , however, the product of measurable spaces still makes Stoch a monoidal category .[ 4]
A limit of particular significance for probability theory is de Finetti's theorem , which can be interpreted as the fact that the space of probability measures (Giry monad ) is the limit in Stoch of the diagram formed by finite permutations of sequences.
Almost sure version
Sometimes it is useful to consider Markov kernels only up to almost sure equality , for example when talking about disintegrations or about regular conditional probability .
Given probability spaces
(
X
,
F
,
p
)
{\displaystyle (X,{\mathcal {F}},p)}
and
(
Y
,
G
,
q
)
{\displaystyle (Y,{\mathcal {G}},q)}
, we say that two measure-preserving kernels
k
,
h
:
(
X
,
F
,
p
)
→
(
Y
,
G
,
q
)
{\displaystyle k,h:(X,{\mathcal {F}},p)\to (Y,{\mathcal {G}},q)}
are almost surely equal if and only if for every measurable subset
B
∈
G
{\displaystyle B\in {\mathcal {G}}}
,
k
(
B
|
x
)
=
h
(
B
|
x
)
{\displaystyle k(B|x)=h(B|x)}
for
p
{\displaystyle p}
-almost all
x
∈
X
{\displaystyle x\in X}
.[ 7]
This defines an equivalence relation on the set of measure-preserving Markov kernels
k
,
h
:
(
X
,
F
,
p
)
→
(
Y
,
G
,
q
)
{\displaystyle k,h:(X,{\mathcal {F}},p)\to (Y,{\mathcal {G}},q)}
.
Probability spaces and equivalence classes of Markov kernels under the relation defined above form a category . When restricted to standard Borel probability spaces, the category is often denoted by Krn .[ 7]
See also
Citations
References
Lawvere, F. W. (1962). "The Category of Probabilistic Mappings" (PDF) .
Chentsov, N. N. (1965). "The categories of mathematical statistics" . Dokl. Akad. SSSR . 164 .
Giry, Michèle (1982). "A categorical approach to probability theory" . Categorical Aspects of Topology and Analysis . Lecture Notes in Mathematics. Vol. 915. Springer. pp. 68– 85. doi :10.1007/BFb0092872 . ISBN 978-3-540-11211-2 .
Panangaden, Prakash (1999). "The category of Markov kernels" . Electronic Notes in Theoretical Computer Science . 22 : 171– 187. doi :10.1016/S1571-0661(05)80602-4 .
Riehl, Emily (2016). Category Theory in Context . Dover. ISBN 9780486809038 .
Kallenberg, Olav (2017). Random Measures, Theory and Applications . Probability Theory and Stochastic Modelling. Vol. 77. Springer. doi :10.1007/978-3-319-41598-7 . ISBN 978-3-319-41596-3 .
Dahlqvist, Fredrik; Danos, Vincent; Garnier, Ilias; Silva, Alexandra (2018). "Borel Kernels and their Approximation, Categorically". MFPS 2018: Proceedings of Mathematical Foundations of Programming Semantics . arXiv :1803.02651 .
Fritz, Tobias (2020). "A synthetic approach to Markov kernels, conditional independence and theorems on sufficient statistics" . Advances in Mathematics . 370 . arXiv :1908.07021 . doi :10.1016/j.aim.2020.107239 . S2CID 201103837 .
Further reading