Concept in probability theory
For the notion in quantum mechanics, see scattering matrix .
In multivariate statistics and probability theory , the scatter matrix is a statistic that is used to make estimates of the covariance matrix , for instance of the multivariate normal distribution .
Definition
Given n samples of m -dimensional data, represented as the m-by-n matrix,
X
=
[
x
1
,
x
2
,
…
,
x
n
]
{\displaystyle X=[\mathbf {x} _{1},\mathbf {x} _{2},\ldots ,\mathbf {x} _{n}]}
, the sample mean is
x
¯
=
1
n
∑
j
=
1
n
x
j
{\displaystyle {\overline {\mathbf {x} }}={\frac {1}{n}}\sum _{j=1}^{n}\mathbf {x} _{j}}
where
x
j
{\displaystyle \mathbf {x} _{j}}
is the j -th column of
X
{\displaystyle X}
.[ 1]
The scatter matrix is the m -by-m positive semi-definite matrix
S
=
∑
j
=
1
n
(
x
j
−
x
¯
)
(
x
j
−
x
¯
)
T
=
∑
j
=
1
n
(
x
j
−
x
¯
)
⊗
(
x
j
−
x
¯
)
=
(
∑
j
=
1
n
x
j
x
j
T
)
−
n
x
¯
x
¯
T
{\displaystyle S=\sum _{j=1}^{n}(\mathbf {x} _{j}-{\overline {\mathbf {x} }})(\mathbf {x} _{j}-{\overline {\mathbf {x} }})^{T}=\sum _{j=1}^{n}(\mathbf {x} _{j}-{\overline {\mathbf {x} }})\otimes (\mathbf {x} _{j}-{\overline {\mathbf {x} }})=\left(\sum _{j=1}^{n}\mathbf {x} _{j}\mathbf {x} _{j}^{T}\right)-n{\overline {\mathbf {x} }}{\overline {\mathbf {x} }}^{T}}
where
(
⋅
)
T
{\displaystyle (\cdot )^{T}}
denotes matrix transpose ,[ 2] and multiplication is with regards to the outer product . The scatter matrix may be expressed more succinctly as
S
=
X
C
n
X
T
{\displaystyle S=X\,C_{n}\,X^{T}}
where
C
n
{\displaystyle \,C_{n}}
is the n -by-n centering matrix .
Application
The maximum likelihood estimate, given n samples, for the covariance matrix of a multivariate normal distribution can be expressed as the normalized scatter matrix
C
M
L
=
1
n
S
.
{\displaystyle C_{ML}={\frac {1}{n}}S.}
[ 3]
When the columns of
X
{\displaystyle X}
are independently sampled from a multivariate normal distribution, then
S
{\displaystyle S}
has a Wishart distribution .
See also
References