协方差与协方差矩阵

1. 协方差

定义: 若实数随机变量 $X$ 与 $Y$ 期望值分别为 $E(X)=\mu$ 与 $E(Y)=\nu$ ，则两者间的协方差定义为:

$$
\operatorname{cov}(X, Y)=\mathrm{E}[(X-\mu)(Y-\nu)]
$$

2.协方差矩阵

设有一组随机向量（多元随机变量或随机向量， multivariate random variable or random vector），可以表示为$\mathbf{X} = \left[ x_1, x_2, x_3, ..., x_n \right]^\top$，$n=1,2,3, ..., n$，代表这一组随机向量的个数。每个随机向量包含$m$个元素，则可以定义该组随机向量的协方差矩阵为：

$$
\operatorname{Covariance \ Matrix \ \mathbf{C}}=\frac{1}{m-1}\left[\begin{array}{cccc}
\operatorname{cov}\left(x_1, x_1\right) & \operatorname{cov}\left(x_1, x_2\right) & \ldots & \operatorname{cov}\left(x_1, x_n\right) \\
\operatorname{cov}\left(x_2, x_1\right) & \operatorname{cov}\left(x_2, x_2\right) & \ldots & \operatorname{cov}\left(x_2, x_n\right) \\
\vdots & \vdots & \ddots & \vdots \\
\operatorname{cov}\left(x_n, x_1\right) & \operatorname{cov}\left(x_n, x_2\right) & \ldots & \operatorname{cov}\left(x_n, x_n\right)
\end{array}\right]
$$

协方差矩阵的第 $(i, j)$ 项定义为如下形式 :

$$
c_{i j}=\operatorname{cov}\left(x_i, x_j\right)=\mathrm{E}\left[\left(x_i-\mu_i\right)\left(x_j-\mu_j\right)\right]
$$

其中， $\mu_i$ 是 $x_i$ 的期望值，即， $\mu_i=\mathrm{E}\left(x_i\right)$ 。而协方差矩阵为:

$$
\mathbf{C} =\mathrm{E}\left[(\mathbf{X}-\mathrm{E}[\mathbf{X}])(\mathbf{X}-\mathrm{E}[\mathbf{X}])^{\mathrm{T}}\right]
$$

Nomenclatures differ. Some statisticians, following the probabilist William Feller in his two-volume book $A n$ Introduction to Probability Theory and Its Applications, ${ }^{[2]}$ call the matrix $\mathrm{K}_{\mathbf{X X}}$ the variance of the random vector $\mathbf{X}$, because it is the natural generalization to higher dimensions of the 1-dimensional variance. Others call it the covariance matrix, because it is the matrix of covariances between the scalar components of the vector $\mathbf{X}$.

$$
\operatorname{var}(\mathbf{X})=\operatorname{cov}(\mathbf{X}, \mathbf{X})=\mathrm{E}\left[(\mathbf{X}-\mathrm{E}[\mathbf{X}])(\mathbf{X}-\mathrm{E}[\mathbf{X}])^{\mathrm{T}}\right] .
$$

Both forms are quite standard, and there is no ambiguity between them. The matrix $\mathrm{K}_{\mathbf{X X}}$ is also often called the variance-covariance matrix, since the diagonal terms are in fact variances.
By comparison, the notation for the cross-covariance matrix between two vectors is

$$
\operatorname{cov}(\mathbf{X}, \mathbf{Y})=\mathrm{K}_{\mathbf{X Y}}=\mathrm{E}\left[(\mathbf{X}-\mathrm{E}[\mathbf{X}])(\mathbf{Y}-\mathrm{E}[\mathbf{Y}])^{\mathrm{T}}\right]
$$

举例：设有随机向量$x_1$和$x_2$, 分别为:

$$
x_1 = [-2.1, -1, 4.3] \\
x_2 = [3.0, 1.1, 0.12]
$$

可以组成$X$:

X = np.stack((x1, x2), axis=0)

既：

$$
\left[\begin{array}{ccc}
-2.1 & -1 & 4.3 \\
3.0 & 1.1 & 0.12
\end{array}\right]
$$

使用Numpy中的协方差矩阵函数numpy.cov()可以计算其协方差矩阵：

x1 = [-2.1, -1,  4.3]
x2 = [3,  1.1,  0.12]
X = np.stack((x1, x2), axis=0)

>>> np.cov(X)
array([[11.71      , -4.286     ], # may vary
       [-4.286     ,  2.144133]])

>>> np.cov(x1, x2)
array([[11.71      , -4.286     ], # may vary
       [-4.286     ,  2.144133]])

>>> np.cov(x1, bias=False)
array(11.71)

>>> np.cov(x1,bias=True)
array(7.80666667)

>>> np.cov(x,ddof=0)
array(7.80666667)

numpy.cov(m, y=None, rowvar=True, bias=False, ddof=None, fweights=None, aweights=None, *, dtype=None)[source]
注意参数的默认值：
- 当bias参数取默认值时，计算各随机变量的均值时采用$(m-1)$，其中m为number of observations given in each radom vector (unbiased estimate)。反之，如果设置为True，则采用$m$求均值。
- If ddof not None the default value implied by bias is overridden. Note that ddof=1 will return the unbiased estimate, even if both fweights and aweights are specified, and ddof=0 will return the simple average (用随机向量的实际元素个数$m$求均值). See the notes for the details. The default value is None.