协方差与协方差矩阵

Please refresh the page if equations are not rendered correctly.
---------------------------------------------------------------

1. 协方差

定义: 若实数随机变量 $X$ 与 $Y$ 期望值分别为 $E(X)=\mu$ 与 $E(Y)=\nu$ ，则两者间的协方差定义为:

$\operatorname{cov}(X, Y)=\mathrm{E}[(X-\mu)(Y-\nu)]$

2.协方差矩阵

设有一组随机向量（多元随机变量或随机向量， multivariate random variable or random vector），可以表示为 $\mathbf{X} = \left[ x_1, x_2, x_3, ..., x_n \right]^\top$ ， $n=1,2,3, ..., n$ ，代表这一组随机向量的个数。每个随机向量包含 $m$ 个元素，则可以定义该组随机向量的协方差矩阵为：

$\operatorname{Covariance \ Matrix \ \mathbf{C}}=\frac{1}{m-1}\left[\begin{array}{cccc} \operatorname{cov}\left(x_1, x_1\right)&\operatorname{cov}\left(x_1, x_2\right)&\ldots&\operatorname{cov}\left(x_1, x_n\right) \\ \operatorname{cov}\left(x_2, x_1\right)&\operatorname{cov}\left(x_2, x_2\right)&\ldots&\operatorname{cov}\left(x_2, x_n\right) \\ \vdots&\vdots&\ddots&\vdots \\ \operatorname{cov}\left(x_n, x_1\right)&\operatorname{cov}\left(x_n, x_2\right)&\ldots&\operatorname{cov}\left(x_n, x_n\right) \end{array}\right]$

协方差矩阵的第 $(i, j)$ 项定义为如下形式 :

$c_{i j}=\operatorname{cov}\left(x_i, x_j\right)=\mathrm{E}\left[\left(x_i-\mu_i\right)\left(x_j-\mu_j\right)\right]$

其中， $\mu_i$ 是 $x_i$ 的期望值，即， $\mu_i=\mathrm{E}\left(x_i\right)$ 。而协方差矩阵为:

$\mathbf{C} =\mathrm{E}\left[(\mathbf{X}-\mathrm{E}[\mathbf{X}])(\mathbf{X}-\mathrm{E}[\mathbf{X}])^{\mathrm{T}}\right]$

Nomenclatures differ. Some statisticians, following the probabilist William Feller in his two-volume book $A n$ Introduction to Probability Theory and Its Applications, ${ }^{[2]}$ call the matrix $\mathrm{K}_{\mathbf{X X}}$ the variance of the random vector $\mathbf{X}$ , because it is the natural generalization to higher dimensions of the 1-dimensional variance. Others call it the covariance matrix, because it is the matrix of covariances between the scalar components of the vector $\mathbf{X}$ .

$\operatorname{var}(\mathbf{X})=\operatorname{cov}(\mathbf{X}, \mathbf{X})=\mathrm{E}\left[(\mathbf{X}-\mathrm{E}[\mathbf{X}])(\mathbf{X}-\mathrm{E}[\mathbf{X}])^{\mathrm{T}}\right] .$

Both forms are quite standard, and there is no ambiguity between them. The matrix $\mathrm{K}_{\mathbf{X X}}$ is also often called the variance-covariance matrix, since the diagonal terms are in fact variances.
By comparison, the notation for the cross-covariance matrix between two vectors is

$\operatorname{cov}(\mathbf{X}, \mathbf{Y})=\mathrm{K}_{\mathbf{X Y}}=\mathrm{E}\left[(\mathbf{X}-\mathrm{E}[\mathbf{X}])(\mathbf{Y}-\mathrm{E}[\mathbf{Y}])^{\mathrm{T}}\right]$

举例：设有随机向量 $x_1$ 和 $x_2$ , 分别为:

$x_1 = [-2.1, -1, 4.3] \\ x_2 = [3.0, 1.1, 0.12]$

可以组成 $X$ :

X = np.stack((x1, x2), axis=0)

既：

$\left[\begin{array}{ccc} -2.1&-1&4.3 \\ 3.0&1.1&0.12 \end{array}\right]$

使用Numpy中的协方差矩阵函数numpy.cov()可以计算其协方差矩阵：

x1 = [-2.1, -1,  4.3]
x2 = [3,  1.1,  0.12]
X = np.stack((x1, x2), axis=0)

>>> np.cov(X)
array([[11.71      , -4.286     ], # may vary
       [-4.286     ,  2.144133]])

>>> np.cov(x1, x2)
array([[11.71      , -4.286     ], # may vary
       [-4.286     ,  2.144133]])

>>> np.cov(x1, bias=False)
array(11.71)

>>> np.cov(x1,bias=True)
array(7.80666667)

>>> np.cov(x,ddof=0)
array(7.80666667)

numpy.cov(m, y=None, rowvar=True, bias=False, ddof=None, fweights=None, aweights=None, *, dtype=None)[source]
注意参数的默认值：
- 当bias参数取默认值时，计算各随机变量的均值时采用 $(m-1)$ ，其中m为number of observations given in each radom vector (unbiased estimate)。反之，如果设置为True，则采用 $m$ 求均值。
- If ddof not None the default value implied by bias is overridden. Note that ddof=1 will return the unbiased estimate, even if both fweights and aweights are specified, and ddof=0 will return the simple average (用随机向量的实际元素个数 $m$ 求均值). See the notes for the details. The default value is None.

3. Pearson相关性系数

已知协方差矩阵的情况下，Pearson相关性系数可以根据以下公式计算得到：

$R_{i j}=\frac{c_{i j}}{\sqrt{c_{i i} c_{j j}}}$

The values of $R$ are between -1 and 1 , inclusive.

在Numpy中，可以直接使用numpy.corrcoef函数求得。

参考资料：
1. 协方差 - 维基百科，自由的百科全书
2. 协方差矩阵 - 维基百科，自由的百科全书
3. 2023-09-04 numpy.cov — NumPy v1.25 Manual
4. numpy.corrcoef — NumPy v1.25 Manual

协方差与协方差矩阵

1. 协方差

2.协方差矩阵

3. Pearson相关性系数

[PyVista] 绘制样条曲线和管状路径

[PoreSpy] 安装

Comments NOTHING

取消回复