432 次浏览

Banach Fixed point theory

最近在看GNN的2009年的文章(reference 1),看到了这个概念。此概念来自于Metric Fixed Point Theory中,也即来自泛函分析,我并没有对其进行深入了解,blog中的内容只是作为GNN的知识储备。

Fixed Point Theory

Fixed point problem:
假设存在集合\(\)\(X\)\(\),以及集合\(\)\(X\)\(\)的两个非空子集\(\)\(A, B\)\(\),存在\(\)\(A \cap B \neq \emptyset \)\(\),一个映射\(\)\(f: A \rightarrow B\)\(\),当存在点\(\)\(x \in A \)\(\)有\(\)\(f(x) =x\)\(\),此点也被称为\(\)\(f\)\(\)的一个固定点(fixed point)。

Fixed Point Theory被分为三个主要领域:
1. Topological Fixed Point Theory
2. Metric Fixed Point Theory
3. Discrete Fixed Point Theory
这三个领域主要由三个理论来划分:
1. Brouwer’s Fixed Point Theorem
2. Banach’s Fixed Point Theorem
3. Tarski’s Fixed Point Theorem

Metric Fixed Point Theory

1922年Banach发表了一个fixed point theorem:Banach’s Contraction Principle.

Lipschitzian映射:令 \(\)\((M, d)\)\(\)为一个度量空间(metric space),对于映射\(\)\(T: M \rightarrow M\)\(\),如果存在一个常数\(\)\(k > 0\)\(\)使得对于所有\(\)\(x, y\)\(\),满足\(\)\(d(T(x), T(y)) \leq k d(x, y) \)\(\),则称映射\(\)\(T\)\(\)为lipschitzian映射。
其中,Lipschitz常数k小于1,即\(\)\(k <1\)\(\)的Lipschitzian映射称为收缩(contraction)。

第一眼看到这个定义想起的是对于函数光滑性的判断条件:lipschitz条件。(这个条件经常用于优化领域,例如在numerical optimization一书中线搜索条件收敛)

Banach’s Contraction Principle:假设 \(\)\((M, d)\)\(\)为一个完全的度量空间,以及映射\(\)\(T: M \rightarrow M\)\(\)为收缩映射,那么\(\)\(T\)\(\)有一个唯一的不动点\(\)\(x^*\)\(\),对于每个\(\)\(x \in M\)\(\),有\(\)\(\lim _{n \rightarrow \infty}T^n (x) = x^*\)\(\),对于每一个\(\)\(x \in M\)\(\),有\(\)\(d(T^n (x), x^*) \leq \frac{k^n}{1-k} d(T(x), x)\)\(\)

任取\(\)\(x_0 \in M\)\(\),构造数列:\(\)\(\{x_n\}\)\(\): \(\)\(x_1 = T(x_0), \ldots, x_n=T(x_{n-1}), \ldots\)\(\)

由于\(\)\(T\)\(\)为收缩映射,所以有\(\)\(d(T(x), T(y)) \leq k d(x, y) \)\(\),设\(\)\(d(x_1, x_0) = c_0\)\(\),于是有:
\(\)\(d(x_1, x_0) = d(T(x_1), T(x_0)) \leq k d(x_1, x_0) \leq kc_0 \)\(\)

\(\)\(d(x_n, x_{n-1}) = d(T(x_{n-1}), T(x_{n-2})) \leq k d(x_{n-1}, x_{n-1}) \leq k^{n-1}c_0 \)\(\)
由于我们空间是距离空间,所以对于正整数m,有:
\(\)\(d(x_n, x_{n+m}) \leq d(x_n, x_{n+1}) + d(x_{n+1}, x_{n+2})+\ldots+d(x_{n+m-1}, x_{n+m}) \leq (k^n + k^{n+1}+\ldots+k^{n+m-1})c_0 =\frac{k^n (1-k^m)}{1-k}c_0 \leq \frac{k^n}{1-k}c_0 \rightarrow 0\)\(\)

假设\(\)\(x^*\)\(\)为不动点,令\(\)\(m \rightarrow \infty \)\(\),\(\)\(x_n\)\(\)为迭代n次\(\)\(x^*\)\(\)的近似解
\(\)\(d(x_n, x^*) \leq \frac{k^n}{1-k}d(x_1, x_0) = \frac{k^n}{1-k}d(T(x_0), x_0) \)\(\)
也即:对于每一个\(\)\(x \in M\)\(\),\(\)\(d(T^n (x), x^*) \leq \frac{k^n}{1-k} d(T(x), x)\)\(\)

假设\(\)\((M, d)\)\(\)是一个完整的度量空间,并且假设\(\)\(T: M\rightarrow M\)是一个映射,其中\(\)\(T^N\)\(\)是关于某个正整数\(\)\(N\)\(\)的压缩映射。则T具有唯一的固定点。

Banach不动点定理给出了一个完备度量空间的不同点求解的迭代法。

在GNN中存在两个局部函数:对于某个节点n
\(\)\(x_n = f_w (l_n, l_{co[n]}, x_{ne[n]}, l_{ne[n]})\)\(\)
\(\)\(o_n = g_w(x_n, l_n)\)\(\)
将所有状态\(\)\(x\)\(\),标签\(\)\(l\)\(\),输出值\(\)\(o\)\(\)堆叠得到全局函数:
\(\)\(x = F_w (x, l)\)\(\)
\(\)\(o = G_w (x, l_N)\)\(\)
Banach不动点定理使得在整个图系统中\(\)\(x, o\)\(\)是存在且唯一的。

reference
1.Franco Scarselli, et al. The Graph Neural Network Model. 2009
2. Khamsi M A, Kirk W A. An introduction to metric spaces and fixed point theory[M]. John Wiley & Sons, 2011.
3. https://www.zhihu.com/question/40652658/answer/496053256
4. https://wenku.baidu.com/view/42e5378610a6f524ccbf85a8.html
5. https://www.bilibili.com/video/av20207040?p=1