Chapter 7 Push-forward, pooling and unpooling
This section shows how the push-forward operation of Definition 5.3 can be used to realize (un)pooling operations on CCs, and subsequently introduces (un)pooling operations for CCNNs. Further, this section demonstrates how CC-based pooling provides a unifying framework for image and graph-based pooling, and how shape-preserving pooling on CCs is related to the mapper on graphs.
In particular, we establish yet another unifying mathematical principle: pooling, as message passing, can be fundamentally built from push-forward operations. Thus, push-forward operations form the main fundamental building block from which all higher-order computations can be realized. This realization is important because it establishes a mathematical foundation for a unifying deep learning application programming interface (API) on complexes that combines pooling as well as message passing-based computations as a single operation. Indeed, in TopoModelX, one of our contributed Python packages, higher-order message passing and the pooling/unpooling operations are implemented as a single function across various topological domains.
7.1 CC-pooling and unpooling
We define a CC-based pooling operation that extends the main characteristics of image-based and graph-based pooling operations. Specifically, we build a pooling operation that `downscales’ the size of a signal supported on a CC \(\mbox{X}\). To this end, we exploit the hierarchical nature of CCs and define the pooling operation as a push-forward operation induced by a cochain map \(G\colon\mathcal{C}^{i}(\mathcal{X})\to \mathcal{C}^{j}(\mathcal{X})\) that pushes an \(i\)-cochain to a \(j\)-cochain. To obtain a useful pooling operation that downscales the size of its input \(i\)-cochain, we impose the constraint \(j>i\). Definition 7.1 realizes our idea of CC-pooling. Figure 7.1 visualizes the intuition behind Definition 7.1. In particular, Figure 7.1 shows an example of successive applications of pooling operations on cochains supported on a CC of dimension three.
Definition 7.1 (CC-pooling operation) Let \(\mbox{X}\) be a CC and \(G\colon\mathcal{C}^{i}( \mbox{X})\to \mathcal{C}^{j}( \mbox{X})\) a cochain map. The push-forward operation induced by \(G\) is called a CC-pooling operation if \(j>i\).
In Definition 7.2, we introduce an unpooling operation on CCs that pushes forward a cochain to a lower-rank cochain. Figure 7.1 shows as example of unpooling on CCs.
Definition 7.2 (CC-unpooling operation) Let \(\mbox{X}\) be a CC and \(G\colon\mathcal{C}^i( \mbox{X})\to \mathcal{C}^j( \mbox{X})\) a cochain map. The push-forward operation induced by \(G\) is called a CC-unpooling operation if \(j<i\).
7.2 Formulating common pooling operations as CC-pooling
In this section, we formulate common pooling operations in terms of CC-pooling. In particular, we demonstrate that graph and image pooling can be cast as CC-pooling.
7.2.1 Graph pooling as CC-pooling
Here, we briefly demonstrate that the CC-pooling operation (Definition 7.1) is consistent with a graph-based pooling algorithm. Let \(\mathbf{H}_0\) be a cochain defined on the vertices and edges of a graph \(\mathcal{G}\). Moreover, let \(\mathbf{H}^{\prime}_0\) be a cochain defined on the vertices of a coarsened version \(\mathcal{G}^{\prime }\) of \(\mathcal{G}\). Under such a setup, \(\mathbf{H}^{\prime}_0\) represents a coarsened version of \(\mathbf{H}_0\). A graph pooling function supported on the pair \((\mathcal{G},\mathbf{H}_0)\) is a function of the form \(\mathcal{POOL} \colon (\mathcal{G},\mathbf{H}_0) \to (\mathcal{G}^{\prime},\mathbf{H}^{\prime}_0)\) that sends every vertex in \(\mathcal{G}\) to a vertex in \(\mathcal{G}^{\prime}\), which corresponds to a cluster of vertices in \(\mathcal{G}\). We now elucidate how the function \(\mathcal{POOL}\) can be realized in terms of CC-pooling.
Proposition 7.1 (The role of CC-pooling) The function \(\mathcal{POOL}\) can be realized in terms of CC-pooling operations.
Proof. Each vertex in the graph \(\mathcal{G}^{\prime}\) represents a cluster of vertices in the original graph \(\mathcal{G}\). Using the membership of these clusters, we construct a CC by augmenting \(\mathcal{G}\) by a collection of 2-cells, so that each of these cells corresponds to a supernode of \(\mathcal{G}^{\prime}\). We denote the resulting CC structure by \(\mathcal{X}_{\mathcal{G}}\), consisting of \(\mathcal{G}\) augmented by the 2-cells. Hence, any 0-cochain \(\mathbf{H}^{\prime}_0\) defined on \(\mathcal{G}^{\prime}\) can be written as a 2-cochain \(\mathbf{H}_2 \in \mathcal{C}^2(\mathcal{X}_{\mathcal{G}})\). The relation between the vertices of the original graph \(\mathcal{G}\) and the vertices of the pooled graph \(\mathcal{G}^{\prime}\), or equivalently the CC \(\mathcal{X}_{\mathcal{G}}\), is described via the incidence matrix \(B_{0,2}^T\). Hence, learning the signal \(\mathbf{H}_2\) can be realized in terms of a map \(B_{0,2}^T \colon \mathcal{C}^{2} (\mathcal{X}_{\mathcal{G}}) \to \mathcal{C}^{0}(\mathcal{X}_{\mathcal{G}})\) that pushes forward the cochain \(\mathbf{H}_0\) to \(\mathbf{H}_2\).
The 2-cells defined on \(\mathcal{X}_{\mathcal{G}}\) can be practically constructed using the mapper on graphs (Hajij, Wang, and Rosen 2018), a classification tool in TDA. See Section 7.4 for more details of such a construction.
7.2.2 Image pooling as CC-pooling
Since images can be realized as lattice graphs, a signal stored on an image grid can be realized as a 0-cochain of the lattice graph that corresponds to the image. See Figures 7.2(a–b) for an example. Here, we demonstrate that the CC-pooling operation (Definition 7.1) is consistent with the known image-pooling definition. Indeed, one may augment the lattice graph of Figure 7.2(b) by 2-cells, as shown in Figure 7.2(c), to perform the image pooling operation. Usually, these cells have a regular window size. In Figure 7.2(c), we have chosen the pooling window size, or equivalently the size of the 2-cell, to be \(2\times 2\), and the pooling stride to be 1. The image pooling operation in this case can be realized as a CC-pooling operation induced by the cochain map \(B_{0,2}^T \colon\mathcal{C}^0 \to \mathcal{C}^2\), as visualized in Figure 7.2(d). We formally record this in the following proposition.
Proposition 7.2 (Realization of image pooling) An image pooling operator can be realized in terms of a push-forward operator from the underlying image domain to a 2-dimensional CC obtained by augmenting the image by appropriate 2-cells where image pooling computations occur.
Proof. The proof is a straightforward conclusion from the definition of image pooling.
7.3 Pooling and unpooling CCNNs
The pooling operator of Definition 7.1 considers only the special case in which the tensor diagram of a CCNN has a single edge. In what follows, we generalize the notion of pooling by identifying the defining properties that characterize a CCNN as a pooling CCNN. To this end, we start with a CCNN whose tensor diagram has a height of one.
Definition 7.3 (Pooling CCNN of height one) Consider a CCNN represented by a tensor diagram \(\mbox{CCNN}_{\mathbf{G};\mathbf{W}}\) of height one. Let \(\mathcal{C}^{i_1}\times\mathcal{C}^{i_2}\times \cdots \times \mathcal{C}^{i_m}\) be the domain and let \(\mathcal{C}^{j_1}\times\mathcal{C}^{j_2}\times \cdots \times \mathcal{C}^{j_n}\) be the codomain of the CCNN. Let \(i_{min}=min(i_1,\ldots,i_m)\) and \(j_{min}=min(j_1,\ldots,j_n)\). We say that the CCNN is a pooling CCNN of height one if
- \(i_{min}< j_{min}\), and
- the tensor diagram \(\mbox{CCNN}_{\mathbf{G};\mathbf{W}}\) has an edge labeled by a cochain operator \(G\colon \mathcal{C}^{i_{min}} \to \mathcal{C}^{k}\) for some \(k\geq j_{min}\).
Intuitively, a CCNN represented by a tensor diagram of height one is a pooling CCNN of height one if it pushes forward its lowest-rank signals to higher-rank cells. Observe that a readout operation can be realized as a pooling CCNN of height one; see Figure 7.3 for an illustration.
A CCNN may not perform a pooling operation at every layer, and it may preserve the dimensionality of the lowest-rank signal. Before we give the general definition of pooling CCNNs, we first define lowest rank-preserving CCNNs of height one.
Definition 7.4 (Lowest rank-preserving CCNN of height one) Consider a CCNN represented by a tensor diagram of height one. Let \(\mathcal{C}^{i_1}\times\mathcal{C}^{i_2}\times \cdots \times \mathcal{C}^{i_m}\) be the domain and let \(\mathcal{C}^{j_1}\times\mathcal{C}^{j_2}\times \cdots \times \mathcal{C}^{j_n}\) be the codomain of the CCNN. Let \(i_{min}=min(i_1,\ldots,i_m)\) and \(j_{min}=min(j_1,\ldots,i_n)\). We say that the CCNN is a lowest rank-preserving CCNN of height one if \(i_{min}= j_{min}\).
Every CCNN is a composition of CCNNs that are represented by tensor diagrams of height one. Hence, pooling CCNNs can be characterized in terms of tensor diagrams of height one, as elaborated in Definition 7.5.
Definition 7.5 (Pooling CCNN) Let \(\mbox{CCNN}_{\mathbf{G};\mathbf{W}}\) be a tensor diagram representation of a CCNN. We decompose the CCNN as \[\begin{equation*} \mbox{CCNN}_{\mathbf{G};\mathbf{W}}= \mbox{CCNN}_{\mathbf{G}_N;\mathbf{W}_N} \circ \cdots \circ \mbox{CCNN}_{\mathbf{G}_1;\mathbf{W}_1}, \end{equation*}\] where \(\mbox{CCNN}_{\mathbf{G}_i;\mathbf{W}_i},i=1,\ldots,N\), is a tensor diagram of height one representing the \(i\)-th layer of the CCNN, and \(\mathbf{G}_i \subseteq \mathbf{G}\). We call the CCNN represented by \(\mbox{CCNN}_{\mathbf{G};\mathbf{W}}\) a pooling CCNN if
- every \(\mbox{CCNN}_{\mathbf{G}_i;\mathbf{W}_i}\) is either a pooling CCNN of height one or a lowest rank-preserving CCNN of height one, and
- at least one of the layers \(\mbox{CCNN}_{\mathbf{G}_i;\mathbf{W}_i}\) is a pooling CCNN of height one. \end{enumerate}
Intuitively, a pooling CCNN is a CCNN whose tensor diagram forms a ‘ladder’ that pushes signals to higher-rank cells at every layer. Figure 5.4(d) gives an example of a pooling CCNN of height two.
An unpooling CCNN of height one is defined similarly to a pooling CCNN of height one (Definition 7.3), with the only difference being that the inequality \(i_{min}<j_{min}\) becomes \(i_{min}>j_{min}\). Moreover, an unpooling CCNN (Definition 7.6) is defined analogously to a pooling CCNN (Definition 7.5).
Definition 7.6 (Unpooling CCNN) Let \(\mbox{CCNN}_{\mathbf{G};\mathbf{W}}\) be a tensor diagram representation of a CCNN. We decompose the CCNN as \[\begin{equation*} \mbox{CCNN}_{\mathbf{G};\mathbf{W}}= \mbox{CCNN}_{\mathbf{G}_N;\mathbf{W}_N} \circ \cdots \circ \mbox{CCNN}_{\mathbf{G}_1;\mathbf{W}_1}, \end{equation*}\] where \(\mbox{CCNN}_{\mathbf{G}_i;\mathbf{W}_i},i=1,\ldots,N\), is a tensor diagram of height one representing the \(i\)-th layer of the CCNN, and \(\mathbf{G}_i \subseteq \mathbf{G}\). We call the CCNN represented by \(\mbox{CCNN}_{\mathbf{G};\mathbf{W}}\) an unpooling CCNN if
- every \(\mbox{CCNN}_{\mathbf{G}_i;\mathbf{W}_i}\) is either an unpooling CCNN of height one or a lowest rank-preserving CCNN of height one, and
- at least one of the layers \(\mbox{CCNN}_{\mathbf{G}_i;\mathbf{W}_i}\) is an unpooling CCNN of height one.
7.4 Mapper and the CC-pooling operation
In practice, constructing a useful CC-pooling operation on a higher-order domain is determined by the higher-rank cells in the input CC. Similar to image-based models, CC-pooling operations can be applied sequentially at the end of a higher-order network to provide a summary representation of the input domain; see Figure 7.1 for an example. Such a hierarchical summary might not be readily available in the input CC. For instance, if \(\mathcal{X}\) is a graph, then a CC-pooling operation, as given in Definition 7.1, can only push forward an input node-signal to an edge-signal, which may not always provide a compact summary of the input signal.
In such situations, one may choose to augment the input CC \(\mathcal{X}\) with a collection of new cells of dimension \(\dim(\mathcal{X})+1\) so that the new cells approximate the shape of the input CC \(\mathcal{X}\). Figure 7.4 displays an example of augmenting a graph \(\mathcal{X}\), which is a CC of dimension one, with new cells of dimension two using the mapper on graphs (MOG) construction suggested in (Dey, Mémoli, and Wang 2016; Singh et al. 2007; Hajij, Wang, and Rosen 2018)6. The augmented higher-rank cells obtained from the MOG construction summarize the shape features of the underlying graph, which is a desirable pooling characteristic (e.g., in shape analysis). We refer to Appendix E for details about the MOG construction of topology-preserving CC-pooling operations.
References
While graph skeletonization using the mapper algorithm (Singh et al. 2007) has been studied in (Dey, Mémoli, and Wang 2016), our implementation and discussion here relies on the notions suggested in (Hajij, Wang, and Rosen 2018).↩︎