Linear least squares regression on graph cycles

\( \newcommand{\F}{\mathbf{F}} \newcommand{\f}{\mathbf{f}} \newcommand{\E}{\mathbf{E}} \newcommand{\e}{\mathbf{e}} \newcommand{\M}{\mathtt{M}} \newcommand{\R}{\mathbb{R}} \newcommand{\Z}{\mathbb{Z}} \renewcommand{\P}{\mathbb{P}} \renewcommand{\E}{\mathbb{E}} \renewcommand{\epsilon}{\varepsilon} \DeclareMathOperator{\nullspace}{nullspace} \DeclareMathOperator{\NullspaceMatrix}{NullspaceMatrix} \DeclareMathOperator{\IncidenceMatrix}{IncidenceMatrix} \DeclareMathOperator{\rank}{rank} \DeclareMathOperator{\diag}{diag} \DeclareMathOperator{\range}{range} \DeclareMathOperator*{\span}{span} \DeclareMathOperator{\SVD}{SingularValueDecomposition} \DeclareMathOperator*{\argmin}{arg\,min} \)

Introduction

Map of Konigsberg's with its seven bridges Pre-World War II map of the city of Kaliningrad, Russia (formerly Königsberg, Prussia) featuring its seven bridges. See original here.
The resolution of the problem of the seven bridges of Königsberg sparked the field of graph theory.

We are interested in estimating the weights of a directed, weighted graph from samples, knowing that the true weights satisfy a set of linear constraints determined by the closed paths within the graph (i.e., its cycles). The worst-case time complexity of the naive approach is super-exponential in the number of vertices. By contrast, the method introduced here has polynomial complexity.

Estimation of weights

Let $G = (V, E)$ be a directed graph with $m$ vertices $V$ and $n$ edges $E \subseteq V \times V$. Our weight function is $W \colon E \to \R$. Consider the extended set of edges $$E^\prime = E \cup \{ (j, i) \, | \, (i, j) \in E \}$$ that includes edges in both directions. Let the extended weight function $w \colon E^\prime \to \R$ be $$ w(i, j) = \begin{cases} W(i, j), & \text{if $(i, j) \in E$}, \\ -W(j, i), & \text{otherwise}. \end{cases} $$

A cycle $C \subseteq E^\prime$ is a set of edges such that $(i, u) \in C$ implies $(v, i) \in C$ for some $u, v \in V$. The weight function $w$ is assumed to satisfy $$\sum_{e \in C} \, w_e = 0$$ for every cycle $C$. Abusing notation, we write $w \in \R^n$ as a vector with components $w_e = w_{ij} = w(i, j)$ for $e = (i, j) \in E^\prime$.

Example

Consider the graph $G = (V, E)$ shown below with weight function $w$. The cycle $C = \{ (1, 2), (2, 3), (3, 1) \}$ must satisfy $w_{12} + w_{23} - w_{13} = 0$.

Orthogonal projection of observations onto regression manifold
Geometrically, the estimator $\hat{w}$ is the orthogonal projection of the observed data $\bar{w}$ onto the nullspace of $R$.
Assume the sampled weights of $G$ are of the form $$\bar{w} = w + \epsilon,$$ where $w$ is the vector of true weights and $\epsilon \in \R^n$ is a Gaussian random vector with mean zero and covariance matrix $\Sigma = \diag(\sigma_1^2, \dotsc, \sigma_n^2) \in \R^{n \times n}_{\ge 0}$. We seek the best estimate $\hat{w}$ of the true values $w$ given the observations $\bar{w}$. A naive solution consists of enumerating all of the $k$ cycles in $G$ and solving the constrained minimization problem $$\hat{w} = \argmin_{x \in \nullspace(R)} \| x - \bar{w} \|_2^2,$$ where $R \in \R^{k \times n}$ is the matrix of linear constraints imposed by the cycles.

Example

In the preceding example, we have $$ R x = \begin{bmatrix} 1 & -1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & -1 & 0 & -1 & 0 & 0 & 0 & 0 & 0 & 0 \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & -1 & 1 \end{bmatrix} \begin{bmatrix} x_{12} \\ x_{13} \\ x_{14} \\ x_{23} \\ x_{43} \\ x_{45} \\ x_{53} \\ x_{56} \\ x_{67} \\ x_{68} \\ x_{78} \\ \end{bmatrix}. $$

While the naive approach consisting of writing down the equations for each cycle works for small graphs, the number of cycles grows fast for larger graphs and enumerating all cycles prior to estimating $w$ becomes unfeasible.

The vector space of cycles

In this section we show that the set of cycles is a vector space. This implies that we shift our focus from writing down $R \in \R^{k \times n}$ to the much more practical goal of finding a basis of cycles.

The incidence matrix $B \in \mathbb{R}^{m \times n}$ of the graph $G$ is defined by its column vectors $b_\ell \in \mathbb{R}^m$ for $\ell = 1, \dotsc, n$, each of which corresponds to an edge $(i, j) \in E$ and $$\begin{equation*} b_{\ell} = \begin{cases} -1, & \text{if $i = \ell$}, \\ 1, & \text{if $j = \ell$}, \\ 0, & \text{otherwise}. \end{cases} \end{equation*}$$

Example

The incidence matrix of $G$ as defined above is $$\begin{equation*} B = \left[ \begin{array}{rrrrrrrrrrrr} -1 & -1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\1 & 0 & -1 & -1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\0 & 1 & 0 & 0 & -1 & -1 & 0 & 0 & 0 & 0 & 0 & 0\\0 & 0 & 1 & 0 & 0 & 0 & -1 & 0 & 0 & 0 & 0 & 0\\0 & 0 & 0 & 1 & 0 & 0 & 0 & -1 & 0 & 0 & 0 & 0\\0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & -1 & 0 & 0 & 0\\0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & -1 & 0 & 0\\0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 0 & -1 & 0\\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & -1\\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 \end{array} \right]. \end{equation*}$$

The dimension $c$ of the nullspace of $B$ is the number of basic cycles. Each vector in $$\nullspace(B) = \{ x \in \Z^n \, | \, B x = 0 \}$$ gives us a cycle (via its non-zero entries) and every cycle can be constructed as a linear combination of elements of $\nullspace(B)$. Consequently, there is no need to exhaustively enumerate all possible cycles as was previously discussed. A basis of $\nullspace(B) = \nullspace(R)$ suffices.

Example

Let $B$ be the incidence matrix with columns $b_1, b_2, b_3, b_4$, and $b_5$. The cycle on the left hand side of the diagram shown below is determined by $u = (1, 1, 1, 1, 0)^\top$ because $B u = b_1 + b_2 + b_3 + b_4 = 0$. The vector $u$ is the sum of the two cycles on the right hand side, $u^\prime = (1, 1, 0, 0, 1)^\top$ and $u^{\prime\prime} = (0, 0, 1, 1, -1)^\top$, which respectively satisfy $B u^{\prime} = b_1 + b_2 + b_5 = 0$ and $B u^{\prime\prime} = b_3 + b_4 - b_5 = 0$.

Least squares estimator

If $Q \in \R^{n \times c}$ is an unitary matrix whose column vectors constitute an orthonormal basis of $\nullspace(B)$, then any cycle of $G$ can be expressed as $Q x$ for some $x \in \R^n$. Moreover, $P = Q Q^\top \in \R^{n \times n}$ is the orthogonal projection onto $\nullspace(B)$.

The estimator defined by $$\hat{w} = P \bar{w}$$ solves the constrained minimization problem $$\argmin_{x \in \nullspace(B)} \| x - \bar{w} \|_2^2$$ by construction. Moreover, it is unbiased, $$\E[\hat{w}] = \E[ P \bar{w} ] = \E[ P w + P \epsilon ] = P w = w$$ and its covariance, which follows from the law of propagation of errors, is given by $$\Sigma^\prime = P \Sigma P^\top.$$

The computational complexity of calculating $\hat{w}$ is dominated by the obtainment of a basis of the nullspace of $B$, which can be carried out using Johnson's algorithm in $\mathcal{O}((m + n) (c + 1))$.

Last updated: June 12, 2020.