How to build a dual certificate for Synchronization over Z_2

A couple of weeks ago I gave a talk at PACM Graduate Student Seminar where I showed a neat simple proof for exact recovery of \mathbb{Z}_2 Synchronization on the presence of random outliers. Although this particular argument is not available in the literature (yet – it will appear as part of a future publication, keep an eye on this blog!), the result has been shown before by Lanhui and Amit with a different technique.

Synchronization over \mathbf{\mathbb{Z}_2} on a complete graph

Recover, up to a global sign, g_1,\dots,g_n\in\mathbb{Z}_2, given noisy \mathbb{Z}_2 relative measurements \rho_{ij}\approx g_ig_j^{-1}.

The problem is essentially the same as described in my last post with the difference that it’s over \mathbb{Z}_2 and not SO(2).  However, while in the Cheeger inequality for the Connection Laplacian paper and post we showed stability in the presence of adversarial noise, here we will show that if the noise is random one can achieve, with high probability, exact recovery.

We will assume a particular noise model: Given { 0\leq p\leq 1}, each edge measurement { \rho_{ij}} is correct with probability { p} and uniformly distributed in { \mathbb{Z}_2} with probability { 1-p} (independently), this means that

\displaystyle \rho_{ij} = \left\{ \begin{array}{ccc} g_ig_j^{-1} & \text{ with probability } & \frac{1+p}2 \\ -g_ig_j^{-1} & \text{ with probability } & \frac{1-p}2 . \end{array} \right.

Similarly to the {SO(2)} case we attempt to recover {g} by trying to find the potential that would incur the least error in the edges, solution of {\min_{x\in\{\pm1\}} \sum_{i,j=1}^n|x_i-\rho_{ij}x_j|^2}, which is equivalent to

\displaystyle \max_{x\in\{\pm1\}} \sum_{i,j=1}^n\rho_{ij}x_ix_j.

By having {\rho} be the zero-diagonal symmetric matrix representing the relative measurements, and {X=xx^\ast}, we can rewrite the problem as

\displaystyle \begin{array}{cc} \max & \mathrm{tr}(\rho X) \\ \text{s. t.} & X_{ii}=1, X\succeq 0, \mathrm{rank}(X)=1. \end{array} \ \ \ \ \ (1)

This problem is the same as the one that appears in the context of Max-Cut (see this previous post) and is known to be NP-hard, we take the same approach as Goemans and Williamson (described in the same post) and relax the problem by dropping the non-convex rank constraint,

\displaystyle \begin{array}{cc} \max & \mathrm{tr}(\rho X) \\ \text{s. t.} & X_{ii}=1, X\succeq 0]. \end{array} \ \ \ \ \ (2)

This problem is an instance of Semidefinite Programming and can be solved, to arbitrary precision, in polynomial time. Although Semidefinite Programming solvers are somewhat slow there is some (successful) work in building faster solvers exploiting the special structure of this particular Semidefinite Program, I’ll blog about this in the future.

We will show that although there is no constraint forcing the solution of (2) to be rank {1}, this will often happen with the random noise model described above, not only that but it will be the rank {1} matrix corresponding to the ground truth. This is somewhat remarkable given that the NP-hard problem (1) can actually be solved efficiently for “typical” inputs. The rest of the post is devoted to proving the Theorem below (with a neat simple proof)

Theorem 1 Let {p>0}, consider the Synchronization problem over {\mathbf{\mathbb{Z}_2}} on a complete graph with measurements randomly distributed according to the noise model described above. With high probability (tending to {1} as {n} goes to infinity), {gg^\ast} is the solution to (2).

Without loss of generality we can assume that {g=\mathbf{1}}, the all-ones vector. This means that

\displaystyle \rho_{ij} = \left\{ \begin{array}{ccc} 1 & \text{ with probability } & \frac{1+p}2 \\ -1 & \text{ with probability } & \frac{1-p}2 . \end{array} \right.

Let us define {H} as the graph of the {-1} edges, this means that if {A_H} is the adjencacy matrix of {H} and {D_H} and {L_H} respectively it’s degree and Laplacian matrix, we have that {\rho = \mathbf{1}\mathbf{1}^T-I-2A_H}. Note that {H} is an Erdos-Renyi graph with edge probability {\frac{1-p}2}.

We will argue by duality, just like in Linear Programming, each Semidefinite Program admits a dual and, under certain conditions strong duality holds. The argument below only requires weak duality (and for the sake of completeness I will show it).

The objective value for the ground truth is {\mathrm{tr}((\mathbf{1}\mathbf{1}^T-I-2A_H)\mathbf{1}\mathbf{1}^T)}, we have understand when this is in fact the optimum of the SDP.

The dual of this SDP is easy to describe and is given by

\displaystyle \min \mathrm{tr}(Z) \quad \text{ s. t. } Z\text{ is diagonal }, Z-(\mathbf{1}\mathbf{1}^T-I-2A_H) \succeq 0.

Weak duality tell us that any value of the dual needs to be larger than or equal to any value of the primal (and strong duality that, under some conditions, the optimal values match). In fact, this is very easy to show: Let {X} be a feasible point of the primal and {Z} of the dual

\displaystyle \mathrm{tr}(Z) = \mathrm{tr}(ZX) = \mathrm{tr}((\mathbf{1}\mathbf{1}^T-I-2A_H) X)+\mathrm{tr}((Z-(\mathbf{1}\mathbf{1}^T-I-2A_H))X) \geq \mathrm{tr}((\mathbf{1}\mathbf{1}^T-I-2A_H) X),

where the first equality uses the fact that {Z} is diagonal and {X_{ii}=1} for all {i} and the inequality uses the fact that since, both {Z-(\mathbf{1}\mathbf{1}^T-I-2A_H)} and {X} are PSD the trace of their products needs to be positive. If we find a matrix {Z} feasible for the dual such that {\mathrm{tr}(Z) = \mathrm{tr}((\mathbf{1}\mathbf{1}^T-I-2A_H)\mathbf{1}\mathbf{1}^T)} then we immediately get what we want, the optimality of {\mathbf{1}\mathbf{1}^T}. Because of the derivation above we know that we should look for a matrix {Z} for which {\mathrm{tr}((Z-(\mathbf{1}\mathbf{1}^T-I-2A_H))\mathbf{1}\mathbf{1}^T)=0}.  A natural guess is (obtained by enforcing the equality of all the diagonal and not just the trace)

\displaystyle Z=(n-1)I - 2D_H. \ \ \ \ \ (3)

By construction, it satisfies {\mathrm{tr}(Z) = \mathrm{tr}((\mathbf{1}\mathbf{1}^T-I-2A_H)\mathbf{1}\mathbf{1}^T)} all we need to show is that {((n-1)I - 2D_H)-(\mathbf{1}\mathbf{1}^T-I-2A_H)\succeq 0}. We can rewrite this as

\displaystyle nI-\mathbf{1}\mathbf{1}^T-2L_H \succeq 0.

Since {\mathbf{1}} is in the nullspace of the above matrix it suffices to show that {nI-2L_H \succeq 0}, which is equivalent to {\left\|L_H\right\|_s\leq \frac{n}2}, where {\|\cdot\|} denotes the spectral norm.

Since {H} is an Erdos-Renyi graph we can use known results about their spectral properties (see here) that immediately give that, almost surely,

\displaystyle \lambda_n(L_H) = (1+o(1))\left(\frac{1-p}2\right)n = \Big[(1+o(1))(1-p)\Big]\frac{n}2,

so as long as {p>0} and {n} sufficiently large, {Z\succeq 0} with high probability, ending the proof of the Theorem.

Note that I only showed that {gg^\ast} is a solution, and not that it is the unique solution, but the proof can be adapted to show uniqueness.

Advertisements

5 thoughts on “How to build a dual certificate for Synchronization over Z_2

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s