A couple of weeks ago I gave a talk at PACM Graduate Student Seminar where I showed a neat simple proof for exact recovery of Synchronization on the presence of random outliers. Although this particular argument is not available in the literature (yet – it will appear as part of a future publication, keep an eye on this blog!), the result has been shown before by Lanhui and Amit with a different technique.
Synchronization over
on a complete graph
Recover, up to a global sign,
, given noisy
relative measurements
.
The problem is essentially the same as described in my last post with the difference that it’s over and not
. However, while in the Cheeger inequality for the Connection Laplacian paper and post we showed stability in the presence of adversarial noise, here we will show that if the noise is random one can achieve, with high probability, exact recovery.
We will assume a particular noise model: Given , each edge measurement
is correct with probability
and uniformly distributed in
with probability
(independently), this means that
Similarly to the case we attempt to recover
by trying to find the potential that would incur the least error in the edges, solution of
, which is equivalent to
By having be the zero-diagonal symmetric matrix representing the relative measurements, and
, we can rewrite the problem as
This problem is the same as the one that appears in the context of Max-Cut (see this previous post) and is known to be NP-hard, we take the same approach as Goemans and Williamson (described in the same post) and relax the problem by dropping the non-convex rank constraint,
This problem is an instance of Semidefinite Programming and can be solved, to arbitrary precision, in polynomial time. Although Semidefinite Programming solvers are somewhat slow there is some (successful) work in building faster solvers exploiting the special structure of this particular Semidefinite Program, I’ll blog about this in the future.
We will show that although there is no constraint forcing the solution of (2) to be rank , this will often happen with the random noise model described above, not only that but it will be the rank
matrix corresponding to the ground truth. This is somewhat remarkable given that the NP-hard problem (1) can actually be solved efficiently for “typical” inputs. The rest of the post is devoted to proving the Theorem below (with a neat simple proof)
Theorem 1 Let
, consider the Synchronization problem over
on a complete graph with measurements randomly distributed according to the noise model described above. With high probability (tending to
as
goes to infinity),
is the solution to (2).
Without loss of generality we can assume that , the all-ones vector. This means that
Let us define as the graph of the
edges, this means that if
is the adjencacy matrix of
and
and
respectively it’s degree and Laplacian matrix, we have that
. Note that
is an Erdos-Renyi graph with edge probability
.
We will argue by duality, just like in Linear Programming, each Semidefinite Program admits a dual and, under certain conditions strong duality holds. The argument below only requires weak duality (and for the sake of completeness I will show it).
The objective value for the ground truth is , we have understand when this is in fact the optimum of the SDP.
The dual of this SDP is easy to describe and is given by
Weak duality tell us that any value of the dual needs to be larger than or equal to any value of the primal (and strong duality that, under some conditions, the optimal values match). In fact, this is very easy to show: Let be a feasible point of the primal and
of the dual
where the first equality uses the fact that is diagonal and
for all
and the inequality uses the fact that since, both
and
are PSD the trace of their products needs to be positive. If we find a matrix
feasible for the dual such that
then we immediately get what we want, the optimality of
. Because of the derivation above we know that we should look for a matrix
for which
. A natural guess is (obtained by enforcing the equality of all the diagonal and not just the trace)
By construction, it satisfies all we need to show is that
. We can rewrite this as
Since is in the nullspace of the above matrix it suffices to show that
, which is equivalent to
, where
denotes the spectral norm.
Since is an Erdos-Renyi graph we can use known results about their spectral properties (see here) that immediately give that, almost surely,
so as long as and
sufficiently large,
with high probability, ending the proof of the Theorem.
Note that I only showed that is a solution, and not that it is the unique solution, but the proof can be adapted to show uniqueness.
Edit: A few typos corrected, thanks Kunal for pointing them out.