The rank recovery problem

Yuehaw Khoo, Amit Singer, and myself just uploaded an open problem note to the arxiv entitled “Open problem: Tightness of maximum likelihood semidefinite relaxations”. This note is about a problem/conjecture that I think is really interesting and would love to hear your thoughts on. The note is already rather short (3 pages) but I will try to give an even more brief (and informal) description below.

A few blog posts ago I talked about how to build dual certificates to show that a certain semidefinite relaxation gives exact recovery for a specific problem, called {\mathbb{Z}_2} synchronization, that amounts to recover {\pm1} labels of the vertices of a graph given pairwise measurements on the edges that correspond to noisy products between the labels of the nodes.

Essentially, these relaxation work is by first formulating a minimization problem (whose optimum solution is usually the Maximum Likelihood Estimator) that is too hard to solve and to then consider a convex relaxation of that problem. Meaning, to extend the set of feasible solutions of the problem to a (nice) convex set, where we can use the power of convex optimization. In order for the relaxation to provide exact recovery, two things must happen: (A) The ground truth (planted solution) is the minimizer of the original hard problem and (B) the optimum solution of the relaxed problem belongs to the smaller feasible set of the original problem. Interestingly, in many applications, we have observed that (B) has a tendency to happen even in regimes where (A) does not hold, meaning that the relaxation is computing the MLE. However, to the best of my knowledge there is not rigorous understanding of this phenomenon. Note that as we don’t have a way of explicitly describing the optimizer, as it is not the ground truth, dual certificate arguments become very difficult to carry out. The challenge posed consists exactly of understanding this behavior.

In order to illustrate the problem I am going to quickly describe the generalized Procrustes problem: There is an unknown underlying point cloud of {m} points in {\mathbb{R}^d}, {A\in\mathbb{R}^{d\times m}} (where columns of {A} represent the coordinates of the points) and {n} unknown orthogonal transformations in {\mathbb{R}^d}, {\{O_i\}_{i = 1}^n} satisfying {O_iO_i^T=I_{d\times d}}. We are given {n} noisy measurements of the form \(A_i = O_iA + \sigma \eta\), with {\sigma>0} and {\eta\in \mathbb{R}^{d\times n}} is a matrix with i.i.d. {\mathcal{N}(0,1)} entries. To recover the rotations (and then {A} by rotating back and averaging) we interested in minimizing (this is essentially equivalent to an MLE):

\displaystyle \min_{O_1 , \dots, O_n \in \mathbf{O}_d} \sum_{i<j} \left\|O_i^T A_i - O_j^T A_j \right\|_F^2, \ \ \ \ \ (1)

which is an intractable problem in general (note that it has a non convex and exponentially large search space).

Let us rewrite the problem: {\sum_{i<j} \left\|O_i^T A_i - O_j^T A_j \right\|_F^2} has the same solution as maximizing {\sum_{i,j=1}^n \mathrm{Tr}\left(A_jA_i^T O_iO_j^T \right)} which, by taking {X = [O_1^T,\dots, O_n^T]^T[O_1^T,\dots, O_n^T]} and {C\in\mathbb{R}^{dn\times dn}} with {d\times d} blocks {C_{ij}^T = A_jA_i^T}, is equivalent to

\displaystyle \max \mathrm{Tr}(CX) \quad \text{subject to } X\succeq 0,\ \forall_i\,X_{ii}=I_{d\times d},\ \mathrm{rank}(X)\leq d. \ \ \ \ \ (2)

A very natural semidefinite relaxation is obtained by simply dropping the non-convex rank constraint, giving the following Semidefinite Programming (which can be solved efficiently)

\displaystyle \max \mathrm{Tr}(CX) \quad \text{subject to } X\succeq 0,\ \forall_i\,X_{ii}=I_{d\times d}. \ \ \ \ \ (3)

(Note how similar this problem is to Synchronization, when {d=1}).

The issue is that, for {d>1}, the {\ell_2} nature of~(1) together with the continuous nature of the orthogonal transformations variables will make exact recovery under noise impossible, even if we solved~(2), this means that (A) never happens. Remarkably we observe that, even then, the optimum point of the relaxation~(3) is a rank {d} solution, which means that it is a feasible solution to~(2), meaning that (B) happens, we call this rank recovery. Based on these observations we conjecture that (B) happens with high probability below a certain level of noise. You can see suggestive numerical experiments in the note.

We used the generalized Procrustes to illustrate the phenomenon but the truth is that it was observed in many different problems: In the multireference alignment problem, in the global registration problem by , and in camera motion estimation by, etc. Yet, to the best of my knowledge, there is no theoretical understanding of this rank recoveryphenomenon. It would be great to understand this phenomenon for all these problems!

Advertisement

1 thought on “The rank recovery problem

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s