18.S096: Concentration Inequalities, Scalar and Matrix Versions

The fourth set of Lecture notes for my course is available here. This one is large deviation and concentration inequalities, for sums of independent scalar or matrices random variables. It also has 5 open problems related to problems involving concentration of certain random matrices. As usual, I will document the open problems here, while referring a much more detailed description of the problems on the notes, including description of partial progress.

Open Problem 4.1. (Improvement over Non-commutative Khintchine inequality)

Let A_1,\dots,A_n\in \mathbb{R}^{d\times d} be symmetric matrices and g_1,\dots,g_n\sim\mathcal{N}(0,1) i.i.d.. Does the following hold?

\mathbb{E} \left\| \sum_{k=1}^{n} g_k A_k \right\| \lesssim \sigma + \left(\log d\right)^{\frac12}\sigma_\ast.

This problem is motivated by the results here and here. See Open Problem 4.1. in these notes for references, motivation, partial progress, and discussion of potential directions for progress.


Open Problem 4.2. (Lata\la-Riemer-Schutt)

Let X\in\mathbb{R}^{d\times d} be a symmetric matrix with (otherwise) independent gaussian entries. Prove (or disprove):

\mathbb{E} \|X\| \lesssim\mathbb{E} \max_k \|Xe_k\|_2

Note that, \|X\| \geq \max_{k} \|Xe_k\|_2. This conjecture is motivated by comments and results in this, this, and this paper. If the largest variance of an entry is comparable to the one of many of the entries, the results in here imply this conjecture. However, in general, the best known upperbound, due to Ramon van Handel, and has an extra factor of \sqrt{\log\log d}The notes have a lot more information.


Open Problem 4.3. (Matrix version of 6 deviations suffice)

This open problem appears in a nice blog post by Raghu Meka. We refer to the discussion on the notes and to Raghu’s blog post for references, and context of the problem.

Prove or disprove: there exists a universal constant C such that, for any choice of n symmetric matrices H_1,\dots,H_n\in\mathbb{R}^{n\times n} satisfying \|H_k\|\leq 1 (for all k=1,\dots,n), there exists \varepsilon_1,\dots,\varepsilon_n \in \{ \pm1\} such that

\left\| \sum_{k=1}^n \varepsilon_k H_k \right\| \leq C \sqrt{n}.

In the notes we also propose investigating (and motivate) a strengthening of this conjecture, asking, in the same conditions, for

$latex \left\| \sum_{k=1}^n \eps_k H_k \right\| \lesssim \left\| \sum_{k=1}^n H_k^2 \right\|^{\frac12}$


Open Problem 4.4. (OSNAP)

This problem concerns a conjecture by Jelani Nelson and Huy L. Nguyen regarding Sparse Norm-Approximating Projections, a form of dimension reduction useful for fast linear algebra.

We defer the exposition of the full open problem to the notes and present here part (3) of it, a particularly interesting instance of the conjecture.

Part (3) of the problem: Let s\leq d\leq m and z_1,\dots,z_m\in \mathbb{R}^d i.i.d. random vectors with i.i.d. entries

\left( z_k\right)_j = \left\{ \begin{array}{rcc} -\frac1{\sqrt{s}} & \text{ with probability } & \frac{s}{2m} \\ 0 & \text{ with probability } & 1-\frac{s}{m} \\ \frac1{\sqrt{s}} & \text{ with probability } & \frac{s}{2m} \end{array} \right.

Note that \mathbb{E} z_kz_k^T = \frac1m I_{d\times d}. The conjecture is that, there exists $c_1$ and $c_2$ positive universal constants such that

\mathrm{Prob}\left\{ \left\| \sum_{k=1}^m \left[z_k z_k^T - \mathbb{E} z_k z_k^T\right] \right\| \geq \varepsilon \right\} < \delta,

for m\geq c_1 \frac{d+\log\left( \frac1{\delta} \right)}{\varepsilon^2} and $s\geq c_2 \frac{\log\left( \frac{d}{\delta} \right)}{\varepsilon^2}$.

I think this would is an interesting question even for fixed \delta, for say \delta = 0.1, or even simply understand the value of

\mathbb{E} \left\| \sum_{k=1}^m \left[z_k z_k^T - \mathbb{E} z_k z_k^T\right] \right\|.



Open Problem 4.5. (Random k-lifts of graphs)

The next problem of this lecture is inspired in this paper (see the notes for more context and current status). Given a graph G, on n nodes and with max-degree \Delta, and an integer k\geq 2 a random k lift G^{\otimes k} of G is a graph on kn nodes obtained by replacing each edge of G by a random k\times k bipartite matching. More precisely, the adjacency matrix A^{\otimes k} of G^{\otimes k} is a nk\times nk matrix with k\times k blocks given by

A^{\otimes k}_{ij} = A_{ij} \Pi_{ij},

where \Pi_{ij} is uniformly randomly drawn from the set of permutations on k elements, and all the edges are independent, except for the fact that \Pi_{ij} = \Pi_{ji}. In other words,

A^{\otimes k} = \sum_{i < j} A_{ij} \left( e_i e_j^T \otimes \Pi_{ij} + e_j e_i^T \otimes \Pi_{ij}^T \right),

where \otimes corresponds to the Kronecker product. Note that \mathbb{E} A^{\otimes k} = A \otimes \left( \frac{1}{k} J \right), where J is the all-ones matrix.

Give a tight upperbound to

\mathbb{E}\left\| A^{\otimes k} -\mathbb{E} A^{\otimes k} \right\|.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s