It starts with a description (and derivation) of Principal Component Analysis and follows by (using Random Matrix Theory) attempting to understand the performance of Principal Component Analysis by looking into the distribution of the spectrum of the the sample covariance matrix when the true distributional covariance is the identity, arriving at the Marchenko-Pastur distribution.
A natural question is then what happens if the distributional covariance matrix has a low dimensional structure, and a particularly simple such example is a rank 1 spike . We derive, in that case, for which values of we expect to see also a spike on the eigenvalue distribution of the sample covariance matrix (an eigenvalue popping out of the support of the Marchenko-Pastur), and realize that there is a critical value of for which this starts happening (this is generally referred to as the BPP transition).
Take a look at it here!
Now for the Open problems:
Open Problem 1.1.
A problem by Mallat and Zeitouni that I discussed on this blog, see here.
Open Problem 1.2.
A problem regarding the monotonicity of the average singular value of a Gaussian matrix, also discussed previously in this blog, see here.
Open Problem 1.3.
What follows is a problems posed by Andrea Montanari and Subhabrata Sen
Let denote a symmetric Wigner matrix with i.i.d. entries . Also, given symmetric, define:
What is the value of , defined as
It is known that (see Montanari and Sen’s paper and/or my notes). A reasonable conjecture is that it is equal to . Remarkably , this would imply that the SDP relaxation for clustering under the Stochastic Block Model on 2 clusters is also optimal for detection.