# 18.S096: Group Testing and Error-Correcting Codes

A new set of Lecture notes is available here. These ones are about group testing and contain a very brief “crash-course” on error-correction codes. They also include five open problems. As usual, I will document the open problems here, while referring a much more detailed description of the problems on the notes.

Originally there was a problem (numbered 7.1.) about group testing bounds. Thanks to the reader who pointed out in the comments that this problem was solved in 2014! We replaced it with problem 4.6. here.

Open Problem 7.1. (Gilbert-Varshamov bound)

The Gilbert-Varshamov bound is an important (lower) bound on the optimal rate of error-correction, however neither it is known to be tight not it has been achieved by explicit deterministic constructions of codes (see notes). This motivates two questions:

1. Explicit deterministic constructions of codes achieving the GV bound
2. Is the GV bound tight?

Open Problem 7.2. (Boolean Classification and Annulus Conjecture)

In traditional error-correcting codes, one is interested in linear codes with large distance (or, in other words, avoiding the pinched ball of non-zero points with Hamming weight $< d$). Inspired by classification applications one can relax this requirement and ask that the linear code avoids an annulus $A(a,b)=\{x\in\{0,1\}^n:\,a\leq\Delta(x)\leq b\}$ where $\Delta(x)$ is the Hamming weight. We define $latex R_A(a,b,n)$ as the optimal rate for such a code. The open problem is to resolve the conjecture on this paper:

Prove or disprove: $latex R_A(\alpha n,\beta n,n)=\alpha + (1 – \alpha) R_A(1,\beta n,(1-\alpha))+o(1)$

We refer to the notes for more details.

Open Problem 7.3. (Shannon Capacity of 7 Cycle)

What is the Shannon Capacity of the 7-Cycle?

This is a classical (and fascinating) question in graph theory. See the notes for the definitions, more details and references (see also this blog post).

Open Problem 7.4. (The Deletion Channel)

Say we have to send a binary string “10010” through a deletion channel and the first and second bits get deleted, then the message receive would be “010” and the receiver would not know which bits were deleted. This is in contrast with the erasure channel where bits are erased but the receiver knows which bits are missing (in the case above the message received would be “??010”). The so-called Trace Reconstruction problem is the problem where the same message is sent multiple times and the goal of the receiver is to find exactly the original message sent from the many observed corrupted version of it. We will be interested in the following quantity:
Draw a random binary string with $n$ bits, suppose the channel has a deletion probability of $\frac12$ for each bit (independently), define $\mathcal{D}\left(n;\frac12\right)$ has the number of times the receiver needs to receive the message (with independent corruptions) so that she can decode the message exactly, with high probability. It is easy to see that $\mathcal{D}\left(n;\frac12\right) \leq 2^n$, since roughly once in every $2^n$ times the whole message will go through the channel unharmed. It is possible to show that $\mathcal{D}\left(n;\frac12\right) \leq 2^{\sqrt{n}}$ but it is not known whether this bound is tight.

What are the asymptotics of $\mathcal{D}\left(n;\frac12\right)$?

\item An interesting aspect of the Deletion Channel is that different messages may have different difficulties of decoding. This motivates the following question: What are the two (distinct) binary sequences $x^{(1)}$ and $x^{(2)}$ that are more difficult to distinguish (let’s say that the receiver knows that either $x^{(1)}$ or $x^{(2)}$ was sent but not which)?

## 6 thoughts on “18.S096: Group Testing and Error-Correcting Codes”

1. Ugo says:

Hi, your open problem 7.1 has been solved in the paper A. G. D’’yachkov, I. V. Vorob’ev, N. A. Polyansky, and V. Yu. Shchukin:
Bounds on the rate of disjunctive codes,
Problems of Information Transmission, Vol. 50, No. 1, pp. 27––56 (2014). In this paper it is shown that the lower bound of Theorem 7.2 of your notes is tight.

1. Afonso S. Bandeira says:

Many thanks for the reference! I’ll edit the notes accordingly and replace it for another problem.
Afonso

1. Afonso S. Bandeira says:

The problem was removed and replaced by 4.6., thank you!

2. Ugo says:

It seems that the authors of A. G. D’’yachkov et al.: Bounds on the rate of disjunctive codes,
Problems of Information Transmission, Vol. 50, No. 1, pp. 27––56 (2014) have now retracted their claim. Please see their erratum at DOI: 10.1134/S0032946016020083
I apologize if I created confusion!

1. Afonso S. Bandeira says:

Thanks for letting us know! I’ll add the problem back later this semester! Afonso

3. Cyrus says:

The trace reconstruction problem is indeed very interesting, and there is still a big gap. It is worth noting that the upper bound has been improved to exp(n^{1/3}) by two independent groups
https://arxiv.org/abs/1612.03599
https://arxiv.org/abs/1612.03148

Also, your question about distinguishing two specific sequences is the basis for the recent (and best known) lower bound for the problem
https://arxiv.org/abs/1808.02336