Seminar on learning theory (106372)

Notes can be found here.

The focus of this seminar is combinatorial questions motivated by learning and teaching theories. A list of topics we aim to study: PAC learning, VC dimension, sign rank, sample compression schemes, teaching sets and recursive teaching dimension, population recovery and PID graphs, and more.

The aim is for the meetings to be interactive and full of discussions. The presentations will be mostly done by participants of the seminar (I will provide some introduction).

The meetings will be summarized by the participant: Use LaTex and send me a draft (both .tex and .pdf files) before your presentation. Please use the following files: .pdf and .tex. When preparing the talk, find a couple of exercises and add them to your notes.

Guidlines for presentations:

Start with a (long) introduction: What is the background for the work? What are the main questions? Why are they interesting? Provide formal definitions of the model. State main theorems.
Explain the high level structure of the proof. What is the intuition behind it?
Explain the proof. Write formal lemmas that are helpful in proof. Explain how lemmas imply the theorem. Prove each lemma separately.
Summarize the talk. What is the main statement to remember? What is the moral of the story? What are the key steps? What problems do you suggest to investigate further?

A more detailed list of papers/topics:

L. G. Valiant. A theory of the learnable. Communications of the ACM, 1984.
VC dimension. Examples: intervals on line, rectangles in plane, convex bodies in plane.
(Shay) Sauer-Perles-Shelah lemma: two proofs (shifting and algebraic, see e.g. here). Upper bound on cover numbers using Sauer-Perles-Shelah [Dudley; see Theorem 1.3 in our paper].
(Anat & Yoni) Sample complexity in PAC learning (see appendix in our paper and references within).
(Vsevolod) Cover numbers. Haussler’s upper bound (see notes).
(Ahsan) Haussler’s paper on agnostic learning (Section 9.4 shows uniform convergence using cover numbers). Pollard’s paper on uniform convergence.
(Gadir) Boosting. Section 2 in Freund’s work.
(Vered) Sample compression schemes. Warmuth and Floyd’s paper; see also our paper. Example: intervals on line. Definition. Greedy construction of logarithmic size. Compression implies PAC learnability. Boosting yield a compression from size m to size d log(m).
(Ori) Sign rank. Definition. Embedding using halfspaces. Sign rank is at least VC dimension.
Sample compression schemes [Section 3 in Ben-David and Litman].
Most matrices have high sign rank (see Section 5.2 in our paper and references within).
(Igor) Population recovery. Definition of noisy model (see here). Solution of DRP using PID graphs (same text). Connection to Fourier analysis (see here)
Teaching. Teaching models [Balbach, Doliwas et al., our paper,…]. Partial ID graphs [Section 2 here].