Loading Events
  • This event has passed.
Statistical Machine Learning Bootcamp

« All Events

14 January, 9:15 am - 16 January, 4:30 pm EST

The goal of the Columbia Year of Statistical Machine Learning Bootcamp Lectures is to introduce students to the computational, mathematical, and statistical foundations of data science.

The focus will be on theoretical subjects of interest in modern statistical machine learning, suitable for new Ph.D. students in computer science, statistics, applied math, and related fields.

The lectures are open (free) to all, but we kindly request that you complete the following registration form so we get an accurate headcount.

Registration: https://forms.gle/dHB5Hbq4GB43eJuJ8

Waitlist: https://forms.gle/LGjzrKg9Qo2R9hCM7


Lectures are in the CS Auditorium (451 Computer Science Building).

The 10:15am-11:00am coffee breaks will be in the CS Lounge (also in the Computer Science Building).

Tuesday, January 14

Wednesday, January 15

  • 9:15-10:15: algorithmic applications of high-dimensional geometry (Alex Andoni; slides)
  • 10:15-11:00: coffee break (CS Lounge)
  • 11:00-12:00: algorithmic applications of high-dimensional geometry (Alex Andoni; slides)
  • 12:00-2:00: lunch break (on your own)
  • 2:00-3:00: optimal transport (Espen Bernton; slides)
  • 3:00-3:30: break
  • 3:30-4:30: optimal transport (Espen Bernton; slides)

Thursday, January 16

  • 9:15-10:15: stochastic gradient methods (Arian Maleki)
  • 10:15-11:00: coffee break (CS Lounge)
  • 11:00-12:00: stochastic gradient methods (Arian Maleki)
  • 12:00-2:00: lunch break (on your own)
  • 2:00-3:00: stochastic gradient methods (Arian Maleki)
  • 3:00-3:30: break
  • 3:30-4:30: nonparametric testing using optimal transport (Bodhi Sen; slides)

Lecturers and Topics:

  • Jarek Błasiok: concentration of measure (slides: [1], [2], [3])
    1. Equivalence between moment bounds/MGF bounds/tail bounds, Khintchine inequality, Bernstein inequality, Johnson-Lindenstrauss for Gaussian matrices.
    2. Subspace embedding: net argument, the volumetric argument for net constructions.
    3. Concentration inequalities for low-influence functions.
  • Alex Andoni: algorithmic applications of high-dimensional geometry (slides: [1], [2], [3])

    Many modern algorithms, especially for massive datasets, benefit from geometric techniques and tools even though the initial problem might have nothing to do with geometry. In this lecture series, we will cover a number of examples where (high-dimensional) geometry techniques lead to algorithms with significantly improved parameters, such as run-time, space, communication, etc. For example, starting with the classic dimension reduction method, researchers developed powerful tools for storing, transmitting, and accessing data quantums more efficiently than merely storing/etc the full data. These tools can be seen as a form of functional compression, where we store just enough information about data pieces to be useful for particular tasks. We will see applications of these tools to problems such as similarity search/nearest neighbor search, and numerical linear algebra.

  • Espen Bernton: optimal transport (slides: [1], [2])
    • Theoretical foundations: Origins of OT – Monge & Kantorovich problems, Primal and dual formulations, Wasserstein distance, Some important properties.
    • Computation and applications: Exact and approximate computation, Some statistical properties, OT as a loss function, Application to Generative models.
  • Arian Maleki: stochastic gradient methods
    1. Standard stochastic gradient descent, its convergence rate, and optimality.
    2. Rupport-Polyak averaging and its comparison with the standard SGD (also robust stochastic gradient descent of Nemirovski et al).
    3. Averaging the gradients and variance reduced algorithms, such as SAG, SAGA, SVRG.
    4. Quasi-Newton stochastic gradient method.
  • Bodhi Sen: nonparametric testing using optimal transport (slides: [1])

    In this lecture I will introduce the problem of distribution-free nonparametric testing and illustrate the connection to the theory of optimal transport. I will use these ideas to develop distribution-free testing procedures for: (i) multivariate two-sample goodness-of-fit testing, and (ii) testing for independence of two random vectors.


14 January, 9:15 am
16 January, 4:30 pm
Event Category:


CS Auditorium (CSB 451)
Mudd Building, 500 West 120th St
New York, NY United States
+ Google Map