In this blog post I give a description of my GSoC 19 student project in a weekly basis. The project took place from May 2019 until August 2019. You can read the proposal here.

Abstract

Sampling algorithms and volume computation of convex polytopes are very useful in many scientific fields and applications. The package volesti is a C++ package with an R interface which provides three geometric random walks for sampling and two state-of-the-art methods for volume estimation for convex polytopes being the first package providing such a variety of options in geometric statistics. volesti currently scales to a few hundreds dimensions in contrast to other available R packages, as geometry, or C++ package VINCI that both only scale typically up to 15-20 dimensions. Thus it could be an essential tool for a quite large number of scientific applications in convex analysis, economics, biology or statistics, that need fast volume approximation or sampling in high dimensions. However, the possibility to scale from a few hundred to a few thousand dimensions was considered as a very far-reaching goal for many years. The goal of this project is to provide the first ever implementations for sampling and volume computation for convex polytopes in a few thousand dimensions. We exploit some very recent theoretical results that guarantee fast convergence and numerical stability in order to propose an efficient implementation of the current state-of-the-art geometrical random walks, i.e. Hamiltonian Monte Carlo and Vaidya walk. The proposed implementations will be a decisive contribution to other scientific fields as computational geometry, finance and optimization. We hope this project will be a decisive contribution towards the first complete and efficient tool for sampling, volume estimation and geometrical statistics in general and thus, help educational programs, research or even serve as a building block towards an international, interdisciplinary community in geometrical statistics.

Bonding period

1st week

I have created 4 Pull Requests in github repository of volesti package, to add the implementations of my preliminary work for the GSoC proposal:
1. Check if the requested error is positive.
2. Dikin walk implementation.
3. Boundary sampling with Hit-and-Run.
4. Volume computation and sampling for low dimensional convex polytopes.
2nd - 3rd week

I have read some papers that are retated to the coding project. In particular, we discussed methods given in:
[1] Algorithmic Theory of ODEs and Sampling from Well-conditioned Logconcave Densities, Yin Tat Lee, Zhao Song, Santosh S. Vempala.
[2] Convergence Rate of Riemannian Hamiltonian Monte Carlo and Faster Polytope Volume Computation, Yin Tat Lee, Santosh S. Vempala.
[3] Optimal Convergence Rate of Hamiltonian Monte Carlo for Strongly Logconcave Distributions, Zongchen Chen, Santosh S. Vempala.
[4] The Dynamics of Lagrange and Hamilton, Nisheeth K. Vishnoi.
[5] Nonconvex sampling with the Metropolis-adjusted Langevin algorithm, Oren Mangoubi, Nisheeth K. Vishnoi.
[6] Efficient Convex Optimization with Membership Oracles, Yin Tat Lee, Aaron Sidford, Santosh S. Vempala.
[7] Riemann Manifold Langevin and Hamiltonian Monte Carlo, Mark Girolami, Ben Calderhead, Siu A. Chin.

Coding period

1st week
1. I have created a new branch in my repo at github to implement Remannian Hamiltonian Monte Carlo. I have started coding.
2. I have extracted the system of non linear equations from the ODE in 1.
3. I have implemented Newton method to solve the system of non linear equations.
2nd week
1. I implement polynomial interpolation to compute the tranjectory that lies in the convex polytope in order to define a larger walk step than the theoretical result suggests in 2.
2. Compute a local approximation of the trajectory that intersects the boundary. Walk randomly on this tranjectory and show experimentally that this walk converges to the theoretical target distribution.
3rd week
1. Solve the ODE in [2] with Runge-Kutta (RK4) method for non-linear system of ODEs. In each step of the random walk we pick a step and we solve the ODE for this interval.
2. We use Runge-Kutta to compute n points in order to approximate the tranjectory with polynomial interpolation of order n+2. Then we use this “local” approximation to compute the boundary intersection and the boundary reflection of the approximate tranjectory. Then we walk on the reflected tranjectory etc. We show experimentally that this walk converges to the theoretical target distribution.
4th week
1. I implemented the billiard walk for H-polytopes. We use this walk to compute the volume of convex polytopes and we show that mixes faster that Hit-and-Run.
2. I implemented the billiard walk for V-polytopes and zonotopes.
3. I implemented some optimizations in convex polytopes’ c++ classes for the boundry intersection needed for the ray shooting.
5th week
1. Documentation for the billiard walk, update c++ interface and tests. I ran some experiments for the accuracy of billiard walk for volume computations.
2. I Implemented the HMC with reflections random walk when the hamiltonian is multidimensional spherical Gaussian distribution.
6th week
1. I added the HMC of spherical multidimensional Gaussian to Vempala - Cousin’s practical method for volume computations.
2. I completed a new implementation of HMC of gibbs sampling using RK2 method to solve the ODE.
3. I completed a new implementation of HMC of gibbs sampling using RK4 method to solve the ODE.
7th week
1. Optimize experimentally the step of Range-Kutta method to solve the ODE for both RK4 and RK2.
2. I implemented the collocation method to solve the ODE.
8th week
1. Work on the R interface and documentation for the HMC random walks.
2. Add HMC walks in C++ documentation.
9th week
1. Work on code optimizations for both HMC walks and billiard walk:
  a. Boundary oracles
  b. Scale the polytope in order the maximum inscribed ball to be the unit ball
2. Improve the boundary oracle for the HMC walk for spherical multidimensional gaussian using bisection method.
10th week
1. Implementation of the HMC of gibbs sampling by approximating the tranjectory.
2. Use the Newton-rapshon method for the boundary oracle.

Abstract

Bonding period

1st week

2nd - 3rd week

Coding period

1st week

2nd week

3rd week

4th week

5th week

6th week

7th week

8th week

9th week

10th week