Poster. Faster Non-Convex Distributed Learning with Compression.


Poster for MARINA Paper presented at Poster session during Efficient Distributed Optimization workshop.


Thumbnail

Poster session

At the Communication Efficient Distributed Optimization workshop in a TRIPODS Institute for Theoretical Foundations of Data Science, I have presented a poster about MARINA, which is my joint work with Eduard Gorbunov, Zhize Li, Peter Richtárik. Thank you to the organizers. They have tried to simulate interactive poster sessions with a 2D game-like platform. It was a lot of interesting posters.

Marina

MARINA employs a novel communication compression strategy based on the compression of gradient differences which is reminiscent but different from the technique used in the DIANA method of Mishchenko et al. (2019). Our methods are superior to previous state-of-the-art methods in terms of the oracle/communication complexity.

General observations

During the poster session in my booth, I have obtained the feeling that rather specialized algorithms for non-convex optimization are working with stationary points like DIANA or PAGE has not been spread enough outside specialized groups in academia and specialized research groups in the industry worked on Federated Learning.

No matter in what aspect of Science we’re working, I suggest for all practitioners in any company they work not to forget and ask themselves two questions:

  1. Does the algorithm converge in that setting?
  2. If the algorithm converges, then to what does it converge?

The importance of that two questions I have learned from prof. Brad Osgood several years ago. Please not forget about them, and if you have never asked yourself start do it.


Written on April 10, 2021