Reinforcement learning for combinatorial optimization: A survey

Nina Mazyavkina, Sergey Sviridov, Sergei Ivanov, Evgeny Burnaev

Research output: Contribution to journalReview articlepeer-review

48 Citations (Scopus)


Many traditional algorithms for solving combinatorial optimization problems involve using hand-crafted heuristics that sequentially construct a solution. Such heuristics are designed by domain experts and may often be suboptimal due to the hard nature of the problems. Reinforcement learning (RL) proposes a good alternative to automate the search of these heuristics by training an agent in a supervised or self-supervised manner. In this survey, we explore the recent advancements of applying RL frameworks to hard combinatorial problems. Our survey provides the necessary background for operations research and machine learning communities and showcases the works that are moving the field forward. We juxtapose recently proposed RL methods, laying out the timeline of the improvements for each problem, as well as we make a comparison with traditional algorithms, indicating that RL models can become a promising direction for solving combinatorial problems.

Original languageEnglish
Article number105400
JournalComputers and Operations Research
Publication statusPublished - Oct 2021


  • Combinatorial optimization
  • Operations research
  • Policy-based methods
  • Reinforcement learning
  • Value-based methods


Dive into the research topics of 'Reinforcement learning for combinatorial optimization: A survey'. Together they form a unique fingerprint.

Cite this