## Abstract

A randomized online version of the mirror descent method is proposed. It differs from the existing versions by the randomization method. Randomization is performed at the stage of the projection of a subgradient of the function being optimized onto the unit simplex rather than at the stage of the computation of a subgradient, which is common practice. As a result, a componentwise subgradient descent with a randomly chosen component is obtained, which admits an online interpretation. This observation, for example, has made it possible to uniformly interpret results on weighting expert decisions and propose the most efficient method for searching for an equilibrium in a zero-sum two-person matrix game with sparse matrix.

Original language | English |
---|---|

Pages (from-to) | 580-596 |

Number of pages | 17 |

Journal | Computational Mathematics and Mathematical Physics |

Volume | 55 |

Issue number | 4 |

DOIs | |

Publication status | Published - 1 Apr 2015 |

Externally published | Yes |

## Keywords

- dual averaging method
- exponential weighting
- mirror descent method
- multi-armed bandits
- online optimization
- randomization
- stochastic optimization
- weighting of experts