End-to-End learning of cost-volume aggregation for real-time dense stereo

Andrey Kuzmin, Dmitry Mikushin, Victor Lempitsky

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    14 Citations (Scopus)

    Abstract

    We present a new deep learning-based approach for dense stereo matching. Compared to previous works, our approach does not use deep learning of pixel appearance descriptors, employing very fast classical matching scores instead. At the same time, our approach uses a deep convolutional network to predict the local parameters of cost volume aggregation process, which in this paper we implement using differentiable domain transform. By treating such transform as a recurrent neural network, we are able to train our whole system that includes cost volume computation, cost-volume aggregation (smoothing), and winner-takes-all disparity selection end-to-end. The resulting method is highly efficient at test time, while achieving good matching accuracy. On the KITTI 2012 and KITTI 2015 benchmark, it achieves a result of 5.08% and 6.34% error rate respectively while running at 29 frames per second rate on a modern GPU.

    Original languageEnglish
    Title of host publication2017 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2017 - Proceedings
    EditorsNaonori Ueda, Jen-Tzung Chien, Tomoko Matsui, Jan Larsen, Shinji Watanabe
    PublisherIEEE Computer Society
    Pages1-6
    Number of pages6
    ISBN (Electronic)9781509063413
    DOIs
    Publication statusPublished - 5 Dec 2017
    Event2017 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2017 - Tokyo, Japan
    Duration: 25 Sep 201728 Sep 2017

    Publication series

    NameIEEE International Workshop on Machine Learning for Signal Processing, MLSP
    Volume2017-September
    ISSN (Print)2161-0363
    ISSN (Electronic)2161-0371

    Conference

    Conference2017 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2017
    Country/TerritoryJapan
    CityTokyo
    Period25/09/1728/09/17

    Keywords

    • Convolutional neural network
    • Cost-volume aggregation
    • Edge-preserving filtering
    • Recurrent neural network
    • Stereo matching

    Fingerprint

    Dive into the research topics of 'End-to-End learning of cost-volume aggregation for real-time dense stereo'. Together they form a unique fingerprint.

    Cite this