Efficient Indexing of Billion-Scale Datasets of Deep Descriptors

Artem Babenko Yandex, Victor Lempitsky

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    98 Citations (Scopus)

    Abstract

    Existing billion-scale nearest neighbor search systems have mostly been compared on a single dataset of a billion of SIFT vectors, where systems based on the Inverted Multi-Index (IMI) have been performing very well, achieving state-of-the-art recall in several milliseconds. SIFT-like descriptors, however, are quickly being replaced with descriptors based on deep neural networks (DNN) that provide better performance for many computer vision tasks. In this paper, we introduce a new dataset of one billion descriptors based on DNNs and reveal the relative inefficiency of IMI-based indexing for such descriptors compared to SIFT data. We then introduce two new indexing structures, the Non-Orthogonal Inverted Multi-Index (NO-IMI) and the Generalized Non-Orthogonal Inverted Multi-Index (GNO-IMI). We show that due to additional flexibility, the new structures are able to adapt to DNN descriptor distribution in a better way. In particular, extensive experiments on the new dataset demonstrate that these data structures provide considerably better trade-off between the speed of retrieval and recall, given similar amount of memory, as compared to the standard Inverted Multi-Index.

    Original languageEnglish
    Title of host publicationProceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
    PublisherIEEE Computer Society
    Pages2055-2063
    Number of pages9
    ISBN (Electronic)9781467388504
    DOIs
    Publication statusPublished - 9 Dec 2016
    Event29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016 - Las Vegas, United States
    Duration: 26 Jun 20161 Jul 2016

    Publication series

    NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
    Volume2016-December
    ISSN (Print)1063-6919

    Conference

    Conference29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
    Country/TerritoryUnited States
    CityLas Vegas
    Period26/06/161/07/16

    Fingerprint

    Dive into the research topics of 'Efficient Indexing of Billion-Scale Datasets of Deep Descriptors'. Together they form a unique fingerprint.

    Cite this