Large-lexicon attribute-consistent text recognition in natural images

Tatiana Novikova, Olga Barinova, Pushmeet Kohli, Victor Lempitsky

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

70 Citations (Scopus)

Abstract

This paper proposes a new model for the task of word recognition in natural images that simultaneously models visual and lexicon consistency of words in a single probabilistic model. Our approach combines local likelihood and pairwise positional consistency priors with higher order priors that enforce consistency of characters (lexicon) and their attributes (font and colour). Unlike traditional stage-based methods, word recognition in our framework is performed by estimating the maximum a posteriori (MAP) solution under the joint posterior distribution of the model. MAP inference in our model is performed through the use of weighted finite-state transducers (WFSTs). We show how the efficiency of certain operations on WFSTs can be utilized to find the most likely word under the model in an efficient manner. We evaluate our method on a range of challenging datasets (ICDAR'03, SVT, ICDAR'11). Experimental results demonstrate that our method outperforms state-of-the-art methods for cropped word recognition.

Original languageEnglish
Title of host publicationComputer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings
Pages752-765
Number of pages14
EditionPART 6
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event12th European Conference on Computer Vision, ECCV 2012 - Florence, Italy
Duration: 7 Oct 201213 Oct 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 6
Volume7577 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference12th European Conference on Computer Vision, ECCV 2012
Country/TerritoryItaly
CityFlorence
Period7/10/1213/10/12

Fingerprint

Dive into the research topics of 'Large-lexicon attribute-consistent text recognition in natural images'. Together they form a unique fingerprint.

Cite this