## Abstract

Given an alphabet A, a pattern p is a word v_{1}@ ⋯ @v_{m}, where v_{i} ∈ A* and @ ∉ A is a distinguished symbol called a variable length don't care symbol. Pattern p matches a text t ∈ A* if t = u_{0}v_{1}u_{1} ⋯ u_{m} - _{1}v_{m}u_{m} for some u_{0}, ⋯ , u_{m} ∈ A*. We address the following problem: given a set P of patterns and a text t, test whether one of the patterns of P matches t. We describe an algorithm that solves the problem in time O((|t| + |P|)log |P|). In contrast to most of the existing string matching algorithms (such as that of Aho-Corasick) our algorithm is not composed of two successive stages - preprocessing the pattern (resp. the text) and reading through the text (resp. the pattern) - but has these two stages essentially interleaved. Our approach is based on using the DAWG (Directed Acyclic Word Graph), a data structure studied by A. Blumer J. Blumer, Haussler, Ehrenfeucht, Crochemore, Chen, Seiferas.

Original language | English |
---|---|

Pages (from-to) | 129-154 |

Number of pages | 26 |

Journal | Theoretical Computer Science |

Volume | 178 |

Issue number | 1-2 |

DOIs | |

Publication status | Published - 30 May 1997 |

Externally published | Yes |