gsuffix-1.0.0 Documentation

The gsuffix library is a lightweight C-language implementations of generalized suffix-based algorithms useful for searching for string patterns in sets of input strings and is released under the terms of the LGPL license.

One important application is searching for motifs in biosequences (DNA or protein). The library provides a unified interface for application code, which can use a standard generalized suffix tree, a modified version of the suffix tree called k-truncated suffix tree, which can be used to search for short (up to k-mer) patterns in multiple input sequences, and several versions of the suffix array, including generalized extended suffix arrays. Each of these data structures has advantages and disadvantages with respect to memory use, speed, and flexibility.

With gsuffix, developers can test each of the algorithms in turn with only very minor modifications to application code.

