The gsuffix library
contains C-language implementations of generalized
suffix-based algorithms useful for searching for string patterns in sets of input
strings and is released under the terms of the LGPL license. One important application is searching for motifs in biosequences (DNA or protein).
The library provides a unified interface for application code, which can use a standard
generalized suffix tree, a modified version of the suffix tree called *k*-truncated
suffix tree, which can be used to search for short (up to *k*-mer) patterns in multiple
input sequences, and several versions of the suffix array, including generalized extended
suffix arrays. Each of these data structures has advantages and disadvatanges with respect
to memory use, speed, and flexibility. With gsuffix, developers can test each of the
algorithms in turn with only very minor modifications to application code.

A comprehensive documentation of gsuffix is located at http://gsuffix.sf.net/gsuffix/index.html.

The source code of the library can be retrieved from the project's
download area
at SourceForge. After unpacking the archive, you can compile gsuffix by invoking the usual *configure* and *make* combo.
Please refer to the included README for more details. The archive also includes a bunch of example applications. They
are located in the *gsapps* folder.

The initial release of gsuffix library encompasses algorithms from the following papers:

- M.I. Abouelhoda, S. Kurtz, E. Ohlebusch. Replacing suffix trees with enhanced suffix arrays. Journal of Discrete Algorithms, 2004
- J. Kärkkäinen, P. Sanders. Simple Linear Work Suffix Array Construction. Proc. 13th International Conference on Automata, Languages and Programming, 2003
- G. Manzini, P. Ferragina. Engineering a Lightweight Suffix Array Construction Algorithm. Proc. 10th European Symposium on Algorithms (ESA '02). Springer Verlag Lecture Notes in Computer Science n. 2461, pp 698-710.
- Marcel H. Schulz, Sebastian Bauer, Peter N. Robinson. The Generalized k-Truncated Suffix Tree for Time- and Space-Efficient Searches in Multiple DNA or Protein Sequences. Submitted for publication.