Wisent
A Python LR(1) parser generator.
Introduction
When writing a computer program, implementing methods to read data from input files with a complex structure can be surprisingly difficult. For example, if the input data comes from an untrusted source, errors in the input file often need to be dealt with very carefully. If your program is written in Python and if the input data is sufficiently structured (i.e., if the format can be described by a context free grammar), Wisent can help you to implement parts of the input processing of your program.

A cave painting from the cave of Altamira, showing a wisent. The photo was taken from the Wikimedia Commons and is in the public domain.
Features
The parser generator has the following features:
- Wisent can deal with general LR(1) grammars.
- Provides helpful error messages: if there is a problem with the input grammar, Wisent generates an example input string to illustrate the problem.
- The language to specify grammars allows use of the
?(optional elements),*(zero or more copies) and+(one or more copies) operators. - Wisent is distributed under the terms of the GNU General Public License (GPL) version 2.
The generated parsers have the following features:
- The generated parser is stand-alone, i.e. you can add the generated parser to your project without adding Wisent to the project dependencies.
- The generated parser is implemented as a Python class.
- Automatic error repair and good error reporting: on invalid input, the generated parser tries to fix the problem to allow continuing the parsing process. At the end of parsing, all detected errors are reported together.
- A call to the parser returns a parse tree. Wisent can create parsers which omit “uninteresting” nodes from the generated tree.
- The generated parsers can be distributed under the 3-clause BSD license. Since this license is compatible with the GPL, you can of course use the generated parsers in GPL projects.
More information can be found in the Wisent Users’ Manual.
About the name
I called the program “Wisent” because the first parser generator I encountered was Bison and the Wisent is the European variant of the Bison. Unfortunately, I learned later that there are at least two other parser generators which use the name “Wisent”:
- Wisent by Thomas B. Preußer: a Parser Generator for C++ and Java implemented in C++.
- Wisent by David Ponce: one component of the “Semantic” package for emacs.
Installation
The source code for more recent, experimental versions of wisent may (or may not) be available on github.com/seehuhn/wisent.
Generic installation instructions are in the file INSTALL. On most systems, the following commands should be sufficient:
./configure
make
make install
Alternatively you can omit the make install and run Wisent directly in the build directory.
Please send any suggestions and bug reports to Jochen Voss. Your message should include the Wisent version number, as obtained by the command wisent -V.
References
- the Wisent Users’ Manual.
- The algorithm used in Wisent to generate the parsers is based on the following article: David Pager, A practical general method for constructing LR(k) parsers. Acta Informatica, volume 7 (1977), number 3, pages 249–268.
- Wikipedia has a entries about context free grammars, LR parsers and Wisents.
- The first edition of the book Parsing Techniques — A Practical Guide by Dick Grune and Ceriel J.H. Jacobs is available online.
- The Bison parser generator is an excellent parser generator for C and C++ projects. Bison comes with an excellent manual.
- The LanguageParsing entry on wiki.python.org lists other Python parser generators.
- Xkcd knows regular expressions.
Downloads
github: seehuhn/wisent
| Version | Date | Download | Notes |
|---|---|---|---|
| 0.6.2 | 2012-04-10 | tar.gz (1.0 MB), sig, sha188560d57326d8796f468173c9d4c9f1da304ed36 |
|
| 0.6.1 | 2010-09-16 | tar.gz (1.0 MB), sig, sha1288b7e7efe7508c44d0593c7fb07583307e20d99 |
bug fix release |