Welcome to the DCGen Project

DCGen is a parser generator targeting the Prolog language. Why a parser generator for Prolog? Doesn't it already have a DCG format?

The answer is yes; however, the lexing abilities of the language are comparatively lacking. Furthermore, the DCG format uses standard BNF format, whereas EBNF can make specification of the grammar easier. Finally, Prolog does nothing to automatically handle left recursion, so these cases must be handled by hand. DCGen aims to extend Prolog's DCG format to provide these capabilities.

Prolog is known as a very useful system for rapidly prototyping parsers, interpreters, and compilers. However, sometimes we may wish to use a library or tool written in another language (such as Java, C++, Python, etc.) that has not been ported to Prolog, either through a rewrite or a binding. Instead of porting the tool, we would like to take our Prolog parser and output an equivalent parser in these other languages. DCGen will specifically allow this, initially starting with Python and Ruby as the target "other" languages.

Project Goals

  • Providing an easy-to-use lexer generator for Prolog
  • Automatically handling left-recursion in grammars.
  • Giving EBNF capabilities to the Prolog DCG format
  • Facilitating parser generation to other languages
    • Python
    • Ruby
    • Parrot, and its grammar format
  • Reversible parsers - once you have a parser from the text of a file to the tree, you can reverse parse from the tree back to the text.
    • Using either a template file or an exhaustive sample file (handles a sufficient number of the rules in the grammar), the system will automatically insert whitespace to give the output text a certain format / look and feel.