The Koopa Cobol Parser Generator

is a Cobol parser generator with a plan for growth. It is able to handle Cobol source files (fixed and free format) in isolation (no preprocessing required) and accepts CICS/SQL fragments. Due to its design it is easily extensible in a way which limits the impact on the overall project. It achieves this by means of a custom DSL for specifying island grammars in a concise way, and through a unit testing framework for such grammars which aids in rapid and accurate fault detection.

Getting started

There are two ways to get started with Koopa. The first is to download one of the pre-built binaries. Double-click it and, if you have a Java runtime installed, you should be up and running.

The alternative is to grab the code from Koopa's repository, and build it yourself. For this you'll need both a Java development kit and a copy of Apache ANT. If you have all that then you can build, and run, a clean executable by invoking:

$ ant clean build jar
$ java -jar koopa.jar

Quick tour

The GUI allows you to parse a batch of Cobol source files. It reports the success or failure of doing so, showing detailed errors and warnings. You get the option of exporting the parse results to CSV, which is how we got the graph above.

Overview of parse results in the Koopa application. Source code view with syntax highlighting.

Koopa also provides a Cobol code visualiser. This application takes the parse tree and uses the information found in it to do syntax highlighting, as well as setting up an outline of the structure of the parsed file. You can also export the syntax tree to XML.

The syntax tree can be exposed to an XPath engine, allowing you to query by means of XPath expressions. The code visualiser integrates this functionality, so you can do interactive queries and inspect the result directly in the source code.

XPath querying of the syntax tree. Exploration of the Cobol grammar.

You can also navigate straight from the source code to the relevant rule of the Cobol grammar. A breadcrumb trail through the grammar rules is shown for the current selection in the source view. Clicking any part of that breadcrumb trail will let you browse the Cobol grammar for that grammar rule.

It is also possible to invoke Koopa from the command line. Right now there is only a single target for parsing Cobol files and dumping the syntax tree to XML.

For maximum flexibility you can, of course, interact directly with Koopa's code. There is support for partial ANTLR grammars, as well as XPath expressions for processing and querying the syntax trees. Or you can simply capture the raw trees and process them however you want.

Cobol

To give you an idea of the capabilities of the current implementation of the Cobol parser, this section shows some results of applying Koopa to several public sources of Cobol code.

These numbers apply to revision - of the codebase.

NIST Cobol 85 Testsuite

OpenJensen

Cobol unit testing framework for mainframe programs

Applewood Computers Accounting System

CobCurses

If you know of other interesting Cobol-based projects, please let me know, and I'll check them out.

Documentation

If you're interested in a high-level overview of the concepts and structure of Koopa there is a PDF guide which provides this. I would recommend reading this if you're considering using Koopa in your own projects. It should help you find your way in its structure.

There is also a section on some of the XPath queries you can throw at Koopa, with examples and tips which should help you write your own..

Support

Please check the FAQ first. If you don't find an answer there then feel free to contact the project administrators of this project through the project's summary page.

License

The code is offered under a BSD licensing scheme.

Publications

PDF iconAndy Kellens, Kris De Schutter, Theo D'Hondt, Luc Jorissen, Bart Van Passel; Cognac: a framework for documenting and verifying the design of Cobol systems; Conference on Software Maintenance and Reengineering (CSMR), 2009.

This paper has some details on the philosophy behind Koopa. The technical details, however, are out of date as it concerns a previous version of Koopa. The main difference lies in the approach to defining the island grammars. The version in the paper made use of standard ANTLR. I moved away from a pure ANTLR approach as it turned out too be difficult to evolve island grammars in this format and instead defined my own DSL.

Contact

Feel free to contact the project administrators of this project through the project's summary page.

SourceForge.net