XPath Queries

The following is an example of some XPath queries you can throw at Koopa, the answers it will come up with, and some insights which should help you write your own. You can try all of these out in the code visualiser, which has a dialog for entering XPath queries (under syntax tree > Query using XPath).

//compilationUnit//programName

This will yield the program names of all compilations units in the syntax tree. The node names are the names of the rules as they appear in the grammar. E.g. see here and here. Note the use of descendent selection ('//') rather than direct child selection ('/'); this helps in coping with changes in the grammar.

//compilationUnit//programName//text()

This will also yield the program names of all compilations units in the syntax tree. The difference with the previous query is that this one will return you the actual tokens, whereas the other one will return you the node for the 'programName' grammar rule.

//node()[text()="FILLER"]

All nodes with text "FILLER". This can match any kind of node; it is not linked to specific grammar rules.

//workingStorageSection/dataDescriptionEntry[position()=1]//dataName//text()

The name of the first data item in the working storage section.

//workingStorageSection/dataDescriptionEntry[last()]//dataName//text()

The name of the last data item in the working storage section.

//workingStorageSection/dataDescriptionEntry[position()<5]//dataName//text()

The name of the first four data items in the working storage section.

//workingStorageSection/dataDescriptionEntry[.//dataName/cobolWord[text()='FOO']]

The data description entry for the data item with name 'FOO'.

//procedureDivision//statement

All statements in the procedure division. This will also match nested statements.

//node()[@line=100]

Everything found at line 100. Koopa exposes some meta information about tokens as attributes, which you can then use in your queries.

//node()[@tag="STRING_LITERAL"]

All string literals in the program. Koopa adds several tags to nodes during lexing/tokenising, which helps direct the parser. You can use this to limit the results of your queries to certain kinds of tokens.

//node()[@tag="WATER"]

Everything in the water (i.e. not recognised by the parser). This tag gets added at the end of parsing. Ideally you would never see any of these, but given the nature of Koopa's grammar you should always expect seeing them. The explicit tagging can help you deal with them.

//callStatement/*[1]//text()

Get the target of all call statements. A query like this can be the starting point for finding inter-program dependencies.