Tweets by @Actipro
Please take some time to learn more about us and our product offerings.
We also showed how the grammar automatically generates an AST of the parsed document for you. The generated tree generally has much more information than what is useful though. Today’s post is going to show how we can completely customize the AST output so that it focuses only on the information we care about.
Our grammar/AST framework has an extremely powerful tree construction mechanism with numerous built-in helper methods to do common tree construction tasks. You also have the ability to create completely custom tree construction nodes via C#/VB code if you like.
The default tree constructors will output nearly everything that is parsed. Each AST node that is generated automatically contains the offset range of the text that created it.
Any EBNF production will create a scope and will output a new hierarchy level node in an AST. The AST node that is created can have zero to many children, depending on what was matched within the production.
If the production is the root production for a non-terminal, then the AST node that is output for the production will have a value that is the non-terminal’s key. Note that productions can be used within other productions. In that case, the nested productions generate an AST node that has children but no value.
Terminals, which match tokens, will output an AST node with no children that has a value of the text that was read.
“Optional” quantifiers will output the tree construction results for the quantifier’s child term if the term matched successfully. Otherwise an empty structural AST node will be output.
For any other quantifier (“zero-or-more”, “one-or-more”, etc.), a new scope is created within the quantifier. The quantifier will output a new hierarchy level node in the AST. The AST node that is created can have zero to many children, depending on what was matched in the quantifier.
We mentioned a couple places above where scopes are created. Namely, scopes occur in productions and in quantifiers (ones other than “optional”).
Any EBNF term can be assigned a label. A label is a way for a tree constructor to identify something that was matched in its scope. Labels are only useful when creating custom tree constructors.
Terminals and non-terminals, represented by the Terminal and NonTerminal classes can be auto-converted to their EBNF term equivalents (EbnfTerminal and EbnfNonTerminal) by calling their indexer and passing the label as a string parameter.
So say we have a terminal within a production for an identifier and we want to apply a label to it. We can do so like this:
For any object that is EbnfTerm-based, there is a SetLabel method you can call instead like:
In that example, statement is a NonTerminal, its ZeroOrMore method call converted it to an EbnfNonTerminal and wrapped it with an EbnfQuantifier. Finally we set the label stmt on the quantifier via the SetLabel method.
Adding tree constructors to a production is easy. So far we’ve only seen how to indicate the EBNF pattern expression for a production. A tree constructor can be added to this by using the > operator followed by an ITreeConstructionNode instance like this:
1: expression.Production = equalityExpression["exp"] > AstFrom("exp");
In the above sample code I created a production for an expression non-terminal, which simply calls into an equalityExpression non-terminal. It applies a label to that non-terminal’s results so that the tree constructor can work with them. The AstFrom tree constructor simply looks for a match with the specified label and returns its value.
Thus the result of a match with this production is the result of the equalityExpression non-terminal. If we didn’t include the tree constructor portion of the above production, the result would be an Expression AST node whose child was the result of the equalityExpression non-terminal.
There are a number of built-in tree construction nodes that should handle nearly all your tree construction needs. These nodes are created by method calls on the Grammar class.
The ITreeConstructionNode generated by this method will create an AST node with the specified value and optional child nodes.
This example creates an AST node with value ReturnStatement that has a child node resulting from an EBNF term match whose label is exp:
1: Ast("ReturnStatement", AstFrom("exp"))
The ITreeConstructionNode generated by this method works the same way as Ast except that the root node created will have the value of the valueOfNode tree construction node result’s value.
This example for an XML attribute name and value shows how a parent AST node can be created whose value is the attribute name and the child AST node will be the attribute value:
1: AstValueOf(AstFrom("attrName"), AstFrom("attrValue"))
The ITreeConstructionNode generated by this method will create an AST node from an EBNF term match with the specified label.
This example creates an AST node resulting from an EBNF term match whose label is exp:
The ITreeConstructionNode generated by this method will create an AST node from the first child of an EBNF term match with the specified label.
This example creates an AST node resulting from the first child of an EBNF term match whose label is op:
The ITreeConstructionNode generated by this method will create an AST node from the children of an EBNF term match with the specified label.
This example creates an AST node with value Block that contains the children of an EBNF term match whose label is stmt:
1: Ast("Block", AstChildrenFrom("stmt"))
The ITreeConstructionNode generated by this method will create an AST node with the specified value and optional child nodes, but only if there is more than one child available. If one child is available, it is directly returned.
This example creates an AST node with value Parameters that has child nodes from the union of the children of EBNF term matches params1 and params2:
1: AstConditional("Parameters", AstChildrenFrom("params1"),
Note that if only one result came from the union of params1 and params2 children, that single result would be returned and would not be wrapped by a Parameters node.
The ITreeConstructionNode generated by this method will create an AST node with the specified value whose child nodes come from the children of the specified labeled EBNF term match, but only if at least one child node is available. If no child nodes are available, nothing is created.
This example creates a FunctionDeclaration AST node whose first child comes from an EBNF term match with label name. The next child will be a Parameters node but only if there was at least one child within an EBNF term match with label params. The last child will be the result of an EBNF term match with label block.
1: Ast("FunctionDeclaration", AstFrom("name"),
2: AstConditionalFrom("Parameters", "params"), AstFrom("block"))
This tree construction node is very handy when there are things such as optional parameter or argument lists in a grammar and you don’t want to clutter up the resulting AST when a particular member doesn’t have parameters or arguments.
The ITreeConstructionNode generated by this method works the same way as AstConditional except that the root node created when there is more than one child available will have the value of the valueOfNode tree construction node result’s value.
This example for a binary operator expression shows how the operator used will be the value of the root node created, but only if the leftexp and rightexp returned matches. If only one match occurred (no operator was used), then that match will be directly passed up.
1: AstValueOfConditional(AstChildFrom("op"), AstFrom("leftexp"),
This tree construction node is pretty advanced and will be easier to understand in the next post, where we show practical usage of it.
The built-in tree construction nodes should handle nearly all the tree construction scenarios you have. But what if there is something that requires some additional programmatic logic? No problem, you can simply create a class that implements ITreeConstructionNode and inject it anywhere just like you do with the results of the above methods.
In this post, we reviewed all the basics on how tree construction works. In the next post, we’ll revisit our Simple language and will apply tree construction techniques to its productions. The default tree constructors create an enormous AST, but once we apply our own tree construction, the AST will be short and concise.
August 10, 2010 at 02:11
SyntaxEditor grammar/AST framework part 4: Tree construction intro
You've been kicked (a good thing) - Trackback from DotNetKicks.com
August 27, 2010 at 01:40
FeedBurner blog post RSS feed issue fixed
FeedBurner blog post RSS feed issue fixed
The Actipro Blog - WPF and WinForms Development