• Products
  • Download
  • Purchase
  • Support
  • Company
Actipro Software company logo
Twitter Follow Actipro RSS Subscribe (RSS Feed)

The Actipro Blog

Tag Cloud

  • aero
  • blog
  • docking
  • editors
  • gauge
  • intelliprompt
  • navigation
  • propertygrid
  • ribbon
  • shared library
  • silverlight
  • syntaxeditor
  • themes
  • views
  • winforms
  • wpf

Latest Twitter News

November 21, 2011 at 11:14 AM
#WPF Studio 2011.2 is out now! Includes enhanced themes for native WPF conrtols and new SyntaxEditor features. http://t.co/uEMCaGPG

September 26, 2011 at 1:25 PM
If you'd like to see our #WPF / #Silverlight SyntaxEditor code editor control ported to Metro, provide feedback here: http://t.co/xXBNIDTi

September 15, 2011 at 8:31 PM
If you want to see SyntaxEditor eventually show up in Win8's #xaml UI, be sure to add your support to this MS thread: http://t.co/FBjz6TuC

August 15, 2011 at 1:47 PM
New SyntaxEditor IntelliPrompt parameter info feature docs/samples ready for the 2011.2 #WPF and #Silverlight releases. http://t.co/ezoYIjv

August 2, 2011 at 2:40 PM
First look at new automated IntelliPrompt parameter info coming to our C#/VB editor control in #WPF / #Silverlight http://t.co/CUz6O1T

Twitter Follow us on Twitter

Month List

  • 2012
    • February (3)
    • January (2)
  • 2011
    • December (2)
    • November (7)
    • October (2)
    • September (1)
    • August (5)
    • July (3)
    • June (6)
    • May (5)
    • April (8)
    • March (4)
    • February (5)
    • January (9)
  • 2010
    • December (9)
    • November (10)
    • October (4)
    • September (8)
    • August (12)
    • July (9)
    • June (7)
    • May (6)
    • April (7)
    • March (6)
    • February (6)
    • January (4)
  • 2009
    • December (2)
    • November (2)
    • October (12)
    • September (3)
    • August (11)
    • July (10)
    • June (6)
    • May (3)
    • April (7)
    • March (6)
    • February (8)
    • January (10)
  • 2008
    • December (10)
    • November (2)
    • October (3)
    • September (5)
    • August (5)
    • July (8)
    • June (4)
    • May (4)
    • April (10)
    • March (8)
    • February (1)
    • January (2)

Category List

  • RSS feed for ActiproActipro (289)
  • RSS feed for Blog SummaryBlog Summary (13)
  • RSS feed for GeneralGeneral (34)
  • RSS feed for In developmentIn development (150)
  • RSS feed for New featuresNew features (140)
  • RSS feed for New productNew product (30)
  • RSS feed for PromotionPromotion (2)
  • RSS feed for SilverlightSilverlight (71)
  • RSS feed for Tips and tricksTips and tricks (4)
  • RSS feed for Visual Studio 2008Visual Studio 2008 (2)
  • RSS feed for Windows FormsWindows Forms (20)
  • RSS feed for Windows VistaWindows Vista (10)
  • RSS feed for WPFWPF (235)
  • RSS feed for XAMLXAML (23)

About Us

Actipro Software is a leading provider of .NET user interface controls for the WPF, Silverlight, and WinForms frameworks, and is most well-known for their SyntaxEditor syntax-highlighting code editor control.

Please take some time to learn more about us and our product offerings.

SyntaxEditor grammar/AST framework part 6: Introduction to callbacks and error handling

August 12, 2010 at 1:22 AM
by Bill Henning (Actipro)

In the previous post, we optimized the tree construction output of our Simple language to be very concise.  The next step in building a grammar is to make sure that it properly handles errors.  After all, since this grammar framework is intended to be used with SyntaxEditor, our code editor control, we have to assume that most of the time the document’s code passed to our grammar parser will be in an invalid state.  The user is continuously typing and modifying it.

In today’s post we will look at the various callbacks that are available to you, probably the most important of which are error handling callbacks.  We’ll also dig into error handling options.

What is a callback?

As we’ve seen in the previous posts in this series, our entire grammar is built directly in C# or VB code.  We do not do code generation like a lot of other parser generators do.  A benefit of this is that you can interact directly with objects in the grammar. 

One way to interact with objects is to assign callbacks to them.  All EbnfTerm-based objects support four callbacks:

  • Initialize
  • Success
  • Error
  • Complete

And as shown in earlier posts, NonTerminal objects can be assigned a can-match callback.

Callbacks are simply delegates that get called when certain events occur.  You can point them to methods you declare or can inject lambda expressions as well.

Let’s look at each of the five callbacks.

EbnfTerm callbacks

Initialize and Complete callbacks

The Initialize callback is called right before an EbnfTerm is about to be parsed.  The Complete callback is called right after an EbnfTerm has been parsed.  Thus they are always paired.

It is important to note that Complete is called regardless of whether the term’s parsing succeeded or failed.

Success and Error callbacks

The Success callback is called right after an EbnfTerm is parsed successfully.  Alternatively, if the EbnfTerm was not parsed successfully, the Error callback is called.  Thus each term that is attempted to be parsed will either have its Success or Error callbacks fired.

The Success and Error callbacks occur immediately before the Complete callback does.

Summary of EbnfTerm callbacks

A term that is successfully parsed will offer this sequence of callbacks:

  • Initialize
  • (parsing attempt here)
  • Success
  • Complete

A term that is not successfully parsed will offer this sequence of callbacks:

  • Initialize
  • (parsing attempt here)
  • Error
  • Complete

Definitions

The Initialize, Success, and Complete callbacks have this definition:

   1: public delegate void ParserCallback(IParserState state);

It is passed an IParserState that gives you access to look-ahead tokens, custom data, and the matches that have been made at the current scope level.  You can update custom data, or even modify the matches collection if you wish in any of these callbacks. 

Custom data can be anything you wish.  Perhaps as you traverse through certain non-terminals, you want to maintain a stack of which ones you’ve visited.  Your custom data could contain such a stack.  In the Initialize callback for the non-terminals you wish to track, you could push an item on the stack.  In the Complete callback for the non-terminals you wish to track, you could pop an item off the stack.

The Error callback has this definition:

   1: public delegate IParserErrorResult ParserErrorCallback(IParserState state);

The Error callback also gets an IParserState passed to it.  However it differs from the others in that it expects an IParseErrorResult object returned.  Since the Error callback is called when an error occurs, this result tells the parser how to proceed.  There are options for preventing any errors from being reported and options for whether to continue on as if no error occurred.

The standard set of options are provided in the ParserErrorResults object via static properties:

  • Default – Potentially report errors and return a match failure.
  • Continue – Potentially report errors but continue on.
  • Ignore – Never report errors and continue on.
  • NoReport – Never report errors and return a match failure.

Sample callback

Callbacks can be assigned with the OnInitialize, OnSuccess, OnError, and OnComplete methods.

This root production shows how an Error callback can be assigned by calling the OnError method and passing it the delegate to use.  In this case the method that will be called is AdvanceToDefaultState.

   1: this.Root.Production = functionDeclaration.OnError(AdvanceToDefaultState)
   2:     .ZeroOrMore().SetLabel("decl")
   3:     > Ast("CompilationUnit", AstChildrenFrom("decl"));

What happens is that if an error occurs while parsing FunctionDeclaration, the AdvanceToDefaultState method is called, which does this:

   1: /// <summary>
   2: /// Advances the token reader to the next 'function' token from where parsing 
   3: /// can resume.
   4: /// </summary>
   5: /// <param name="state">A <see cref="IParserState"/> that provides information 
   6: /// about the parser's current state.</param>
   7: /// <returns>An <see cref="IParserErrorResult"/> value indicating a result.</returns>
   8: private IParserErrorResult AdvanceToDefaultState(IParserState state) {
   9:     state.TokenReader.AdvanceTo(SimpleTokenId.Function);
  10:     return ParserErrorResults.Continue;
  11: }

You can see how it tells the token reader to advance to the next Function token.  We have skipped over any potentially “bad” tokens and have gone right to the next token that we know will successfully start a FunctionDeclaration.

The callback returns ParserErrorResults.Continue, which means potentially report an error, but continue on instead of breaking out of the ZeroOrMore quantifier that contains the FunctionDeclaration non-terminal.

Built-in error callbacks

There are also some built-in Error callbacks that you can assign.  They don’t do anything other than return the various related ParserErrorResults values:

  • OnErrorContinue – Returns ParserErrorResults.Continue.
  • OnErrorIgnore – Returns ParserErrorResults.Ignore.
  • OnErrorNoReport – Returns ParserErrorResults.NoReport.

This example shows the use of OnErrorContinue, where we will report an error if the semi-colon isn’t matched, but we’ll continue on with parsing as if it was there:

   1: variableDeclarationStatement.Production = @var + @identifier["name"] + 
   2:     @semiColon.OnErrorContinue()
   3:     > Ast("VariableDeclarationStatement", AstFrom("name"));

Advanced error handling

Sometimes errors will occur where a non-terminal is referenced however that non-terminal is capable of starting with multiple different terminals.  In that case, the parser doesn’t report an error by default since it doesn’t know what it should say.  Here’s a perfect example:

   1: returnStatement.Production = @return + expression["exp"].OnErrorContinue() + 
   2:     @semiColon.OnErrorContinue()
   3:     > Ast("ReturnStatement", AstFrom("exp"));

Say the input for this production was return return.  Obviously that is invalid as the second return keyword doesn’t fit into an expression.  Since Expression can start with numerous terminals, an error occurs but no parse error is reported into the parse errors collection since the parser doesn’t know what to tell the UI. 

We have two options for handling this scenario.

Option 1 – Use an error alias

When a NonTerminal is assigned an ErrorAlias, it will report an error by default if it fails to match.  We only want to set error aliases on higher-level non-terminals such as Expression or Statement non-terminals.  We can do so like this:

   1: var expression = new NonTerminal("Expression") { ErrorAlias = "Expression" };

That will tell the parser to automatically report an Expression expected parse error if Expression fails to match.  This is the easiest way to handle this scenario.

Option 2 – Custom error callback

The second option that can be used if we need to customize the error message more is to use an error callback:

   1: returnStatement.Production = @return + expression["exp"].OnError(ExpressionExpected) + 
   2:     @semiColon.OnErrorContinue()
   3:     > Ast("ReturnStatement", AstFrom("exp"));

The error callback can be implemented like:

   1: /// <summary>
   2: /// Occurs when an expression is expected but not found.
   3: /// </summary>
   4: /// <param name="state">A <see cref="IParserState"/> that provides information 
   5: /// about the parser's current state.</param>
   6: /// <returns>An <see cref="IParserErrorResult"/> value indicating a result.</returns>
   7: private IParserErrorResult ExpressionExpected(IParserState state) {
   8:     // Report a custom error, and return a value telling the parser to not report 
   9:     // errors and continue on
  10:     state.ReportError(ParseErrorLevel.Error, "Expression should have been here.");
  11:     return ParserErrorResults.Ignore;
  12: }

Note that here we report an error Expression should have been here instead of the Expression expected message that comes from option #1.  We also return ParserErrorResults.Ignore to ensure that no other error message is reported, and tell the parser to continue on.

Error reporting notes

We’ve now seen how both terminals and non-terminals are capable of reporting parse errors that can be displayed in the user interface.  In some scenarios, multiple errors may be reported for a given text offset.  Allowing this can really confuse the end user.  The grammar framework has built in functionality such that it will only report the first parse error for a given offset, since that is the most important one.

The parse error collection returned in the parse data result back to the document will also be sorted by each error’s location in the document.

NonTerminal can-match callbacks

Can-match callbacks can optionally be assigned to any NonTerminal.  Since our grammar is LL(*)-based, each NonTerminal maintains a set of terminals that it knows are able to start it.  This is called the “first set”.  For instance a Simple language FunctionDeclaration production always starts with a function terminal.  Thus the FunctionDeclaration’s first set consists of a single function terminal.

Sometimes you may have an alternation EBNF term with two or more non-terminal references that have intersecting first sets.  We see this in the Simple language where both the SimpleName and FunctionAccessExpression non-terminal productions start with Identifier terminals, and the PrimaryExpression non-terminal production is an alternation that contains both of them.  This situation is called ambiguity and the grammar will warn you when it detects the scenario so that you can fix it.

When a can-match callback is specified, it effectively overrides the “first set” of the non-terminal.  Thus in the Simple language where the ambiguity occurred, the ambiguity is resolved by applying a can-match callback to one of the ambiguous non-terminals.

Definition

A can-match callback has this definition:

   1: public delegate bool ParserCanMatchCallback(IParserState state);

It is passed an IParserState and the result is a boolean value indicating whether the non-terminal can match with the current state.  Logic in the callback is generally implemented such that it examines look-ahead tokens to see what the next several tokens are.  Since you are able to look ahead all the way to the end of the document if you wish, that is the reason our grammar is LL(*).  The * means infinite look-ahead.

Sample callback

The Simple language grammar’s FunctionAccessExpression has a can-match callback.  This code can be used to assign the callback:

   1: functionAccessExpression.CanMatchCallback = CanMatchFunctionAccessExpression;

And here is the callback implementation:

   1: /// <summary>
   2: /// Returns whether the <c>FunctionAccessExpression</c> non-terminal can match.
   3: /// </summary>
   4: /// <param name="state">A <see cref="IParserState"/> that provides information 
   5: /// about the parser's current state.</param>
   6: /// <returns>
   7: /// <c>true</c> if the <see cref="NonTerminal"/> can match with the current state;
   8: /// otherwise, <c>false</c>.
   9: /// </returns>
  10: private bool CanMatchFunctionAccessExpression(IParserState state) {
  11:     return (state.TokenReader.LookAheadToken.Id == SimpleTokenId.Identifier) && 
  12:         (state.TokenReader.GetLookAheadToken(2).Id == SimpleTokenId.OpenParenthesis);
  13: }

CanAlwaysMatch callback

There is a built-in method called Grammar.CanAlwaysMatch which is a can-match callback that always returns true.  This callback is useful as described in the next section.

Proper design of iterative productions

Very often a root compilation unit has some other set of non-terminals that repeat within it.  In this case we set up a non-terminal EBNF term with an error handler and place it in a ZeroOrMore quantifier like this:

   1: this.Root.Production = functionDeclaration.OnError(AdvanceToDefaultState)
   2:     .ZeroOrMore().SetLabel("decl")
   3:     > Ast("CompilationUnit", AstChildrenFrom("decl"));

What happens here is that if an error occurs in FunctionDeclaration it will advance to the next function token (per above) and will continue on with the next FunctionDeclaration match. 

But what happens if at the start of the document we have an invalid token instead, such as an Identifier?  FunctionDeclaration doesn’t start with an Identifier terminal.  It only starts with a Function terminal.  Thus the entire ZeroOrMore quantifier will never be entered and your CompilationUnit AST node output will be empty, even if there are a lot of valid function declarations after that initial Identifier.

We can easily handle this scenario by using the CanAlwaysMatch callback on FunctionDeclaration:

   1: // Make sure FunctionDeclaration will always be examined, 
   2: // even if the next token is not 'function'
   3: functionDeclaration.CanMatchCallback = CanAlwaysMatch;

Thus we have forced the “first set” of FunctionDeclaration to be overridden and even tokens like Identifier will cause us to enter FunctionDeclaration.  In that scenario, Identifier won’t match with the Function terminal and an error will be reported that indicates ‘function’ expected.  This scenario is now properly handled.

Advanced implementations

What about languages such as C# where you could have a using statement, namespace, or type declaration at the root compilation unit level?  We’ll apply the same concepts.

Make a new non-terminal called CompilationUnitContent that has an alternation between those other non-terminals.  Make the root production call CompilationUnitContent in the same way FunctionDeclaration was called above, with an error handler.  Then likewise, we set the CanAlwaysMatch callback on the CompilationUnitContent non-terminal.

The AdvanceToDefaultState method that we use needs to be designed to advance to the next Using, Namespace, etc. token.  This is easy as the AdvanceTo method we provide on ITokenReader can accept any number of token ID values.

Finally we can use one of the two options listed in the Advanced error handling section above to report a helpful parse error to the end user.

Next steps

Today we’ve covered a lot of ground on callbacks and error handling.  You can see the power that our grammar framework has in these areas since grammars are written natively in C# and VB. 

In the next post, we’ll apply these techniques to our Simple language grammar so that we can make it always provide us with as complete of an AST as possible, even when there are parse errors present, as will often be the case when editing documents in a code editor control such as SyntaxEditor.

Tags: wpf, silverlight, syntaxeditor
Filed under: Actipro, In development, WPF, Silverlight
Submit to DotNetKicks...
Permalink | Comments (2)

Related posts

SyntaxEditor grammar/AST framework part 7: Adding error handling to the Simple grammarIn the previous post, we saw how the grammar framework supports callbacks nearly everywhere in the E...SyntaxEditor grammar/AST framework part 3: Creating a grammar for the Simple languageIn the previous post we gave a detailed introduction to symbols, EBNF terms, and how you can transla...SyntaxEditor grammar/AST framework part 2: Introduction to symbols and EBNF termsIn the previous post, we gave an overview about what we’ll be covering in this multi-part walkthroug...

Comments

August 13, 2010 at 01:58  

trackback

SyntaxEditor grammar/AST framework part 6: Callbacks / error handling

You've been kicked (a good thing) - Trackback from DotNetKicks.com

DotNetKicks.com

August 27, 2010 at 01:40  

trackback

FeedBurner blog post RSS feed issue fixed

FeedBurner blog post RSS feed issue fixed

The Actipro Blog - WPF and WinForms Development

Comments are closed
Copyright © 1999-2012 Actipro Software LLC. All rights reserved.
Home Actipro Software | Products | Download | Contact Us