Error reporting/Handling

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Error reporting/Handling

Martin MacRobert
Hi Everyone,
I'm trying to create an error mechanism for my spirit grammar, but the
examples with the standard boost distro, seem a bit elementary to be
useful to my application.

I need to accomplish 3 things.
1. Generate a "nice" error, that incorporates the offending token in
the error message.
2. Show where in the script the error occurred.
3. Show all errors in a script, as best as possible, instead of one at
a time, similar to a typical compiler error report.

I had tried something based on the error-reporting example, but the
problem was that in the context of a large grammar, the error was
fired when the parser backtracks out of a rule to try the next rule.
So I yeilded errors in a script that did not actually have any
grammatical errors.

Can anyone give advice as to how I can accomplish the 3 goals?
Can anyone provide example code on how to achieve this?

Regards,
Martin


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Error reporting/Handling

Phil Endecott-9
Hi Martin,

As you probably know, over in Yacc-land they have a special 'error'
token to use for error handling.  So you can write (pseudocode):

program ::= list(command)

command ::= assignment ';'
           | function-call ';'
           | error ';'

The action for the error token will be to show an error message.  The
important thing is that error, like the other alternatives for command,
finishes with a semicolon.  So if the parser sees something that is not
an assignment or a function call it will treat it as an error *up to the
next semicolon*; on reaching a semicolon it will consider that as the
end of the (invalid) command and start parsing the next command.  This
is the key to trying to show all of the errors in a file.

I have tried to achieve something similar in Spirit.  Because Spirit
tries the alternatives in sequence until one matches you can just make
your error rule the last alternative.  It is tempting to write something
like this (pseudocode):

command ::= assignment ';'
           | function-call ';'
           | *anychar [display-error-message] ';'

Unfortunately this isn't quite right because *anychar won't stop when it
gets to the semicolon.  This is better:

           | *(anychar-';') [diplsay-error-message] ';'

Unfortunately this still doesn't really work because of Spirit's
recursive-descent nature.  Imagine that your grammar had

block ::= '{' list(command) '}'

and your input was

     { foo(); a=b; }

Having parsed the assignment "a=b" the parser should terminate the list
and parse the end of the block, but the list parser will try to find one
more command.  If your command rule didn't have an error alternative it
would say "no" and the list parser would finish, but with the error
alternative it says "yes, '}' is a command matching the last
alternative" and prints an error message.

To avoid this you need to know all of the things that can follow a
command, and exclude them from the error matcher:

           | *(anychar-';'-'}') [diplsay-error-message] ';'

With a complex grammar this quickly gets hard; you need to bring global
information about how a rule is used into the design of the rule.

Anyway, having written a rule to match the error, the next bit is easy.
  You can display the offending content because the action is passed
iterators indicating the start and end of the matched content as
arguments in the usual way.  And you can use
position_iteraotr<file_iterator<char>> to get line numbers.

Here is an extract from some of my code.  (This is real code.
Everything above is pseudo-code.)

typedef file_iterator<char> base_iterator_t;
typedef position_iterator<base_iterator_t> iterator_t;

struct parse_error {
   string what_;
   parse_error(string what): what_(what) {}
   void operator()(string s, int line) const {
     ostringstream err;
     err << "Error at line " << line
         << ": expecting " << what_ << " but got '" << s << "'";
     throw err.str();
   }
   void operator()(iterator_t first, iterator_t last) const {
     string s(first, last);
     (*this)(s,first.get_position().line);
   }
};

       asm_template
         = *asm_template_fragment
         | (*(anychar_p-';'-'|')) [parse_error("an asm template")] ;

       foo
         = .... >> '|' >> asm_template >> '|' >> .... >> ';'


I hope that helps.  If anyone can suggest better ways of doing this sort
of thing I would be most interested to know.

Best wishes,

--Phil.



-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general