[Spirit2X] How to add function-like parsers and directives to Spirit2X (+ remarks)

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[Spirit2X] How to add function-like parsers and directives to Spirit2X (+ remarks)

Francois Barel
Hi,

I started playing around with Spirit2X, and have a few minor remarks
as well as one "how should I" question (to which the answer is
probably "just wait for it to be implemented" :p).


First off, 3 remarks:

1. In support/placeholder.hpp I suggest applying the attached patch
(adding the boost namespace) so that the BOOST_SPIRIT_PLACEHOLDER and
BOOST_SPIRIT_DEFINE_PLACEHOLDERS macros can be used in contexts which
do not have "using namespace boost;".

2. Small copy/paste error in plus's what() function, patch attached.

3. Setting "pass" to false in an action (e.g. my_parser[pass = false])
fails parsing as expected, yet the input iterator is still advanced
(it isn't reset to the value it had before my_parser was called). Just
wondering, since this is contrary to usual Spirit behavior -- is that
intentional? (Looks like it was the same in Spirit2.)



And the question: in Spirit2 I had several custom directives with
arguments, using the following syntax:
        my_directive(first_arg, second_arg, ...)[ ...parser... ]
(think classic Spirit if_p / limit_d / ...), and also custom
"function-like" parsers:
        my_function(first_arg, second_arg, ...)
Depending on the cases, first_arg, second_arg, ... are either
"immediates" (number, ...) or Phoenix actors or Spirit parsers (well,
Proto exprs).

Figuring out how to hook those into the Spirit2 meta-grammar was a lot
of fun at the time :o) Now I am wondering how that should be
registered into Spirit2X's grammar. Basically I see two options, and
I'm wondering which one is best:

1.
- Make my_xxx an instance of a custom type (let's say my_xxx_terminal_spec)
- which has an operator() which receives first_arg, second_arg, ...
and returns an instance of another custom type which stores the values
of first_arg, second_arg, ... (let's say my_xxx_extended_terminal)
- And register that second type as a directive (with use_directive for
my_directive_extended_terminal) or as a terminal (with use_terminal
for my_function_extended_terminal)
IOW, do it much like it is done for char_ now (with terminal_spec
having an operator() which returns an extended_terminal, and that type
getting registered "normally" via use_terminal / use_directive).
That seems a little "harder" than the rest of extending Spirit2X which
is made very easy with use_terminal, use_directive, ... (having to
write an operator() by hand -- it's like for the operator() overloads
of parameterized rules and grammars, it's a little surprising that the
function-call syntax is not handled via Proto like the rest).

2.
Next to use_operator, use_terminal and use_directive, I see there is a
use_function, which is not implemented / used yet AFAICT.
Maybe it is meant to provide the way to declare this kind of
constructs in the future... Can you confirm -- and should I just wait
for it to be implemented? :)

TIA,
Fran├žois

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel

placeholder.patch (1K) Download Attachment
plus.patch (648 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Spirit2X] How to add function-like parsers and directives to Spirit2X (+ remarks)

Joel de Guzman-2
Francois Barel wrote:

> Hi,
>
> I started playing around with Spirit2X, and have a few minor remarks
> as well as one "how should I" question (to which the answer is
> probably "just wait for it to be implemented" :p).
>
>
> First off, 3 remarks:
>
> 1. In support/placeholder.hpp I suggest applying the attached patch
> (adding the boost namespace) so that the BOOST_SPIRIT_PLACEHOLDER and
> BOOST_SPIRIT_DEFINE_PLACEHOLDERS macros can be used in contexts which
> do not have "using namespace boost;".

Fixed.

> 2. Small copy/paste error in plus's what() function, patch attached.

Ak! Fixed.

> 3. Setting "pass" to false in an action (e.g. my_parser[pass = false])
> fails parsing as expected, yet the input iterator is still advanced
> (it isn't reset to the value it had before my_parser was called). Just
> wondering, since this is contrary to usual Spirit behavior -- is that
> intentional? (Looks like it was the same in Spirit2.)

Now this is a fundamental BUG! The iterators should be reset.
But here's a fundamental question: Do you find it useful to
allow the semantic action to fail a parse? You can already do
that using semantic predicates. Having to store the iterator
and restore it later might potentially be an unneeded operation
in the general case.

> And the question: in Spirit2 I had several custom directives with
> arguments, using the following syntax:
>         my_directive(first_arg, second_arg, ...)[ ...parser... ]
> (think classic Spirit if_p / limit_d / ...), and also custom
> "function-like" parsers:
>         my_function(first_arg, second_arg, ...)
> Depending on the cases, first_arg, second_arg, ... are either
> "immediates" (number, ...) or Phoenix actors or Spirit parsers (well,
> Proto exprs).
>
> Figuring out how to hook those into the Spirit2 meta-grammar was a lot
> of fun at the time :o)

Wow. You figured that out? :-P

> Now I am wondering how that should be
> registered into Spirit2X's grammar. Basically I see two options, and
> I'm wondering which one is best:
>
> 1.
> - Make my_xxx an instance of a custom type (let's say my_xxx_terminal_spec)
> - which has an operator() which receives first_arg, second_arg, ...
> and returns an instance of another custom type which stores the values
> of first_arg, second_arg, ... (let's say my_xxx_extended_terminal)
> - And register that second type as a directive (with use_directive for
> my_directive_extended_terminal) or as a terminal (with use_terminal
> for my_function_extended_terminal)
> IOW, do it much like it is done for char_ now (with terminal_spec
> having an operator() which returns an extended_terminal, and that type
> getting registered "normally" via use_terminal / use_directive).
> That seems a little "harder" than the rest of extending Spirit2X which
> is made very easy with use_terminal, use_directive, ... (having to
> write an operator() by hand -- it's like for the operator() overloads
> of parameterized rules and grammars, it's a little surprising that the
> function-call syntax is not handled via Proto like the rest).

This is the way to go, but it's not as difficult as you outlined it.
extended_terminal is supposed to be generic and is supposed to be
reused. It already contains all the smarts. You don't have to write
your own operator()s.

I'll outline how this is done in another "anatomy of a..." installment
in the weekend.

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [Spirit2X] How to add function-like parsers and directives to Spirit2X (+ remarks)

Francois Barel
Joel de Guzman wrote:

> Francois Barel wrote:
>
>> 3. Setting "pass" to false in an action (e.g. my_parser[pass = false])
>> fails parsing as expected, yet the input iterator is still advanced
>> (it isn't reset to the value it had before my_parser was called). Just
>> wondering, since this is contrary to usual Spirit behavior -- is that
>> intentional? (Looks like it was the same in Spirit2.)
>
> Now this is a fundamental BUG! The iterators should be reset.
> But here's a fundamental question: Do you find it useful to
> allow the semantic action to fail a parse? You can already do
> that using semantic predicates. Having to store the iterator
> and restore it later might potentially be an unneeded operation
> in the general case.

On the one hand I've never used it... partly because I didn't remember
it existed until recently (and if this hasn't been noticed before, I'm
guessing very few people have actually used it :))

On the other hand, I think it's nice to have the possibility: it
simplifies syntax -- you can do all in one action instead of having to
use a separate semantic predicate right after. For example the
construct I first used to emulate classic Spirit's limit_d: uint[ _a =
_1 ] >> eps(check(_a)) can be written as: uint[ pass = check(_1), _a =
_1 ]. Not an earth-breaking change, I agree... but still, I think it's
better (lighter, more readable) since the check is directly attached
to the relevant parser.
It's like classic Spirit's if_p/else_p: you can do it in Spirit2 with
((eps(cond) >> p_true) | (eps(!cond) >> p_false)), but it's a lot less
readable than having a dedicated syntax for it (although
performance-wise I guess it's very similar, unless cond is really
complex).

So in the end I'd vote to keep "pass". Do you expect a real
performance impact for this? In the general case (pass not set to
false) it's just an extra iterator copy, right?


> This is the way to go, but it's not as difficult as you outlined it.
> extended_terminal is supposed to be generic and is supposed to be
> reused. It already contains all the smarts. You don't have to write
> your own operator()s.
>
> I'll outline how this is done in another "anatomy of a..." installment
> in the weekend.

d'oh! That might explain why it is in support/ and not just in
qi/char/, I could have thought of that :)
OK then, thanks Joel! I just wanted to make sure I was going with the
right solution, and not something which would get replaced soon by
something simpler.

Thanks,
Fran├žois

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [Spirit2X] How to add function-like parsers and directives to Spirit2X (+ remarks)

Joel de Guzman-2
Francois Barel wrote:

> Joel de Guzman wrote:
>> Francois Barel wrote:
>>
>>> 3. Setting "pass" to false in an action (e.g. my_parser[pass = false])
>>> fails parsing as expected, yet the input iterator is still advanced
>>> (it isn't reset to the value it had before my_parser was called). Just
>>> wondering, since this is contrary to usual Spirit behavior -- is that
>>> intentional? (Looks like it was the same in Spirit2.)
>> Now this is a fundamental BUG! The iterators should be reset.
>> But here's a fundamental question: Do you find it useful to
>> allow the semantic action to fail a parse? You can already do
>> that using semantic predicates. Having to store the iterator
>> and restore it later might potentially be an unneeded operation
>> in the general case.
>
> On the one hand I've never used it... partly because I didn't remember
> it existed until recently (and if this hasn't been noticed before, I'm
> guessing very few people have actually used it :))
>
> On the other hand, I think it's nice to have the possibility: it
> simplifies syntax -- you can do all in one action instead of having to
> use a separate semantic predicate right after. For example the
> construct I first used to emulate classic Spirit's limit_d: uint[ _a =
> _1 ] >> eps(check(_a)) can be written as: uint[ pass = check(_1), _a =
> _1 ]. Not an earth-breaking change, I agree... but still, I think it's
> better (lighter, more readable) since the check is directly attached
> to the relevant parser.
> It's like classic Spirit's if_p/else_p: you can do it in Spirit2 with
> ((eps(cond) >> p_true) | (eps(!cond) >> p_false)), but it's a lot less
> readable than having a dedicated syntax for it (although
> performance-wise I guess it's very similar, unless cond is really
> complex).
>
> So in the end I'd vote to keep "pass". Do you expect a real
> performance impact for this? In the general case (pass not set to
> false) it's just an extra iterator copy, right?

Potentially, yes, it can have an impact. Here's the pseudo code
for that:

     iterator save = first;
     if (subject.parse(first, last))
     {
         if (action())
             return true;
     }
     first = save; // restore
     return false;

The iterator is held for the whole duration of the parse. If the
parse involves multi-megabytes of input, and if the iterator is a
multi_pass, then all that information must be cached in memory.

The bad thing about it is that regardless if the flag is used or
not, the client will still pay.

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel