Another example parser raises some questions

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Another example parser raises some questions

Richard-45
RFC 5322 specifies the grammar for dates and times in email message header
fields.  RFC 5322 superseded RFC 2822, which in turn superseded RFC 822.

Newer RFCs deprecated some of the variation in generating dates, but
a good parser should still accept all the stuff allowed in the older
specifications, particularly when parsing older content.

Here is the RFC 5322 date time parser:
<https://github.com/LegalizeAdulthood/ucpp-date-time-parser>

I added a few more tips based on this example compred to the JSON
example.  I will probably move the Tips closer to the top before the
date/time production rules.  Doing this parser step-by-step with TDD
really made things sane, I must say.

Parser implementation:
<https://github.com/LegalizeAdulthood/ucpp-date-time-parser/blob/master/date_time.cpp>

It handles the basics so far and isn't handling the full RFC 5322 spec
vis-a-vis the variation allowed in older formats.

While working on this parser, I ran into a couple things:

- The generalized integer parsers are used as an instance like:

    uint_parser<unsigned, 10, 2, 2> digit_2;
    start = digit_2 << ':' << digit_2;

  or

    start = uint_parser<unsigned, 10, 2, 2>{} << ':'
        << uint_parser<unsigned, 10, 2, 2>{};

  Where I've used {} to be clear about explicit initialization of
  an instance to default (C++11).

  I had this wrong on the reference card.  Looking deeper into the
  reference docs, I simply missed copying over an extra pair of ()'s.
  I will fix this in the reference card.

- Parsing the date/time isn't enough, it also needs to be validated.
  For instance, hours can't be outside the range [0,23].  I can do range
  checking after parsing or in a semantic action.  I seem to recall that
  there was a way to enforce this in spirit using only the parser and
  no semantic action?

  If semantic actions are required, is this the generally accepted
  best practice?

- I have a big fusion adapted struct holding all the chunks.  This
  mapped very nicely to a bunch of <<'s on the pieces.  However, at
  some point I would like to be able to create additional nonterminal
  rules to separate code for date parsing from code for time parsing
  without changing my structure.  I wanted to write:

    start = date_value >> time_value;
    date_value = ...;
    time_value = ...;

  However, now my attribute sequence for start is a 2-tuple instead of an
  8-tuple, so my fusion adapted struct no longer works directly.  It
  felt icky to have to write semantic actions just to assign into the
  larger structure from the two subrules.

  Are semantic actions my only "way out" here?

- Take a look closely at the rule for seconds in the implementation.
  I really wanted to write:

    start = ... >> digit_2 >> no_skip[':' >> digit_2 >> seconds];

  but this has the same problem as the 2-tuple vs. the 8-tuple; now my
  attribute for start is tuple<A,B,C,D,tuple<E,F,G>,H>.  So pushing
  the no_skip[] down over each element gave me:

    start = ... >> digit_2
        >> no_skip[lit(':')] >> no_skip[digit_2] >> no_skip[seconds]
        >> time_zone_offset;

  but this resulted in a compile error when applying no_skip[] to my
  rule for seconds.  If I applied no_skip[] around the implementation
  of the rule, then it worked as expected.  To my eyes, they appear to
  be no different, but obviously they are different to the compiler.

  What is the reason for this difference?
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
     The Computer Graphics Museum <http://ComputerGraphicsMuseum.org>
         The Terminals Wiki <http://terminals.classiccmp.org>
  Legalize Adulthood! (my blog) <http://LegalizeAdulthood.wordpress.com>

------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Another example parser raises some questions

teajay-2
On 11/11/2014 04:51 PM, Richard wrote:

> RFC 5322 specifies the grammar for dates and times in email message header
> fields.  RFC 5322 superseded RFC 2822, which in turn superseded RFC 822.
>
> Newer RFCs deprecated some of the variation in generating dates, but
> a good parser should still accept all the stuff allowed in the older
> specifications, particularly when parsing older content.
>
> Here is the RFC 5322 date time parser:
> <https://github.com/LegalizeAdulthood/ucpp-date-time-parser>
>
> I added a few more tips based on this example compred to the JSON
> example.  I will probably move the Tips closer to the top before the
> date/time production rules.  Doing this parser step-by-step with TDD
> really made things sane, I must say.
>
> Parser implementation:
> <https://github.com/LegalizeAdulthood/ucpp-date-time-parser/blob/master/date_time.cpp>
>
> It handles the basics so far and isn't handling the full RFC 5322 spec
> vis-a-vis the variation allowed in older formats.
>
> While working on this parser, I ran into a couple things:
>
> - The generalized integer parsers are used as an instance like:
>
>     uint_parser<unsigned, 10, 2, 2> digit_2;
>     start = digit_2 << ':' << digit_2;
>
>   or
>
>     start = uint_parser<unsigned, 10, 2, 2>{} << ':'
>         << uint_parser<unsigned, 10, 2, 2>{};
>
>   Where I've used {} to be clear about explicit initialization of
>   an instance to default (C++11).
>
>   I had this wrong on the reference card.  Looking deeper into the
>   reference docs, I simply missed copying over an extra pair of ()'s.
>   I will fix this in the reference card.
>
> - Parsing the date/time isn't enough, it also needs to be validated.
>   For instance, hours can't be outside the range [0,23].  I can do range
>   checking after parsing or in a semantic action.  I seem to recall that
>   there was a way to enforce this in spirit using only the parser and
>   no semantic action?
>
You could do that by using a special type synonym for the integers, and
provide the ranges as numeric limit traits.

Or you could also refine the numeric parser by deriving your own to
perform the range checking.

This is all possible, is this easier to understand and to develop ? I'm
not sure.

>   If semantic actions are required, is this the generally accepted
>   best practice?
>

I would tend to say, yes, it's clear, straight forward and easier to
understand.

> - I have a big fusion adapted struct holding all the chunks.  This
>   mapped very nicely to a bunch of <<'s on the pieces.  However, at
>   some point I would like to be able to create additional nonterminal
>   rules to separate code for date parsing from code for time parsing
>   without changing my structure.  I wanted to write:
>
>     start = date_value >> time_value;
>     date_value = ...;
>     time_value = ...;
>
>   However, now my attribute sequence for start is a 2-tuple instead of an
>   8-tuple, so my fusion adapted struct no longer works directly.  It
>   felt icky to have to write semantic actions just to assign into the
>   larger structure from the two subrules.
>
>   Are semantic actions my only "way out" here?
>

Take a look at my fork of your test project:
https://github.com/teajay-fr/ucpp-date-time-parser. It can be done, by
using the attr_cast customization point and casting to the correct type
rule type. This can be done as the layout of the structures is the same.

But is all this trouble worth it ? It's up to you.


> - Take a look closely at the rule for seconds in the implementation.
>   I really wanted to write:
>
>     start = ... >> digit_2 >> no_skip[':' >> digit_2 >> seconds];
>
>   but this has the same problem as the 2-tuple vs. the 8-tuple; now my
>   attribute for start is tuple<A,B,C,D,tuple<E,F,G>,H>.  So pushing
>   the no_skip[] down over each element gave me:
>
>     start = ... >> digit_2
>         >> no_skip[lit(':')] >> no_skip[digit_2] >> no_skip[seconds]
>         >> time_zone_offset;
>
>     start = ... >> digit_2
>         >> no_skip[lit(':')] >> no_skip[digit_2] >> no_skip[seconds]
>         >> time_zone_offset;
>   but this resulted in a compile error when applying no_skip[] to my
>   rule for seconds.  If I applied no_skip[] around the implementation
>   of the rule, then it worked as expected.  To my eyes, they appear to
>   be no different, but obviously they are different to the compiler.
>
>   What is the reason for this difference?
>
When you apply the no_skip to a rule, you have to tell the rule that it
should use the unused skipper type :
boost::spirit::qi::detail::unused_skipper<skipper> >.

That's what I understood from the error message.


Regards,

Thomas Bernard


------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Another example parser raises some questions

Richard-45

Thanks very much for your responses, Thomas!  Hopefully this will help
me to answer questions as we work through this example as a group
tomorrow night :).

In article <m3u3ip$e5o$[hidden email]>,
    Thomas Bernard <[hidden email]> writes:

> > - I have a big fusion adapted struct holding all the chunks.  This
> >   mapped very nicely to a bunch of <<'s on the pieces.  However, at
> >   some point I would like to be able to create additional nonterminal
> >   rules to separate code for date parsing from code for time parsing
> >   without changing my structure.  I wanted to write:
> >
> >     start = date_value >> time_value;
> >     date_value = ...;
> >     time_value = ...;
> >
> >   However, now my attribute sequence for start is a 2-tuple instead of an
> >   8-tuple, so my fusion adapted struct no longer works directly.  It
> >   felt icky to have to write semantic actions just to assign into the
> >   larger structure from the two subrules.
> >
> >   Are semantic actions my only "way out" here?
>
> Take a look at my fork of your test project:
> https://github.com/teajay-fr/ucpp-date-time-parser. It can be done, by
> using the attr_cast customization point and casting to the correct type
> rule type. This can be done as the layout of the structures is the same.
>
> But is all this trouble worth it ? It's up to you.

I confess that I looked at your solution and have no idea why it works.

The code makes mention of an 'int', but I don't see int used
anywhere...  I will look at the documentation for this Qi extension
point tonight and see if I reach a better understanding.

Thanks again for taking the time to help!

-- Richard

--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
     The Computer Graphics Museum <http://ComputerGraphicsMuseum.org>
         The Terminals Wiki <http://terminals.classiccmp.org>
  Legalize Adulthood! (my blog) <http://LegalizeAdulthood.wordpress.com>

------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Another example parser raises some questions

Richard-45
In reply to this post by teajay-2

In article <m3u3ip$e5o$[hidden email]>,
    Thomas Bernard <[hidden email]> writes:

> > - I have a big fusion adapted struct holding all the chunks.  This
> >   mapped very nicely to a bunch of <<'s on the pieces.  However, at
> >   some point I would like to be able to create additional nonterminal
> >   rules to separate code for date parsing from code for time parsing
> >   without changing my structure.  I wanted to write:
> >
> >     start = date_value >> time_value;
> >     date_value = ...;
> >     time_value = ...;
> >
> >   However, now my attribute sequence for start is a 2-tuple instead of an
> >   8-tuple, so my fusion adapted struct no longer works directly.  It
> >   felt icky to have to write semantic actions just to assign into the
> >   larger structure from the two subrules.
> >
> >   Are semantic actions my only "way out" here?
> >
>
> Take a look at my fork of your test project:
> https://github.com/teajay-fr/ucpp-date-time-parser. It can be done, by
> using the attr_cast customization point and casting to the correct type
> rule type. This can be done as the layout of the structures is the same.
>
> But is all this trouble worth it ? It's up to you.

I think what I was hoping for was the ability to have attributes propagate
like this:

    a >> b
    a: tuple<A, B>, b: tuple<C, D> --> tuple<A, B, C, D>

However, this seems directly at odds with the ability to fusion adapt
a struct inside another struct, which would require something like:

    a >> b
    a: tuple<A, B>, b: tuple<C, D> ->> tuple<tuple<A, B>, tuple<C, D>>

Maybe I'm just being lazy :-) and the code is really trying to tell me
that a moment wants to be a composite of two structs: a date part and
a time part.
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
     The Computer Graphics Museum <http://ComputerGraphicsMuseum.org>
         The Terminals Wiki <http://terminals.classiccmp.org>
  Legalize Adulthood! (my blog) <http://LegalizeAdulthood.wordpress.com>

------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Another example parser raises some questions

Joel de Guzman
On 11/12/14, 9:58 AM, Richard wrote:

>
> In article <m3u3ip$e5o$[hidden email]>,
>      Thomas Bernard <[hidden email]> writes:
>
>>> - I have a big fusion adapted struct holding all the chunks.  This
>>>    mapped very nicely to a bunch of <<'s on the pieces.  However, at
>>>    some point I would like to be able to create additional nonterminal
>>>    rules to separate code for date parsing from code for time parsing
>>>    without changing my structure.  I wanted to write:
>>>
>>>      start = date_value >> time_value;
>>>      date_value = ...;
>>>      time_value = ...;
>>>
>>>    However, now my attribute sequence for start is a 2-tuple instead of an
>>>    8-tuple, so my fusion adapted struct no longer works directly.  It
>>>    felt icky to have to write semantic actions just to assign into the
>>>    larger structure from the two subrules.
>>>
>>>    Are semantic actions my only "way out" here?
>>>
>>
>> Take a look at my fork of your test project:
>> https://github.com/teajay-fr/ucpp-date-time-parser. It can be done, by
>> using the attr_cast customization point and casting to the correct type
>> rule type. This can be done as the layout of the structures is the same.
>>
>> But is all this trouble worth it ? It's up to you.
>
> I think what I was hoping for was the ability to have attributes propagate
> like this:
>
>      a >> b
>      a: tuple<A, B>, b: tuple<C, D> --> tuple<A, B, C, D>
>
> However, this seems directly at odds with the ability to fusion adapt
> a struct inside another struct, which would require something like:
>
>      a >> b
>      a: tuple<A, B>, b: tuple<C, D> ->> tuple<tuple<A, B>, tuple<C, D>>
>
> Maybe I'm just being lazy :-) and the code is really trying to tell me
> that a moment wants to be a composite of two structs: a date part and
> a time part.

Actually, here are some tricks that I can share that might work for you.

1) Use BOOST_FUSION_ADAPT_STRUCT_NAMED to adapt a struct X one or
    more times using different "views" of X. E.g. X1 can adapt the
    first two members X2 the second two members.

2) Use expressions instead of names to adapt composed members. Example:

     struct foo
     {
         int x;
     };

     struct bar
     {
         foo foo_;
         int y;
     };

BOOST_FUSION_ADAPT_STRUCT(
     ns::bar,
     (int, foo_.x) // test that adapted members can actually be expressions
     (int, y)
)

I just tested this and it works!

Regards,
--
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/


------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Another example parser raises some questions

teajay-2
In reply to this post by teajay-2
Am 12.11.2014 00:45, schrieb Thomas Bernard:

> On 11/11/2014 04:51 PM, Richard wrote:
>> RFC 5322 specifies the grammar for dates and times in email message header
>> fields.  RFC 5322 superseded RFC 2822, which in turn superseded RFC 822.
>>
>> Newer RFCs deprecated some of the variation in generating dates, but
>> a good parser should still accept all the stuff allowed in the older
>> specifications, particularly when parsing older content.
>>
>> Here is the RFC 5322 date time parser:
>> <https://github.com/LegalizeAdulthood/ucpp-date-time-parser>
>>
>> I added a few more tips based on this example compred to the JSON
>> example.  I will probably move the Tips closer to the top before the
>> date/time production rules.  Doing this parser step-by-step with TDD
>> really made things sane, I must say.
>>
>> Parser implementation:
>> <https://github.com/LegalizeAdulthood/ucpp-date-time-parser/blob/master/date_time.cpp>
>>
>> It handles the basics so far and isn't handling the full RFC 5322 spec
>> vis-a-vis the variation allowed in older formats.
>>
>> While working on this parser, I ran into a couple things:
>>
>> - The generalized integer parsers are used as an instance like:
>>
>>      uint_parser<unsigned, 10, 2, 2> digit_2;
>>      start = digit_2 << ':' << digit_2;
>>
>>    or
>>
>>      start = uint_parser<unsigned, 10, 2, 2>{} << ':'
>>          << uint_parser<unsigned, 10, 2, 2>{};
>>
>>    Where I've used {} to be clear about explicit initialization of
>>    an instance to default (C++11).
>>
>>    I had this wrong on the reference card.  Looking deeper into the
>>    reference docs, I simply missed copying over an extra pair of ()'s.
>>    I will fix this in the reference card.
>>
>> - Parsing the date/time isn't enough, it also needs to be validated.
>>    For instance, hours can't be outside the range [0,23].  I can do range
>>    checking after parsing or in a semantic action.  I seem to recall that
>>    there was a way to enforce this in spirit using only the parser and
>>    no semantic action?
>>
> You could do that by using a special type synonym for the integers, and
> provide the ranges as numeric limit traits.
>
> Or you could also refine the numeric parser by deriving your own to
> perform the range checking.
>
> This is all possible, is this easier to understand and to develop ? I'm
> not sure.
>
>>    If semantic actions are required, is this the generally accepted
>>    best practice?
>>
>
> I would tend to say, yes, it's clear, straight forward and easier to
> understand.
>
>> - I have a big fusion adapted struct holding all the chunks.  This
>>    mapped very nicely to a bunch of <<'s on the pieces.  However, at
>>    some point I would like to be able to create additional nonterminal
>>    rules to separate code for date parsing from code for time parsing
>>    without changing my structure.  I wanted to write:
>>
>>      start = date_value >> time_value;
>>      date_value = ...;
>>      time_value = ...;
>>
>>    However, now my attribute sequence for start is a 2-tuple instead of an
>>    8-tuple, so my fusion adapted struct no longer works directly.  It
>>    felt icky to have to write semantic actions just to assign into the
>>    larger structure from the two subrules.
>>
>>    Are semantic actions my only "way out" here?
>>
>
> Take a look at my fork of your test project:
> https://github.com/teajay-fr/ucpp-date-time-parser. It can be done, by
> using the attr_cast customization point and casting to the correct type
> rule type. This can be done as the layout of the structures is the same.
>
> But is all this trouble worth it ? It's up to you.
>
>
>> - Take a look closely at the rule for seconds in the implementation.
>>    I really wanted to write:
>>
>>      start = ... >> digit_2 >> no_skip[':' >> digit_2 >> seconds];
>>
>>    but this has the same problem as the 2-tuple vs. the 8-tuple; now my
>>    attribute for start is tuple<A,B,C,D,tuple<E,F,G>,H>.  So pushing
>>    the no_skip[] down over each element gave me:
>>
>>      start = ... >> digit_2
>>          >> no_skip[lit(':')] >> no_skip[digit_2] >> no_skip[seconds]
>>          >> time_zone_offset;
>>
>>      start = ... >> digit_2
>>          >> no_skip[lit(':')] >> no_skip[digit_2] >> no_skip[seconds]
>>          >> time_zone_offset;
>>    but this resulted in a compile error when applying no_skip[] to my
>>    rule for seconds.  If I applied no_skip[] around the implementation
>>    of the rule, then it worked as expected.  To my eyes, they appear to
>>    be no different, but obviously they are different to the compiler.
>>
>>    What is the reason for this difference?
>>
> When you apply the no_skip to a rule, you have to tell the rule that it
> should use the unused skipper type :
> boost::spirit::qi::detail::unused_skipper<skipper> >.
>
> That's what I understood from the error message.
>
>
> Regards,
>
> Thomas Bernard
>

Another precission about no_skip: no_skip only disables pre skiping.
There is also lexeme available which disables skipping completely. Maybe
this is more appropriate for your use case ?

Regards,

Thomas
About noIt just bumped



------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Another example parser raises some questions

Richard-45

In article <m3v5l4$igu$[hidden email]>,
    Thomas Bernard <[hidden email]> writes:

> Another precission about no_skip: no_skip only disables pre skiping.
> There is also lexeme available which disables skipping completely. Maybe
> this is more appropriate for your use case ?

Yes, I think in my case I do want to disable pre-skipping, at least
how I used it.  For instance, the productions for the time component
allow for whitespace before the time value and after, but not before
or after the separating ':'.[*]  So I was doing:

        time = digit_2 >> no_skip[lit(':')] >> no_skip[digit_2]
                >> no_skip[seconds];

which disabled pre-skipping before the : and any skipping after the :.

After I split the definitions for date and time into separate rules
and made my date_time::moment a std::pair<date_time::date,date_time::time>
I tried using

        time = lexeme[digit_2 >> ':' >> digit_2 >> seconds]

and a bunch of my unit tests failed.  Rather than jumping down the
rabbit hole, I simply reverted the change and went back to the version
with the individual no_skips.

[*] The "obsolete" productions from RFC 822 allow for arbitrary amounts
of whitespace, including () comments, between any two tokens.  I've
written the code currently to ignore these "obsolete" forms.  A real
production parser would have to handle all those complications, but
this is meant to be an exercise to get practice with spirit and not be
a production quality parser.  Hrm.  Perhaps I should state that in the
ReadMe.md :-).
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
     The Computer Graphics Museum <http://ComputerGraphicsMuseum.org>
         The Terminals Wiki <http://terminals.classiccmp.org>
  Legalize Adulthood! (my blog) <http://LegalizeAdulthood.wordpress.com>

------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Another example parser raises some questions

TONGARI J
In reply to this post by teajay-2
2014-11-12 16:27 GMT+08:00 Thomas Bernard <[hidden email]>:
Another precission about no_skip: no_skip only disables pre skiping.
There is also lexeme available which disables skipping completely.

This is not true:  `no_skip` __also__ disables pre-skip, otherwise it is equivalent to `lexeme`.
`lexeme` does pre-skip; `no_skip` is the one that disables skipping completely.

See:

------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Another example parser raises some questions

sehe
In reply to this post by teajay-2
On 12-11-14 00:45, Thomas Bernard wrote:
> But is all this trouble worth it ? It's up to you.
not to mention sailing on the edges of tested (reliable) functionality:
http://boost.2283326.n4.nabble.com/transform-attribute-gt-segfault-td4657399.html#a4657403

------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Another example parser raises some questions

mandrews
In reply to this post by Joel de Guzman
I've been using spirit for a few days now, so apologies if this one is obvious.

I'm trying a couple little test examples using  Richard Thomson's date_time as a starting point and Joel's suggestion    2) Use expressions instead of names to adapt composed members.
For example, I want to build up pieces for various log file fields and then compose them together as needed for specific cases.
Suppose I have a composed type event_record containing a date_time and an event signature defined as:


                        namespace date_time
                        {

                        enum months
                        {
                                        January = 1,
                                        February,
                        ...
                        };

                        struct moment
                        {
                                months month;
                                unsigned day;
                                unsigned hour;
                                unsigned minute;
                                unsigned second;
                        ...
                        };
                        } // namespace date_time


                        namespace event_signature
                        {

                                struct snort_signature
                                {
                                        unsigned gid;
                                        unsigned signature_id;
                                        unsigned context;
                                };
                        } // namespace event_signature

                        namespace event_record
                        {
                                struct snort_event
                                {
                                        date_time::moment moment_;
                                        event_signature::snort_signature  sig_;
                                };

                                snort_event parse(std::string const& text);

                        } // namespace event_record



BOOST_FUSION_ADAPT_STRUCT(
        event_record::snort_event,
        (date_time::months, moment_.month)
        (unsigned, moment_.day)
        (unsigned, moment_.hour)
        (unsigned, moment_.minute)
        (unsigned, moment_.second)
        (unsigned, sig_.gid)
        (unsigned, sig_.signature_id)
        );

namespace
{

        typedef ascii::space_type skipper;

        template <typename Iter>
        struct date_time_grammar : grammar<Iter, date_time::moment(), skipper>
        {
                date_time_grammar() : date_time_grammar::base_type{start}
                {
                        month_names.add("Jan", date_time::January)
                                ("Feb", date_time::February)
                                ("Mar", date_time::March)
                                ("Apr", date_time::April)
                                ("May", date_time::May)
                                ("Jun", date_time::June)
                                ("Jul", date_time::July)
                                ("Aug", date_time::August)
                                ("Sep", date_time::September)
                                ("Oct", date_time::October)
                                ("Nov", date_time::November)
                                ("Dec", date_time::December);
                        uint_parser<unsigned, 10, 1, 2> digit_1_2;
                        uint_parser<unsigned, 10, 2, 2> digit_2;

                        start = month_names >> digit_1_2 >> digit_2 >> lit(':') >> digit_2 >> lit(':') >> digit_2 ;
                };

                symbols<char const, date_time::months> month_names;
                rule<Iter, date_time::moment(), skipper> start;
        };

        template <typename Iter>
        struct snort_event_grammar : grammar<Iter, event_record::snort_event(), skipper>
        {
                snort_event_grammar() : snort_event_grammar::base_type{ start }
                {
                        start = rsyslog_date >> attr(1) >> attr(2) >> attr(3);
                };

                date_time_grammar<Iter> rsyslog_date;
                rule<Iter, event_record::snort_event(), skipper> start;
        };
...


Trying to compile under VS2013 update 3  this gives the error:
error C2440: 'static_cast' : cannot convert from 'const date_time::months' to 'date_time::moment'
1>          No constructor could take the source type, or constructor overload resolution was ambiguous
1>          D:\wp\eng\main\third-party\boost\1.55.0\boost/spirit/home/qi/detail/assign_to.hpp(170) : see reference to function template instantiation 'void
                boost::spirit::traits::assign_to_attribute_from_value<Attribute,T,void>::call<T>(const T_ &,Attribute &,boost::mpl::false_)' being compiled
1>          with
1>          [
1>              Attribute=date_time::moment
1>  ,            T=date_time::months
1>  ,            T_=date_time::months
1>          ]
Called from the line " start = month_names >> digit_1_2 >> digit_2 >> lit(':') >> digit_2..." in  date_time_grammar

It appears something isn't right with my use of BOOST_FUSION_ADAPT_STRUCT - why is it still looking for a moment struct attribute rather than the moment struct members?

- TIA,  MarkA

-----Original Message-----
From: Joel de Guzman [mailto:[hidden email]]
Sent: Tuesday, November 11, 2014 7:10 PM
To: [hidden email]
Subject: Re: [Spirit-general] Another example parser raises some questions

...
Actually, here are some tricks that I can share that might work for you.

1) Use BOOST_FUSION_ADAPT_STRUCT_NAMED to adapt a struct X one or
    more times using different "views" of X. E.g. X1 can adapt the
    first two members X2 the second two members.

2) Use expressions instead of names to adapt composed members. Example:

     struct foo
     {
         int x;
     };

     struct bar
     {
         foo foo_;
         int y;
     };

BOOST_FUSION_ADAPT_STRUCT(
     ns::bar,
     (int, foo_.x) // test that adapted members can actually be expressions
     (int, y)
)

I just tested this and it works!

Regards,
--
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/


------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general

------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Another example parser raises some questions

Richard-45

Hi Mark,

I think I see the problem in your code.  Let me see if I figured it
out :).

In article <FE246E6DF88A05429FA6E2DF54CB4ECB014E524062C9@WPEXCHANGE>,
    Mark Andrews <[hidden email]> writes:

> BOOST_FUSION_ADAPT_STRUCT(
>     event_record::snort_event,
>     (date_time::months, moment_.month)

This is saying that the first type in the fusion sequence for a
snort_event is a month of type date_time::months.

>     (unsigned, moment_.day)
>     (unsigned, moment_.hour)
>     (unsigned, moment_.minute)
>     (unsigned, moment_.second)
>     (unsigned, sig_.gid)
>     (unsigned, sig_.signature_id)
>     );
>
> namespace
> {
>
> typedef ascii::space_type skipper;
>
> template <typename Iter>
> struct date_time_grammar : grammar<Iter, date_time::moment(), skipper>

This grammar has a synthesized attribute type of date_time::moment.

> template <typename Iter>
> struct snort_event_grammar : grammar<Iter, event_record::snort_event(),
 skipper>
> {
> snort_event_grammar() : snort_event_grammar::base_type{ start }
> {
> start = rsyslog_date >> attr(1) >> attr(2) >> attr(3);

Spirit will synthesize the attribute type from the construct on the
right as tuple<date_time::moment(), int, int, int>.  It will then
attempt to assign that to the synthesized attribute for start, which
is event_record::snort_event.

However, the type signatures aren't compatible and that's why you get
this error:

> error C2440: 'static_cast' : cannot convert from 'const date_time::months' to
 'date_time::moment'

If we want to use the technique suggested by Joel of adapting the
strut to moment_.day, moment_.hour, etc., then you need to change your
grammar and decide who is responsible for stuffing in the components
of the moment: it's either rsyslog_date or it's start.

I think it leads to more composable parsers when you let the
sub-parsers deal with the details of assigning the pieces into their
own data type, in this case a date_time::moment.

That's why I ended up "listening to my code" and creating a struct for
the date (its own parseable concept) and a struct for the time (its
own parseable concept) and having moment simply be a
std::pair<date,time>.
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
     The Computer Graphics Museum <http://ComputerGraphicsMuseum.org>
         The Terminals Wiki <http://terminals.classiccmp.org>
  Legalize Adulthood! (my blog) <http://LegalizeAdulthood.wordpress.com>

------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Another example parser raises some questions

mandrews
> If we want to use the technique suggested by Joel of adapting the strut to moment_.day, moment_.hour, > etc., then you need to change your grammar and decide who is responsible for stuffing in the components > of the moment: it's either rsyslog_date or it's start.

> I think it leads to more composable parsers when you let the sub-parsers deal with the details of assigning > the pieces into their own data type, in this case a date_time::moment.

Yep, that's it. I assigned all the pieces in sub-parsers as you suggest; it works properly and seems cleaner.
Thanks,
Mark