spirit2 permutations and attributes

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

spirit2 permutations and attributes

Joel de Guzman-2
Hi Y'all,

Ok, so we know that spirit2 collapses unused attributes.
Some terminals have unused attributes. For example literals
have unused attributes so as to be able to collapse attributes
of expressions like:

     char_ >> ',' >> int_

into:

     tuple<char, int>

The collapsing rules work fine for sequences and alternatives.
Most of the time, we do not care about the literals. Yet, I know
at least one use case for alternatives where you want to have
an attribute for the literals. Example:

     lit("apple") | "banana" |  "cherry"

but, of course, you can use a symbol table for that. But what
if we had:

     int_ | "banana" |  "cherry"

I think this is clearly a case where we definitely want to
know if it's an int, a "banana" or "cherry" we parsed. I think
we need an attributed literal. Example:

     attr("banana")

(can anyone suggest a better name?)

I'm guessing the raw directive allows you to do the same:

     raw["banana"]

but the intent is not the same.

Anyway, so... permutations...

We all know (or do you?) that spirit2 reuses the seldom (if ever)
used ^ operator (I've never used it, has anyone?) for permutations.
It's useful for parsing items in any order. For example:

     char_('a') ^ 'b' ^ 'c'

allows us to parse:

     "a", "ab", "abc", "cba", "bca" ... etc.

yeah, permutations. (aside: we also probably need strict permutations
where elements need to occur at least once).

Now, the same problem with unused attribute collapsing arises. There's
no way to extract any usable info from the parse when some of the
attributes (all attributes in this case) are suppressed. What happens?
For example, the attribute of the above expression is unused_type. It
starts out as:

     fusion::vector<
        optional<unused_type>
      , optional<unused_type>
      , optional<unused_type>
     >

then some Spirit smarts collapses this to:

     fusion::vector<>

then, to:

    unused_type

based on Spirit's attribute collapsing rules.

This information is useless. We can never know if we parsed
"abc" or "cba" or "a" or "ab".

Again, the "raw" or "attr" (as suggested above) solves this problem
with the price of extra verbosity.

Your thoughts? Is there a better way?

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: spirit2 permutations and attributes

CARL BARRON-3

On Nov 4, 2008, at 11:00 AM, Joel de Guzman wrote:

> Hi Y'all,
>
> Ok, so we know that spirit2 collapses unused attributes.
> Some terminals have unused attributes. For example literals
> have unused attributes so as to be able to collapse attributes
> of expressions like:
>
>     char_ >> ',' >> int_
>
> into:
>
>     tuple<char, int>
>
> The collapsing rules work fine for sequences and alternatives.
> Most of the time, we do not care about the literals. Yet, I know
> at least one use case for alternatives where you want to have
> an attribute for the literals. Example:
>
>     lit("apple") | "banana" |  "cherry"
>
> but, of course, you can use a symbol table for that. But what
> if we had:
>
>     int_ | "banana" |  "cherry"
>
> I think this is clearly a case where we definitely want to
> know if it's an int, a "banana" or "cherry" we parsed. I think
> we need an attributed literal.
     Well  does not  variant<A,variant<B,C> > get reduced to  
variant<A,B,C> ?

so
        struct b_tag{};
        struct c_tag{{};
        symbols< char,variant<b_tag,c_tag> > fruits;
        //  fill the table
        rule<Iter,variant<int,b_tag,c_tag> foo;
        foo %= int_ | fruit;

        does the job for literals  like above but not for


        foo = char_('<')  << ("bananna" | " cherry" ) [_a = _1]   > *(char_ -  
'<') >> "<'/" >> lit(_a) >>  '>';



>
> Anyway, so... permutations...
>
> We all know (or do you?) that spirit2 reuses the seldom (if ever)
> used ^ operator (I've never used it, has anyone?) for permutations.
> It's useful for parsing items in any order. For example:
>
>     char_('a') ^ 'b' ^ 'c'
>
> allows us to parse:
>
>     "a", "ab", "abc", "cba", "bca" ... etc.
>
> yeah, permutations. (aside: we also probably need strict permutations
> where elements need to occur at least once).
>
> Now, the same problem with unused attribute collapsing arises. There's
> no way to extract any usable info from the parse when some of the
> attributes (all attributes in this case) are suppressed. What happens?
> For example, the attribute of the above expression is unused_type. It
> starts out as:
>
>     fusion::vector<
>        optional<unused_type>
>      , optional<unused_type>
>      , optional<unused_type>
>>
>
> then some Spirit smarts collapses this to:
>
>     fusion::vector<>
>
> then, to:
>
>    unused_type
>
> based on Spirit's attribute collapsing rules.
>
> This information is useless. We can never know if we parsed
> "abc" or "cba" or "a" or "ab".
>



> Again, the "raw" or "attr" (as suggested above) solves this problem
> with the price of extra verbosity.

>
>
> Your thoughts? Is there a better way?

    What is this automatic attribute going to look like?  I don't see  
anything that is non macro generated
that is portable.     template < char *p>  struct attrib; is not a  
solution if  duplicate string literals are stored
more than once in the executable.

        Have not ever used a permutation operator in spirit, so I only can  
say beware of the at least exponential
growth in complexity in terms of the number of items permuted.

        My solutions ussually is to add rules to get around which of   a or b  
is matched from

        foo = "a" | "b";
        if the code for a rule is getting unruly one can always create a  
grammar and hide details in the class
of this grammar.   then the real grammar does not get bogged down with  
the details involved, which are
non-zero but generally not difficult;.

       

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: spirit2 permutations and attributes

Joel de Guzman-2
Carl Barron wrote:

>> but, of course, you can use a symbol table for that. But what
>> if we had:
>>
>>     int_ | "banana" |  "cherry"
>>
>> I think this is clearly a case where we definitely want to
>> know if it's an int, a "banana" or "cherry" we parsed. I think
>> we need an attributed literal.
>      Well  does not  variant<A,variant<B,C> > get reduced to  
> variant<A,B,C> ?

Alternates are always "flattened", so, variant<A,variant<B,C> >
won't happen. For the expression:

     a | b | c ... | x

the attribute is a flat:

     variant<A,B,C, ... X>

All unused attributes are pruned from the typelist though,
as per collapsing rules.

> so
> struct b_tag{};
> struct c_tag{{};
> symbols< char,variant<b_tag,c_tag> > fruits;
> //  fill the table
> rule<Iter,variant<int,b_tag,c_tag> foo;
> foo %= int_ | fruit;
>
> does the job for literals  like above but not for
>

Yep. That works fine.

> foo = char_('<')  << ("bananna" | " cherry" ) [_a = _1]   > *(char_ -  
> '<') >> "<'/" >> lit(_a) >>  '>';

Yes :( Exactly my point.

>> Anyway, so... permutations...

[snips...]

>> Your thoughts? Is there a better way?
>
>     What is this automatic attribute going to look like?  I don't see  
> anything that is non macro generated
> that is portable.     template < char *p>  struct attrib; is not a  
> solution if  duplicate string literals are stored
> more than once in the executable.
>
> Have not ever used a permutation operator in spirit, so I only can  
> say beware of the at least exponential
> growth in complexity in terms of the number of items permuted.
>
> My solutions ussually is to add rules to get around which of   a or b  
> is matched from
>
> foo = "a" | "b";
> if the code for a rule is getting unruly one can always create a  
> grammar and hide details in the class
> of this grammar.   then the real grammar does not get bogged down with  
> the details involved, which are
> non-zero but generally not difficult;.

Oh, I'm not worried about combinatorial explosion. I've solved that
part. So, the attribute of:

     a ^ b ^ c ^ d

is just

     fusion::vector<optional<A>,optional<B>,optional<C>,optional<D> >

which works just fine, except when collapsing is involved. If
A is unused_type (say /a/ is a literal), then the attribute collapses
to:

     fusion::vector<optional<B>,optional<C>,optional<D> >

So, there's no way to know if /a/ was parsed.

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: spirit2 permutations and attributes

CARL BARRON-3

On Nov 4, 2008, at 9:09 PM, Joel de Guzman wrote:

>
> Oh, I'm not worried about combinatorial explosion. I've solved that
> part. So, the attribute of:
>
>     a ^ b ^ c ^ d
>
> is just
>
>     fusion::vector<optional<A>,optional<B>,optional<C>,optional<D> >
>
> which works just fine, except when collapsing is involved. If
> A is unused_type (say /a/ is a literal), then the attribute collapses
> to:
>
>     fusion::vector<optional<B>,optional<C>,optional<D> >
>
> So, there's no way to know if /a/ was parsed.

Ok  you want a type that is cheap and does not collapse results like  
unused_type does, correct.

Any old plain empty struct will prevent the entry from being  
collapsed, correct?

So  the problem with operator ^ () and unused_type becomes transform  
used_type to attrib_type
before reducing the result.

similistlu attrib [x]  can change the result type to attrib_type  
similar to what raw[] and most likely omit[] do
correct??

Seems like the solution is too simple:)



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: spirit2 permutations and attributes

Joel de Guzman-2
Carl Barron wrote:

> On Nov 4, 2008, at 9:09 PM, Joel de Guzman wrote:
>> Oh, I'm not worried about combinatorial explosion. I've solved that
>> part. So, the attribute of:
>>
>>     a ^ b ^ c ^ d
>>
>> is just
>>
>>     fusion::vector<optional<A>,optional<B>,optional<C>,optional<D> >
>>
>> which works just fine, except when collapsing is involved. If
>> A is unused_type (say /a/ is a literal), then the attribute collapses
>> to:
>>
>>     fusion::vector<optional<B>,optional<C>,optional<D> >
>>
>> So, there's no way to know if /a/ was parsed.
>
> Ok  you want a type that is cheap and does not collapse results like  
> unused_type does, correct.
>
> Any old plain empty struct will prevent the entry from being  
> collapsed, correct?
>
> So  the problem with operator ^ () and unused_type becomes transform  
> used_type to attrib_type
> before reducing the result.
>
> similistlu attrib [x]  can change the result type to attrib_type  
> similar to what raw[] and most likely omit[] do
> correct??
>
> Seems like the solution is too simple:)

Yes, that's the solution I suggested. I was wondering though if there's
a better way. For example, it is possible to inhibit collapsing for
certain parser expressions like a ^ b ^ c. That way, the expression:

     lit("a") ^ "b" ^ "c"

can have the attribute:

     fusion::vector<
         optional<unused_type>
       , optional<unused_type>
       , optional<unused_type>
     >

How is that useful? At least, you can check if any of "a", "b"
or "c" is parsed by checking each of the optionals. I.e. if the
input is "abc", then all optionals will check true; if the input
is "ac", only the first and third optionals will check true.

I'm not sure if it's a good idea to relax the collapsing rules
on certain ocassions though. It might cause more confusion than
its worth.

So, "decorating" parsers with unused attributes is indeed a
solution. It's not as simple as you think though. The
transformation is deep. For example, raw[xxx] works with only
literal string attributes. A generic attrib[p] directive will
have to tell all its children down evry node of p to bring back
its attribute. THis is not possible in certain cases.

Perhaps it's a good time to rethink about collapsing in the
first place. Or at least, the way we do automatic implicit
collapsing. Hmmm... more on that in another post.

Any comment, suggestion much appreciated!

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: spirit2 permutations and attributes

Hartmut Kaiser
> > On Nov 4, 2008, at 9:09 PM, Joel de Guzman wrote:
> >> Oh, I'm not worried about combinatorial explosion. I've solved that
> >> part. So, the attribute of:
> >>
> >>     a ^ b ^ c ^ d
> >>
> >> is just
> >>
> >>     fusion::vector<optional<A>,optional<B>,optional<C>,optional<D> >
> >>
> >> which works just fine, except when collapsing is involved. If
> >> A is unused_type (say /a/ is a literal), then the attribute
> collapses
> >> to:
> >>
> >>     fusion::vector<optional<B>,optional<C>,optional<D> >
> >>
> >> So, there's no way to know if /a/ was parsed.
> >
> > Ok  you want a type that is cheap and does not collapse results like
> > unused_type does, correct.
> >
> > Any old plain empty struct will prevent the entry from being
> > collapsed, correct?
> >
> > So  the problem with operator ^ () and unused_type becomes transform
> > used_type to attrib_type
> > before reducing the result.
> >
> > similistlu attrib [x]  can change the result type to attrib_type
> > similar to what raw[] and most likely omit[] do
> > correct??
> >
> > Seems like the solution is too simple:)
>
> Yes, that's the solution I suggested. I was wondering though if there's
> a better way. For example, it is possible to inhibit collapsing for
> certain parser expressions like a ^ b ^ c. That way, the expression:
>
>      lit("a") ^ "b" ^ "c"
>
> can have the attribute:
>
>      fusion::vector<
>          optional<unused_type>
>        , optional<unused_type>
>        , optional<unused_type>
>      >
>
> How is that useful? At least, you can check if any of "a", "b"
> or "c" is parsed by checking each of the optionals. I.e. if the
> input is "abc", then all optionals will check true; if the input
> is "ac", only the first and third optionals will check true.
>
> I'm not sure if it's a good idea to relax the collapsing rules
> on certain ocassions though. It might cause more confusion than
> its worth.
>
> So, "decorating" parsers with unused attributes is indeed a
> solution. It's not as simple as you think though. The
> transformation is deep. For example, raw[xxx] works with only
> literal string attributes. A generic attrib[p] directive will
> have to tell all its children down evry node of p to bring back
> its attribute. THis is not possible in certain cases.
>
> Perhaps it's a good time to rethink about collapsing in the
> first place. Or at least, the way we do automatic implicit
> collapsing. Hmmm... more on that in another post.

As already mentioned while chatting yesterday I think the only solution is
to provide explicit means of decorating the parsers which shouldn't
collapse/hide their attributes:

    attr("abc") >> int_;

where:

    a: A --> attr(a): A
    a: unused --> attr(a): A (where A is the type of a before collapsing)

even if a itself otherwise would have an unused attribute.

Perhaps using a syntax similar to raw is more appropriate:

    attr['a' >> 'b'] --> tuple<char, char>

Wrt your permutation example: I think we should decorate the whole thing:

    attr[lit("a") ^ "b" ^ "c"]

and not only the first element. Or did I misunderstand your suggestion?

Regards Hartmut





-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: spirit2 permutations and attributes

Joel de Guzman-2
Hartmut Kaiser wrote:

> As already mentioned while chatting yesterday I think the only solution is
> to provide explicit means of decorating the parsers which shouldn't
> collapse/hide their attributes:
>
>     attr("abc") >> int_;
>
> where:
>
>     a: A --> attr(a): A
>     a: unused --> attr(a): A (where A is the type of a before collapsing)
>
> even if a itself otherwise would have an unused attribute.
>
> Perhaps using a syntax similar to raw is more appropriate:
>
>     attr['a' >> 'b'] --> tuple<char, char>
>
> Wrt your permutation example: I think we should decorate the whole thing:
>
>     attr[lit("a") ^ "b" ^ "c"]
>
> and not only the first element. Or did I misunderstand your suggestion?

No, but to be honest, I don't like it. I want to explore other
options. If you look closely, there is really no hard rules
on attribute collapsing. The sequence and the alternate do it
differently. The alternate does not actually strip all unused
attributes, for example, just reorders them such that the unused
attribute is at the beginning of the variant and as required by
variant, there must be at most one element of a type. Following
this, it can be seen that the attribute collapsing, or in a
more general sense, attribute transformation, varies depending on
the composite parser. My current thinking is that, for the permutation,
stripping unused attributes is not suitable.

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel