Parser operator: Difference

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Parser operator: Difference

Orient
According to the documentation http://ciere.com/cppnow15/x3_docs/spirit/quick_reference/operator.html the "minus" operator overloaded to be:

Expression | Attribute | Description
a - b |  A | Difference. Parse a but not b

Given a two input iterators beg and end, cieted description sounds like: parse a (input: it_a = beg ... end), if a was successful (it_a was incremented therefore), then parse b with input it_b = beg ... it_a, if b was successful and it_a == it_b, then success. If any condition failed, then parser fails. But code talks me that b applies firstly:

// Try Right first
Iterator start = first;
if (this->right.parse(first, last, context, rcontext, unused))
{
    // Right succeeds, we fail.
    first = start;
    return false;
}
// Right fails, now try Left
return this->left.parse(first, last, context, rcontext, attr);

Intuitively it is not clear as it really is.

What I want is to distinct keywords and common symbols: (pseudocode)
// three sets of reserved symbols:
x3::symbols< ast::constant > const constant
({
     ...
 },
 "constant");

x3::symbols< ast::intrinsic > const intrinsic
({
     ...
 },
 "intrinsic");

x3::symbols< ast::keyword > const keyword
({
     ...
 },
 "keyword");
// then I want to define "identifier" rule:
auto const identifier_def =
        x3::raw[x3::lexeme[((x3::alpha | '_') >> *(x3::alnum | '_')) - (keyword | intrinsic | constant)]]
        ;
// it will be optimal solution if "-" behaviour would be as I mentioned above, but currently I do as following:
template< typename parser >
decltype(auto)
distinct_keyword(parser && _parser)
{
    return x3::lexeme[std::forward< parser >(_parser) >> !(x3::alnum | '_')];
}
auto const symbol_def =
        !distinct_keyword(keyword | intrinsic | constant)
        >> x3::raw[x3::lexeme[((x3::alpha | '_') >> *(x3::alnum | '_'))]]
        ;

There is unnesessary code (which (I think) impacts on performans too) in latter variant: >> !(x3::alnum | '_').

Is the following modification of code a breaking change for code, written in the past? :

Iterator start = first;
if (this->left.parse(first, last, context, rcontext, attr))
{
    if (this->right.parse(start, first, context, rcontext, unused))
    {
        return (start != first); // Does right completely match the same range as left?
    }
    return true
}
return false;

If it is breaking change, then maybe there is a need to introduce operator / for parsers, as an analogy with "set difference" operator from math?
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Orient
This post was updated on .
To be correct (parse function for operator -):

Iterator const start = first;
if (this->left.parse(first, last, context, rcontext, attr))
{
    Iterator it = start;
    if (this->right.parse(it, first, context, rcontext, unused))
    {
        if (it == first)
        {
            first = start;
            return false;
        }
    }
    return true;
}
return false;

Maybe it make sense to rename boost/spirit/home/x3/operator/difference.hpp -> /home/user/boost/libs/spirit/include/boost/spirit/home/x3/operator/minus.hpp and make operator / in difference.hpp, as described previously?

Currently a - b is equivalent for !b > a or !b >> a, isn't it?
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

TONGARI J
2015-05-20 14:18 GMT+08:00 Orient <[hidden email]>:
To be correct (parse function for *operator -*):

Iterator const start = first;
if (this->left.parse(first, last, context, rcontext, attr))
{
    Iterator it = start;
    if (this->right.parse(it, first, context, rcontext, unused))
    {
        if (it == first)
        {
            first = start;
            return false;
        }
    }
    return true;
}
return false;

Maybe it make sense to rename boost/spirit/home/x3/operator/difference.hpp
->
/home/user/boost/libs/spirit/include/boost/spirit/home/x3/operator/minus.hpp
and make *operator /* in difference.hpp, as described previously?

I like the idea as it is more useful than the current semantic, and there's a precedent that an operator changes its semantic while Spirit evolves (e.g. "!"), so I think it should not be a burden for X3, that's my personal opinion, of course.
Note that with the new semantic, you may need hold[] to hold back the data when right-parser fails.

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Orient
In addition, current semantic of *a - b* is equivalent to *!b >> a*, if I am right.
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Joel de Guzman
In reply to this post by TONGARI J
On 5/20/15 3:01 PM, TONGARI J wrote:

> 2015-05-20 14:18 GMT+08:00 Orient <[hidden email]
> <mailto:[hidden email]>>:
>
>     To be correct (parse function for *operator -*):
>
>     Iterator const start = first;
>     if (this->left.parse(first, last, context, rcontext, attr))
>     {
>          Iterator it = start;
>          if (this->right.parse(it, first, context, rcontext, unused))
>          {
>              if (it == first)
>              {
>                  first = start;
>                  return false;
>              }
>          }
>          return true;
>     }
>     return false;
>
>     Maybe it make sense to rename boost/spirit/home/x3/operator/difference.hpp
>     ->
>     /home/user/boost/libs/spirit/include/boost/spirit/home/x3/operator/minus.hpp
>     and make *operator /* in difference.hpp, as described previously?
>
>
> I like the idea as it is more useful than the current semantic, and there's a precedent
> that an operator changes its semantic while Spirit evolves (e.g. "!"), so I think it
> should not be a burden for X3, that's my personal opinion, of course.
> Note that with the new semantic, you may need hold[] to hold back the data when
> right-parser fails.

There was a (short) time when Spirit did it this way. I liked it too,
but alas, the reason why we went to the current semantics escapes me
now. I'll try to recall and get back to you guys on this issue :-)

Regards,
--
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Joel de Guzman
In reply to this post by Orient
On 5/20/15 3:28 PM, Orient wrote:
> In addition, current semantic of *a - b* is equivalent to *!b >> a*, if I am
> right.

Yes you are right.

Regards,
--
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

TONGARI J
In reply to this post by Joel de Guzman
2015-05-20 17:12 GMT+08:00 Joel de Guzman <[hidden email]>:
On 5/20/15 3:01 PM, TONGARI J wrote:
> 2015-05-20 14:18 GMT+08:00 Orient <[hidden email]
> <mailto:[hidden email]>>:
>
>     To be correct (parse function for *operator -*):
>
>     Iterator const start = first;
>     if (this->left.parse(first, last, context, rcontext, attr))
>     {
>          Iterator it = start;
>          if (this->right.parse(it, first, context, rcontext, unused))
>          {
>              if (it == first)
>              {
>                  first = start;
>                  return false;
>              }
>          }
>          return true;
>     }
>     return false;
>
>     Maybe it make sense to rename boost/spirit/home/x3/operator/difference.hpp
>     ->
>     /home/user/boost/libs/spirit/include/boost/spirit/home/x3/operator/minus.hpp
>     and make *operator /* in difference.hpp, as described previously?
>
>
> I like the idea as it is more useful than the current semantic, and there's a precedent
> that an operator changes its semantic while Spirit evolves (e.g. "!"), so I think it
> should not be a burden for X3, that's my personal opinion, of course.
> Note that with the new semantic, you may need hold[] to hold back the data when
> right-parser fails.

There was a (short) time when Spirit did it this way. I liked it too,
but alas, the reason why we went to the current semantics escapes me
now. I'll try to recall and get back to you guys on this issue :-)

Maybe the reason is just as I mentioned -- the need for data-rollback when right-parser fails?
If so, then it's already solved by "hold[a - b]".

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Joel de Guzman
On 5/20/15 5:22 PM, TONGARI J wrote:

> 2015-05-20 17:12 GMT+08:00 Joel de Guzman <[hidden email] <mailto:[hidden email]>>:
>
>     On 5/20/15 3:01 PM, TONGARI J wrote:
>     > 2015-05-20 14:18 GMT+08:00 Orient <[hidden email] <mailto:[hidden email]>
>      > <mailto:[hidden email] <mailto:[hidden email]>>>:
>      >
>      >     To be correct (parse function for *operator -*):
>      >
>      >     Iterator const start = first;
>      >     if (this->left.parse(first, last, context, rcontext, attr))
>      >     {
>      >          Iterator it = start;
>      >          if (this->right.parse(it, first, context, rcontext, unused))
>      >          {
>      >              if (it == first)
>      >              {
>      >                  first = start;
>      >                  return false;
>      >              }
>      >          }
>      >          return true;
>      >     }
>      >     return false;
>      >
>      >     Maybe it make sense to rename boost/spirit/home/x3/operator/difference.hpp
>      >     ->
>      >     /home/user/boost/libs/spirit/include/boost/spirit/home/x3/operator/minus.hpp
>      >     and make *operator /* in difference.hpp, as described previously?
>      >
>      >
>      > I like the idea as it is more useful than the current semantic, and there's a precedent
>      > that an operator changes its semantic while Spirit evolves (e.g. "!"), so I think it
>      > should not be a burden for X3, that's my personal opinion, of course.
>      > Note that with the new semantic, you may need hold[] to hold back the data when
>      > right-parser fails.
>
>     There was a (short) time when Spirit did it this way. I liked it too,
>     but alas, the reason why we went to the current semantics escapes me
>     now. I'll try to recall and get back to you guys on this issue :-)
>
>
> Maybe the reason is just as I mentioned -- the need for data-rollback when right-parser fails?
> If so, then it's already solved by "hold[a - b]".

I'd like to study this issue a bit more. Orient, could you give us specific
cases and inputs which highlight the two semantics? I suggest to make it
very simple. I do not have enough time to study the implications of
anything more elaborate than that used in the comment:

   // Unlike classic Spirit, with this version of difference, the rule
   // lit("policeman") - "police" will always fail to match.

Let us see the advantages and disadvantages of the semantics of
both and decide which one we want to have for X3.

Regards,
--
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Orient
For current state *a - b* make sense for something using with char_ parser (for me), say skipper grammar equal to corresponding regular expression R"(\s+|(/\*[^*]*\*+([^/*][^*]*\*+)*/)|(//[^\r\n]*))":

static auto const skipper_def =
        x3::ascii::space
        | (x3::lit("/*") > ana_ > *((x3::char_ - (x3::char_('*') | x3::char_('/'))) > ana_) > x3::char_('/'))
        | (x3::lit("//") > *(x3::char_ - (x3::eol | x3::eoi)) > (x3::eol | x3::eoi))
        ;

That is for parsers *a* and *b*, which consumes strictly fixed and equal range of input. But when we dealing with identifiers and keywords (intrinsics, predefined constants or keywords itself), then we should to distinct them in a smart way.
Say, there are a set of keywords "function", "begin", "local", "end" (really Lua subset) and expect a general identifier (symbol):
auto const symbol_def =
        x3::raw[x3::lexeme[((x3::alpha | '_') >> *(x3::alnum | '_'))]] - (x3::lit("function") | "begin" | "local" | "end")
        ;

and input is "endless" or "functional" or "beginning" or "locally", then there is an error here. Because rhs parser not finished as much further as left can do. Therefore we need to do an additional work oneself and to make an computer do redundant work at compile time and runtime by writing at rhs:
... - x3::lexeme[x3::lit("function") | "begin" | "local" | "end") >> !(x3::alnum | '_')]
Really such all above code not compiles (I think, it is due to my misunderstanding of directives propagating).

Main cons of current semantic therefore: *operator -* is wholesome applicable only for *a* and *b*, which parses strictly equal amount of input.

Here is a real grammar, which I always imply and keep in mind: https://github.com/tomilov/insituc/blob/master/include/insituc/parser/implementation/parser.hpp

Joel de Guzman wrote
On 5/20/15 5:22 PM, TONGARI J wrote:
> 2015-05-20 17:12 GMT+08:00 Joel de Guzman <[hidden email] <mailto:[hidden email]>>:
>
>     On 5/20/15 3:01 PM, TONGARI J wrote:
>     > 2015-05-20 14:18 GMT+08:00 Orient <[hidden email] <mailto:[hidden email]>
>      > <mailto:[hidden email] <mailto:[hidden email]>>>:
>      >
>      >     To be correct (parse function for *operator -*):
>      >
>      >     Iterator const start = first;
>      >     if (this->left.parse(first, last, context, rcontext, attr))
>      >     {
>      >          Iterator it = start;
>      >          if (this->right.parse(it, first, context, rcontext, unused))
>      >          {
>      >              if (it == first)
>      >              {
>      >                  first = start;
>      >                  return false;
>      >              }
>      >          }
>      >          return true;
>      >     }
>      >     return false;
>      >
>      >     Maybe it make sense to rename boost/spirit/home/x3/operator/difference.hpp
>      >     ->
>      >     /home/user/boost/libs/spirit/include/boost/spirit/home/x3/operator/minus.hpp
>      >     and make *operator /* in difference.hpp, as described previously?
>      >
>      >
>      > I like the idea as it is more useful than the current semantic, and there's a precedent
>      > that an operator changes its semantic while Spirit evolves (e.g. "!"), so I think it
>      > should not be a burden for X3, that's my personal opinion, of course.
>      > Note that with the new semantic, you may need hold[] to hold back the data when
>      > right-parser fails.
>
>     There was a (short) time when Spirit did it this way. I liked it too,
>     but alas, the reason why we went to the current semantics escapes me
>     now. I'll try to recall and get back to you guys on this issue :-)
>
>
> Maybe the reason is just as I mentioned -- the need for data-rollback when right-parser fails?
> If so, then it's already solved by "hold[a - b]".

I'd like to study this issue a bit more. Orient, could you give us specific
cases and inputs which highlight the two semantics? I suggest to make it
very simple. I do not have enough time to study the implications of
anything more elaborate than that used in the comment:

   // Unlike classic Spirit, with this version of difference, the rule
   // lit("policeman") - "police" will always fail to match.

Let us see the advantages and disadvantages of the semantics of
both and decide which one we want to have for X3.

Regards,
--
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Michael Powell-2
In reply to this post by Orient
On Wed, May 20, 2015 at 1:59 AM, Orient <[hidden email]> wrote:
> According to the documentation
> http://ciere.com/cppnow15/x3_docs/spirit/quick_reference/operator.html the
> "minus" operator overloaded to be:
>
> Expression | Attribute | Description
> a - b |  A | Difference. Parse a but not b

If it's me, especially now three versions in, familiarity and user
base are non-trivial matters to consider, whether to change semantics
suddenly.

Ditto along the lines of "simple" variadically-fed tuples.

That's my two cents.

> Given a two input iterators *beg* and *end*, cieted description sounds like:
> parse *a* (input: *it_a* = *beg* ... *end*), if *a* was successful (*it_a*
> was incremented therefore), then parse *b* with input *it_b* = *beg* ...
> *it_a*, if *b* was successful and *it_a* == *it_b*, then success. If any
> condition failed, then parser fails. But code talks me that *b* applies
> firstly:
>
> // Try Right first
> Iterator start = first;
> if (this->right.parse(first, last, context, rcontext, unused))
> {
>     // Right succeeds, we fail.
>     first = start;
>     return false;
> }
> // Right fails, now try Left
> return this->left.parse(first, last, context, rcontext, attr);
>
> Intuitively it is not clear as it really is.
>
> What I want is to distinct keywords and common symbols: (pseudocode)
> // three sets of reserved symbols:
> x3::symbols< ast::constant > const constant
> ({
>      ...
>  },
>  "constant");
>
> x3::symbols< ast::intrinsic > const intrinsic
> ({
>      ...
>  },
>  "intrinsic");
>
> x3::symbols< ast::keyword > const keyword
> ({
>      ...
>  },
>  "keyword");
> // then I want to define "identifier" rule:
> auto const identifier_def =
>         x3::raw[x3::lexeme[((x3::alpha | '_') >> *(x3::alnum | '_')) -
> (keyword | intrinsic | constant)]]
>         ;
> // it will be optimal solution if "-" behaviour would be as I mentioned
> above, but currently I do as following:
> template< typename parser >
> decltype(auto)
> distinct_keyword(parser && _parser)
> {
>     return x3::lexeme[std::forward< parser >(_parser) >> !(x3::alnum |
> '_')];
> }
> auto const symbol_def =
>         !distinct_keyword(keyword | intrinsic | constant)
>         >> x3::raw[x3::lexeme[((x3::alpha | '_') >> *(x3::alnum | '_'))]]
>         ;
>
> There is unnesessary code (which (I think) impacts on performans too) in
> latter variant: >> !(x3::alnum | '_').
>
> Is the following modification of code a breaking change for code, written in
> the past*?* :
>
> Iterator start = first;
> if (this->left.parse(first, last, context, rcontext, attr))
> {
>     if (this->right.parse(start, first, context, rcontext, unused))
>     {
>         return (start != first); // Does right completely match the same
> range as left?
>     }
>     return true
> }
> return false;
>
> If it is breaking change, then maybe there is a need to introduce *operator
> /* for parsers, as an analogy with "set difference" operator from math?
>
>
>
> --
> View this message in context: http://boost.2283326.n4.nabble.com/Parser-operator-Difference-tp4675788.html
> Sent from the spirit-general mailing list archive at Nabble.com.
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Spirit-general mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/spirit-general

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Joel de Guzman
In reply to this post by Orient
Admin hat on: please avoid top quoting.

On 5/21/15 2:16 AM, Orient wrote:
> For current state *a - b* make sense for something using with char_ parser
> (for me), say skipper grammar equal to corresponding regular expression
> R"(\s+|(/\*[^*]*\*+([^/*][^*]*\*+)*/)|(//[^\r\n]*))":

Human parser hat on: please avoid regex. It makes my jet-lagged head
spin! :-) No, I hesitate to read that one above for my sanity :-)

> static auto const skipper_def =
>          x3::ascii::space
>          | (x3::lit("/*") > ana_ > *((x3::char_ - (x3::char_('*') |
> x3::char_('/'))) > ana_) > x3::char_('/'))
>          | (x3::lit("//") > *(x3::char_ - (x3::eol | x3::eoi)) > (x3::eol |
> x3::eoi))
>          ;

Sorry, I can't understand that either. I can try, but can you please
post something as simple as possible such as:

   lit("policeman") - "police"

Otherwise, we will be potentially confusing each other.

> That is for parsers *a* and *b*, which consumes strictly fixed and equal
> range of input. But when we dealing with identifiers and keywords
> (intrinsics, predefined constants or keywords itself), then we should to
> distinct them in a smart way.
> Say, there are a set of keywords "function", "begin", "local", "end" (really
> Lua subset) and expect a general identifier (symbol):
> auto const symbol_def =
>          x3::raw[x3::lexeme[((x3::alpha | '_') >> *(x3::alnum | '_'))]] -
> (x3::lit("function") | "begin" | "local" | "end")
>          ;
>
> and input is "endless" or "functional" or "beginning" or "locally", then
> there is an error here. Because rhs parser not finished as much further as
> left can do. Therefore we need to do an additional work oneself and to make
> an computer do redundant work at compile time and runtime by writing at rhs:
> ... - x3::lexeme[x3::lit("function") | "begin" | "local" | "end") >>
> !(x3::alnum | '_')]
> Really such all above code not compiles (I think, it is due to my
> misunderstanding of directives propagating).
>
> Main cons of current semantic therefore: *operator -* is wholesome
> applicable only for *a* and *b*, which parses strictly equal amount of
> input.
>
> Here is a real grammar, which I always imply and keep in mind:
> https://github.com/tomilov/insituc/blob/master/include/insituc/parser/implementation/parser.hpp

I'm sorry, Orient, I can't follow. Can we keep the discussion as simple
as possible with concocted and simple use cases such as the "policeman"
example?

Regards,
--
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Michael Powell-2


On May 20, 2015 10:54:18 PM EDT, Joel de Guzman <[hidden email]> wrote:

>Admin hat on: please avoid top quoting.
>
>On 5/21/15 2:16 AM, Orient wrote:
>> For current state *a - b* make sense for something using with char_
>parser
>> (for me), say skipper grammar equal to corresponding regular
>expression
>> R"(\s+|(/\*[^*]*\*+([^/*][^*]*\*+)*/)|(//[^\r\n]*))":
>
>Human parser hat on: please avoid regex. It makes my jet-lagged head
>spin! :-) No, I hesitate to read that one above for my sanity :-)
>
>> static auto const skipper_def =
>>          x3::ascii::space
>>          | (x3::lit("/*") > ana_ > *((x3::char_ - (x3::char_('*') |
>> x3::char_('/'))) > ana_) > x3::char_('/'))
>>          | (x3::lit("//") > *(x3::char_ - (x3::eol | x3::eoi)) >
>(x3::eol |
>> x3::eoi))
>>          ;
>
>Sorry, I can't understand that either. I can try, but can you please
>post something as simple as possible such as:
>
>   lit("policeman") - "police"

Could be something like...

( lit("racecar") - "race" ) | ( lit("racecar") - "car" )

>Otherwise, we will be potentially confusing each other.
>
>> That is for parsers *a* and *b*, which consumes strictly fixed and
>equal
>> range of input. But when we dealing with identifiers and keywords
>> (intrinsics, predefined constants or keywords itself), then we should
>to
>> distinct them in a smart way.
>> Say, there are a set of keywords "function", "begin", "local", "end"
>(really
>> Lua subset) and expect a general identifier (symbol):
>> auto const symbol_def =
>>          x3::raw[x3::lexeme[((x3::alpha | '_') >> *(x3::alnum |
>'_'))]] -
>> (x3::lit("function") | "begin" | "local" | "end")
>>          ;
>>
>> and input is "endless" or "functional" or "beginning" or "locally",
>then
>> there is an error here. Because rhs parser not finished as much
>further as
>> left can do. Therefore we need to do an additional work oneself and
>to make
>> an computer do redundant work at compile time and runtime by writing
>at rhs:
>> ... - x3::lexeme[x3::lit("function") | "begin" | "local" | "end") >>
>> !(x3::alnum | '_')]
>> Really such all above code not compiles (I think, it is due to my
>> misunderstanding of directives propagating).
>>
>> Main cons of current semantic therefore: *operator -* is wholesome
>> applicable only for *a* and *b*, which parses strictly equal amount
>of
>> input.
>>
>> Here is a real grammar, which I always imply and keep in mind:
>>
>https://github.com/tomilov/insituc/blob/master/include/insituc/parser/implementation/parser.hpp
>
>I'm sorry, Orient, I can't follow. Can we keep the discussion as simple
>as possible with concocted and simple use cases such as the "policeman"
>example?
>
>Regards,

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Joel de Guzman
On 5/21/15 5:23 PM, Michael wrote:
>> Sorry, I can't understand that either. I can try, but can you please
>> >post something as simple as possible such as:
>> >
>> >   lit("policeman") - "police"
> Could be something like...
>
> ( lit("racecar") - "race" ) | ( lit("racecar") - "car" )

OK, so what about it and what are the implications of the proposed
and the old semantics of operator-?

Regards,
--
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

TONGARI J
In reply to this post by Joel de Guzman
2015-05-21 10:54 GMT+08:00 Joel de Guzman <[hidden email]>:
On 5/21/15 2:16 AM, Orient wrote:
> static auto const skipper_def =
>          x3::ascii::space
>          | (x3::lit("/*") > ana_ > *((x3::char_ - (x3::char_('*') |
> x3::char_('/'))) > ana_) > x3::char_('/'))
>          | (x3::lit("//") > *(x3::char_ - (x3::eol | x3::eoi)) > (x3::eol |
> x3::eoi))
>          ;

Sorry, I can't understand that either. I can try, but can you please
post something as simple as possible such as:

   lit("policeman") - "police"

Let me try, a C-like lang may have rules like:
```
identifier = lexeme[char_("_a-zA-Z")  >> *char_("_a-zA-Z0-9")]

keyword = "int"

variable = identifier - keyword
```

Given the input "internet", with the new semantic, it will match `variable`; with the old semantic, it won't.

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Michael Powell-2


On May 21, 2015 6:03:10 AM EDT, TONGARI J <[hidden email]> wrote:

>2015-05-21 10:54 GMT+08:00 Joel de Guzman <[hidden email]>:
>>
>> On 5/21/15 2:16 AM, Orient wrote:
>> > static auto const skipper_def =
>> >          x3::ascii::space
>> >          | (x3::lit("/*") > ana_ > *((x3::char_ - (x3::char_('*') |
>> > x3::char_('/'))) > ana_) > x3::char_('/'))
>> >          | (x3::lit("//") > *(x3::char_ - (x3::eol | x3::eoi)) >
>> (x3::eol |
>> > x3::eoi))
>> >          ;
>>
>> Sorry, I can't understand that either. I can try, but can you please
>> post something as simple as possible such as:
>>
>>    lit("policeman") - "police"
>
>
>Let me try, a C-like lang may have rules like:
>```
>identifier = lexeme[char_("_a-zA-Z")  >> *char_("_a-zA-Z0-9")]
>
>keyword = "int"
>
>variable = identifier - keyword
>```
>
>Given the input "internet", with the new semantic, it will match
>`variable`; with the old semantic, it won't.

Does it not "see" whitespace surrounding "int"? Spacers are always tricky to get correct...

>------------------------------------------------------------------------
>
>------------------------------------------------------------------------------
>One dashboard for servers and applications across
>Physical-Virtual-Cloud
>Widest out-of-the-box monitoring support with 50+ applications
>Performance metrics, stats and reports that give you Actionable
>Insights
>Deep dive visibility with transaction tracing using APM Insight.
>http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Spirit-general mailing list
>[hidden email]
>https://lists.sourceforge.net/lists/listinfo/spirit-general

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Joel de Guzman
In reply to this post by TONGARI J
On 5/21/15 6:03 PM, TONGARI J wrote:

> 2015-05-21 10:54 GMT+08:00 Joel de Guzman <[hidden email] <mailto:[hidden email]>>:
>
>     On 5/21/15 2:16 AM, Orient wrote:
>     > static auto const skipper_def =
>     >          x3::ascii::space
>     >          | (x3::lit("/*") > ana_ > *((x3::char_ - (x3::char_('*') |
>     > x3::char_('/'))) > ana_) > x3::char_('/'))
>     >          | (x3::lit("//") > *(x3::char_ - (x3::eol | x3::eoi)) > (x3::eol |
>     > x3::eoi))
>     >          ;
>
>     Sorry, I can't understand that either. I can try, but can you please
>     post something as simple as possible such as:
>
>         lit("policeman") - "police"
>
>
> Let me try, a C-like lang may have rules like:
> ```
> identifier = lexeme[char_("_a-zA-Z")  >> *char_("_a-zA-Z0-9")]
>
> keyword = "int"
>
> variable = identifier - keyword
> ```
>
> Given the input "internet", with the new semantic, it will match `variable`; with the old
> semantic, it won't.

OK, I understand now. Thanks! I still have jetlag from Aspen.
I took this sleeping pills and I feel so woozy!

Anyway...

First, if I will have a keyword rule like that, I'd have it such that
it will only match whole words:

    keyword = "int" >> !char_("_a-zA-Z");

Otherwise keyword will partially match inputs like "internet", which
is wrong.

Second, if that's the case, then:

    variable = identifier - keyword;

will match "internet"

Regards,
--
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Orient
Joel de Guzman wrote
On 5/21/15 6:03 PM, TONGARI J wrote:
> 2015-05-21 10:54 GMT+08:00 Joel de Guzman <[hidden email] <mailto:[hidden email]>>:
>
>     On 5/21/15 2:16 AM, Orient wrote:
>     > static auto const skipper_def =
>     >          x3::ascii::space
>     >          | (x3::lit("/*") > ana_ > *((x3::char_ - (x3::char_('*') |
>     > x3::char_('/'))) > ana_) > x3::char_('/'))
>     >          | (x3::lit("//") > *(x3::char_ - (x3::eol | x3::eoi)) > (x3::eol |
>     > x3::eoi))
>     >          ;
>
>     Sorry, I can't understand that either. I can try, but can you please
>     post something as simple as possible such as:
>
>         lit("policeman") - "police"
>
>
> Let me try, a C-like lang may have rules like:
> ```
> identifier = lexeme[char_("_a-zA-Z")  >> *char_("_a-zA-Z0-9")]
>
> keyword = "int"
>
> variable = identifier - keyword
> ```
>
> Given the input "internet", with the new semantic, it will match `variable`; with the old
> semantic, it won't.

OK, I understand now. Thanks! I still have jetlag from Aspen.
I took this sleeping pills and I feel so woozy!

Anyway...

First, if I will have a keyword rule like that, I'd have it such that
it will only match whole words:

    keyword = "int" >> !char_("_a-zA-Z");

Otherwise keyword will partially match inputs like "internet", which
is wrong.

Second, if that's the case, then:

    variable = identifier - keyword;

will match "internet"

Regards,
--
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
OK^2.
The example of skipper looks complicated, but it simply (space or /* ... */ or // ... endl). With x3::seek it looks much more precious.
My language (once mentioned previously) https://github.com/tomilov/insituc has three different kinds of reserved symbols: Lua keywords (local, begin, end, function, return), x87 constants (zero, one, pi, l2e, l2t, lg2, ln2) and FPU or libc-like functions (abs, chs, modf, poly, sin, exp, pow, arctg, round, max...).
When one expecting identifier (function name or variable name), one should to avoid all three kinds of reserved symbols. If I do as following (really by means of x3::symbols, but pseudocode here):
rule keyword = ("local" | "begin" | "end" | "function" | "return") >> !('_' | alnum);
rule intrinsic = ("exp" | "cos" | "frac" | "round" | "extract" | "scale2" ...) >> !('_' | alnum); // here code duplication
rule constant = ("zero" | "one" | "pi" | "l2e" | "l2t" | "lg2" | "ln2") >> !('_' | alnum); // here code duplication again
then I got the following grammar for identifier:
rule identifier = !(keyword | intrinsic | constant) >> ('_' | alpha) >> *('_' | alnum);
In runtime here three superfluous checking !('_' | alnum).
Also they are three suprefluous expressions contributing in compile time.
They are hindering reading of the code.
You could argue "Anyways you should at least once write in grammar (keyword >> !('_' | alnum)).", but I do this just as many times as many different kinds of reserved symbols exists in my grammar, but identifier used as variable name and function name (and can be used in many others cases).
My proposal (at least for backward compatibility sake) is to introduce new operator / (if already not reserved for other purposes) with discussed semantic.
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Joel de Guzman
On 5/22/15 7:42 PM, Orient wrote:

> Joel de Guzman wrote
>> On 5/21/15 6:03 PM, TONGARI J wrote:
>>> 2015-05-21 10:54 GMT+08:00 Joel de Guzman &lt;
>
>> djowel@
>
>>   &lt;mailto:
>
>> djowel@
>
>> &gt;>:
>>>
>>>      On 5/21/15 2:16 AM, Orient wrote:
>>>      > static auto const skipper_def =
>>>      >          x3::ascii::space
>>>      >          | (x3::lit("/*") > ana_ > *((x3::char_ - (x3::char_('*') |
>>>      > x3::char_('/'))) > ana_) > x3::char_('/'))
>>>      >          | (x3::lit("//") > *(x3::char_ - (x3::eol | x3::eoi)) >
>>> (x3::eol |
>>>      > x3::eoi))
>>>      >          ;
>>>
>>>      Sorry, I can't understand that either. I can try, but can you please
>>>      post something as simple as possible such as:
>>>
>>>          lit("policeman") - "police"
>>>
>>>
>>> Let me try, a C-like lang may have rules like:
>>> ```
>>> identifier = lexeme[char_("_a-zA-Z")  >> *char_("_a-zA-Z0-9")]
>>>
>>> keyword = "int"
>>>
>>> variable = identifier - keyword
>>> ```
>>>
>>> Given the input "internet", with the new semantic, it will match
>>> `variable`; with the old
>>> semantic, it won't.
>>
>> OK, I understand now. Thanks! I still have jetlag from Aspen.
>> I took this sleeping pills and I feel so woozy!
>>
>> Anyway...
>>
>> First, if I will have a keyword rule like that, I'd have it such that
>> it will only match whole words:
>>
>>      keyword = "int" >> !char_("_a-zA-Z");
>>
>> Otherwise keyword will partially match inputs like "internet", which
>> is wrong.
>>
>> Second, if that's the case, then:
>>
>>      variable = identifier - keyword;
>>
>> will match "internet"
>>
>> Regards,
>> --
>> Joel de Guzman
>> http://www.ciere.com
>> http://boost-spirit.com
>> http://www.cycfi.com/
>>
>>
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> _______________________________________________
>> Spirit-general mailing list
>
>> Spirit-general@.sourceforge
>
>> https://lists.sourceforge.net/lists/listinfo/spirit-general
>
> OK^2.
> The example of skipper looks complicated, but it simply (space or /* ... */
> or // ... endl). With x3::seek it looks much more precious.
> My language (once mentioned previously) https://github.com/tomilov/insituc
> has three different kinds of reserved symbols: Lua keywords (local, begin,
> end, function, return), x87 constants (zero, one, pi, l2e, l2t, lg2, ln2)
> and FPU or libc-like functions (abs, chs, modf, poly, sin, exp, pow, arctg,
> round, max...).
> When one expecting identifier (function name or variable name), one should
> to avoid all three kinds of reserved symbols. If I do as following (really
> by means of x3::symbols, but pseudocode here):
> rule keyword = ("local" | "begin" | "end" | "function" | "return") >> !('_'
> | alnum);
> rule intrinsic = ("exp" | "cos" | "frac" | "round" | "extract" | "scale2"
> ...) >> !('_' | alnum); // here code duplication
> rule constant = ("zero" | "one" | "pi" | "l2e" | "l2t" | "lg2" | "ln2") >>
> !('_' | alnum); // here code duplication again
> then I got the following grammar for identifier:
> rule identifier = !(keyword | intrinsic | constant) >> ('_' | alpha) >>
> *('_' | alnum);
> In runtime here three superfluous checking !('_' | alnum).
> Also they are three suprefluous expressions contributing in compile time.
> They are hindering reading of the code.
> You could argue "Anyways you should at least once write in grammar (keyword
>>> !('_' | alnum)).", but I do this just as many times as many different
> kinds of reserved symbols exists in my grammar, but identifier used as
> variable name and function name (and can be used in many others cases).
> My proposal (at least for backward compatibility sake) is to introduce new
> *operator /* (if already not reserved for other purposes) with discussed
> semantic.

The question is: would that really remove the need for >> !('_' | alnum) ?
Wouldn't you still have them in your keyword rule in order for you to use
them independently?

Regards,
--
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Michael Powell-2
On Fri, May 22, 2015 at 9:07 AM, Joel de Guzman <[hidden email]> wrote:

> On 5/22/15 7:42 PM, Orient wrote:
>> Joel de Guzman wrote
>>> On 5/21/15 6:03 PM, TONGARI J wrote:
>>>> 2015-05-21 10:54 GMT+08:00 Joel de Guzman &lt;
>>
>>> djowel@
>>
>>>   &lt;mailto:
>>
>>> djowel@
>>
>>> &gt;>:
>>>>
>>>>      On 5/21/15 2:16 AM, Orient wrote:
>>>>      > static auto const skipper_def =
>>>>      >          x3::ascii::space
>>>>      >          | (x3::lit("/*") > ana_ > *((x3::char_ - (x3::char_('*') |
>>>>      > x3::char_('/'))) > ana_) > x3::char_('/'))
>>>>      >          | (x3::lit("//") > *(x3::char_ - (x3::eol | x3::eoi)) >
>>>> (x3::eol |
>>>>      > x3::eoi))
>>>>      >          ;
>>>>
>>>>      Sorry, I can't understand that either. I can try, but can you please
>>>>      post something as simple as possible such as:
>>>>
>>>>          lit("policeman") - "police"
>>>>
>>>>
>>>> Let me try, a C-like lang may have rules like:
>>>> ```
>>>> identifier = lexeme[char_("_a-zA-Z")  >> *char_("_a-zA-Z0-9")]
>>>>
>>>> keyword = "int"
>>>>
>>>> variable = identifier - keyword
>>>> ```
>>>>
>>>> Given the input "internet", with the new semantic, it will match
>>>> `variable`; with the old
>>>> semantic, it won't.
>>>
>>> OK, I understand now. Thanks! I still have jetlag from Aspen.
>>> I took this sleeping pills and I feel so woozy!
>>>
>>> Anyway...
>>>
>>> First, if I will have a keyword rule like that, I'd have it such that
>>> it will only match whole words:
>>>
>>>      keyword = "int" >> !char_("_a-zA-Z");
>>>
>>> Otherwise keyword will partially match inputs like "internet", which
>>> is wrong.
>>>
>>> Second, if that's the case, then:
>>>
>>>      variable = identifier - keyword;
>>>
>>> will match "internet"
>>>
>>> Regards,
>>> --
>>> Joel de Guzman
>>> http://www.ciere.com
>>> http://boost-spirit.com
>>> http://www.cycfi.com/
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>> Widest out-of-the-box monitoring support with 50+ applications
>>> Performance metrics, stats and reports that give you Actionable Insights
>>> Deep dive visibility with transaction tracing using APM Insight.
>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>> _______________________________________________
>>> Spirit-general mailing list
>>
>>> Spirit-general@.sourceforge
>>
>>> https://lists.sourceforge.net/lists/listinfo/spirit-general
>>
>> OK^2.
>> The example of skipper looks complicated, but it simply (space or /* ... */
>> or // ... endl). With x3::seek it looks much more precious.
>> My language (once mentioned previously) https://github.com/tomilov/insituc
>> has three different kinds of reserved symbols: Lua keywords (local, begin,
>> end, function, return), x87 constants (zero, one, pi, l2e, l2t, lg2, ln2)
>> and FPU or libc-like functions (abs, chs, modf, poly, sin, exp, pow, arctg,
>> round, max...).
>> When one expecting identifier (function name or variable name), one should
>> to avoid all three kinds of reserved symbols. If I do as following (really
>> by means of x3::symbols, but pseudocode here):
>> rule keyword = ("local" | "begin" | "end" | "function" | "return") >> !('_'
>> | alnum);
>> rule intrinsic = ("exp" | "cos" | "frac" | "round" | "extract" | "scale2"
>> ...) >> !('_' | alnum); // here code duplication
>> rule constant = ("zero" | "one" | "pi" | "l2e" | "l2t" | "lg2" | "ln2") >>
>> !('_' | alnum); // here code duplication again
>> then I got the following grammar for identifier:
>> rule identifier = !(keyword | intrinsic | constant) >> ('_' | alpha) >>
>> *('_' | alnum);
>> In runtime here three superfluous checking !('_' | alnum).
>> Also they are three suprefluous expressions contributing in compile time.
>> They are hindering reading of the code.
>> You could argue "Anyways you should at least once write in grammar (keyword
>>>> !('_' | alnum)).", but I do this just as many times as many different
>> kinds of reserved symbols exists in my grammar, but identifier used as
>> variable name and function name (and can be used in many others cases).
>> My proposal (at least for backward compatibility sake) is to introduce new
>> *operator /* (if already not reserved for other purposes) with discussed
>> semantic.
>
> The question is: would that really remove the need for >> !('_' | alnum) ?
> Wouldn't you still have them in your keyword rule in order for you to use
> them independently?

I don't see them as a hindrance, but rather as an implementation
detail. True, not part of the grammar, per se. But it is something
that Spirit needs to help it discern keyword from non-keyword.

If you really needed to, tuck such things away in a separate rule,
superfluous_non_keyword_phrase, or some such.

That doesn't justify changing the semantics for everyone else, IMO.

That's my two cents.

> Regards,
> --
> Joel de Guzman
> http://www.ciere.com
> http://boost-spirit.com
> http://www.cycfi.com/
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Spirit-general mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/spirit-general

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parser operator: Difference

Joel de Guzman
On 5/22/15 9:23 PM, Michael Powell wrote:

> On Fri, May 22, 2015 at 9:07 AM, Joel de Guzman <[hidden email]> wrote:
>> On 5/22/15 7:42 PM, Orient wrote:
>>> Joel de Guzman wrote
>>>> On 5/21/15 6:03 PM, TONGARI J wrote:
>>>>> 2015-05-21 10:54 GMT+08:00 Joel de Guzman &lt;
>>>
>>>> djowel@
>>>
>>>>    &lt;mailto:
>>>
>>>> djowel@
>>>
>>>> &gt;>:
>>>>>
>>>>>       On 5/21/15 2:16 AM, Orient wrote:
>>>>>       > static auto const skipper_def =
>>>>>       >          x3::ascii::space
>>>>>       >          | (x3::lit("/*") > ana_ > *((x3::char_ - (x3::char_('*') |
>>>>>       > x3::char_('/'))) > ana_) > x3::char_('/'))
>>>>>       >          | (x3::lit("//") > *(x3::char_ - (x3::eol | x3::eoi)) >
>>>>> (x3::eol |
>>>>>       > x3::eoi))
>>>>>       >          ;
>>>>>
>>>>>       Sorry, I can't understand that either. I can try, but can you please
>>>>>       post something as simple as possible such as:
>>>>>
>>>>>           lit("policeman") - "police"
>>>>>
>>>>>
>>>>> Let me try, a C-like lang may have rules like:
>>>>> ```
>>>>> identifier = lexeme[char_("_a-zA-Z")  >> *char_("_a-zA-Z0-9")]
>>>>>
>>>>> keyword = "int"
>>>>>
>>>>> variable = identifier - keyword
>>>>> ```
>>>>>
>>>>> Given the input "internet", with the new semantic, it will match
>>>>> `variable`; with the old
>>>>> semantic, it won't.
>>>>
>>>> OK, I understand now. Thanks! I still have jetlag from Aspen.
>>>> I took this sleeping pills and I feel so woozy!
>>>>
>>>> Anyway...
>>>>
>>>> First, if I will have a keyword rule like that, I'd have it such that
>>>> it will only match whole words:
>>>>
>>>>       keyword = "int" >> !char_("_a-zA-Z");
>>>>
>>>> Otherwise keyword will partially match inputs like "internet", which
>>>> is wrong.
>>>>
>>>> Second, if that's the case, then:
>>>>
>>>>       variable = identifier - keyword;
>>>>
>>>> will match "internet"
>>>>
>>>> Regards,
>>>> --
>>>> Joel de Guzman
>>>> http://www.ciere.com
>>>> http://boost-spirit.com
>>>> http://www.cycfi.com/
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>>> Widest out-of-the-box monitoring support with 50+ applications
>>>> Performance metrics, stats and reports that give you Actionable Insights
>>>> Deep dive visibility with transaction tracing using APM Insight.
>>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>>> _______________________________________________
>>>> Spirit-general mailing list
>>>
>>>> Spirit-general@.sourceforge
>>>
>>>> https://lists.sourceforge.net/lists/listinfo/spirit-general
>>>
>>> OK^2.
>>> The example of skipper looks complicated, but it simply (space or /* ... */
>>> or // ... endl). With x3::seek it looks much more precious.
>>> My language (once mentioned previously) https://github.com/tomilov/insituc
>>> has three different kinds of reserved symbols: Lua keywords (local, begin,
>>> end, function, return), x87 constants (zero, one, pi, l2e, l2t, lg2, ln2)
>>> and FPU or libc-like functions (abs, chs, modf, poly, sin, exp, pow, arctg,
>>> round, max...).
>>> When one expecting identifier (function name or variable name), one should
>>> to avoid all three kinds of reserved symbols. If I do as following (really
>>> by means of x3::symbols, but pseudocode here):
>>> rule keyword = ("local" | "begin" | "end" | "function" | "return") >> !('_'
>>> | alnum);
>>> rule intrinsic = ("exp" | "cos" | "frac" | "round" | "extract" | "scale2"
>>> ...) >> !('_' | alnum); // here code duplication
>>> rule constant = ("zero" | "one" | "pi" | "l2e" | "l2t" | "lg2" | "ln2") >>
>>> !('_' | alnum); // here code duplication again
>>> then I got the following grammar for identifier:
>>> rule identifier = !(keyword | intrinsic | constant) >> ('_' | alpha) >>
>>> *('_' | alnum);
>>> In runtime here three superfluous checking !('_' | alnum).
>>> Also they are three suprefluous expressions contributing in compile time.
>>> They are hindering reading of the code.
>>> You could argue "Anyways you should at least once write in grammar (keyword
>>>>> !('_' | alnum)).", but I do this just as many times as many different
>>> kinds of reserved symbols exists in my grammar, but identifier used as
>>> variable name and function name (and can be used in many others cases).
>>> My proposal (at least for backward compatibility sake) is to introduce new
>>> *operator /* (if already not reserved for other purposes) with discussed
>>> semantic.
>>
>> The question is: would that really remove the need for >> !('_' | alnum) ?
>> Wouldn't you still have them in your keyword rule in order for you to use
>> them independently?
>
> I don't see them as a hindrance, but rather as an implementation
> detail. True, not part of the grammar, per se. But it is something
> that Spirit needs to help it discern keyword from non-keyword.
>
> If you really needed to, tuck such things away in a separate rule,
> superfluous_non_keyword_phrase, or some such.
>
> That doesn't justify changing the semantics for everyone else, IMO.
>
> That's my two cents.

Well, to be fair, Orient is not suggesting changing the semantics,
instead he's proposing a new operator / with this semantics.


Regards,
--
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
12