Python parser

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Python parser

Vygintas Daugmaudis
Hello list,

At first, I want to congratulate the whole team on the very fine job
that you have done with Boost.Spirit library. This was a breeze to use
and debug facilities are quite nice as well.

I also attach Python's grammar for Spirit; it is my pleasure to donate
it to the application repository, and this may be of some interest. I
know quite well that some time ago I was browsing for something
similar, and I couldn't find it, so we rolled our own parser, which
was not so complicated, seeing that Python team publish their EBNF
(albeit in different versions of Python those grammars are a little
bit different as well; most notable differences are keyword list in
2.6 and previous versions, and old_expression and old_lambda support,
which has to go in current parsers, but that crud -- hopefully -- will
be removed later).

The grammar was tested on CMS Offline configuration system, which
consists of ~6000 Python files of all levels of complexity -- from the
most simple few liners to quite complicated parsers. The biggest
difference between the attached parser and the native one
(libpython2.5) is that for some curious reason the native parser
accepts statements like this:

a-5 = foo()

which, of course, is not correct (assignment to arithmetic
expression). The attached parser, on the other hand, correctly rejects
such constructs. However, in our grammar Python's indentation support
is rather minimalistic -- we check just enough to parse ambivalent
statements (try-finally family), but not more.

Another thing that could be useful on its own is filter_eol_d facility
(could be easily generalized), which transparently filters eols, but
at the same time it does not inhibit nor disturbs the actual skipper.
This is exactly what I found missing in Spirit -- some kind of
localized skipping directive; this is especially handy when in some
contexts eol -- as in our case -- is considered to be a part of
whitespace, and in others -- eols have semantic meaning and cannot be
skipped.

That's about it, I suppose. Thank you again for the Spirit. The
library is truly wonderful, even if documentation is a little bit
lacking at times. :-)

Best regards,
Justinas Vygintas D.

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel

pyparser.tar.bz2 (16K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Python parser

Andy Elvey
Justinas V.D. wrote:

> Hello list,
>
> At first, I want to congratulate the whole team on the very fine job
> that you have done with Boost.Spirit library. This was a breeze to use
> and debug facilities are quite nice as well.
>
> I also attach Python's grammar for Spirit; it is my pleasure to donate
> it to the application repository, and this may be of some interest. I
> know quite well that some time ago I was browsing for something
> similar, and I couldn't find it, so we rolled our own parser, which
> was not so complicated, seeing that Python team publish their EBNF
> (albeit in different versions of Python those grammars are a little
> bit different as well; most notable differences are keyword list in
> 2.6 and previous versions, and old_expression and old_lambda support,
> which has to go in current parsers, but that crud -- hopefully -- will
> be removed later).
>
> The grammar was tested on CMS Offline configuration system, which
> consists of ~6000 Python files of all levels of complexity -- from the
> most simple few liners to quite complicated parsers. The biggest
> difference between the attached parser and the native one
> (libpython2.5) is that for some curious reason the native parser
> accepts statements like this:
>
> a-5 = foo()
>
> which, of course, is not correct (assignment to arithmetic
> expression). The attached parser, on the other hand, correctly rejects
> such constructs. However, in our grammar Python's indentation support
> is rather minimalistic -- we check just enough to parse ambivalent
> statements (try-finally family), but not more.
>
> Another thing that could be useful on its own is filter_eol_d facility
> (could be easily generalized), which transparently filters eols, but
> at the same time it does not inhibit nor disturbs the actual skipper.
> This is exactly what I found missing in Spirit -- some kind of
> localized skipping directive; this is especially handy when in some
> contexts eol -- as in our case -- is considered to be a part of
> whitespace, and in others -- eols have semantic meaning and cannot be
> skipped.
>
> That's about it, I suppose. Thank you again for the Spirit. The
> library is truly wonderful, even if documentation is a little bit
> lacking at times. :-)
>
> Best regards,
> Justinas Vygintas D.
>
Awesome!  Good stuff - good on you, Justinas!   :)
- Andy



------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python parser

OvermindDL1
On Thu, Jan 22, 2009 at 12:01 AM, Andy Elvey <[hidden email]> wrote:

> Justinas V.D. wrote:
>> Hello list,
>>
>> At first, I want to congratulate the whole team on the very fine job
>> that you have done with Boost.Spirit library. This was a breeze to use
>> and debug facilities are quite nice as well.
>>
>> I also attach Python's grammar for Spirit; it is my pleasure to donate
>> it to the application repository, and this may be of some interest. I
>> know quite well that some time ago I was browsing for something
>> similar, and I couldn't find it, so we rolled our own parser, which
>> was not so complicated, seeing that Python team publish their EBNF
>> (albeit in different versions of Python those grammars are a little
>> bit different as well; most notable differences are keyword list in
>> 2.6 and previous versions, and old_expression and old_lambda support,
>> which has to go in current parsers, but that crud -- hopefully -- will
>> be removed later).
>>
>> The grammar was tested on CMS Offline configuration system, which
>> consists of ~6000 Python files of all levels of complexity -- from the
>> most simple few liners to quite complicated parsers. The biggest
>> difference between the attached parser and the native one
>> (libpython2.5) is that for some curious reason the native parser
>> accepts statements like this:
>>
>> a-5 = foo()
>>
>> which, of course, is not correct (assignment to arithmetic
>> expression). The attached parser, on the other hand, correctly rejects
>> such constructs. However, in our grammar Python's indentation support
>> is rather minimalistic -- we check just enough to parse ambivalent
>> statements (try-finally family), but not more.
>>
>> Another thing that could be useful on its own is filter_eol_d facility
>> (could be easily generalized), which transparently filters eols, but
>> at the same time it does not inhibit nor disturbs the actual skipper.
>> This is exactly what I found missing in Spirit -- some kind of
>> localized skipping directive; this is especially handy when in some
>> contexts eol -- as in our case -- is considered to be a part of
>> whitespace, and in others -- eols have semantic meaning and cannot be
>> skipped.
>>
>> That's about it, I suppose. Thank you again for the Spirit. The
>> library is truly wonderful, even if documentation is a little bit
>> lacking at times. :-)
>>
>> Best regards,
>> Justinas Vygintas D.
>>
> Awesome!  Good stuff - good on you, Justinas!   :)
> - Andy
>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by:
> SourcForge Community
> SourceForge wants to tell your story.
> http://p.sf.net/sfu/sf-spreadtheword
> _______________________________________________
> Spirit-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/spirit-devel
>

Very nice.  Was always curious how someone would implement a
whitespace sensitive language in Spirit (I had ideas, just never
tested it).  :)

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python parser

Joel de Guzman-2
In reply to this post by Vygintas Daugmaudis
Justinas V.D. wrote:
> Hello list,
>
> At first, I want to congratulate the whole team on the very fine job
> that you have done with Boost.Spirit library. This was a breeze to use
> and debug facilities are quite nice as well.

Thank you! I wish you can write it in Spirit2! :-)

> I also attach Python's grammar for Spirit; it is my pleasure to donate
> it to the application repository, and this may be of some interest.

[snip]

>
> The grammar was tested on CMS Offline configuration system, which
> consists of ~6000 Python files of all levels of complexity -- from the
> most simple few liners to quite complicated parsers.

That's cool! Well tested.

[snip]

> constructs. However, in our grammar Python's indentation support is
> rather minimalistic -- we check just enough to parse ambivalent
> statements (try-finally family), but not more.

How do you do it?

> Another thing that could be useful on its own is filter_eol_d facility
> (could be easily generalized), which transparently filters eols, but at
> the same time it does not inhibit nor disturbs the actual skipper. This
> is exactly what I found missing in Spirit -- some kind of localized
> skipping directive; this is especially handy when in some contexts eol

You're not the first to ask.

> -- as in our case -- is considered to be a part of whitespace, and in
> others -- eols have semantic meaning and cannot be skipped.

But can't you rig your space skipper to enable/disable skipping of
eols?

> That's about it, I suppose. Thank you again for the Spirit. The library
> is truly wonderful, even if documentation is a little bit lacking at
> times. :-)

Definitely. Do you have a URL I can use? Better yet, please
provide me with some info:

* Application Name
* Link
* Author (contact info, e.g. email)
* Short info

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net


------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python parser

Vygintas Daugmaudis
Joel de Guzman rašė:
> Justinas V.D. wrote:
>> Hello list,
>>
>> At first, I want to congratulate the whole team on the very fine job
>> that you have done with Boost.Spirit library. This was a breeze to use
>> and debug facilities are quite nice as well.
>
> Thank you! I wish you can write it in Spirit2! :-)

I reckon this task to be quite doable, and once Spirit2 is in Boost...
I am looking for that. :-)


[snip]


> That's cool! Well tested.

Seeing how mass media reacted to LHC opening, it would be a wee bit
irresponsible to give them any more clues that we don't know what we
are doing here. :-D

>
>> constructs. However, in our grammar Python's indentation support is
>> rather minimalistic -- we check just enough to parse ambivalent
>> statements (try-finally family), but not more.
>
> How do you do it?

The solution is not really elegant. We simply take the position once
keyword parser completes its task, and go back until we encounter eol.
This distance minus keyword length is our indentation.
Now, if we describe this as a parser, which consumes nothing and
simply checks if consequent keywords i.e., "finally", "except" have
the same indentation, we can "inject" it in our grammar in order to
terminate recursive descend properly.


>
>> Another thing that could be useful on its own is filter_eol_d facility
>> (could be easily generalized), which transparently filters eols, but at
>> the same time it does not inhibit nor disturbs the actual skipper. This
>> is exactly what I found missing in Spirit -- some kind of localized
>> skipping directive; this is especially handy when in some contexts eol
>
> You're not the first to ask.

So there must be need, then.

>
>> -- as in our case -- is considered to be a part of whitespace, and in
>> others -- eols have semantic meaning and cannot be skipped.
>
> But can't you rig your space skipper to enable/disable skipping of
> eols?

You know, I tried to do both. I imagine that this variant with space
skipper has some appeal, especially considering "scanner business",
but this was hugely down-weighted by the introduced complexity into
this said skipper. In our case, it must become parens-aware and to
hold additional state and rules that would govern its behavior. It is
simply expensive to check in which part of grammar we are skipping
instead of simply marking (statically) that at this and this point we
want to filter-out this particular symbol.



> Definitely. Do you have a URL I can use? Better yet, please
> provide me with some info:
>
> * Application Name
Pyparser ought to be fine, I suppose.

> * Link
Could you, please, host it at Spirit repository? The only other thing
I could offer would be to keep it at my university web account, but
those are prone to disappear. How many times
blablabla.edu/~studname/something.bam have we seen unreachable simply
because ~studname completes his or her studies? :-)

> * Author (contact info, e.g. email)
Justinas Vygintas Daugmaudis  [hidden email]

> * Short info
A tool that given a directory, recursively iterates through it, and
parses all Python files found therein. Parsing process consists of two
passes; in the first pass we parse a Python file with Boost.Spirit
based parser and in the second pass we parse the file with the native
Python  parser, provided  by libpython2.x. Pyparser utility outputs
the summary of agreements and disagreements between the Boost.Spirit
based parser and the native one. Ideally, there ought to be no
disagreements.

Regards,
Justinas V.D.

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python parser

cppljevans
On 01/23/09 12:37, Justinas V.D. wrote:
> Joel de Guzman rašė:
>> Justinas V.D. wrote:
[snip]

>>> constructs. However, in our grammar Python's indentation support is
>>> rather minimalistic -- we check just enough to parse ambivalent
>>> statements (try-finally family), but not more.
>> How do you do it?
>
> The solution is not really elegant. We simply take the position once
> keyword parser completes its task, and go back until we encounter eol.
> This distance minus keyword length is our indentation.
> Now, if we describe this as a parser, which consumes nothing and
> simply checks if consequent keywords i.e., "finally", "except" have
> the same indentation, we can "inject" it in our grammar in order to
> terminate recursive descend properly.
>
>
>>> Another thing that could be useful on its own is filter_eol_d facility
>>> (could be easily generalized), which transparently filters eols, but at
>>> the same time it does not inhibit nor disturbs the actual skipper. This
>>> is exactly what I found missing in Spirit -- some kind of localized
>>> skipping directive; this is especially handy when in some contexts eol
>> You're not the first to ask.

Would boost::iostreams filters be any help here.  There's a filter which
counts characters and lines:

http://www.boost.org/doc/libs/1_37_0/libs/iostreams/doc/classes/counter.html#reference

I'd think it would be easy to modify to count the number of characters
after the end-of-line until a non-blank character is encountered.
Wouldn't that be the current indentation?


------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python parser

OvermindDL1
On Fri, Jan 23, 2009 at 12:41 PM, Larry Evans <[hidden email]> wrote:

> On 01/23/09 12:37, Justinas V.D. wrote:
>> Joel de Guzman rašė:
>>> Justinas V.D. wrote:
> [snip]
>>>> constructs. However, in our grammar Python's indentation support is
>>>> rather minimalistic -- we check just enough to parse ambivalent
>>>> statements (try-finally family), but not more.
>>> How do you do it?
>>
>> The solution is not really elegant. We simply take the position once
>> keyword parser completes its task, and go back until we encounter eol.
>> This distance minus keyword length is our indentation.
>> Now, if we describe this as a parser, which consumes nothing and
>> simply checks if consequent keywords i.e., "finally", "except" have
>> the same indentation, we can "inject" it in our grammar in order to
>> terminate recursive descend properly.
>>
>>
>>>> Another thing that could be useful on its own is filter_eol_d facility
>>>> (could be easily generalized), which transparently filters eols, but at
>>>> the same time it does not inhibit nor disturbs the actual skipper. This
>>>> is exactly what I found missing in Spirit -- some kind of localized
>>>> skipping directive; this is especially handy when in some contexts eol
>>> You're not the first to ask.
>
> Would boost::iostreams filters be any help here.  There's a filter which
> counts characters and lines:
>
> http://www.boost.org/doc/libs/1_37_0/libs/iostreams/doc/classes/counter.html#reference
>
> I'd think it would be easy to modify to count the number of characters
> after the end-of-line until a non-blank character is encountered.
> Wouldn't that be the current indentation?
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by:
> SourcForge Community
> SourceForge wants to tell your story.
> http://p.sf.net/sfu/sf-spreadtheword
> _______________________________________________
> Spirit-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/spirit-devel
>

I had thought about parsing whitespace, and in Classic Spirit I do not
see an 'easy' way, something like what you did.  However, thinking
about how to do it in Spirit2x I see it being a great deal easier,
being able to be a direct part of the grammar.  Perhaps I should try
making one sometime...

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

[spirit2] rule-skippers [Re: Python parser]

Joel de Guzman-2
In reply to this post by Vygintas Daugmaudis
Justinas V.D. wrote:

>>> Another thing that could be useful on its own is filter_eol_d facility
>>> (could be easily generalized), which transparently filters eols, but at
>>> the same time it does not inhibit nor disturbs the actual skipper. This
>>> is exactly what I found missing in Spirit -- some kind of localized
>>> skipping directive; this is especially handy when in some contexts eol
>> You're not the first to ask.
>
> So there must be need, then.

Here's a thought...

1) In Spirit2, the rule knows the skipper. As of now, it is a compile
error if there is a mismatch (i.e if the supplied skipper is not
compatible with the rule's declared skipper type.

Now... most skippers are default constructable. Typically, these
are quite simple. What if instead of compile error, we default
construct a skipper type and use that? Can anyone see danger in
such a behavior?

2) Another plausible strategy is to allow the rule to take in a
skipper. Example:

    r.skip_using(blank);

That will override the passed-in skipper.

3) Of course, yet another (orthogonal) solution is by a directive:

     skip_using(blank)[p]

I'd like to hear what you guys think. All this is doable.

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net


------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit2] rule-skippers [Re: Python parser]

Hartmut Kaiser
> >>> Another thing that could be useful on its own is filter_eol_d
> facility
> >>> (could be easily generalized), which transparently filters eols,
> but at
> >>> the same time it does not inhibit nor disturbs the actual skipper.
> This
> >>> is exactly what I found missing in Spirit -- some kind of localized
> >>> skipping directive; this is especially handy when in some contexts
> eol
> >> You're not the first to ask.
> >
> > So there must be need, then.
>
> Here's a thought...
>
> 1) In Spirit2, the rule knows the skipper. As of now, it is a compile
> error if there is a mismatch (i.e if the supplied skipper is not
> compatible with the rule's declared skipper type.
>
> Now... most skippers are default constructable. Typically, these
> are quite simple. What if instead of compile error, we default
> construct a skipper type and use that? Can anyone see danger in
> such a behavior?

Implicit behavior is not good. It would be very difficult to debug if the
wrong skipper type is supplied.

> 2) Another plausible strategy is to allow the rule to take in a
> skipper. Example:
>
>     r.skip_using(blank);
>
> That will override the passed-in skipper.

Better, but syntax-wise not preferable, IMHO.

> 3) Of course, yet another (orthogonal) solution is by a directive:
>
>      skip_using(blank)[p]

That's the best. It's explicit and matches the syntax as used otherwise (see
verbatim[]).

I'd prefer skip[] and skip(...)[], though. BTW, this version has the
additional advantage of being usable as the direct opposite to verbatim[] as
well (without argument). And think about the use case where you want to
temporarily enable skipping inside a verbatim[] directive:

    verbatim[a >> skip(blank)[b >> c] >> d]

Regards Hartmut



------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit2] rule-skippers [e ss: Python parser]

CARL BARRON-3
In reply to this post by Joel de Guzman-2

On Jan 23, 2009, at 8:24 PM, Joel de Guzman wrote:

> Justinas V.D. wrote:
>
>>>> Another thing that could be useful on its own is filter_eol_d  
>>>> facility
>>>> (could be easily generalized), which transparently filters eols,  
>>>> but at
>>>> the same time it does not inhibit nor disturbs the actual  
>>>> skipper. This
>>>> is exactly what I found missing in Spirit -- some kind of localized
>>>> skipping directive; this is especially handy when in some  
>>>> contexts eol
>>> You're not the first to ask.
>>
>> So there must be need, then.
>
> Here's a thought...
>
> 1) In Spirit2, the rule knows the skipper. As of now, it is a compile
> error if there is a mismatch (i.e if the supplied skipper is not
> compatible with the rule's declared skipper type.
>
> Now... most skippers are default constructable. Typically, these
> are quite simple. What if instead of compile error, we default
> construct a skipper type and use that? Can anyone see danger in
> such a behavior?
  Skippers with non default ctors are possible.

>
> 2) Another plausible strategy is to allow the rule to take in a  
> skipper. Example:
>
>   r.skip_using(blank);
>
> That will override the passed-in skipper.
>
> 3) Of course, yet another (orthogonal) solution is by a directive:
>
>    skip_using(blank)[p]
> I'd like to hear what you guys think. All this is doable.
   As for 2 and 3 I say why not both as 3 allows it in the middle of a  
rule,
and 2 makes it cleaner if it is always the same skipper for a rule.  
Seems like the 'engine'
for 2 and 3 is essentially the same, so I see no problem with both 2  
and 3.


------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit2] rule-skippers [e ss: Python parser]

Joel de Guzman-2
Carl Barron wrote:

>> I'd like to hear what you guys think. All this is doable.
>    As for 2 and 3 I say why not both as 3 allows it in the middle of a  
> rule,
> and 2 makes it cleaner if it is always the same skipper for a rule.  
> Seems like the 'engine'
> for 2 and 3 is essentially the same, so I see no problem with both 2  
> and 3.

I agree with Carl. 2 and 3 are both good solutions. I also
agree with Hartmut that the name should be "skip".

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net


------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit2] rule-skippers [e ss: Python parser]

Joel de Guzman-2
Joel de Guzman wrote:

> Carl Barron wrote:
>
>>> I'd like to hear what you guys think. All this is doable.
>>    As for 2 and 3 I say why not both as 3 allows it in the middle of a  
>> rule,
>> and 2 makes it cleaner if it is always the same skipper for a rule.  
>> Seems like the 'engine'
>> for 2 and 3 is essentially the same, so I see no problem with both 2  
>> and 3.
>
> I agree with Carl. 2 and 3 are both good solutions. I also
> agree with Hartmut that the name should be "skip".

Oh, and I forgot to mention that the 2nd solution has an important
property that you can change the skip-parser dynamically at
runtime.

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net


------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit2] rule-skippers [e ss: Python parser]

Vygintas Daugmaudis
Joel de Guzman rašė:

> Joel de Guzman wrote:
>> Carl Barron wrote:
>>
>>>> I'd like to hear what you guys think. All this is doable.
>>>    As for 2 and 3 I say why not both as 3 allows it in the middle of a  
>>> rule,
>>> and 2 makes it cleaner if it is always the same skipper for a rule.  
>>> Seems like the 'engine'
>>> for 2 and 3 is essentially the same, so I see no problem with both 2  
>>> and 3.
>> I agree with Carl. 2 and 3 are both good solutions. I also
>> agree with Hartmut that the name should be "skip".

I concur that skip[P] and skip(P_0)[P_1] would  be more in the spirit
of Spirit. :-)

But the name should probably be skip_d, because it would go well with
other directives and, on the other hand, "skip" is awfully often used
in other contexts, e.g., to name a skip parser, so I guess that skip_d
would collide less with user code as well.


In pyparser the construct is called filter_eol_d[P] (I imagine that
filter_eol_d[P] would be equivalent to skip_d(eol_p)[P]) and is used
like this:
    ch_p('(') >> filter_eol_d[argument_list >> ')'];

This means that white space and comment skipping is performed as would
be expected, but filter_eol_d just filters out any excess eols within
parens.

Imagine something akin to this:

    foo( arg0,
    arg1              =           bar,
    arg2, # comment
    # comment
    # comment
    # comment
    arg3[some_expression]
    )
    EOL -- Marks End of Statement; not to be filtered!
    another_statement
    EOL

Okay, space skipper is simple and easy and does exactly what it looks
it does: it skips spaces and comments, so foo, after the
transformation, would look like this:

    foo(arg0,
    arg1=bar,
    arg2,


    arg3[some_expression]
    )

Additional transformation from filter_eol_d[argument_list >> ')'] (or
skip_d(eol_p)[argument_list >> ')']) would make it look like this:
    foo(arg0,arg1=bar,arg2,arg3[some_expression])

Now argument_list can parse

    arg0,arg1=bar,arg2,arg3[some_expression]

easily.


This is certainly less complicated than some convoluted space skipper,
which adapts dynamically. And, besides, in order to change the manner
of working for that skipper, we would have to inject some tags into
the grammar anyway, e.g., "from now on space skipper has to skip eols"
and "from now on space skipper has NOT to skip eols".



>
> Oh, and I forgot to mention that the 2nd solution has an important
> property that you can change the skip-parser dynamically at
> runtime.

Yes, that is nice, too.

Regards,
Justinas V.D.

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit2] rule-skippers [Re: Python parser]

Francois Barel
In reply to this post by Joel de Guzman-2
Joel de Guzman wrote:

> Here's a thought...
>
> 1) In Spirit2, the rule knows the skipper. As of now, it is a compile
> error if there is a mismatch (i.e if the supplied skipper is not
> compatible with the rule's declared skipper type.
>
> Now... most skippers are default constructable. Typically, these
> are quite simple. What if instead of compile error, we default
> construct a skipper type and use that? Can anyone see danger in
> such a behavior?

No opinion either way here...

> 2) Another plausible strategy is to allow the rule to take in a
> skipper. Example:
>
>    r.skip_using(blank);
>
> That will override the passed-in skipper.

Yes, I like the idea but not the syntax, see below. (And BTW wouldn't
this syntax be restricted to rules -- whereas the directive syntax
below would work on any parser, no?)

> 3) Of course, yet another (orthogonal) solution is by a directive:
>
>     skip_using(blank)[p]
>
> I'd like to hear what you guys think. All this is doable.

Yes, I like this one (name-wise, I like just "skip" as Hartmut suggested).
Note that IMO it also requires a complementary pseudo-parser which
gets (a reference to) the current skipper (in case it is not
default-constructible / has some context), so that when you disable
the skipper with a lexeme[...] directive, you can re-enable it (the
same instance, not a new one) inside with skip[...].

I had similar directives (named "use_skipper" and "get_skipper") in
Spirit2 and Spirit2X, with the following syntax:
    get_skipper[ _a = _1 ] >> lexeme[ x >> use_skipper(_a)[ y ] >> z ]
for a whitespace-sensitive language. (There is probably a better
syntax to be found for the "get_skipper" part, though...)

Regards,
François

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit2] rule-skippers [Re: Python parser]

Francois Barel
Francois Barel wrote:

>> 2) Another plausible strategy is to allow the rule to take in a
>> skipper. Example:
>>
>>    r.skip_using(blank);
>>
>> That will override the passed-in skipper.
>
> Yes, I like the idea but not the syntax, see below. (And BTW wouldn't
> this syntax be restricted to rules -- whereas the directive syntax
> below would work on any parser, no?)

Please ignore this bit ^^, I misinterpreted solution 2... I got it
when I read the exchange between Carl and Joel in the other part of
this thread.

This offers an alternative to solution 3, for different use cases --
it is indeed interesting too.

Regards,
François

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit2] rule-skippers [Re: Python parser]

OvermindDL1
On Sat, Jan 24, 2009 at 5:24 AM, Francois Barel <[hidden email]> wrote:

> Francois Barel wrote:
>>> 2) Another plausible strategy is to allow the rule to take in a
>>> skipper. Example:
>>>
>>>    r.skip_using(blank);
>>>
>>> That will override the passed-in skipper.
>>
>> Yes, I like the idea but not the syntax, see below. (And BTW wouldn't
>> this syntax be restricted to rules -- whereas the directive syntax
>> below would work on any parser, no?)
>
> Please ignore this bit ^^, I misinterpreted solution 2... I got it
> when I read the exchange between Carl and Joel in the other part of
> this thread.
>
> This offers an alternative to solution 3, for different use cases --
> it is indeed interesting too.
>
> Regards,
> François
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by:
> SourcForge Community
> SourceForge wants to tell your story.
> http://p.sf.net/sfu/sf-spreadtheword
> _______________________________________________
> Spirit-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/spirit-devel
>

I have my skipper do other things though, such as count line numbers
and so forth, so as long as I can still use non-default constructable
ones then I up for whatever you choose.

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit2] rule-skippers [e ss: Python parser]

OvermindDL1
In reply to this post by Vygintas Daugmaudis
On Sat, Jan 24, 2009 at 4:36 AM, Justinas V.D. <[hidden email]> wrote:

> Joel de Guzman rašė:
>> Joel de Guzman wrote:
>>> Carl Barron wrote:
>>>
>>>>> I'd like to hear what you guys think. All this is doable.
>>>>    As for 2 and 3 I say why not both as 3 allows it in the middle of a
>>>> rule,
>>>> and 2 makes it cleaner if it is always the same skipper for a rule.
>>>> Seems like the 'engine'
>>>> for 2 and 3 is essentially the same, so I see no problem with both 2
>>>> and 3.
>>> I agree with Carl. 2 and 3 are both good solutions. I also
>>> agree with Hartmut that the name should be "skip".
>
> I concur that skip[P] and skip(P_0)[P_1] would  be more in the spirit
> of Spirit. :-)
>
> But the name should probably be skip_d, because it would go well with
> other directives and, on the other hand, "skip" is awfully often used
> in other contexts, e.g., to name a skip parser, so I guess that skip_d
> would collide less with user code as well.
>
>
> In pyparser the construct is called filter_eol_d[P] (I imagine that
> filter_eol_d[P] would be equivalent to skip_d(eol_p)[P]) and is used
> like this:
>    ch_p('(') >> filter_eol_d[argument_list >> ')'];
>
> This means that white space and comment skipping is performed as would
> be expected, but filter_eol_d just filters out any excess eols within
> parens.
>
> Imagine something akin to this:
>
>    foo( arg0,
>    arg1              =           bar,
>    arg2, # comment
>    # comment
>    # comment
>    # comment
>    arg3[some_expression]
>    )
>    EOL -- Marks End of Statement; not to be filtered!
>    another_statement
>    EOL
>
> Okay, space skipper is simple and easy and does exactly what it looks
> it does: it skips spaces and comments, so foo, after the
> transformation, would look like this:
>
>    foo(arg0,
>    arg1=bar,
>    arg2,
>
>
>    arg3[some_expression]
>    )
>
> Additional transformation from filter_eol_d[argument_list >> ')'] (or
> skip_d(eol_p)[argument_list >> ')']) would make it look like this:
>    foo(arg0,arg1=bar,arg2,arg3[some_expression])
>
> Now argument_list can parse
>
>    arg0,arg1=bar,arg2,arg3[some_expression]
>
> easily.
>
>
> This is certainly less complicated than some convoluted space skipper,
> which adapts dynamically. And, besides, in order to change the manner
> of working for that skipper, we would have to inject some tags into
> the grammar anyway, e.g., "from now on space skipper has to skip eols"
> and "from now on space skipper has NOT to skip eols".
>
>
>
>>
>> Oh, and I forgot to mention that the 2nd solution has an important
>> property that you can change the skip-parser dynamically at
>> runtime.
>
> Yes, that is nice, too.
>
> Regards,
> Justinas V.D.
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by:
> SourcForge Community
> SourceForge wants to tell your story.
> http://p.sf.net/sfu/sf-spreadtheword
> _______________________________________________
> Spirit-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/spirit-devel
>

Would being able to modify the skipper like this cause a speed hit
though?  Right now if I need to actually change my skipper (only once
thus far) I just create a new thing that just calls parse again with a
new grammar and the new rule I want, all inside the main grammar that
is already being parsed.  This incurs no real speed hit, compile time
or otherwise over just using multiple grammars anyway.

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit2] rule-skippers [e ss: Python parser]

Joel de Guzman-2
OvermindDL1 wrote:
>>
>>> Oh, and I forgot to mention that the 2nd solution has an important
>>> property that you can change the skip-parser dynamically at
>>> runtime.
>> Yes, that is nice, too.
>>
[...]
>>
>
> Would being able to modify the skipper like this cause a speed hit
> though?  Right now if I need to actually change my skipper (only once
> thus far) I just create a new thing that just calls parse again with a
> new grammar and the new rule I want, all inside the main grammar that
> is already being parsed.  This incurs no real speed hit, compile time
> or otherwise over just using multiple grammars anyway.

Implemented:

    skip(s)[p]

in Spirit2X. After thinking about it for some time, I realized that
the 2nd solution is not good. It's only advantage is that you
can supply the skipper at runtime. However, that advantage is
mitigated by the fact that the rule can only accept a skipper
type. Not good. Also, OvermindDL1 is right that there will be a
speed hit.

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net


------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit2] rule-skippers [e ss: Python parser]

Hartmut Kaiser
Joel,

> >>> Oh, and I forgot to mention that the 2nd solution has an important
> >>> property that you can change the skip-parser dynamically at
> >>> runtime.
> >> Yes, that is nice, too.
> >>
> [...]
> >>
> >
> > Would being able to modify the skipper like this cause a speed hit
> > though?  Right now if I need to actually change my skipper (only once
> > thus far) I just create a new thing that just calls parse again with
> a
> > new grammar and the new rule I want, all inside the main grammar that
> > is already being parsed.  This incurs no real speed hit, compile time
> > or otherwise over just using multiple grammars anyway.
>
> Implemented:
>
>     skip(s)[p]
>
> in Spirit2X. After thinking about it for some time, I realized that
> the 2nd solution is not good. It's only advantage is that you
> can supply the skipper at runtime. However, that advantage is
> mitigated by the fact that the rule can only accept a skipper
> type. Not good. Also, OvermindDL1 is right that there will be a
> speed hit.

Did you implement skip[] as well? I envision this as a possibility to
'reestablish' a skipper as used before:

    lexeme[a >> skip[b >> c] >> d]

If at all possible (I understand that the skipper type is lost inside
lexeme[], but perhaps it's possible to retrieve it after putting it into the
modifier list).

Regards Hartmut




------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit2] rule-skippers [e ss: Python parser]

Joel de Guzman-2
Hartmut Kaiser wrote:

> Did you implement skip[] as well? I envision this as a possibility to
> 'reestablish' a skipper as used before:
>
>     lexeme[a >> skip[b >> c] >> d]
>
> If at all possible (I understand that the skipper type is lost inside
> lexeme[], but perhaps it's possible to retrieve it after putting it into the
> modifier list).

Done. It's easier than that. I just made lexeme hide the skipper
inside:

     template <typename Skipper>
     struct unused_skipper : unused_type
     {
         unused_skipper(Skipper const& skipper)
           : skipper(skipper) {}
         Skipper const& skipper;
     };

Thanks for the suggestion!

Take note though that (as usual) the lexeme and skip cannot pass
through a rule (as all spirit2 directives).

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net


------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
12