Catching up, breaking changes, Spirit2 docs & AST generator

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Catching up, breaking changes, Spirit2 docs & AST generator

Tobias Schwinger-2
Hi Spirit people,

it's been a while of inactivity (Boost-wise for me, that is)... So, that
is to change and I'd like to catch up with what's going on around here.

OK, bad news first: I just uploaded *breaking changes* with the overhaul
of Fusion/Functional into the trunk.

There are only two unfused.* adapter variants left (for the new
Boost.Functional/Forward utility should be used to plug two of them
together, now).
I also added a boolean non-type to 'unfused' (previously known as
'unfused_lvalue_args') to nuke out the nullary operator() overloads.
The documentation has been brought in sync plus limit headers are now
officially documented (support request from a while ago) and there is
a note about Boost.Functional/Factory can make Fusion.Functional work
on ctors. All is getting better, after all.

I've been looking into Spirit2 few months ago. There were no docs at
that time. Are there any now?

And: Can Spirit2 create ASTs?

The need for a fast AST generator, lack of docs and the rather high
compilation times on the examples sorta scared me away from using
Spirit2 and I ended up writing myself a very minimalist Spirit-like
parser framework for my particular case but with /some/ genericity in
mind (Spirit v. -0.1, if you want so :P).
The AST generator does not need any copying, memory allocation can be
reduced to the absolute minimum and backtracking effectively boils down
to a move and a conditional branch (never taken if the tuning parameters
for memory management are set properly and lookahead is finite for
realistic input). So if Spirit2 still needs AST support, we should talk
details for eventual boostification...

That should be it plus all the things I forgot.

Regards,
Tobias


------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: Catching up, breaking changes, Spirit2 docs & AST generator

Francois Barel
Tobias Schwinger wrote:
> I've been looking into Spirit2 few months ago. There were no docs at
> that time. Are there any now?

They are not complete yet:
http://www.boost.org/doc/libs/1_38_0/libs/spirit/doc/html/index.html


> And: Can Spirit2 create ASTs?
>
> The need for a fast AST generator, lack of docs and the rather high
> compilation times on the examples sorta scared me away from using
> Spirit2 and I ended up writing myself a very minimalist Spirit-like
> parser framework for my particular case but with /some/ genericity in
> mind (Spirit v. -0.1, if you want so :P).
> The AST generator does not need any copying, memory allocation can be
> reduced to the absolute minimum and backtracking effectively boils down
> to a move and a conditional branch (never taken if the tuning parameters
> for memory management are set properly and lookahead is finite for
> realistic input). So if Spirit2 still needs AST support, we should talk
> details for eventual boostification...

Absolutely: Spirit2 has generic generation capabilities, in the form
of "attributes" (it can build ASTs, and much more). You may want to
look at the examples in libs/spirit/example/qi/, e.g. employee.cpp,
calc2_ast.cpp, mini_xml*, ...

Each parser (terminal or nonterminal) can build and return an
attribute (any data structure). For example a non-terminal employee
parser "emp" could return a "struct employee", the "*emp" parser could
return a "std::vector<employee>", and so on.

You just need to decorate your grammar so that the attribute which is
generated by your whole parser is the data structure you want to
build. Often times, you can make your parser directly build the "real"
data structure (tree) you want, instead of building a "generic" AST,
which you will have to walk afterwards in order to build the data
structure you actually wanted.

In terms of performance: those attributes are 1. only built when
actually needed, and 2. built in place / swapped (but not copied
around like AST nodes in classic Spirit, which was a major cause of
inefficiency).
IOW, Spirit2 itself does not do any kind of allocation for attributes
-- only your containers (their allocators) do, if your attributes use
any. You can very well have everything pre-allocated before you call
parse, and Spirit2 will fill in your already-allocated data structure
without allocating anything.

Regards,
Fran├žois

------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: Catching up, breaking changes, Spirit2 docs & AST generator

Tobias Schwinger-2
Fran├žois,

thank you for your reply!

Francois Barel wrote:
> Tobias Schwinger wrote:
>> I've been looking into Spirit2 few months ago. There were no docs at
>> that time. Are there any now?
>
> They are not complete yet:
> http://www.boost.org/doc/libs/1_38_0/libs/spirit/doc/html/index.html
>

Just skimmed and partially read it (will send feedback once I redo in
more depth). Very nice!

Here's a summary of my Spirit-related work, so far, figuring you might
want to add it to the acknowledgments section:

Spirit 1:
o Boost.TypeOf integration
o Optimization of ASTs
o Some on-list user support
Spirit 2:
o Proposing expectation points
o GCC port of an early version
Fusion 2:
o Functional module
o Maintenance work (bug discovery/fixes)

>
>> And: Can Spirit2 create ASTs?
>>
>> The need for a fast AST generator, lack of docs and the rather high
>> compilation times on the examples sorta scared me away from using
>> Spirit2 and I ended up writing myself a very minimalist Spirit-like
>> parser framework for my particular case but with /some/ genericity in
>> mind (Spirit v. -0.1, if you want so :P).
>> The AST generator does not need any copying, memory allocation can be
>> reduced to the absolute minimum and backtracking effectively boils down
>> to a move and a conditional branch (never taken if the tuning parameters
>> for memory management are set properly and lookahead is finite for
>> realistic input). So if Spirit2 still needs AST support, we should talk
>> details for eventual boostification...
>
> Absolutely: Spirit2 has generic generation capabilities, in the form
> of "attributes" (it can build ASTs, and much more). You may want to
> look at the examples in libs/spirit/example/qi/, e.g. employee.cpp,
> calc2_ast.cpp, mini_xml*, ...

Very powerful, indeed! And quite impressive, too, knowing what's going
on inside the compiler for this stuff to work... I knew that this
capability existed, however :D.

>
> Each parser (terminal or nonterminal) can build and return an
> attribute (any data structure). For example a non-terminal employee
> parser "emp" could return a "struct employee", the "*emp" parser could
> return a "std::vector<employee>", and so on.
>
> You just need to decorate your grammar so that the attribute which is
> generated by your whole parser is the data structure you want to
> build. Often times, you can make your parser directly build the "real"
> data structure (tree) you want, instead of building a "generic" AST,
> which you will have to walk afterwards in order to build the data
> structure you actually wanted.

Yeah, but aren't we comparing apples with pairs, here?

I agree, that any kind of AST facility that won't allow pass-one-built
data structures to be attached won't be worth much (especially talking
Spirit2 which is obviously a most powerful tool for creating those).

In my case, I need an AST, because I don't know how to structure the
data until a certain section of my grammar has been parsed. Further,
(which might be rather special) I essentially parse ints and structuring
  information, so I wouldn't need the really cool stuff (would save it
for the next parser :D) and would prefer not to pay much for it, either.

So I could write myself a Fusion sequence with a single container or use
several of them to get rid of an enum I use for tagging the nodes (would
visit Variants then - dunno I really want to, I'd probably prefer better
compilation times, but depends on visitation API provided) and might
also need to write a custom container (just thinking aloud, basically).

> In terms of performance: those attributes are 1. only built when
> actually needed, and 2. built in place / swapped (but not copied
> around like AST nodes in classic Spirit, which was a major cause of
> inefficiency).
> IOW, Spirit2 itself does not do any kind of allocation for attributes
> -- only your containers (their allocators) do, if your attributes use
> any.

Sounds interesting. How exactly is backtracking handled? Or (rephrasing
the question with a widened scope): How does the call protocol on the
attributes (especially containers) look like?

Wondering whether swapping Fusion sequences (rather then keeping them in
dynamic memory and moving a pointer) is the best approach (for the very
purpose of AST creation, that is)...

> You can very well have everything pre-allocated before you call
> parse, and Spirit2 will fill in your already-allocated data structure
> without allocating anything.

That's not quite what I was talking about, since most typically we don't
know what will be parsed in advance.

My memory manager allocates block-wise, freeing up lazily (keeping one
spare block, that is) to ensure that N nodes can be created / at least N
nodes can be backtracked over without touching the allocator.

BTW. wouldn't std::deque be a more advisable default to be used in the
examples, as std::vector will copy its entire content when it needs to
grow (typical +50% growth rate might not be desired, either)?


Regards,
Tobias


------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel