parsed string attribute contains \0

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

parsed string attribute contains \0

Olaf Peter
Hi,

for my (spare time) project I have to implement e.g. the following BNF rule:

integer ::= digit { [ underline ] digit }

The separator underline rises up on several production rules, e.g. hex
values (16'A_FF_E),bit pattern (2"0111_0100) or others
(10*42.47_11*e-6_6). After some (negative) experience with polluting the
grammar rules with lots of semantic actions I decided to store the
underline pruned literal strings into the AST, tag them with the context
from BNF, and evaluate them later (e.g. parse with a 2nd pass explicit
for int, double, hex, etc with the knowledge of the tags added before).
IIRC this is the recommanded way (avoid SA).

At this point I run into the problem with the 'flattended'/'joined'
string - there is a \0 at the point of underline occurence.

---8<----
#include <iostream>
#include <boost/spirit/home/x3.hpp>

int main()
{
    namespace x3 = boost::spirit::x3;
    using x3::char_;

    auto const integer =  x3::rule<struct _, std::string>{} =
        x3::lexeme[ +char_("0-9") >> *(char_("0-9") | "_") ];

    for(std::string const str: {
            "4711", "123_456"
    }) {
      auto iter = str.begin(), end = str.end();

      std::string attr;
      bool r = x3::phrase_parse(iter, end, integer, x3::space, attr);

      std::cout << "parse '" << str << "': ";
      if (r && iter == end) {
        std::cout << "succeeded: '" << attr << "'\n";
        if(attr.find('\0') != std::string::npos) {
            std::cout << "found '\\0'\n";
        }
      } else {
        std::cout << "*** failed ***\n";
      }
    }

    return 0;
}

--->8----

Depending on the capabilities of the terminal used you can see, e.g.:

parse '4711': succeeded: '4711'
parse '123_456': succeeded: '123456'
found '\0'

or even only '123  (then the \0 strikes).
The rule written here looks odd, I know. Normally I would write
x3::lexeme[ +char_("0-9") >> -(*char_("0-9_") ];
(which still isn't sufficent I guess) but this contains the unwanted
underline. This ignores the literal in attribute but gives the problems
mentioned above.

How can I fix this? How to write the rule that trailing underlines fails
to parse?

Thanks,
Olaf



------------------------------------------------------------------------------
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: parsed string attribute contains \0

Olaf Peter
> How can I fix this? How to write the rule that trailing underlines fails
> to parse?

x3::lexeme[ +char_("0-9") >> *(-x3::lit("_") >> char_("0-9")) ];

My rule from former topic wasn't so bad et, this work as expected, isn't
it? But from what the \0 comes from the old one?


------------------------------------------------------------------------------
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Loading...