position_iterator and new lines

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

position_iterator and new lines

Wander Lairson
Hello,

I already sent this message to spirit-general, but unfortunately I had
no answer.

I've worked with spirit and found a funny problem: I made a skip
parser which skip space char (that which isspace() returns true),
excepting new lines So, new lines are imporants in my grammar. I used
position_iterator to keep track of the current position to error
report. When
treating an input which new lines are just a line feed, it works fine,
but when trying to parse
an input which new lines are made from carriage returns followed by
line feeds, it fails. A few
hours debugging, I discovery that the problem comes from

iterator/position_iterator.hpp:position_iterator::increment()

Here is the code

   void increment()
   {
       typename base_t::reference val = *(this->base());
       if (val == '\n' || val == '\r') {
           ++this->base_reference();
           if (this->base_reference() != _end) {
               typename base_t::reference val2 = *(this->base());
               if ((val == '\n' && val2 == '\r')
                   || (val == '\r' && val2 == '\n'))
               {
                   ++this->base_reference();
               }
           }
           this->next_line(_pos);
           static_cast<main_iter_t &>(*this).newline();
       }

       ..........

   }

The problem is that the line feed is not returned, because two
increments happen, which
skips the line feed (if you have a sequence of line feed followed by a
carriage return, o carriage return is skipped). This is strange,
because this just have the objective of not
call next_line two times. If you cut off the second condition in the
most external if, so:

   void increment()
   {
       typename base_t::reference val = *(this->base());
       if (val == '\n') { <============= just recognizes line feeds
           ++this->base_reference();
           if (this->base_reference() != _end) {
               typename base_t::reference val2 = *(this->base());
               if ((val == '\n' && val2 == '\r')
                   || (val == '\r' && val2 == '\n'))
               {
                   ++this->base_reference();
               }
           }
           this->next_line(_pos);
           static_cast<main_iter_t &>(*this).newline();
       }
   }

the parser works fine. Here is a little test program:

#include <iostream>
#include <cctype>
#include <cstring>
#include <string>
#include <boost/spirit/core.hpp>
#include <boost/spirit/iterator/position_iterator.hpp>

using namespace boost::spirit;
using namespace std;

// Parse space chars, except newlines
struct space_but_newline_parser : public char_parser<space_but_newline_parser>
{
   typedef space_but_newline_parser self_t;

   space_but_newline_parser() {}

   template <typename CharT>
   bool test(CharT ch) const
   {
       typedef char_traits<CharT> char_tr_t;
       return isspace(ch) && !char_tr_t::eq(ch,
                               char_tr_t::to_char_type('\n'));
   }
};

template<typename IteratorT, typename SkipT>
bool do_parse(const IteratorT &first, const IteratorT &last, const SkipT &sk)
{
   return parse(first, last, ( int_p >> eol_p >> int_p ), sk).full;
}

int main(int argc, char *argv[])
{
   typedef position_iterator<const char *> iterator_t;

   const char str_lf[] = "10 \n 10";
   const char str_crlf[] = "10 \r\n 10";

   iterator_t lf_first(str_lf, str_lf + strlen(str_lf), "");
   iterator_t crlf_first(str_lf, str_crlf + strlen(str_crlf), "");
   iterator_t last;

   space_but_newline_parser my_skip_parser;

   if (do_parse(lf_first, last, my_skip_parser))
       cout << "line feed ok" << endl;
   else
       cout << "line feed failed" << endl;

   if (do_parse(crlf_first, last, my_skip_parser))
       cout << "carriage return line feed ok" << endl;
   else
       cout << "carriage return line feed failed" << endl;

   return 0;
}

// vim:sts=4:sw=4

Could this be a bug or a mistake of mine?

Thanks,
Wander

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: position_iterator and new lines

Hartmut Kaiser
Seems to be a bug in Spirit. Can you provide a patch incl. a minimal
testcase, please?

Regards Hartmut

> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf
> Of Wander Lairson
> Sent: Wednesday, July 18, 2007 11:24 AM
> To: [hidden email]
> Subject: [Spirit-devel] position_iterator and new lines
>
> Hello,
>
> I already sent this message to spirit-general, but
> unfortunately I had no answer.
>
> I've worked with spirit and found a funny problem: I made a
> skip parser which skip space char (that which isspace()
> returns true), excepting new lines So, new lines are
> imporants in my grammar. I used position_iterator to keep
> track of the current position to error report. When treating
> an input which new lines are just a line feed, it works fine,
> but when trying to parse an input which new lines are made
> from carriage returns followed by line feeds, it fails. A few
> hours debugging, I discovery that the problem comes from
>
> iterator/position_iterator.hpp:position_iterator::increment()
>
> Here is the code
>
>    void increment()
>    {
>        typename base_t::reference val = *(this->base());
>        if (val == '\n' || val == '\r') {
>            ++this->base_reference();
>            if (this->base_reference() != _end) {
>                typename base_t::reference val2 = *(this->base());
>                if ((val == '\n' && val2 == '\r')
>                    || (val == '\r' && val2 == '\n'))
>                {
>                    ++this->base_reference();
>                }
>            }
>            this->next_line(_pos);
>            static_cast<main_iter_t &>(*this).newline();
>        }
>
>        ..........
>
>    }
>
> The problem is that the line feed is not returned, because
> two increments happen, which skips the line feed (if you have
> a sequence of line feed followed by a carriage return, o
> carriage return is skipped). This is strange, because this
> just have the objective of not call next_line two times. If
> you cut off the second condition in the most external if, so:
>
>    void increment()
>    {
>        typename base_t::reference val = *(this->base());
>        if (val == '\n') { <============= just recognizes line feeds
>            ++this->base_reference();
>            if (this->base_reference() != _end) {
>                typename base_t::reference val2 = *(this->base());
>                if ((val == '\n' && val2 == '\r')
>                    || (val == '\r' && val2 == '\n'))
>                {
>                    ++this->base_reference();
>                }
>            }
>            this->next_line(_pos);
>            static_cast<main_iter_t &>(*this).newline();
>        }
>    }
>
> the parser works fine. Here is a little test program:
>
> #include <iostream>
> #include <cctype>
> #include <cstring>
> #include <string>
> #include <boost/spirit/core.hpp>
> #include <boost/spirit/iterator/position_iterator.hpp>
>
> using namespace boost::spirit;
> using namespace std;
>
> // Parse space chars, except newlines
> struct space_but_newline_parser : public
> char_parser<space_but_newline_parser>
> {
>    typedef space_but_newline_parser self_t;
>
>    space_but_newline_parser() {}
>
>    template <typename CharT>
>    bool test(CharT ch) const
>    {
>        typedef char_traits<CharT> char_tr_t;
>        return isspace(ch) && !char_tr_t::eq(ch,
>                                char_tr_t::to_char_type('\n'));
>    }
> };
>
> template<typename IteratorT, typename SkipT> bool
> do_parse(const IteratorT &first, const IteratorT &last, const
> SkipT &sk) {
>    return parse(first, last, ( int_p >> eol_p >> int_p ), sk).full; }
>
> int main(int argc, char *argv[])
> {
>    typedef position_iterator<const char *> iterator_t;
>
>    const char str_lf[] = "10 \n 10";
>    const char str_crlf[] = "10 \r\n 10";
>
>    iterator_t lf_first(str_lf, str_lf + strlen(str_lf), "");
>    iterator_t crlf_first(str_lf, str_crlf + strlen(str_crlf), "");
>    iterator_t last;
>
>    space_but_newline_parser my_skip_parser;
>
>    if (do_parse(lf_first, last, my_skip_parser))
>        cout << "line feed ok" << endl;
>    else
>        cout << "line feed failed" << endl;
>
>    if (do_parse(crlf_first, last, my_skip_parser))
>        cout << "carriage return line feed ok" << endl;
>    else
>        cout << "carriage return line feed failed" << endl;
>
>    return 0;
> }
>
> // vim:sts=4:sw=4
>
> Could this be a bug or a mistake of mine?
>
> Thanks,
> Wander
>
> --------------------------------------------------------------
> -----------
> This SF.net email is sponsored by DB2 Express Download DB2
> Express C - the FREE version of DB2 express and take control
> of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> Spirit-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/spirit-devel


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: position_iterator and new lines

Wander Lairson
Hi Hartmut,

Offcourse yes, I'll do this as soon as possible.

Thanks,
Wander

2007/7/18, Hartmut Kaiser <[hidden email]>:

> Seems to be a bug in Spirit. Can you provide a patch incl. a minimal
> testcase, please?
>
> Regards Hartmut
>
> > -----Original Message-----
> > From: [hidden email]
> > [mailto:[hidden email]] On Behalf
> > Of Wander Lairson
> > Sent: Wednesday, July 18, 2007 11:24 AM
> > To: [hidden email]
> > Subject: [Spirit-devel] position_iterator and new lines
> >
> > Hello,
> >
> > I already sent this message to spirit-general, but
> > unfortunately I had no answer.
> >
> > I've worked with spirit and found a funny problem: I made a
> > skip parser which skip space char (that which isspace()
> > returns true), excepting new lines So, new lines are
> > imporants in my grammar. I used position_iterator to keep
> > track of the current position to error report. When treating
> > an input which new lines are just a line feed, it works fine,
> > but when trying to parse an input which new lines are made
> > from carriage returns followed by line feeds, it fails. A few
> > hours debugging, I discovery that the problem comes from
> >
> > iterator/position_iterator.hpp:position_iterator::increment()
> >
> > Here is the code
> >
> >    void increment()
> >    {
> >        typename base_t::reference val = *(this->base());
> >        if (val == '\n' || val == '\r') {
> >            ++this->base_reference();
> >            if (this->base_reference() != _end) {
> >                typename base_t::reference val2 = *(this->base());
> >                if ((val == '\n' && val2 == '\r')
> >                    || (val == '\r' && val2 == '\n'))
> >                {
> >                    ++this->base_reference();
> >                }
> >            }
> >            this->next_line(_pos);
> >            static_cast<main_iter_t &>(*this).newline();
> >        }
> >
> >        ..........
> >
> >    }
> >
> > The problem is that the line feed is not returned, because
> > two increments happen, which skips the line feed (if you have
> > a sequence of line feed followed by a carriage return, o
> > carriage return is skipped). This is strange, because this
> > just have the objective of not call next_line two times. If
> > you cut off the second condition in the most external if, so:
> >
> >    void increment()
> >    {
> >        typename base_t::reference val = *(this->base());
> >        if (val == '\n') { <============= just recognizes line feeds
> >            ++this->base_reference();
> >            if (this->base_reference() != _end) {
> >                typename base_t::reference val2 = *(this->base());
> >                if ((val == '\n' && val2 == '\r')
> >                    || (val == '\r' && val2 == '\n'))
> >                {
> >                    ++this->base_reference();
> >                }
> >            }
> >            this->next_line(_pos);
> >            static_cast<main_iter_t &>(*this).newline();
> >        }
> >    }
> >
> > the parser works fine. Here is a little test program:
> >
> > #include <iostream>
> > #include <cctype>
> > #include <cstring>
> > #include <string>
> > #include <boost/spirit/core.hpp>
> > #include <boost/spirit/iterator/position_iterator.hpp>
> >
> > using namespace boost::spirit;
> > using namespace std;
> >
> > // Parse space chars, except newlines
> > struct space_but_newline_parser : public
> > char_parser<space_but_newline_parser>
> > {
> >    typedef space_but_newline_parser self_t;
> >
> >    space_but_newline_parser() {}
> >
> >    template <typename CharT>
> >    bool test(CharT ch) const
> >    {
> >        typedef char_traits<CharT> char_tr_t;
> >        return isspace(ch) && !char_tr_t::eq(ch,
> >                                char_tr_t::to_char_type('\n'));
> >    }
> > };
> >
> > template<typename IteratorT, typename SkipT> bool
> > do_parse(const IteratorT &first, const IteratorT &last, const
> > SkipT &sk) {
> >    return parse(first, last, ( int_p >> eol_p >> int_p ), sk).full; }
> >
> > int main(int argc, char *argv[])
> > {
> >    typedef position_iterator<const char *> iterator_t;
> >
> >    const char str_lf[] = "10 \n 10";
> >    const char str_crlf[] = "10 \r\n 10";
> >
> >    iterator_t lf_first(str_lf, str_lf + strlen(str_lf), "");
> >    iterator_t crlf_first(str_lf, str_crlf + strlen(str_crlf), "");
> >    iterator_t last;
> >
> >    space_but_newline_parser my_skip_parser;
> >
> >    if (do_parse(lf_first, last, my_skip_parser))
> >        cout << "line feed ok" << endl;
> >    else
> >        cout << "line feed failed" << endl;
> >
> >    if (do_parse(crlf_first, last, my_skip_parser))
> >        cout << "carriage return line feed ok" << endl;
> >    else
> >        cout << "carriage return line feed failed" << endl;
> >
> >    return 0;
> > }
> >
> > // vim:sts=4:sw=4
> >
> > Could this be a bug or a mistake of mine?
> >
> > Thanks,
> > Wander
> >
> > --------------------------------------------------------------
> > -----------
> > This SF.net email is sponsored by DB2 Express Download DB2
> > Express C - the FREE version of DB2 express and take control
> > of your XML. No limits. Just data. Click to get it now.
> > http://sourceforge.net/powerbar/db2/
> > _______________________________________________
> > Spirit-devel mailing list
> > [hidden email]
> > https://lists.sourceforge.net/lists/listinfo/spirit-devel
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> Spirit-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/spirit-devel
>

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel