Newbie: Need help designing simple rule

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Newbie: Need help designing simple rule

Daniel Lidström
Hi,

I'm trying to parse a text-file with data in columns, like this:

66     10     0000 1  -20000 206551    0952 206008   10159 207118   14332
206950
66     10     0000 1  -20000 206551    0952 206008   10159 207118   14332
206950
(many more lines like these)

The first 4 columns are always present, then there are up to four pairs of
numbers (always
at least 1 pair). Here's the piece of code I've been playing with:


#include <boost/spirit.hpp>
#include <iostream>

using namespace std;
using namespace boost::spirit;
using namespace boost;

void echo(int i)
{
   cout << i << ' ';
}

int main(int argc, char* argv[])
{
   /// this one should be on one line
   const char* s = "66     10     0000 1  -20000 206551    0952 206008
10159 207118   14332 206950\n66     10     0000 1  -20000 206551    0952
206008   10159 207118   14332 206950\n";
   const char* p = s;

   rule<phrase_scanner_t> r
       =
           int_p[&echo]
           && int_p[&echo]
           && int_p[&echo]
           && int_p[&echo]
           && *(int_p[&echo] & int_p[&echo])
           && eol_p
      ;
   parse_info<> info = spirit::parse(p, r, space_p);

   while( info.hit )
   {
      cout << endl << "Parsed ok!" << endl;
      p += info.length;
      info = spirit::parse(p, r, space_p);
   }

   return 0;
}

info.hit is false so the while-loop is never entered. What is printed is
this:
66 10 0 1 -20000 -20000 206551 206551 952 952 206008 206008 10159 10159
207118 207118 14332 14332 206950 206950 66 66 10 10 0 0 1 1 -20000 -20000
206551 206551 952 952 206008 206008 10159 10159 207118 207118 14332 14332
206950 206950

So the in the first line every value is printed once, and in the second,
every value is printed twice.
I'm trying to match each line, one by one. Can anyone suggest how this would
be done?

Thanks in advance!
Daniel


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Newbie: Need help designing simple rule

Carl Barron

On Sep 19, 2005, at 10:50 AM, Daniel Lidström wrote:

> Hi,
>
> I'm trying to parse a text-file with data in columns, like this:
>
> 66     10     0000 1  -20000 206551    0952 206008   10159 207118  
> 14332
> 206950
> 66     10     0000 1  -20000 206551    0952 206008   10159 207118  
> 14332
> 206950
> (many more lines like these)
>
> The first 4 columns are always present, then there are up to four
> pairs of
> numbers (always
> at least 1 pair). Here's the piece of code I've been playing with:
>
>
> #include <boost/spirit.hpp>
> #include <iostream>
>
> using namespace std;
> using namespace boost::spirit;
> using namespace boost;
>
> void echo(int i)
> {
>    cout << i << ' ';
> }
>
> int main(int argc, char* argv[])
> {
>    /// this one should be on one line
>    const char* s = "66     10     0000 1  -20000 206551    0952 206008
> 10159 207118   14332 206950\n66     10     0000 1  -20000 206551    
> 0952
> 206008   10159 207118   14332 206950\n";
>    const char* p = s;
>
>    rule<phrase_scanner_t> r
>        =
>            int_p[&echo]
>            && int_p[&echo]
>            && int_p[&echo]
>            && int_p[&echo]
>            && *(int_p[&echo] & int_p[&echo])
>            && eol_p
>       ;
>    parse_info<> info = spirit::parse(p, r, space_p);
>
>    while( info.hit )
>    {
>       cout << endl << "Parsed ok!" << endl;
>       p += info.length;
>       info = spirit::parse(p, r, space_p);
>    }
>
>    return 0;
> }
>
> info.hit is false so the while-loop is never entered. What is printed
> is
> this:
> 66 10 0 1 -20000 -20000 206551 206551 952 952 206008 206008 10159 10159
> 207118 207118 14332 14332 206950 206950 66 66 10 10 0 0 1 1 -20000
> -20000
> 206551 206551 952 952 206008 206008 10159 10159 207118 207118 14332
> 14332
> 206950 206950
>
> So the in the first line every value is printed once, and in the
> second,
> every value is printed twice.
> I'm trying to match each line, one by one. Can anyone suggest how this
> would
> be done?
> \

  first it appears you are looking for pairs of integers so
   item = int_p[&echo] >> int_p[&echo]
   echoes a pair of integers.
now   r = item >> +item >> eol_p
  parses 2 or more pairs of integers.

   rule<phase_scanner_t> item = int_p[&echo] >> int_p[&echo];
   rule<phase_scanner_t> r = item >> +item >> eol_p;

  r is an item followed by one or more items, that is two or more items.
[an item is two integers]




-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Newbie: Need help designing simple rule

Carl Barron
In reply to this post by Daniel Lidström

On Sep 19, 2005, at 10:50 AM, Daniel Lidström wrote:

> Hi,
>
> I'm trying to parse a text-file with data in columns, like this:
>
> 66     10     0000 1  -20000 206551    0952 206008   10159 207118  
> 14332
> 206950
> 66     10     0000 1  -20000 206551    0952 206008   10159 207118  
> 14332
> 206950
> (many more lines like these)
>
> The first 4 columns are always present, then there are up to four
> pairs of
> numbers (always
> at least 1 pair). Here's the piece of code I've been playing with:
>
>
  after reading this again your code might work if the skipper in your
parse function call is blank_p instead of space_p. Space_p eats the end
of
line indicator ['\n'].   If you have a file of these lines I'd read the
file line by line
using std::getline() [the free function in <string> ] to read each line
one at a time,
The parse the line producing the results in a vector<double>. To do this
easily I recommend using a struct derived from grammar<...>. This struct
can use a reference to the actual vector used. then if the parse parses
the
whole line you can process your data from this vector.  Also spirit
provides
some common semantic functors one of which is push_back_a which will
push the result into an STL container supporting push_back.
the following grammar uses repeat_p to loop according to the above
statement of the problem:

  #include <boost/spirit/core.hpp>
#include <boost/spirit/actor.hpp>
#include <boost/spirit/utility.hpp>

#include <vector>

namespace SP = boost::spirit;

struct file_fmt:SP::grammar<file_fmt>
{
        std::vector<double> &out;
        file_fmt(std::vector<double> &a):out(a){}
       
        template <class Scan>
        struct definition
        {
                definition(const file_fmt &s)
                {
                        item = SP::int_p[SP::push_back_a(s.out)]
                                ;
                        pair = item >> item;
                        required = SP::repeat_p(4)[item]
                                ;
                        optional = SP::repeat_p(1,4)[pair]
                                ;
                        line = required >> optional;
                               
                }
                SP::rule<Scan> line,item,required,optional,pair;
                SP::rule<Scan> const & start() const
                { return line;}
        };
};


using this in a loop like this will parse the entire file:

    std::fstream ifs(file_name);
    std::string line;
    while(std::getline(ifs,line))
    {
        std::vector<double> items;
        file_fmt fmt(items);

        if(SP::parse(line.begin(),line.end(),fmt,SP::blank_p).full)
                process_data(items);  // use the results.
        else
        {
                std::cerr << "parse_failed\n";
                break;
        }
   }




-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Parsing a Fixed-Width Field

Vale Group
I have to parse a field of characters which has a fixed width.  The leading
characters are terminated with a null and will be saved in a string:

*alnum_p[push_back_a(stringA)]

The remaining characters in the field after the null are undefined and will not
be saved.  There is no terminating token such as a null or newline following
those characters to mark the start of the next field.  I only know the total
length of the field.

It appears that I can't perform the calculation of the size of the remaining
field in the repeat parser.
repeat_p(boost::ref(fieldWidth) - boost::ref(stringA.size()))[anychar_p]
performs the repeat loop fieldWidth times and not fieldWidth less the size of
the string times.

What is the best way to parse this field?

--
Charles



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parsing a Fixed-Width Field

Joel de Guzman-2
Vale Group wrote:

> I have to parse a field of characters which has a fixed width.  The leading
> characters are terminated with a null and will be saved in a string:
>
> *alnum_p[push_back_a(stringA)]
>
> The remaining characters in the field after the null are undefined and will not
> be saved.  There is no terminating token such as a null or newline following
> those characters to mark the start of the next field.  I only know the total
> length of the field.
>
> It appears that I can't perform the calculation of the size of the remaining
> field in the repeat parser.
> repeat_p(boost::ref(fieldWidth) - boost::ref(stringA.size()))[anychar_p]
> performs the repeat loop fieldWidth times and not fieldWidth less the size of
> the string times.
>
> What is the best way to parse this field?

Lemme see if I get this right.

1) You want to parse a fixed width field of length L
2) The leading string is nul terminated (size N)
3) There are remaining L-N chars are to be ignored.

Ok, how about:

     int n = fieldWidth;

     *alnum_p[push_back_a(stringA)][decr(n)]
      >> repeat_p(boost::ref(n))[anychar_p]

where decr:

     struct decr
     {
         decr(int& n) : n(n) {}
         template <class Iter>
         void operator()(Iter const&, Iter const&)
         {
             --n;
         }
         int& n;
     };

??

Regards,
--
Joel de Guzman
http://www.boost-consulting.com
http://spirit.sf.net



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Re: Parsing a Fixed-Width Field

Vale Group
Joel de Guzman wrote:
> Lemme see if I get this right.
> 1) You want to parse a fixed width field of length L
> 2) The leading string is nul terminated (size N)
> 3) There are remaining L-N chars are to be ignored.

That's correct.

> Ok, how about:
>      int n = fieldWidth;
>      *alnum_p[push_back_a(stringA)][decr(n)]
>       >> repeat_p(boost::ref(n))[anychar_p]
>
> where decr:
>      struct decr
>      {
>          decr(int& n) : n(n) {}
>          template <class Iter>
>          void operator()(Iter const&, Iter const&)
>          {
>              --n;
>          }
>          int& n;
>      };

That looks like the solution I need.  Now if I could just get it to compile.
The code looks fine to me and varies from the examples given on the Semantic
Actions page of the documentation only in the placement of some consts.  Alas my
gcc 3.4 compiler complains.  I reduced the code to this minimal file:

#include <boost/spirit.hpp>
#include <boost/ref.hpp>
#include <iostream>
#include <string>

struct decr {  // this is line 6
   decr(int& n) : n(n) {}
   template <class Iter>
   void operator()(Iter const&, Iter const&) {
      --n;
   }
   int& n;
};

using namespace std;
using namespace boost::spirit;

typedef char char_t;
typedef file_iterator<char_t> iterator_t;
typedef scanner<iterator_t> scanner_t;
typedef rule<scanner_t> rule_t;

int main() {
   iterator_t first("fwtest.txt");
   iterator_t last = first.make_end();
   const int fieldWidth = 10;
   int n = fieldWidth;
   string stringA;
   rule_t fldw = *alnum_p[push_back_a(stringA)][decr(n)]
                 >> repeat_p(boost::ref(n))[anychar_p];
   parse_info<iterator_t> info = parse(first, last, fldw);
   if (info.full) {
      std::cout << "Parse succeeded.\n";
      return 0;
   }
   else {
      std::cout << "--- Parse failed ---\n";
      return 1;
   }
}

The compiler send me these messages:

C:/boostcvs/boost/spirit/core/scanner/scanner.hpp: In static member function
`static void boost::spirit::attributed_action_policy<AttrT>::call(const ActorT&,
AttrT&, const IteratorT&, const IteratorT&)
[with ActorT = decr, IteratorT = boost::spirit::file_iterator<char_t,
boost::spirit::fileiter_impl::mmap_file_iterator<char_t> >, AttrT = const
char]':

C:/boostcvs/boost/spirit/core/scanner/scanner.hpp:159:   instantiated from
`void boost::spirit::action_policy::do_action(const ActorT&, AttrT&, const
IteratorT&, const IteratorT&) const
[with ActorT = decr, AttrT = const char, IteratorT = iterator_t]'


C:/boostcvs/boost/spirit/core/composite/actions.hpp:109:   instantiated from
`typename boost::spirit::parser_result<boost::spirit::action<ParserT, ActionT>,
ScannerT>::type boost::spirit::action<ParserT, ActionT>::parse(const ScannerT&)
const
[with ScannerT = boost::spirit::scanner<iterator_t,
boost::spirit::scanner_policies<boost::spirit::iteration_policy,
boost::spirit::match_policy, boost::spirit::action_policy> >, ParserT =
boost::spirit::action<boost::spirit::alnum_parser,
boost::spirit::ref_value_actor<std::string, boost::spirit::push_back_action> >,
ActionT = decr]'

C:/boostcvs/boost/spirit/core/composite/kleene_star.hpp:58:   instantiated from
`typename boost::spirit::parser_result<boost::spirit::kleene_star<S>,
ScannerT>::type boost::spirit::kleene_star<S>::parse(const ScannerT&) const
[with ScannerT = boost::spirit::scanner<iterator_t,
boost::spirit::scanner_policies<boost::spirit::iteration_policy,
boost::spirit::match_policy, boost::spirit::action_policy> >, S =
boost::spirit::action<boost::spirit::action<boost::spirit::alnum_parser,
boost::spirit::ref_value_actor<std::string, boost::spirit::push_back_action> >,
decr>]'


C:/boostcvs/boost/spirit/core/composite/sequence.hpp:53:   instantiated from
`typename boost::spirit::parser_result<boost::spirit::sequence<A, B>,
ScannerT>::type boost::spirit::sequence<A, B>::parse(const ScannerT&) const
[with ScannerT = boost::spirit::scanner<iterator_t,
boost::spirit::scanner_policies<boost::spirit::iteration_policy,
boost::spirit::match_policy, boost::spirit::action_policy> >, A =
boost::spirit::kleene_star<boost::spirit::action<boost::spirit::action<boost::sp
irit::alnum_parser, boost::spirit::ref_value_actor<std::string,
boost::spirit::push_back_action> >, decr> >, B =
boost::spirit::fixed_loop<boost::spirit::anychar_parser,
boost::reference_wrapper<int> >]'

C:/boostcvs/boost/spirit/core/non_terminal/impl/rule.ipp:233:   instantiated
from
`typename boost::spirit::match_result<ScannerT, ContextResultT>::type
boost::spirit::impl::concrete_parser<ParserT, ScannerT,
AttrT>::do_parse_virtual(const ScannerT&) const
[with ParserT =
boost::spirit::sequence<boost::spirit::kleene_star<boost::spirit::action<boost::
spirit::action<boost::spirit::alnum_parser,
boost::spirit::ref_value_actor<std::string, boost::spirit::push_back_action> >,
decr> >, boost::spirit::fixed_loop<boost::spirit::anychar_parser,
boost::reference_wrapper<int> > >, ScannerT = scanner_t, AttrT =
boost::spirit::nil_t]'

main.cpp:6:   instantiated from here

C:/boostcvs/boost/spirit/core/scanner/scanner.hpp:128: error: no match for call
to `(const decr) (const char&)'


I would appreciate some guidance.  I fear that I'm standing too close to the
code.

--
Chuck Brockman



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parsing a Fixed-Width Field

Joel de Guzman-2
Vale Group wrote:

> Joel de Guzman wrote:
>
>>Lemme see if I get this right.
>>1) You want to parse a fixed width field of length L
>>2) The leading string is nul terminated (size N)
>>3) There are remaining L-N chars are to be ignored.
>
>
> That's correct.
>
>
>>Ok, how about:
>>     int n = fieldWidth;
>>     *alnum_p[push_back_a(stringA)][decr(n)]
>>      >> repeat_p(boost::ref(n))[anychar_p]
>>
>>where decr:
>>     struct decr
>>     {
>>         decr(int& n) : n(n) {}
>>         template <class Iter>
>>         void operator()(Iter const&, Iter const&)
>>         {
>>             --n;
>>         }
>>         int& n;
>>     };
>
>
> That looks like the solution I need.  Now if I could just get it to compile.
> The code looks fine to me and varies from the examples given on the Semantic
> Actions page of the documentation only in the placement of some consts.  Alas my
> gcc 3.4 compiler complains.  I reduced the code to this minimal file:

Ahh, of course. Try this:

struct decr {  // this is line 6
    decr(int& n) : n(n) {}
    template <class Char>
    void operator()(Char) {
       --n;
    }
    int& n;
};

--
Joel de Guzman
http://www.boost-consulting.com
http://spirit.sf.net



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Parsing a Fixed-Width Field

Joel de Guzman-2
In reply to this post by Vale Group
Joel de Guzman wrote:
 > Vale Group wrote:
 >
 >> Joel de Guzman wrote:
 >>
 >>> Lemme see if I get this right.
 >>> 1) You want to parse a fixed width field of length L
 >>> 2) The leading string is nul terminated (size N)
 >>> 3) There are remaining L-N chars are to be ignored.
 >>
 >>
 >>
 >> That's correct.
 >>
 >>
 >>> Ok, how about:
 >>>     int n = fieldWidth;
 >>>     *alnum_p[push_back_a(stringA)][decr(n)]
 >>>      >> repeat_p(boost::ref(n))[anychar_p]
 >>>
 >>> where decr:
 >>>     struct decr
 >>>     {
 >>>         decr(int& n) : n(n) {}
 >>>         template <class Iter>
 >>>         void operator()(Iter const&, Iter const&)
 >>>         {
 >>>             --n;
 >>>         }
 >>>         int& n;
 >>>     };
 >>
 >>
 >>
 >> That looks like the solution I need.  Now if I could just get it to
 >> compile.
 >> The code looks fine to me and varies from the examples given on the
 >> Semantic
 >> Actions page of the documentation only in the placement of some
 >> consts.  Alas my
 >> gcc 3.4 compiler complains.  I reduced the code to this minimal file:
 >
 >
 > Ahh, of course. Try this:
 >
 > struct decr {  // this is line 6
 >    decr(int& n) : n(n) {}
 >    template <class Char>
 >    void operator()(Char) {
 >       --n;
 >    }
 >    int& n;
 > };
 >

Oh and please make that const:

struct decr {  // this is line 6
    decr(int& n) : n(n) {}
    template <class Char>
    void operator()(Char) const {
       --n;
    }
    int& n;
};

Regards,
--
Joel de Guzman
http://www.boost-consulting.com
http://spirit.sf.net



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general
Reply | Threaded
Open this post in threaded view
|

Re: Re: Parsing a Fixed-Width Field

Vale Group
Joel de Guzman wrote:

> Oh and please make that const:
>
> struct decr {  // this is line 6
>     decr(int& n) : n(n) {}
>     template <class Char>
>     void operator()(Char) const {
>        --n;
>     }
>     int& n;
> };

Yes!  It now compiles.  I'll get to a more complete test tomorrow and
incorporate it into my larger program.

--
Chuck Brockman



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general