[Spirit2X] UTF-8: failing no_case and case_handling tests

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Spirit2X] UTF-8: failing no_case and case_handling tests

Francois Barel
Currently the Qi no_case and Karma case_handling tests are failing for
me (gcc-4.1 on 32-bit Linux) with the following error:

====== BEGIN OUTPUT ======
terminate called after throwing an instance of '...<std::out_of_range>...'
  what():  Invalid UTF-32 code point U+0xffffffe1 encountered while
trying to encode UTF-16 sequence

EXIT STATUS: 134
====== END OUTPUT ======

thrown inside utf8_output_iterator's push method (in
boost/regex/pending/unicode_iterator.hpp).

The attached patch solves it by ensuring that in UTF-8 conversions,
chars are first converted to unsigned types before being converted to
ints (to avoid chars >127 being erroneously converted to negative
ints).
Does this look OK to you?


Also, is there a reason for
boost/spirit/home/support/string_traits.hpp's remove_sign to be there?
Because:
1. AFAICT it's not used,
2. maybe it's just a name issue, but for chars it doesn't remove the
sign -- on the contrary, it puts it,
3. if removing the sign is the goal, isn't that what type_traits's
make_unsigned is for?

Thanks,
Fran├žois

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel

Spirit2X-utf8.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Spirit2X] UTF-8: failing no_case and case_handling tests

Joel de Guzman-2
Francois Barel wrote:

> Currently the Qi no_case and Karma case_handling tests are failing for
> me (gcc-4.1 on 32-bit Linux) with the following error:
>
> ====== BEGIN OUTPUT ======
> terminate called after throwing an instance of '...<std::out_of_range>...'
>   what():  Invalid UTF-32 code point U+0xffffffe1 encountered while
> trying to encode UTF-16 sequence
>
> EXIT STATUS: 134
> ====== END OUTPUT ======
>
> thrown inside utf8_output_iterator's push method (in
> boost/regex/pending/unicode_iterator.hpp).
>
> The attached patch solves it by ensuring that in UTF-8 conversions,
> chars are first converted to unsigned types before being converted to
> ints (to avoid chars >127 being erroneously converted to negative
> ints).
> Does this look OK to you?

Looks good!

> Also, is there a reason for
> boost/spirit/home/support/string_traits.hpp's remove_sign to be there?
> Because:
> 1. AFAICT it's not used,
> 2. maybe it's just a name issue, but for chars it doesn't remove the
> sign -- on the contrary, it puts it,
> 3. if removing the sign is the goal, isn't that what type_traits's
> make_unsigned is for?

Yes, I agree it has to be removed. It has no function. Probably
an artifact.

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net


------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [Spirit2X] UTF-8: failing no_case and case_handling tests

Francois Barel
Joel de Guzman wrote:
>> The attached patch solves it by ensuring that in UTF-8 conversions,
>> chars are first converted to unsigned types before being converted to
>> ints (to avoid chars >127 being erroneously converted to negative
>> ints).
>> Does this look OK to you?
>
> Looks good!

Commited in r1125.

>> Also, is there a reason for
>> boost/spirit/home/support/string_traits.hpp's remove_sign to be there?
>> Because:
>> 1. AFAICT it's not used,
>> 2. maybe it's just a name issue, but for chars it doesn't remove the
>> sign -- on the contrary, it puts it,
>> 3. if removing the sign is the goal, isn't that what type_traits's
>> make_unsigned is for?
>
> Yes, I agree it has to be removed. It has no function. Probably
> an artifact.

Done in r1126.

Regards,
Fran├žois

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel