[spirit] get_script method

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[spirit] get_script method

Keshav Krity

I am trying to get the script name for a given Unicode value. This is the method I am calling.

 

http://www.boost.org/doc/libs/1_49_0/boost/spirit/home/support/char_encoding/unicode/query.hpp

 

inline properties::script get_script(::boost::uint32_t ch)

{

     return static_cast<properties::script>(detail::script_lookup(ch) & 0x3F);

}

 

enum script
{
.
.
.
common = 92,
unknown = 93
}
 
This api is not returning “common” script for characters such as space. The reason being the & operation with 0x3F. It restricts the return value to the range 0-63. Thus the values in the enum “script” greater that 63 will never be returned. 

 

I fail to understand the logic behind this. Can somebody please explain the behavior, why the values are being restricted to top 64 only?

 

Thanks

 


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit] get_script method

Joel de Guzman-2
On 5/5/2012 2:39 PM, Keshav Krity wrote:
> I am trying to get the script name for a given Unicode value. This is the method I am
> calling.
>
> http://www.boost.org/doc/libs/1_49_0/boost/spirit/home/support/char_encoding/unicode/query.hpp
>
>  inline properties::script get_script(::boost::uint32_t ch) { return
> static_cast<properties::script>(detail::script_lookup(ch) & 0x3F); }

[..]

> This api is not returning "common" script for characters such as space. The reason
> being the & operation with 0x3F. It restricts the return value to the range 0-63. Thus
> the values in the enum "script" greater that 63 will never be returned.
>
> I fail to understand the logic behind this. Can somebody please explain the behavior,
> why the values are being restricted to top 64 only?

Looks like a bug indeed. Unicode support has not been fully tested.
Would you care to provide some tests for them?

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://boost-spirit.com




------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel
Reply | Threaded
Open this post in threaded view
|

Re: [spirit] get_script method

Joel de Guzman-2
On 5/7/2012 9:05 AM, Joel de Guzman wrote:

> On 5/5/2012 2:39 PM, Keshav Krity wrote:
>> I am trying to get the script name for a given Unicode value. This is the method I am
>> calling.
>>
>> http://www.boost.org/doc/libs/1_49_0/boost/spirit/home/support/char_encoding/unicode/query.hpp
>>
>>  inline properties::script get_script(::boost::uint32_t ch) { return
>> static_cast<properties::script>(detail::script_lookup(ch) & 0x3F); }
>
> [..]
>
>> This api is not returning "common" script for characters such as space. The reason
>> being the & operation with 0x3F. It restricts the return value to the range 0-63. Thus
>> the values in the enum "script" greater that 63 will never be returned.
>>
>> I fail to understand the logic behind this. Can somebody please explain the behavior,
>> why the values are being restricted to top 64 only?
>
> Looks like a bug indeed. Unicode support has not been fully tested.
> Would you care to provide some tests for them?

This is fixed in Boost trunk, FWIW. I'd still welcome some tests if you
care to provide some.

Regards,
--
Joel de Guzman
http://www.boostpro.com
http://boost-spirit.com




------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Spirit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-devel