Re: [bitfield] Initial bitfield proposal available inthevault

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [bitfield] Initial bitfield proposal available inthevault

Martin Bonner
----Original Message----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Andy Little Sent:
07 March 2006 15:14 To: [hidden email]
Subject: Re: [boost] [bitfield] Initial bitfield proposal available
inthevault

> "Martin Bonner" <[hidden email]> wrote in message
> news:[hidden email]...
>> ----Original Message----
>> From: Emile Cormier
>>
>>> The bitfield mechanism relies on this assumption: Unions of
>>> non-polymorphic, non-derived objects, having the exact same
>>> underlying data member type, will have the same size as this
>>> underlying data member type. I'm no language lawyer, so please let
>>> me know if this is a safe and portable assumption.
>>
>> I'm not quite sure what you mean, but given:
>> struct a { unsigned char ch; };
>> struct b { unsigned char ch; };
>> union u { a theA; b theB };
>> then you are not guaranteed that sizeof(u) == sizeof(unsigned char).
>
> Though in practise you can use:
>
> BOOST_STATIC_ASSERT(sizeof(u) == sizeof(unsigned char))

My point was exactly that you CANNOT use that.  (On a certain class of
machine).

>
>> On word addressed machines (which /are/ still being built), it is
>> almost certain that the minimum size for a struct is a complete
>> word.  This is because the C and C++ standards effectively promise
>> that pointers to structs are all of the same size (the size of a
>> pointer-to-struct does not depend on the contents of the struct).
>> It is desirable that a pointer-to-struct be the smaller,
>> cheaper-to-dereference pointer to word (rather than the larger
>> more-expensive-to-dereference pointer to char), so the smallest
>> struct has to occupy a whole word.
>
> I dont see why the size of a pointer to a struct affects the size of
> a struct which in the case of an empty struct is often 1 byte?

I don't think you have understood what a word addressed machine is!  

On most modern archictectures there are 8 bits stored at (for example)
0x100 and another 8 bits at 0x101.  The 32 bits at 0x100 cover 0x100,
0x101, 0x102, and 0x103.

On a word addressed machine, there may be 36 bits stored at 02000 and
another (different) 36 bits stored at 02001.  A simple 36-bit pointer
can address individual words, but not sub-units within those words.  To
address individual bytes, you need a double-word pointer.  One word
identifies the word, and a few bits within the second word identifies
which byte you are addressing.

On such a machine, it makes sense for an empty struct to occupy a whole
word (which is four nine-bit bytes), so that a pointer to struct can
(always) be a single word pointer.




--
Martin Bonner
[hidden email]
Pi Technology, Milton Hall, Ely Road, Milton, Cambridge, CB4 6WZ,
ENGLAND Tel: +44 (0)1223 203894

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [bitfield] Initial bitfield proposal availableinthevault

Andy Little

"Martin Bonner"  wrote

> Andy Little wrote
>> "Martin Bonner" <[hidden email]> wrote in message
>> news:[hidden email]...
>>> ----Original Message----
>>> From: Emile Cormier
>>>
>>>> The bitfield mechanism relies on this assumption: Unions of
>>>> non-polymorphic, non-derived objects, having the exact same
>>>> underlying data member type, will have the same size as this
>>>> underlying data member type. I'm no language lawyer, so please let
>>>> me know if this is a safe and portable assumption.
>>>
>>> I'm not quite sure what you mean, but given:
>>> struct a { unsigned char ch; };
>>> struct b { unsigned char ch; };
>>> union u { a theA; b theB };
>>> then you are not guaranteed that sizeof(u) == sizeof(unsigned char).
>>
>> Though in practise you can use:
>>
>> BOOST_STATIC_ASSERT(sizeof(u) == sizeof(unsigned char))
>
> My point was exactly that you CANNOT use that.  (On a certain class of
> machine).

Why not?. Will it a) compile but be incorrect or b) not compile but be incorrect
or c) not compile but be correct?
How do you store an unsigned char then? (And whatever way that is just pretend
to the hardware that the struct is an unsigned char)

>>> On word addressed machines (which /are/ still being built), it is
>>> almost certain that the minimum size for a struct is a complete
>>> word.  This is because the C and C++ standards effectively promise
>>> that pointers to structs are all of the same size (the size of a
>>> pointer-to-struct does not depend on the contents of the struct).
>>> It is desirable that a pointer-to-struct be the smaller,
>>> cheaper-to-dereference pointer to word (rather than the larger
>>> more-expensive-to-dereference pointer to char), so the smallest
>>> struct has to occupy a whole word.
>>
>> I dont see why the size of a pointer to a struct affects the size of
>> a struct which in the case of an empty struct is often 1 byte?
>
> I don't think you have understood what a word addressed machine is!
>
> On most modern archictectures there are 8 bits stored at (for example)
> 0x100 and another 8 bits at 0x101.  The 32 bits at 0x100 cover 0x100,
> 0x101, 0x102, and 0x103.
>
> On a word addressed machine, there may be 36 bits stored at 02000 and
> another (different) 36 bits stored at 02001.  A simple 36-bit pointer
> can address individual words, but not sub-units within those words.  To
> address individual bytes, you need a double-word pointer.  One word
> identifies the word, and a few bits within the second word identifies
> which byte you are addressing.
>
> On such a machine, it makes sense for an empty struct to occupy a whole
> word (which is four nine-bit bytes), so that a pointer to struct can
> (always) be a single word pointer.

Sounds like there is a choice. Either make unsigned char 36 bits and use a small
pointer or make unsigned char 9 bits and use a large pointer. I dont know
whether C++ will allow both? It reminds me of the old Microchip PIC architecture
though. Last I looked they were working to make their hardware compatible with C
FWIW  and just increasing the number of address lines, because they were
previously so difficult to deal with, with the separate extra bits in an address
and so on (though that was a kind of paged memory IIRC).  IOW in their case they
realised the downside of the idea as I understand it and moved to one drop
linear addressing for later designs.

Maybe I have got the wrong end of the proverbial stick again though ?

regards
Andy Little



_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [bitfield] Initial bitfield proposal availableinthevault

Martin Bonner
----Original Message----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Andy Little Sent:
07 March 2006 17:43 To: [hidden email]
Subject: Re: [boost] [bitfield] Initial bitfield proposal
availableinthevault

> "Martin Bonner"  wrote
>> Andy Little wrote
>>> "Martin Bonner" <[hidden email]> wrote in message
>>> news:[hidden email]...
>>>> ----Original Message----
>>>> From: Emile Cormier
>>>>
>>>>> The bitfield mechanism relies on this assumption: Unions of
>>>>> non-polymorphic, non-derived objects, having the exact same
>>>>> underlying data member type, will have the same size as this
>>>>> underlying data member type. I'm no language lawyer, so please let
>>>>> me know if this is a safe and portable assumption.
>>>>
>>>> I'm not quite sure what you mean, but given:
>>>> struct a { unsigned char ch; };
>>>> struct b { unsigned char ch; };
>>>> union u { a theA; b theB };
>>>> then you are not guaranteed that sizeof(u) == sizeof(unsigned
>>>> char).
>>>
>>> Though in practise you can use:
>>>
>>> BOOST_STATIC_ASSERT(sizeof(u) == sizeof(unsigned char))
>>
>> My point was exactly that you CANNOT use that.  (On a certain class
>> of machine).
>
> Why not?. Will it a) compile but be incorrect or b) not compile but
> be incorrect
> or c) not compile but be correct?

The assert will fire.

> How do you store an unsigned char then? (And whatever way that is
> just pretend to the hardware that the struct is an unsigned char)

Storing an unsigned char is expensive.  It involves extra bit twiddling.
(See below)

>
>>>> On word addressed machines (which /are/ still being built), it is
>>>> almost certain that the minimum size for a struct is a complete
>>>> word.  This is because the C and C++ standards effectively promise
>>>> that pointers to structs are all of the same size (the size of a
>>>> pointer-to-struct does not depend on the contents of the struct).
>>>> It is desirable that a pointer-to-struct be the smaller,
>>>> cheaper-to-dereference pointer to word (rather than the larger
>>>> more-expensive-to-dereference pointer to char), so the smallest
>>>> struct has to occupy a whole word.
>>>
>>> I dont see why the size of a pointer to a struct affects the size of
>>> a struct which in the case of an empty struct is often 1 byte?
>>
>> I don't think you have understood what a word addressed machine is!
>>
>> On most modern archictectures there are 8 bits stored at (for
>> example) 0x100 and another 8 bits at 0x101.  The 32 bits at 0x100
>> cover 0x100, 0x101, 0x102, and 0x103.
>>
>> On a word addressed machine, there may be 36 bits stored at 02000 and
>> another (different) 36 bits stored at 02001.  A simple 36-bit pointer
>> can address individual words, but not sub-units within those words.
>> To address individual bytes, you need a double-word pointer.  One
>> word identifies the word, and a few bits within the second word
>> identifies which byte you are addressing.
>>
>> On such a machine, it makes sense for an empty struct to occupy a
>> whole word (which is four nine-bit bytes), so that a pointer to
>> struct can (always) be a single word pointer.
>
> Sounds like there is a choice. Either make unsigned char 36 bits and
> use a small pointer or make unsigned char 9 bits and use a large
> pointer.
Yup.  And the COMPILER writer gets to make that choice.

> I dont know whether C++ will allow both?

It will allow the compiler writer to make either of those choices,


> It reminds me of the old Microchip PIC architecture
> though. Last I looked they were working to make their hardware
> compatible with C  FWIW  and just increasing the number of address
> lines, because they were previously so difficult to deal with, with
> the separate extra bits in an address and so on (though that was a
> kind of paged memory IIRC).  IOW in their case they
> realised the downside of the idea as I understand it and moved to one
> drop linear addressing for later designs.

I believe it is a similar sort of idea.  This is the same sort of thing
as Prime changing their instruction set so that memset( ptr, 0,
sizeof(ptr) ) set ptr to a null pointer (the natural representation of a
null pointer on a Prime was 07777/000000).
>
> Maybe I have got the wrong end of the proverbial stick again though ?

My point is that assuming
>>> BOOST_STATIC_ASSERT(sizeof(u) == sizeof(unsigned char))

Means that there is a class of C++ implementations where the library
will not work.  It is then up to the library author to consider whether
that class is suffiently important to change his implementation for (it
may well not be).


--
Martin Bonner
[hidden email]
Pi Technology, Milton Hall, Ely Road, Milton, Cambridge, CB4 6WZ,
ENGLAND Tel: +44 (0)1223 203894

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [bitfield] Initial bitfield proposal available inthevault

Martin Bonner
In reply to this post by Martin Bonner
----Original Message----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Andy Little Sent:
08 March 2006 03:42 To: [hidden email]
Subject: Re: [boost] [bitfield] Initial bitfield proposal available
inthevault

> "Emile Cormier"  wrote
>
>> Martin is on to something about using pointers.
I'm not sure I am.  I was trying to explain (and failing) why structs
may be larger than their embedded data.  (By making the struct larger,
the pointer can be smaller).

> Surely any pointer must be capable of being converted to a void*
> which means void* has to know about the extra bits? How does that
work?

sizeof(struct*) <= sizeof(void*)
sizeof(char*)==sizeof(void*)

Note that if you cast an unsigned char* to an arbitrary struct* and
back, you are **NOT** guaranteed to get your original pointer back.  On
the other hand you are guaranteed that casting an arbitrary pointer to
struct to another pointer to struct and back will return you the
original value.  

> I would have thought it would be possible to reinterpret_cast (or
> somehow
> convert) the struct to the inbuilt type in these situations isnt it
> thus fooling
> the compiler into storing that type?
> The problem I see is that there seems to be two types of pointers
> here which doesnt seem to be standard C++ behaviour?

It is, oh it is!

--
Martin Bonner
[hidden email]
Pi Technology, Milton Hall, Ely Road, Milton, Cambridge, CB4 6WZ,
ENGLAND Tel: +44 (0)1223 203894

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost