bug or documentation missing boost posix_time time_duration not_a_date_time

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

bug or documentation missing boost posix_time time_duration not_a_date_time

Boost - Users mailing list
hi

please compile this program:

#include <boost/date_time/posix_time/ptime.hpp>
#include <iostream>
#include <vector>

using namespace boost::posix_time;
using namespace std;

#define x(v) cout << #v << " : " << (v) << '\n';

int main(int argc, char *argv[])
{
  time_duration a(not_a_date_time);
  time_duration b;
  x(a<b);
  x(a>b);
  x(a<=b);
  x(a>=b);
  x(b<a);
  x(b>a);
  x(b<=a);
  x(b>=a);
  x(a==b);
  x(b==a);
  x(a.is_special());
  x(b.is_special());
  vector<time_duration> v0, v1;
  v0.push_back(time_duration());
  v0.push_back(time_duration(not_a_date_time));
  sort(v0.begin(), v0.end());
  v1.push_back(time_duration(not_a_date_time));
  v1.push_back(time_duration());
  sort(v1.begin(), v1.end());
  x(v0==v1);
  return 0;
}

the output is:

a<b : 0
a>b : 0
a<=b : 1
a>=b : 1
b<a : 0
b>a : 0
b<=a : 1
b>=a : 1
a==b : 0
b==a : 0
a.is_special() : 1
b.is_special() : 0
v0==v1 : 0

if the time_duration datatype is a complete partial order which is for
example needed for it to be able to be sorted, then the above output is
partly illegal:

if not a<b and not b<a then must be a=b. but this is not the case.

if a<=b and b<=a then must be a=b. but this is not the case.

so as expected the two vectors v0 and v1 are not equal after being
sorted even though they certainly should be!

i do not understand why time_duration is behaving in such a strange way
here.

I am using time_duration(not_a_date_time) as a SQL-NULL value just like
i am using date(not_a_date_time) and ptime(not_a_date_time) as SQL-NULL
values.

obviously I should not use time_duration(not_a_date_time). maybe this
is the reason why time_duration() != time_duration(not_a_date_time)? I
don't understand the design here. for example
date()==date(not_a_date_time) and also ptime()==ptime(not_a_date_time)
but not with time_duration. why???

why is there no documentation that time_duration is not a complete
partial order and thus not sortable - at least when using
not_a_date_time?

cya
erik
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: bug or documentation missing boost posix_time time_duration not_a_date_time

Boost - Users mailing list
On 23/05/2019 00:30, Erik Thiele wrote:
> if the time_duration datatype is a complete partial order which is for
> example needed for it to be able to be sorted, then the above output is
> partly illegal:
>
> if not a<b and not b<a then must be a=b. but this is not the case.
>
> if a<=b and b<=a then must be a=b. but this is not the case.

not_a_date_time is like a NaN.  NaNs are weird like that.

Returning true for the <= and >= operations is probably actually a bug;
it's a side effect of using less_than_comparable when it should be using
partially_ordered instead.  You should raise an issue about that on
GitHub or submit a PR to fix it.

> so as expected the two vectors v0 and v1 are not equal after being
> sorted even though they certainly should be!

A default constructed time_duration is equal to zero duration.  A
not_a_date_time time_duration is not a zero duration.  Why would you
expect them to be equal?

> i do not understand why time_duration is behaving in such a strange way
> here.
>
> I am using time_duration(not_a_date_time) as a SQL-NULL value just like
> i am using date(not_a_date_time) and ptime(not_a_date_time) as SQL-NULL
> values.

SQL NULLs are weird, just like NaNs, so this should not be surprising to
you.
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: bug or documentation missing boost posix_time time_duration not_a_date_time

Boost - Users mailing list
Am Thu, 23 May 2019 11:18:43 +1200
schrieb Gavin Lambert via Boost-users <[hidden email]>:

> On 23/05/2019 00:30, Erik Thiele wrote:
> > if the time_duration datatype is a complete partial order which is
> > for example needed for it to be able to be sorted, then the above
> > output is partly illegal:
> >
> > if not a<b and not b<a then must be a=b. but this is not the case.
> >
> > if a<=b and b<=a then must be a=b. but this is not the case.  
>
> not_a_date_time is like a NaN.  NaNs are weird like that.

i don't think we should mix floating point issues with the date_time
library.

> Returning true for the <= and >= operations is probably actually a
> bug; it's a side effect of using less_than_comparable when it should
> be using partially_ordered instead.  You should raise an issue about
> that on GitHub or submit a PR to fix it.

I will raise an issue.

> > so as expected the two vectors v0 and v1 are not equal after being
> > sorted even though they certainly should be!  
>
> A default constructed time_duration is equal to zero duration.  A
> not_a_date_time time_duration is not a zero duration.  Why would you
> expect them to be equal?

I do not expect time_duration(not_a_date_time) and
time_duration(hours(0)) to be equal. instead i expect them to be not
equal. actually they are not equal, which is correctly reported by
operator==. but i expect v0 and v1 to be equal because after sorting two
vectors with the same contents must be equal. watch closely, v0 and v1
have the same elements. they are just push_backed in a different order.
so after sorting v0 and v1, they need to be equal, but aren't. this is
because operator< is not a complete partial order here.

> > i do not understand why time_duration is behaving in such a strange
> > way here.
> >
> > I am using time_duration(not_a_date_time) as a SQL-NULL value just
> > like i am using date(not_a_date_time) and ptime(not_a_date_time) as
> > SQL-NULL values.  
>
> SQL NULLs are weird, just like NaNs, so this should not be surprising
> to you.

I do not share the opinion that anything is weird :-) I was just
looking for a reason for the asymmetry here. anyway i think it is not
possible anymore to change this since there is too much code using the
library out there and everybody is now used to this asymmetry.

but the operator< issue can hopefully be fixed. i make a report.

cu
erik
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: bug or documentation missing boost posix_time time_duration not_a_date_time

Boost - Users mailing list
On 23/05/2019 17:37, Erik Thiele wrote:
>> not_a_date_time is like a NaN.  NaNs are weird like that.
>
> i don't think we should mix floating point issues with the date_time
> library.

It's not implemented with an actual floating point NaN, but it is
clearly intended to have equivalent behaviour.

https://github.com/boostorg/date_time/blob/develop/include/boost/date_time/int_adapter.hpp

See the above code.  Although interestingly unlike some floating point
implementations, it looks like it does consider not_a_date_time ==
not_a_date_time.

But yes, it should either give not_a_date_time an actual consistent
relative ordering or use partially_ordered instead of less_than_comparable.

> but i expect v0 and v1 to be equal because after sorting two
> vectors with the same contents must be equal. watch closely, v0 and v1
> have the same elements. they are just push_backed in a different order.
> so after sorting v0 and v1, they need to be equal, but aren't. this is
> because operator< is not a complete partial order here.

Ah, I missed that part.  But still, no.

Sorting only uses operator<, and those operators are implemented
consistently with NaN-like behaviour (partial ordering).

In particular, it is *not* a "strict weak ordering", which is what
std::sort assumes.

not_a_date_time has no specific order, therefore sort is allowed to put
it wherever it likes; you can't make any assumptions about its position
relative to other values.  It's not even required to group them together
if they're spread out (although some sort algorithms might do that).

If you don't like that behaviour, you can pass your own different
comparator to sort which does assign a strict ordering to the values.
For example, you can declare that not_a_date_time should be treated as
strictly less than any actual duration.

The bug you noticed in the <= and >= return values has no effect on sort.
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: bug or documentation missing boost posix_time time_duration not_a_date_time

Boost - Users mailing list
I assume that time_duration is implemented as an integer. Some values
of this integer are then reserved for special values like
not_a_date_time.

this can be seen here:

https://github.com/boostorg/date_time/blob/develop/include/boost/date_time/int_adapter.hpp

this should have the consequence that when studying the assembler code
for comparing two time_durations, all that happens is the assembler
code of "less than compare" for integer type. if additional comparisons
and branchings take place, then I assume the code is not meeting the
design criteria of being optimized.

Well... this is what I did, copy paste this code to justcompare.cpp:

#include <boost/date_time/posix_time/ptime.hpp>

using namespace boost::posix_time;

bool compare_time_duration(time_duration a, time_duration b)
{
  return a < b;
}

bool compare_int(int a, int b)
{
  return a < b;
}

EOF

g++ -Wall -O2 -c justcompare.cpp
objdump -S --demangle justcompare.o


justcompare.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000
<compare_time_duration(boost::posix_time::time_duration, boost::posix_time::time_duration)>:
   0:   48 8b 17                mov    (%rdi),%rdx
   3:   48 b9 ff ff ff ff ff    movabs $0x7fffffffffffffff,%rcx
   a:   ff ff 7f
   d:   48 8d 04 0a             lea    (%rdx,%rcx,1),%rax
  11:   48 83 f8 fd             cmp    $0xfffffffffffffffd,%rax
  15:   77 29                   ja     40
  17:   48 bf fe ff ff ff ff    movabs $0x7ffffffffffffffe,%rdi
  1e:   ff ff 7f
  21:   31 c0                   xor    %eax,%eax
  23:   48 39 fa                cmp    %rdi,%rdx
  26:   74 60                   je     88
  28:   48 8b 36                mov    (%rsi),%rsi
  2b:   48 01 f1                add    %rsi,%rcx
  2e:   48 83 f9 fd             cmp    $0xfffffffffffffffd,%rcx
  32:   77 20                   ja     54
  34:   48 39 fe                cmp    %rdi,%rsi
  37:   74 0a                   je     43
  39:   48 39 f2                cmp    %rsi,%rdx
  3c:   0f 9c c0                setl   %al
  3f:   c3                      retq  
  40:   48 8b 36                mov    (%rsi),%rsi
  43:   48 b9 fe ff ff ff ff    movabs $0x7ffffffffffffffe,%rcx
  4a:   ff ff 7f
  4d:   31 c0                   xor    %eax,%eax
  4f:   48 39 ce                cmp    %rcx,%rsi
  52:   74 3c                   je     90
  54:   48 bf 00 00 00 00 00    movabs $0x8000000000000000,%rdi
  5b:   00 00 80
  5e:   48 39 fa                cmp    %rdi,%rdx
  61:   74 35                   je     98
  63:   48 b9 ff ff ff ff ff    movabs $0x7fffffffffffffff,%rcx
  6a:   ff ff 7f
  6d:   48 39 ce                cmp    %rcx,%rsi
  70:   0f 94 c0                sete   %al
  73:   48 39 ca                cmp    %rcx,%rdx
  76:   41 0f 95 c0             setne  %r8b
  7a:   44 20 c0                and    %r8b,%al
  7d:   74 21                   je     a0
  7f:   f3 c3                   repz retq
  81:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)
  88:   f3 c3                   repz retq
  8a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
  90:   f3 c3                   repz retq
  92:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
  98:   48 39 d6                cmp    %rdx,%rsi
  9b:   0f 95 c0                setne  %al
  9e:   c3                      retq  
  9f:   90                      nop
  a0:   48 39 ca                cmp    %rcx,%rdx
  a3:   74 da                   je     7f
  a5:   48 39 fe                cmp    %rdi,%rsi
  a8:   75 8f                   jne    39
  aa:   f3 c3                   repz retq
  ac:   0f 1f 40 00             nopl   0x0(%rax)

00000000000000b0 <compare_int(int, int)>:
  b0:   39 f7                   cmp    %esi,%edi
  b2:   0f 9c c0                setl   %al
  b5:   c3                      retq  

(i removed the verbose jump target strings from the output to make the
email readable)

see the difference between compare_time_duration and compare_int...

I have a bad feeling in my stomach... sorry, something is going very
wrong here :-)




cya
erik
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: bug or documentation missing boost posix_time time_duration not_a_date_time

Boost - Users mailing list
On 25/05/2019 01:41, Erik Thiele wrote:
> I assume that time_duration is implemented as an integer. Some values
> of this integer are then reserved for special values like
> not_a_date_time.
>
> this can be seen here:
>
> https://github.com/boostorg/date_time/blob/develop/include/boost/date_time/int_adapter.hpp

Yes, that's what I said before.

> this should have the consequence that when studying the assembler code
> for comparing two time_durations, all that happens is the assembler
> code of "less than compare" for integer type. if additional comparisons
> and branchings take place, then I assume the code is not meeting the
> design criteria of being optimized.

No.  That would be true if it were just delegating directly to the
underlying integer comparison (which, FWIW, is what ptime does).

int_adapter, however, intentionally performs a more complex comparison
in order to treat the special values as special, and thus in an order
different from their underlying integer "implementation detail".

In particular, not_a_date_time is implemented as a NaN-like "not in any
order".  (For the purposes of sort, this tends to make it equivalent to
any other value, and thus it can sort to anywhere at all in the list,
usually remaining motionless.)

Have a look at the implementation of int_adapter::compare in the above code.

Whether this is the "right" choice, I don't know.  But it's how it
works, and it appears to have been intentional.
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users