[serialization] Serialisation/deserialisation of floating-point values

classic Classic list List threaded Threaded
61 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisation offloating-point values

Peter Broadwell-2
Note the difference between the "definition" formulae of 3010/10000
and the suggested formulae using 3030/10000.

Perhaps this is on purpose, if not may explain why the tests done
later in this thread  wich use the 3030/10000 version had troubles?

;;peter


Paul A Bristow wrote:
 > [...]
 > For C++, using numeric limits,
 >
 > So it is convenient instead to use the following formula which can be
 > calculated at compile time:
 > 2 + std::numeric_limits<double>::digits * 3010/10000;
 >
 > [...]
 > and I suggest that this should be:
 >
 > os << std::setprecision(2 + std::numeric_limits<double>::digits *
 > 3030/10000);
 >
 > HTH
 >
 > Paul
 >
 >
 >
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisation offloating-point values

Janek Kozicki
In reply to this post by pabristow
Paul A Bristow said:     (by the date of Wed, 15 Mar 2006 20:15:10 -0000)

>
> std::stringstream stream;
> double num;
> stream << std::setprecision(3 +
> std::numeric_limits<double>::digits * 3030/10000);
> stream << orig_value;
> stream >> num; // <<<<<<<<<<   This is where I believe it
> goes wrong, sometimes, by one 1 bit :-((
>
> so orig_value != num.
>

try following modifications in above code. If you still get a mistake by
one bit, then .... well, I'd be very surprised.


#include <boost/lexical_cast.hpp>

        std::stringstream stream;
        double num;
        stream << boost::lexical_cast<std::string>(orig_value);
        std::string tmp;
        stream >> tmp;
        num=boost::lexical_cast<double>(tmp);




--
Janek Kozicki                                                         |
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisation offloating-point values

Caleb Epstein-3
On 3/15/06, Janek Kozicki <[hidden email]> wrote:
>
>
> try following modifications in above code. If you still get a mistake by
> one bit, then .... well, I'd be very surprised.


Prepare to be suprised.

Here's the exact code I compiled:

#include <iostream>
#include <sstream>
#include <cassert>
#include <boost/lexical_cast.hpp>

int main ()
{
    std::stringstream stream;
    double orig_value = 0.0019075645054089487;
    stream << boost::lexical_cast<std::string> (orig_value);
    double num = boost::lexical_cast<double> (stream.str());
    assert (num == orig_value);
}

On gcc + Linux this fails:

lc: lc.cpp:12: int main(): Assertion `num == orig_value' failed.

Breakpoint 1, main () at lc.cpp:12
12          assert (num == orig_value);
(gdb) print num
$3 = 0.0019075645054089489
(gdb) print orig_value
$4 = 0.0019075645054089487

On MSVC 8, the program also asserts and the values are similarly mismatched:

        orig_value    0.0019075645054089487    double
        num    0.0019075645054089489    double

Note that boost::lexical_cast uses a precision of
std::numeric_limits<T>::digits10
+ 1 in its T-to-string conversions.  For double, this is 16 which would
probably explain the mismatch on the 17th significant digit on two separate
platforms.

--
Caleb Epstein
caleb dot epstein at gmail dot com
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisation offloating-point values

Janek Kozicki
Caleb Epstein said:     (by the date of Wed, 15 Mar 2006 18:19:45 -0500)

> Breakpoint 1, main () at lc.cpp:12
> 12          assert (num == orig_value);
> (gdb) print num
> $3 = 0.0019075645054089489
> (gdb) print orig_value
> $4 = 0.0019075645054089487

ok, so I am surprised :) Therefore I back off ;)

but honestly - there must be a bug somewhere (I belive it's in
lexical_cast), right? Maybe authors of lexical_cast should add one
significant place to the string? If not - then why this won't work? Is
it really impossible to avoid losing data in conversions?

best if one of boost::lexical_cast authors would answer....

--
Janek Kozicki                                                         |
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

[lexical_cast] bug in T -> string ? Re: [serialization] Serialisation/deserialisationoffloating-point values

Kevin Wheatley
In reply to this post by Caleb Epstein-3
Caleb Epstein wrote:
> Note that boost::lexical_cast uses a precision of
> std::numeric_limits<T>::digits10
> + 1 in its T-to-string conversions.  For double, this is 16 which would
> probably explain the mismatch on the 17th significant digit on two separate
> platforms.

so it looks like lexical_cast has a bug (repeated) as well then see
Paul Bristow's comment earlier, here is the code in lexical_cast.hpp.

            lexical_stream()
            {
                stream.unsetf(std::ios::skipws);

                if(std::numeric_limits<Target>::is_specialized)
                   
stream.precision(std::numeric_limits<Target>::digits10 + 1);
                else if(std::numeric_limits<Source>::is_specialized)
                   
stream.precision(std::numeric_limits<Source>::digits10 + 1);
            }


Should both be tweaked to add the extra digits:

stream.precision(2+std::numeric_limits<Target>::digits * 3010/10000);

and

stream.precision(2+std::numeric_limits<Source>::digits * 3010/10000);

or am I missing the boat on my quick inspection

Kevin

--
| Kevin Wheatley, Cinesite (Europe) Ltd | Nobody thinks this      |
| Senior Technology                     | My employer for certain |
| And Network Systems Architect         | Not even myself         |
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [lexical_cast] bug in T -> string ? Re: [serialization]Serialisation/deserialisationoffloating-point values

Kevin Wheatley
Actually looking at this further...
boost_1_33_1/libs/numeric/conversion/test/bounds_test.cpp:  cout <<
setprecision( std::numeric_limits<long double>::digits10 ) ;
boost_1_33_1/libs/numeric/conversion/test/traits_test.cpp:  std::cout
<< std::setprecision( std::numeric_limits<long double>::digits10 ) ;
boost_1_33_1/libs/numeric/conversion/test/converter_test.cpp:
std::cout << std::setprecision( std::numeric_limits<long
double>::digits10 ) ;
boost_1_33_1/libs/numeric/interval/examples/io.cpp:  
boost::io::ios_precision_saver state(stream,
std::numeric_limits<T>::digits10);
boost_1_33_1/libs/numeric/interval/examples/io.cpp:  
boost::io::ios_precision_saver state(stream,
std::numeric_limits<T>::digits10);
 
May also want to be looked at in terms of 'recomending to the users by
example' the correct thing todo.

(this was based upon a quick grep of the code BTW)

Kevin

--
| Kevin Wheatley, Cinesite (Europe) Ltd | Nobody thinks this      |
| Senior Technology                     | My employer for certain |
| And Network Systems Architect         | Not even myself         |
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisation offloating-point values

John Maddock
In reply to this post by Janek Kozicki
> but honestly - there must be a bug somewhere (I belive it's in
> lexical_cast), right? Maybe authors of lexical_cast should add one
> significant place to the string? If not - then why this won't work? Is
> it really impossible to avoid losing data in conversions?

No, as already discussed the problem is that many iostreams libraries do not
round trip the binary floating point representation to decimal and back
again.  This is technically possible to do (albeit with quite heroic
efforts), but apparently std lib vendors don't consider it crucial :-(

That was why I suggested using the C99 style hex format for floats in the
serialisation lib: that format is both portable and readily round-trippable
since there's no binary-to-decimal conversion involved (just binary to hex).

John.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisationoffloating-point values

pabristow
In reply to this post by Caleb Epstein-3
 

| -----Original Message-----
| From: [hidden email]
| [mailto:[hidden email]] On Behalf Of Caleb Epstein
| Sent: 15 March 2006 23:20
| To: [hidden email]
| Subject: Re: [boost] [serialization]
| Serialisation/deserialisationoffloating-point values
|
| On 3/15/06, Janek Kozicki <[hidden email]> wrote:
| >
| >
| > try following modifications in above code. If you still get
| a mistake by
| > one bit, then .... well, I'd be very surprised.
|
|
| Prepare to be suprised.
|
| Here's the exact code I compiled:
|
| #include <iostream>
| #include <sstream>
| #include <cassert>
| #include <boost/lexical_cast.hpp>
|
| int main ()
| {
|     std::stringstream stream;
|     double orig_value = 0.0019075645054089487;
|     stream << boost::lexical_cast<std::string> (orig_value);
|     double num = boost::lexical_cast<double> (stream.str());
|     assert (num == orig_value);
| }
|
| On gcc + Linux this fails:
|
| lc: lc.cpp:12: int main(): Assertion `num == orig_value' failed.
|
| Breakpoint 1, main () at lc.cpp:12
| 12          assert (num == orig_value);
| (gdb) print num
| $3 = 0.0019075645054089489
| (gdb) print orig_value
| $4 = 0.0019075645054089487
|
| On MSVC 8, the program also asserts and the values are
| similarly mismatched:
|
|         orig_value    0.0019075645054089487    double
|         num    0.0019075645054089489    double
|
| Note that boost::lexical_cast uses a precision of
| std::numeric_limits<T>::digits10
| + 1 in its T-to-string conversions.  For double, this is 16
| which would
| probably explain the mismatch on the 17th significant digit
| on two separate
| platforms.

Well I have pointed this mistake out over two years ago, but it still hasn't
been changed.  I have to say I think that this is a bit poor.  Our testing
of this very widely used utility is also not up to Boost standards either.

Sadly though, I fear this is not the only problem.  I think that there is
also a problem in the Microsoft input string to double, even with enough
decimal digits, for a small proportion of decimal digits strings. There
testing / quality aspriation is obviously not brilliant either. I may get
round to checking this out more fully later.

Paul

--
Paul A Bristow
Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB
Phone and SMS text +44 1539 561830, Mobile and SMS text +44 7714 330204
mailto: [hidden email]  http://www.hetp.u-net.com/index.html
http://www.hetp.u-net.com/Paul%20A%20Bristow%20info.html



_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisationoffloating-point values

pabristow
In reply to this post by John Maddock
 

| -----Original Message-----
| From: [hidden email]
| [mailto:[hidden email]] On Behalf Of John Maddock
| Sent: 16 March 2006 10:28
| To: [hidden email]
| Subject: Re: [boost] [serialization]
| Serialisation/deserialisationoffloating-point values
|
| > but honestly - there must be a bug somewhere (I belive it's in
| > lexical_cast), right? Maybe authors of lexical_cast should add one
| > significant place to the string? If not - then why this
| won't work? Is
| > it really impossible to avoid losing data in conversions?
|
| No, as already discussed the problem is that many iostreams
| libraries do not
| round trip the binary floating point representation to
| decimal and back
| again.  This is technically possible to do (albeit with quite heroic
| efforts), but apparently std lib vendors don't consider it crucial :-(
|
| That was why I suggested using the C99 style hex format for
| floats in the
| serialisation lib: that format is both portable and readily
| round-trippable
| since there's no binary-to-decimal conversion involved (just
| binary to hex).

Anything that guarantees a round trip MUST be a good.

(Getting output and input right would be even better!  There are papers
which present methods for doing it which claim to be proven correct - but
these are not the methods used for popular implmentations.)

Is a hex fully portable?  It surely just promises to be as close as the FP
representation will allow?  Or should we store the FP representation in the
serialization and only deserialize if it matches exactly?  This sounds a
prudent move to me.

Paul

--
Paul A Bristow
Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB
Phone and SMS text +44 1539 561830, Mobile and SMS text +44 7714 330204
mailto: [hidden email]  http://www.hetp.u-net.com/index.html
http://www.hetp.u-net.com/Paul%20A%20Bristow%20info.html



_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization]Serialisation/deserialisationoffloating-point values

John Maddock
> Anything that guarantees a round trip MUST be a good.
>
> (Getting output and input right would be even better!  There are
> papers which present methods for doing it which claim to be proven
> correct - but these are not the methods used for popular
> implmentations.)
>
> Is a hex fully portable?  It surely just promises to be as close as
> the FP representation will allow?  Or should we store the FP
> representation in the serialization and only deserialize if it
> matches exactly?  This sounds a prudent move to me.

Um, what do you mean by fully portable?  It is in the sense that:

* If you do a write-then-read cycle on the same machine you get back exactly
the same result.
* If you do a write-then-read cycle on different machines you only get the
same result back if the machine reading the value has at least as many bits
in it's mantissa as the machine used for writing.  But that goes without
saying really.

I guess I really should put my money where my mouth is and present some
sample code, I'll see what I can do later....

John.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization]Serialisation/deserialisationoffloating-point values

pabristow
 

| -----Original Message-----
| From: [hidden email]
| [mailto:[hidden email]] On Behalf Of John Maddock
| Sent: 16 March 2006 13:43
| To: [hidden email]
| Subject: Re:
| [boost][serialization]Serialisation/deserialisationoffloating-
| point values
|
| > Anything that guarantees a round trip MUST be a good.
| >
| > (Getting output and input right would be even better!  There are
| > papers which present methods for doing it which claim to be proven
| > correct - but these are not the methods used for popular
| > implmentations.)
| >
| > Is a hex fully portable?  It surely just promises to be as close as
| > the FP representation will allow?  Or should we store the FP
| > representation in the serialization and only deserialize if it
| > matches exactly?  This sounds a prudent move to me.
|
| Um, what do you mean by fully portable?  It is in the sense that:
|
| * If you do a write-then-read cycle on the same machine you
| get back exactly
| the same result.
| * If you do a write-then-read cycle on different machines you
| only get the
| same result back if the machine reading the value has at
| least as many bits
| in it's mantissa as the machine used for writing.  But that
| goes without
| saying really.
|
| I guess I really should put my money where my mouth is and
| present some
| sample code, I'll see what I can do later....
|
| John.

That would be excellent.

My suggestion is to store the FP format somehow and somewhere in the
serialization.

http://babbage.cs.qc.edu/courses/cs341/IEEE-754references.html#tables

lists half a dozen IEEE formats, so a single byte would suffice, but it
might be better to cater for User Defined Types by storing the number of
significand and exponent bit counts separately? Some 128-bit types like
doubledouble Darwin and 265-bit reals are in use, as well as arbitrary
precision like NTL ZZ.

Some users might also want to use 'exact reals', for example

http://keithbriggs.info/xrc.html  (note that C and C++ implmentations exist)

so it might be useful to cater for this as well.

Paul

--
Paul A Bristow
Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB
Phone and SMS text +44 1539 561830, Mobile and SMS text +44 7714 330204
mailto: [hidden email]  http://www.hetp.u-net.com/index.html
http://www.hetp.u-net.com/Paul%20A%20Bristow%20info.html


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisationoffloating-point values

Robert Ramey
In reply to this post by John Maddock
Hmm - I don't remember that suggestion.  How does using hex
address the placement of the decimal point?  Assuming it
managed the issues of lost precision, I would think that those
using a text type serialization format would expect an
intuitively readable floating point representation.

Robert Ramey



John Maddock wrote:

> That was why I suggested using the C99 style hex format for floats in
> the serialisation lib: that format is both portable and readily
> round-trippable since there's no binary-to-decimal conversion
> involved (just binary to hex).
>
> John.
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost 



_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisation offloating-point values

pabristow
In reply to this post by Peter Broadwell-2
Ooops - this is a typo.

It should of course be 3010/10000.

(All this is because floating point calculations, especially log10(2) =
0.3010.... can't be done at compile time - a shame because it could be - and
has received some consideration for the next C++0x).

Sorry.

Paul

| -----Original Message-----
| From: [hidden email]
| [mailto:[hidden email]] On Behalf Of Peter Broadwell
| Sent: 15 March 2006 20:43
| To: [hidden email]
| Subject: Re: [boost] [serialization]
| Serialisation/deserialisation offloating-point values
|
| Note the difference between the "definition" formulae of 3010/10000
| and the suggested formulae using 3030/10000.
|
| Perhaps this is on purpose, if not may explain why the tests done
| later in this thread  wich use the 3030/10000 version had troubles?
|
| ;;peter
|
|
| Paul A Bristow wrote:
|  > [...]
|  > For C++, using numeric limits,
|  >
|  > So it is convenient instead to use the following formula
| which can be
|  > calculated at compile time:
|  > 2 + std::numeric_limits<double>::digits * 3010/10000;
|  >
|  > [...]
|  > and I suggest that this should be:
|  >
|  > os << std::setprecision(2 + std::numeric_limits<double>::digits *
|  > 3030/10000);
|  >
|  > HTH
|  >
|  > Paul
|  >
|  >
|  >
| _______________________________________________
| Unsubscribe & other changes:
| http://lists.boost.org/mailman/listinfo.cgi/boost
|

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisation offloating-point values

Caleb Epstein-3
On 3/17/06, Paul A Bristow <[hidden email]> wrote:
>
> Ooops - this is a typo.
>
> It should of course be 3010/10000.


Which is one of the reasons magic numbers in code are best avoided.  It sure
would be nice if one could just use numeric_limits<T>::digits10 + 2 instead
of numeric_limits<T>::digits * 3010 / 10000, but the former gives a
different result for float (8 instead of 9).

Perhaps this cryptic calculation might be best addressed by a
boost::numeric_limits<T> which could extend std::numeric_limits<T> and
include Paul Bristow's proposed max_digits10 (see
http://www2.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1822.pdf)?

--
Caleb Epstein
caleb dot epstein at gmail dot com
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisationoffloating-point values

pabristow
 
| -----Original Message-----
| From: [hidden email]
| [mailto:[hidden email]] On Behalf Of Caleb Epstein
| Sent: 17 March 2006 15:13
| To: [hidden email]
| Subject: Re: [boost] [serialization]
| Serialisation/deserialisationoffloating-point values
|
| On 3/17/06, Paul A Bristow <[hidden email]> wrote:
| >
| > Ooops - this is a typo.
| >
| > It should of course be 3010/10000.
|
| Which is one of the reasons magic numbers in code are best avoided.

Touche!  Case proven!

| It sure would be nice if one could just use
| numeric_limits<T>::digits10 + 2 instead
| of numeric_limits<T>::digits * 3010 / 10000, but the former gives a
| different result for float (8 instead of 9).
|
| Perhaps this cryptic calculation might be best addressed by a
| boost::numeric_limits<T> which could extend std::numeric_limits<T> and
| include Paul Bristow's proposed max_digits10 (see
| http://www2.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1822.pdf)?

Well this would be faster than the glacial speed of simple no-brain changes
like this to Standards.

And while we are at it, the macros I proposed to WG14 for C could also be
added, in case these are more convenient for (C++ AND C) users.

http://www2.open-std.org/JTC1/SC22/WG14/www/docs/n1151.pdf
Date: 2005-11-30, version 1

For example:
#define FLT_MAXDIG10 (2+(FLT_MANT_DIG * 3010)/10000)
#define DBL_MAXDIG10 (2+ (DBL_MANT_DIG * 3010)/10000)
#define LDBL_MAXDIG10 (2+ (LDBL_MANT_DIG * 3010)/10000)

which yield the following values on typical implementations:

FLT_DIG 6, FLT_MAXDIG10 9
DBL_DIG 15, DBL_MAXDIG10 17
LDBL_DIG 19, LDBL_MAXDIG10 21

Should it go into boost/detail/limits.hpp?

I don't feel qualfied to do this, but it is long overdue to sort out this
trivial problem.

(and get on with the much more important problem of failure of
round-tripping of all uses of stringstream, crucially lexical_cast and
serialization).

Paul

--
Paul A Bristow
Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB
Phone and SMS text +44 1539 561830, Mobile and SMS text +44 7714 330204
mailto: [hidden email]  http://www.hetp.u-net.com/index.html
http://www.hetp.u-net.com/Paul%20A%20Bristow%20info.html




_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisation offloating-point values

pabristow
In reply to this post by Paul Giaccone
If you are using >> to convert decimal digit strings to floating-point and
expect to get **exactly** the right result, read on.

There was some discussion in this thread some weeks ago and agreement that
there was a problem with serialization of floating point (and with lexical
cast).

Although the change is only 1 bit, if you repeatedly read back and
re-serialized floating-points, the values would drift 1 bit each time.  

I've now found a (some - quite a few) moments to look into this.

The basic problem is failure to 'round-trip/loopback'

        float f = ?; // should work for ALL values, both float and double.
        std::stringstream s; // or files.
        s.precision(max_digits10); // 9 decimal digits for 32-bit float, 17
for 64-bit double.
        s.str().erase(); // see note below on why.
        s << f; // Output to string.
        float rf;
        s >> rf; // Read back into float.
        assert(f == rf); // Check get back **exactly** the same.

With MSVC, the problem is with s >> rf;  For some values, the input is a
single least significant bit wrong (greater).

The ***Good News*** is that, unlike what I found for VS 7.1, where 1/3 of
float values are read in 1 bit wrong,

VS 8.0 works correctly in release mode for ALL 32-bit float values.

(Digression - because of the memory leak in stringstream in VS 8.0 (it is
disgraceful that we haven't had an SP1 for this), the naïve test runs out of
real and virtual memory after half an hour if you try a brute force loop
re-creating stringstream for each value.  So it is necessary (and quicker)
to create the string just once and erase the string contents before each
test.
I used my own nextafterf to test all 2130706431 float values and it took
70:53 (must get my new dual-core machine going ;-).

The ***Bad News*** is that, as shown by John Maddock, for double there is a
bizarre small range of values where every third value of significand are
read in one bit wrong.  Murphy's law applies - it is fairly popular area.

Of course, testing all the double values would take longer than some of us
are likely to be above ground to be interested in the result ;-)

So I created vaguely random double values using 5 15-bit rand() calls to
fill all the bits, and then excluding NaN and infs.

(Unlike the more expertly random John Maddock, I decided it was best to keep
it simple to submit as a bug report to MS rather than any of the Boost fancy
randoms - which in any case seem to have bits which never get twiddled - not
my idea of random - but then I am not a statistican or mathematican.)

For example:

Written  : 0.00019879711946838022 == 3f2a0e8640d90401
Readback : 0.00019879711946838024 == 3f2a0e8640d90402  << note 1 bit
greater.

This shows that

failed 77 out of 100000 double values, fraction 0.0007.

The range of 'wrong' reads is roughly shown by

wrong min 0.00013372562138477771 == 3f2187165749cbef
wrong max 0.0038160481887855135 == 3f6f42d545772497

I suspect the 'bad' range is more like 0.0001 to 0.005 from some runs.

All have an exponent in the range 3f2 to 3f6.

And if you use nextafter to test sucessive double values in this range, each
3rd value is read in 'wrong.

I think we really can claim this is 'a bug not a feature' (MS reponse to my
complaint about 7.1 floats) and I will submit this soon.  With the info
above, it should be possible to find the obscure mistake.

I suspect this problem exists in many previous MS versions. I doubt even
Dinkumware would apply an extensive random double value test like this - it
takes some time to run.

If anyone wants to test other compilers, please mail me and I will dump my
crude test in the vault.

Paul

--
Paul A Bristow
Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB
Phone and SMS text +44 1539 561830, Mobile and SMS text +44 7714 330204
mailto: [hidden email]  http://www.hetp.u-net.com/index.html
http://www.hetp.u-net.com/Paul%20A%20Bristow%20info.html


| -----Original Message-----
| From: [hidden email]
| [mailto:[hidden email]] On Behalf Of Paul Giaccone
| Sent: 14 March 2006 17:39
| To: [hidden email]
| Subject: [boost] [serialization]
| Serialisation/deserialisation offloating-point values
|
| I'm having problems with deserialising floating-point (double) values
| that are written to an XML file.  I'm reading the values back in and
| comparing them to what I saved to ensure that my file has
| been written
| correctly.  However, some of the values differ in about the
| seventeenth
| significant figure (or thereabouts).
|
| I thought Boost serialization used some numerical limit to make sure
| that values are serialised exactly to full precision, so what is
| happening here?
|
| Example:
| Value in original object, written to file: 0.0019075645054089487
| Value actually stored in file (by examination of XML file):
| 0.0019075645054089487 [identical to value written to file]
| Value after deserialisation: 0.0019075645054089489
|
| It looks like there is a difference in the least-significant bit, as
| examining the memory for these two values gives:
|
| Original value: b4 83 9b ca e7 40 5f 3f
| Deserialised value: b5 83 9b ca e7 40 5f 3f
|
| (where the least-significant byte is on the left)
|
| Note the difference in the first bytes.
|
| I'm using Boost 1.33.1 with Visual Studio 7.1.3088 in debug mode.
|
| Paul



_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisation offloating-point values

Edward Diener
Paul A Bristow wrote:
>
> (Digression - because of the memory leak in stringstream in VS 8.0 (it is
> disgraceful that we haven't had an SP1 for this),

Another digression.

VS2002 had a single SP1 issued in 2005, long after most people switched
to the essentially free ( shipping cost from MS of DVDs ) VS2003
upgrade. VS2003 has never had an SP issued. VC 8.0 is in VS2005 and, at
the rate which MS has released SPs for the previous releases... <g>. MS
did issue workarounds for problems in VS200(2,3) if one reported bugs to
them verbally. Whether they still do I do not know. They have an online
bug reporting system at
http://lab.msdn.microsoft.com/productfeedback/default.aspx. So far they
appear to be very responsive to any bugs reported.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisation offloating-point values

pabristow
| -----Original Message-----
| From: [hidden email]
| [mailto:[hidden email]] On Behalf Of Edward Diener
| Sent: 04 April 2006 17:36
| To: [hidden email]
| Subject: Re: [boost] [serialization]
| Serialisation/deserialisation offloating-point values
|
| Paul A Bristow wrote:
| >
| > (Digression - because of the memory leak in stringstream in
| VS 8.0 (it is
| > disgraceful that we haven't had an SP1 for this),
|
| Another digression.
|
| They have an online bug reporting system at
| http://lab.msdn.microsoft.com/productfeedback/default.aspx.
| So far they appear to be very responsive to any bugs reported.

A workaround for the problem above has been issued - but it involves
re-compiling to produce a new .dll, a significant hassle.

My view is that this is not good enough - the very least is issue of a new
.dll.
But an SP1 would be better - there seem to be number of things they have
fixed.

I have expressed this view to their forum (and I am not alone!).

The response to the original report on 7.1 loopback was

http://lab.msdn.microsoft.com/productfeedback/viewfeedback.aspx?feedbackid=7
bf2f26d-171f-41fe-be05-4169a54eef9e

http://tinyurl.com/mpk72

essentially "it's a feature" - but then it was fixed in 8.0!

So don't hold your breath on this - VS 2008??

I suspect that the Fair Dinkumware version may be corrected already - would
anyone like to test?

Paul

--
Paul A Bristow
Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB
Phone and SMS text +44 1539 561830, Mobile and SMS text +44 7714 330204
mailto: [hidden email]  http://www.hetp.u-net.com/index.html
http://www.hetp.u-net.com/Paul%20A%20Bristow%20info.html




_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisationoffloating-point values

Jeff Flinn
Paul A Bristow wrote:
...

> A workaround for the problem above has been issued - but it involves
> re-compiling to produce a new .dll, a significant hassle.
>
> My view is that this is not good enough - the very least is issue of
> a new .dll.
> But an SP1 would be better - there seem to be number of things they
> have fixed.
>
> I have expressed this view to their forum (and I am not alone!).
>
> The response to the original report on 7.1 loopback was
>
> http://lab.msdn.microsoft.com/productfeedback/viewfeedback.aspx?feedbackid=7
> bf2f26d-171f-41fe-be05-4169a54eef9e
>
> http://tinyurl.com/mpk72
>
> essentially "it's a feature" - but then it was fixed in 8.0!

I noticed only a single vote on the importance of this bug( which is now at
2 with my vote :-) ). Perhaps if all of those interested added there voices
something would get done. Does Herb Sutter have a louder voice in that
regard?

Also the bug (it's still listed as bug I think), only addresses float and
not double. Did you ever enter one for double?

I too would like to see a vc7.1 SP. I just finally moved the last of our
products from vc6.5, and don't have the desire to move to vc8 just yet. :)

Jeff



_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [serialization] Serialisation/deserialisation offloating-point values

Edward Diener
In reply to this post by pabristow
Paul A Bristow wrote:

> | -----Original Message-----
> | From: [hidden email]
> | [mailto:[hidden email]] On Behalf Of Edward Diener
> | Sent: 04 April 2006 17:36
> | To: [hidden email]
> | Subject: Re: [boost] [serialization]
> | Serialisation/deserialisation offloating-point values
> |
> | Paul A Bristow wrote:
> | >
> | > (Digression - because of the memory leak in stringstream in
> | VS 8.0 (it is
> | > disgraceful that we haven't had an SP1 for this),
> |
> | Another digression.
> |
> | They have an online bug reporting system at
> | http://lab.msdn.microsoft.com/productfeedback/default.aspx.
> | So far they appear to be very responsive to any bugs reported.
>
> A workaround for the problem above has been issued - but it involves
> re-compiling to produce a new .dll, a significant hassle.
>
> My view is that this is not good enough - the very least is issue of a new
> .dll.
> But an SP1 would be better - there seem to be number of things they have
> fixed.
>
> I have expressed this view to their forum (and I am not alone!).

MS has been in no-SP mode since VS .NET has come out.

>
> The response to the original report on 7.1 loopback was
>
> http://lab.msdn.microsoft.com/productfeedback/viewfeedback.aspx?feedbackid=7
> bf2f26d-171f-41fe-be05-4169a54eef9e
>
> http://tinyurl.com/mpk72
>
> essentially "it's a feature" - but then it was fixed in 8.0!

It is demoralizing but you have to argue with MS once they gave one of
their inane responses justifying bugs as "as designed".

>
> So don't hold your breath on this - VS 2008??

Possibly, but of course there is no guarantee it will be fixed even in a
future new release. But I think once you force MS to admit that
something is truly a bug, they will fix it. This is much better than
Borland who, even when they knew a bug existed, simply decided to ignore
both the bug report or any future fix.

>
> I suspect that the Fair Dinkumware version may be corrected already - would
> anyone like to test?

What is Fair Dinkumware ?

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
1234