[JSON][review] Design: what is expected of a "JSON library"?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[JSON][review] Design: what is expected of a "JSON library"?

Boost - Dev mailing list
Hi Everyone,
This is not a complete Boost Review yet. In this email I wanted to discuss
the design goals of the library.

The point that I would like to raise is the tension between having
relatively small types (int64_t, uint64_t, double) represent numbers in
json::value on the one hand, and ECMA specification allowing arbitrarily
big/precise numbers in JSON format on the other.

Can I expect of a JSON library that when it converts a JSON contents into
internal representation and then back to JSON contents, I should get the
same contents (moduo white space)? This would be possible if the internal
representation of JSON numbers was an arbitrary-precision decimal type or a
string. But when we need to squeeze any number into 24 bits, we will soon
get to the point when integer number `100000000000000000000001` after the
experiment gets changed to `1E23`. Is this acceptable for a JSON library?

But maybe it is not a valid goal? Maybe the goal of the JSON format is to
have objects already created in internal representation converted to text
and then back to objects? (assuming the recipient program is run on the
same environment as the sender (no differences in word size or maximum int
representation.) That is, as long as you agree to the constraints of
json::value, whatever you manage to put inside, we guarantee that you get
the same value when you serialize it and then parse it back. Boost.JSON
does this nicely.

In that other view, if I have objects of types
`boost::multiprecision::cpp_int` my only option is to pass them as strings
in JSON protocol. But I can pass any number as string anyway, so what is
the use of numbers in JSON format? Uness it is just practical: you can
choose to use numbers and then internal representations of JSON may be
smaller.

Do you get the concern that I am seeing? I mean I have used JSON libraries
before, and this has never been a practical problem. But Boost Review puts
the bar high for the libraries, so I guess this question should be
answered: what guarantees can a JSON library give us with respect to
accuracy of numbers?

Regards,
&rzej;

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [JSON][review] Design: what is expected of a "JSON library"?

Boost - Dev mailing list
On Tue, Sep 15, 2020 at 8:47 AM Andrzej Krzemienski via Boost
<[hidden email]> wrote:
> ...Boost Review puts
> the bar high for the libraries, so I guess this question should be
> answered: what guarantees can a JSON library give us with respect to
> accuracy of numbers?

This is a great question. My philosophy is that there can be no single
JSON library which satisfies all use-cases, because there are
requirements which oppose each other. Some people for example want to
parse comments and be able to serialize the comments back out. This
would be a significant challenge to implement in the json::value
container without impacting performance. It needs to be said up-front,
that there are use-cases for which Boost.JSON will be ill-suited, and
that's OK. The library targets a specific segment of use-cases and
tries to excel for those cases. In particular, Boost.JSON is designed
to be a competitor to JSON for Modern C++ ("nlohmann's json") and
RapidJSON. Both of these libraries are wildly popular.

Support for extended or arbitrary precision numbers is something that
we can consider. It could be added as a new "kind", with a custom data
type. By necessity this would require dynamic allocation to store the
mantissa and exponent, which is fine. However note that the resulting
serialized JSON from these arbitrary precision numbers is likely to be
rejected by many implementations. In particular, Javascript in the
browser and Node.js in the server would reject such numbers.

As a goal of the library is suitability as a vocabulary type,
homogeneity of interface (same integer and floating point
representation on all platforms) is prioritized over min/maxing (using
the largest representation possible). The cousin to homogeneity is
compatibility - we would like to say that ANY serialized output of the
library will be recognizable by most JSON implementations in the wild.
If we support arbitrary precision numbers, some fraction of outputs
will no longer be recognized. Here we have the aforementioned tension
between features and usability. Increasing one decreases the other.

Regards

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [JSON][review] Design: what is expected of a "JSON library"?

Boost - Dev mailing list
wt., 15 wrz 2020 o 18:09 Vinnie Falco <[hidden email]> napisał(a):

> On Tue, Sep 15, 2020 at 8:47 AM Andrzej Krzemienski via Boost
> <[hidden email]> wrote:
> > ...Boost Review puts
> > the bar high for the libraries, so I guess this question should be
> > answered: what guarantees can a JSON library give us with respect to
> > accuracy of numbers?
>
> This is a great question. My philosophy is that there can be no single
> JSON library which satisfies all use-cases, because there are
> requirements which oppose each other. Some people for example want to
> parse comments and be able to serialize the comments back out. This
> would be a significant challenge to implement in the json::value
> container without impacting performance. It needs to be said up-front,
> that there are use-cases for which Boost.JSON will be ill-suited, and
> that's OK. The library targets a specific segment of use-cases and
> tries to excel for those cases. In particular, Boost.JSON is designed
> to be a competitor to JSON for Modern C++ ("nlohmann's json") and
> RapidJSON. Both of these libraries are wildly popular.
>
> Support for extended or arbitrary precision numbers is something that
> we can consider. It could be added as a new "kind", with a custom data
> type. By necessity this would require dynamic allocation to store the
> mantissa and exponent, which is fine. However note that the resulting
> serialized JSON from these arbitrary precision numbers is likely to be
> rejected by many implementations. In particular, Javascript in the
> browser and Node.js in the server would reject such numbers.
>
> As a goal of the library is suitability as a vocabulary type,
> homogeneity of interface (same integer and floating point
> representation on all platforms) is prioritized over min/maxing (using
> the largest representation possible). The cousin to homogeneity is
> compatibility - we would like to say that ANY serialized output of the
> library will be recognizable by most JSON implementations in the wild.
> If we support arbitrary precision numbers, some fraction of outputs
> will no longer be recognized. Here we have the aforementioned tension
> between features and usability. Increasing one decreases the other.
>

So, I can see the design goals and where they come from. For the record, I
am not requesting for the support of arbitrary-precision numbers. This is
just my way of trying to determine the scope of this library. I would
appreciate it if you said something similar in the docs in some "design
decisions" section.  To me, the sentence "This library provides containers
and algorithms which implement JSON" followed by a reference to Standard
ECMA-262 <https://www.ecma-international.org/ecma-262/10.0/index.html>
somehow implied that you are able to parse just *any* JSON input.

That high-level contract -- as I understand it -- is:
1. Any json::value that you can build can be serialized and then
deserialized, and you are guaranteed that the resulting json::value will be
equal to the original.
2. JSON inputs where number values cannot be represented losslessly in
uint64_t, int64_t and double, may render different values when parsed and
then serialized back, and for extremely big number values can even fail to
parse.
3. Whatever JSON output you can produce with this library, we guarantee it
can be passed by any common JSON implementation (probably also based on
uint64_t+int64_t+double implementation.

Regards,
&rzej;

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [JSON][review] Design: what is expected of a "JSON library"?

Boost - Dev mailing list
On Tue, Sep 15, 2020 at 4:00 PM Andrzej Krzemienski <[hidden email]> wrote:
> This is just my way of trying to determine the scope of this library.
> I would appreciate it if you said something similar in the docs in
> some "design decisions" section.
>
> That high-level contract -- as I understand it -- is:
> 1. Any json::value that you can build can be serialized and then deserialized, and you are guaranteed that the resulting json::value will be equal to the original.
> 2. JSON inputs where number values cannot be represented losslessly in uint64_t, int64_t and double, may render different values when parsed and then serialized back, and for extremely big number values can even fail to parse.
> 3. Whatever JSON output you can produce with this library, we guarantee it can be passed by any common JSON implementation (probably also based on uint64_t+int64_t+double implementation.

Yes this sounds about right and I agree the docs should make this clear.

Thanks

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [JSON][review] Design: what is expected of a "JSON library"?

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
> Do you get the concern that I am seeing? I mean I have used JSON libraries
> before, and this has never been a practical problem. But Boost Review puts
> the bar high for the libraries, so I guess this question should be
> answered: what guarantees can a JSON library give us with respect to
> accuracy of numbers?
>
> Regards,
> &rzej;

I would have liked to give users the freedom to choose the accuracy
through templates. Something like:

// i want speed
boost::json::parser< int  > p_int;

// i want accuracy
boost::json::parser< boost::multiprecision::cpp_int > p_infinity;

Eduardo


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

publickey - eduardo.quintana@pm.me.asc.pgp (814 bytes) Download Attachment
signature.asc (242 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [JSON][review] Design: what is expected of a "JSON library"?

Boost - Dev mailing list
On Wed, Sep 16, 2020 at 7:27 AM Eduardo Quintana via Boost
<[hidden email]> wrote:
> I would have liked to give users the freedom to choose the accuracy
> through templates.

The library avoids templates especially for the use-case you envision.
But that said, there are plans to improve the floating point
conversion algorithms using modern techniques. What is there now is
fast and reasonably accurate already, so in a sense you are already
getting both of the options you want.

Thanks

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [JSON][review] Design: what is expected of a "JSON library"?

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
Em qua., 16 de set. de 2020 às 11:27, Eduardo Quintana via Boost
<[hidden email]> escreveu:
> I would have liked to give users the freedom to choose the accuracy
> through templates. Something like:
>
> // i want speed
> boost::json::parser< int  > p_int;
>
> // i want accuracy
> boost::json::parser< boost::multiprecision::cpp_int > p_infinity;

By this example, can we assume you only care about this flexibility at
parsing level and not necessarily in the DOM-like object
(json::value)?


--
Vinícius dos Santos Oliveira
https://vinipsmaker.github.io/

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [JSON][review] Design: what is expected of a "JSON library"?

Boost - Dev mailing list
> > // i want speed
> > boost::json::parser< int  > p_int;
> >
> > // i want accuracy
> > boost::json::parser< boost::multiprecision::cpp_int > p_infinity;
>
> By this example, can we assume you only care about this flexibility at
> parsing level and not necessarily in the DOM-like object
> (json::value)?
>

That what just for illustration purposes.
I see that there are several objects involved.

My point is: the concept of JSON is independent of the underlying
types used for representing strings and numbers.
That's, in my opinion, a strong suggestion for using templates,
and a solution to the "precision vs speed" conundrum.

Of course a json::parser<T>, has to produce a json::value<T>
and yet json::value<T> can exists independently of a json::parser<T>
or json::serializer<T>.
Is up to the developers to choose how the abstraction is implemented.

Anyways that was just a comment and a suggestion.
If I was writting a json library I would have used templates.


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

publickey - eduardo.quintana@pm.me.asc.pgp (814 bytes) Download Attachment
signature.asc (242 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [JSON][review] Design: what is expected of a "JSON library"?

Boost - Dev mailing list
On Fri, 18 Sep 2020 at 09:40, Eduardo Quintana via Boost <
[hidden email]> wrote:

> > > // i want speed
> > > boost::json::parser< int  > p_int;
> > >
> > > // i want accuracy
> > > boost::json::parser< boost::multiprecision::cpp_int > p_infinity;
> >
> > By this example, can we assume you only care about this flexibility at
> > parsing level and not necessarily in the DOM-like object
> > (json::value)?
> >
>
> That what just for illustration purposes.
> I see that there are several objects involved.
>
> My point is: the concept of JSON is independent of the underlying
> types used for representing strings and numbers.
> That's, in my opinion, a strong suggestion for using templates,
> and a solution to the "precision vs speed" conundrum.
>
> Of course a json::parser<T>, has to produce a json::value<T>
> and yet json::value<T> can exists independently of a json::parser<T>
> or json::serializer<T>.
> Is up to the developers to choose how the abstraction is implemented.
>
> Anyways that was just a comment and a suggestion.
> If I was writting a json library I would have used templates.
>

Vinnie might have covered this already but here's my 2c.

JSON is in the main used as a language and platform agnostic data
interchange format. The *de-facto* reality is that every other platform and
language implementation that I can think of limits real numbers to doubles
and integers to the range +-2^63, so it seems to me that going beyond that
will achieve little in terms of general utility.

This is the situation today.

I understand the argument that this library could be an opportunity to push
the boundaries and lead other languages to better practice.

On the other hand, this is not the stated intent of the library - it's
intent (in summary) is to meet the needs of the majority of C++ programmers
who wish to interoperate with the world wide web today.

Along with others here I am sure, I have significant experience in dealing
with financial and cryptocurrency data interchange using JSON. The *reality*
is that one learns *not to rely on the various interpretations of number
values at all*.
If you're sending an important value (such as a price) you very quickly
learn to encode it as a JSON string representing the exact value and
precision you want.

In reality, JSON would be a perfectly useful data format if it did not have
numeric, boolean or null types at all. For this reason, it seems to me that
arguing over the precision of the API representation of these types is not
likely to lead to increased utility in practice.

Although I appreciate that some niche use cases would appreciate the
ability, it is my impression that these will be rare and in any case can be
worked around with strings.

R


>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>


--
Richard Hodges
[hidden email]
office: +442032898513
home: +376841522
mobile: +376380212

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [JSON][review] Design: what is expected of a "JSON library"?

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
On Tue, 15 Sep 2020 at 16:46, Andrzej Krzemienski via Boost
<[hidden email]> wrote:

> The point that I would like to raise is the tension between having
> relatively small types (int64_t, uint64_t, double) represent numbers in
> json::value on the one hand, and ECMA specification allowing arbitrarily
> big/precise numbers in JSON format on the other.

As far as I'm concerned any JSON library that doesn't address this is defective.
It's not rocket science: just allow access to the actual string
representation, don't try to convert it to a bunch of random types.

This library doesn't even provide it in its low-level API and forces
undesired conversion to int64/double onto users.
Parsing text in general should only be in charge of segmenting the
text, not synthesizing attributes from those segments.


> In that other view, if I have objects of types
> `boost::multiprecision::cpp_int` my only option is to pass them as strings
> in JSON protocol. But I can pass any number as string anyway, so what is
> the use of numbers in JSON format? Uness it is just practical: you can
> choose to use numbers and then internal representations of JSON may be
> smaller.

I personally do encode my decimal numbers (which have values that are
not representable as double) as numbers in JSON.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [JSON][review] Design: what is expected of a "JSON library"?

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
On Fri, 18 Sep 2020 at 09:18, Richard Hodges via Boost
<[hidden email]> wrote:

> JSON is in the main used as a language and platform agnostic data
> interchange format. The *de-facto* reality is that every other platform and
> language implementation that I can think of limits real numbers to doubles
> and integers to the range +-2^63, so it seems to me that going beyond that
> will achieve little in terms of general utility.
>
> This is the situation today.

That is just not true.
There are many implementations of JSON under the sun, some quite
low-key and homemade (it's pretty trivial after all), and quite a lot
do it correctly.

Also many people doing it wrong is not an argument.

>
> I understand the argument that this library could be an opportunity to push
> the boundaries and lead other languages to better practice.
>
> On the other hand, this is not the stated intent of the library - it's
> intent (in summary) is to meet the needs of the majority of C++ programmers
> who wish to interoperate with the world wide web today.
>
> Along with others here I am sure, I have significant experience in dealing
> with financial and cryptocurrency data interchange using JSON. The *reality*
> is that one learns *not to rely on the various interpretations of number
> values at all*.
> If you're sending an important value (such as a price) you very quickly
> learn to encode it as a JSON string representing the exact value and
> precision you want.

Well I've been working in high-frequency trading for years, and I have
the opposite experience.
Representation of numbers is serious business, be them fixed- or
float-precision, binary or decimal. If you can't even get that right
and have to use strings, you're in for a lot of problems.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [JSON][review] Design: what is expected of a "JSON library"?

Boost - Dev mailing list
On Fri, 18 Sep 2020 at 11:35, Mathias Gaunard <[hidden email]>
wrote:

> On Fri, 18 Sep 2020 at 09:18, Richard Hodges via Boost
> <[hidden email]> wrote:
>
> > JSON is in the main used as a language and platform agnostic data
> > interchange format. The *de-facto* reality is that every other platform
> and
> > language implementation that I can think of limits real numbers to
> doubles
> > and integers to the range +-2^63, so it seems to me that going beyond
> that
> > will achieve little in terms of general utility.
> >
> > This is the situation today.
>
> That is just not true.
> There are many implementations of JSON under the sun, some quite
> low-key and homemade (it's pretty trivial after all), and quite a lot
> do it correctly.
>
> Also many people doing it wrong is not an argument.
>

I think it's fair to say that Boost.JSON's audience might be, as you put so
succinctly, "many people doing it wrong". In other words, the target users
would be people looking for easy interoperation with internet-facing
services and the like, rather than specialist applications.



>
> >
> > I understand the argument that this library could be an opportunity to
> push
> > the boundaries and lead other languages to better practice.
> >
> > On the other hand, this is not the stated intent of the library - it's
> > intent (in summary) is to meet the needs of the majority of C++
> programmers
> > who wish to interoperate with the world wide web today.
> >
> > Along with others here I am sure, I have significant experience in
> dealing
> > with financial and cryptocurrency data interchange using JSON. The
> *reality*
> > is that one learns *not to rely on the various interpretations of number
> > values at all*.
> > If you're sending an important value (such as a price) you very quickly
> > learn to encode it as a JSON string representing the exact value and
> > precision you want.
>
> Well I've been working in high-frequency trading for years, and I have
> the opposite experience.
> Representation of numbers is serious business, be them fixed- or
> float-precision, binary or decimal. If you can't even get that right
> and have to use strings, you're in for a lot of problems.
>

I think in my mind, HFT would come under the heading of "specialist
application", whereas say, distributing tick data to the millions of young
cryptocurrency enthusiasts connected by app, browser, python bot, etc (or
consuming said data) would come under the heading of "general use".

In the general case, representation of real numbers is by no means
standardised and cannot be relied upon. In this case, strings are often
chosen to represent numbers because then the actual precision of the price
of say, BTC/USDT is not in doubt.

I completely understand that this is suboptimal for HFT, but then in
fairness, so is the choice of JSON as an encoding format. For example, for
internal communication I might prefer FlatBuffers or even unadulterated
binary if performance were the major consideration.

Nevertheless, I take your point that there are some applications for which
Boost.JSON would not be suitable. I don't think the authors would argue
with that.

What I do think is that for me, Boost is the go-to repository of
functionality that is missing from the standard library. And with that in
mind, Boost.JSON fills a gaping hole in the Boost suite of libraries in
that it gives its consumers a tool that developers using other languages
have taken for granted for years. In that sense, for me, it would be a
major step forward.

--
Richard Hodges
[hidden email]
office: +442032898513
home: +376841522
mobile: +376380212

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost