[review][fixed_string] FixedString review results

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[review][fixed_string] FixedString review results

Boost - Dev mailing list
These are the results of the candidate Boost.FixedString review. Thanks
to all the people
(and there's been many) who contributed. To clear up the tension, let me
begin by
announcing that:

Candidate Boost.FixedString is ACCEPTED into Boost. Acceptance is
subject to the authors'
addressing of some issues raised during the review that I label as
"priority 1" below.

*Summary of reviews*

We've had 10 full reviews:

   * Accept: Deniz Bahadir, Emil Dotchevski, Gavin Lambert, JeanHeyd
Meneide,
     Paul A. Bristow, Peter Dimov, Peter Koch Larsen.
   * Conditionally accept: Zach Laine.
   * Reject: Andrzej Krzemienski, Phil Endecott.

Zach's conditions to acceptance seem minor and easily addressable to me.
Andrzej based his rejection on a perceived lack of clarity as to what
the actual pitch of
the library is, which ultimately boils down to the issue of
throwing/asserting on
capacity overflow: my impression is that a strong pitch, if not a
universally acclaimed
one, emerged during discussions.
Finally, Phil voted to reject but then went on to state that he would
accept based
on a number of "minimally sufficient" conditions that are trivial to
meet (or even
have been already implemented by Krystian).

All in all, I think it is a reasonable compromise to grant acceptance
based on the
resolution of a number of important issues, listed below.

*Issues*

I've classified issues in three different priorities:

   * Priority 1: to be solved *before* inclusion into Boost.
   * Priority 2: to be addressed somewhere in the future.
   * Priority 3: anecdotal or without clear support from the community,
to be addressed
     at the authors's discretion.

* Priority 1 issues*

1. CMakeLists.txt should either be made generic or removed from repo (it
is currently
a specific thing for Visual Studio).
2. Choose a name for the component. Some consensus arose that naming
should follow
std::string's and be of the form basic_foo<...> with aliases foo, wfoo,
etc. As for "foo", the
following candidates have been proposed:
   * static_string
   * fixed_capacity_string
   * array_string
   * static_capacity_string
   * inplace_string
   * capped_string
   * bounded_capacity_string
   * bounded_string
To be clear, I don't think any of these is clearly more popular than the
rest, so it is up to
the authors to make up their mind.
3. constexpr as much as possible, taking into account that, pre C++20,
this requires
zero initialization of the data array.
4. noexcept as much as possible and, in principle, at least as much as
std::string.
5. Provide hash overloads. Up to the authors if they want to cover
Boost.Hash as well.
6. Determine the exact behavior on capacity overflow. I think *some*
partial consensus
arose around throwing via boost::throw_exception so as to provide an
opt-out mechanism
for exception-handicapped scenarios. This has been the most contentious
issue during the
review but I think the authors are in their right to decide on throwing
much as anyone
can decide otherwise and propose what would be basically a different
library.
7. Make substr return a fixed_string rather than a view, and
additionally provide a
view-returning subview member function.
8. State clearly on the docs what C++ version is required.
9 There are wrong "size() + m > max_size()" checks (they could overflow)
in the code.
10. to_fixed_string should be a set of overloads mathing those of
std::string.
11. There's an issue around some detail::is_input_iterator alias that
Zach asked be
clarified.
12. Compile-fail tests should be added to verify that those functions
with constraints are
correctly constrained.

*Priority 2*

13. Determine if fixed_string<0> should be specialized or not to make it
more memory
efficient. This raises potential issues in the uniqueness of the pointer
returned by data().
14. Determine if operator+(fixed_string,fixed_string) should be
provided. My impression
is that people generally wanted this, and I've been tempted to raise
this to priority 1.
15. Optimize the type of the internal size data member when size_t is
larger than
necessary.
16. BOOST_FIXED_STRING_USE_BOOST is unconditionally defined within the
code. The
authors should clearly state what this is for and let the user configure
the library to
use with/without Boost or, else, remove the thing entirely.
17. Source code organization (three equally named files, detal and impl
folders), raised
some eyebrows. Some (including Vinnie) favor a single-file implementation.
18. Docs should be more C++ standard-like and provide a synopsis. Vinnie
has indicated
they may abandon Doxygen in favor of asciidoc to produce better-quality
docs.
In general, people were not enthusiastic about the documentation.
19. The current monolithic test should be split up in lighter, more
manegeable test
files. Peter Dimov suggested some additional improvements on the testing
part.

* Priority 3*

20. Provide a literal operator""
21. Should traits parameter be removed?
22. Store size (as capacity-size) in the last char of the data array, so
as to save one
memory byte. This might affect performance, though.
23. Add functions for checking remaining capacity.

Thanks to Krystian and Vinnie for their work! And to everybody who has
contributed
to this really interesting review.

Best regards,

Joaquín M López Muñoz
Review manager for candidate Boost.FixedString

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [review][fixed_string] FixedString review results

Boost - Dev mailing list
On Sun, Dec 15, 2019 at 12:53 PM Joaquin M López Muñoz via Boost
<[hidden email]> wrote:
> Candidate Boost.FixedString is ACCEPTED into Boost

Thank you for your generous investment of time in managing this review.

Although my name is associated with it, I feel that Krystian should be
the principal investigator and therefore, take my comments below as
suggestions not requirements.

> 1. CMakeLists.txt should either be made generic or removed from repo (it
> is currently a specific thing for Visual Studio).

Here I must object, I need a CML that produces a proper Visual Studio
project file because that is my primary workflow. This is going to be
a recurring problem with all of my repositories (Beast, JSON come to
mind and I have several more in the pipeline).

I don't have the knowledge to solve it, but if someone was to submit a
patch that makes the CML work correctly for everyone and continues to
satisfy the needs of my workflow while still being maintainable by me,
I would be more than happy to run with that and port it to my other
projects.

>    * static_string

I regret changing the name hastily in the first place, I should have
left it as static_string because this is consistent with
boost::static_vector.

> 3. constexpr as much as possible, taking into account that, pre C++20
> this requires zero initialization of the data array.
> 4. noexcept as much as possible and, in principle, at least as much as
> std::string.
> 5. Provide hash overloads. Up to the authors if they want to cover
> Boost.Hash as well.

I agree with everything here.

> 6. Determine the exact behavior on capacity overflow.

I only just a few weeks ago saw Boost.Container's solution, committed
recently, which is to make the behavior of overflow a policy. Although
my personal view is that this is an overly complex solution, enough
people want the other behavior that it makes sense to just copy what
Ion did and do the same thing in static_string. This will make
everyone happy and it will be consistent with Boost.Container, thus a
fairly safe and uncontroversial move.

> 7. Make substr return a fixed_string rather than a view, and
> additionally provide a
> view-returning subview member function.
> 8. State clearly on the docs what C++ version is required.
> 9 There are wrong "size() + m > max_size()" checks (they could overflow)
> in the code.
> 10. to_fixed_string should be a set of overloads mathing those of
> std::string.
> 11. There's an issue around some detail::is_input_iterator alias that
> Zach asked be
> clarified.

Yes to all.

> 12. Compile-fail tests should be added to verify that those functions
> with constraints are correctly constrained.

Visual Studio doesn't easily support compile-fail tests, so I would
prefer to see these implemented as SFINAE-based static_asserts
instead, so that each compile-fail test doesn't need to go into a
separate build target.

> *Priority 2*

These are all good areas of research.

> 16. BOOST_FIXED_STRING_USE_BOOST is unconditionally defined within the
> code. The
> authors should clearly state what this is for and let the user configure
> the library to
> use with/without Boost or, else, remove the thing entirely.

The idiom I am using in all of my new libraries, which I think should
be applied identically to static_string is thus:

1. The default configuration is to build a linkable library (i.e. not
header only) and require Boost. For static_string, the linkable part
is not really applicable since the library consists only of templates.

2. Optionally, the user can define BOOST_STATIC_STRING_STANDALONE, and
the files will compile without Boost. They will not include any Boost
headers, but all declarations will still be in the boost namespace.
Users who define BOOST_STATIC_STRING_STANDALONE must still qualify
library identifiers with boost::static_string. The standalone version
of the library may have higher requirements. In this case it would
require C++17 (for string_view).

3. Optionally, the user can define BOOST_STATIC_STRING_HEADER_ONLY,
and the files will compile in header-only mode, where no separate
linker input is needed. This is not applicable to static_string but it
is how I am structuring my other libraries (except for Beast, which I
am leaving alone for now to not break users).

> 17. Source code organization (three equally named files, detal and impl
> folders), raised some eyebrows.

Generally speaking this is the source code organization I use, which
closely resembles how Boost.Asio and standalone Asio are structured,
and this is done so for 1. minimizing the amount of header material
exposed to the documentation toolchain, 2. providing the separation
needed to support the header-only configuration, and 3. Separating
public from private implementation. There's nothing generally wrong
with the equally named files, in larger projects it would be highly
annoying to have to give them all different names, it is much easier
to find the matching files when the names are the same.

For static_string though, which is template-only and rather limited in
how large it can get, putting it in one header file probably outweighs
the benefits of separation. And in fact, advertising "just copy one
file into your project to use this!" is a pretty good selling point
(especially with BOOST_STATIC_STRING_STANDALONE defined).

> Vinnie has indicated they may abandon Doxygen in favor of asciidoc
> to produce better-quality docs.

This is up to Krystian, and I support such a change.

I have no opinions on the rest.

Thanks!!

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [review][fixed_string] FixedString review results

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
On Sun, Dec 15, 2019 at 3:53 PM Joaquin M López Muñoz via Boost <
[hidden email]> wrote:

> Candidate Boost.FixedString is ACCEPTED into Boost.


  Thank you again for managing the review :)  Here is what has been
implemented thus far:

3. constexpr as much as possible, taking into account that, pre C++20,
> this requires
> zero initialization of the data array.
>

Implemented, but tests need to be written


> 4. noexcept as much as possible and, in principle, at least as much as
> std::string.
>

Mostly implemented (some more complex conditional noexcepts need to be done)


> 5. Provide hash overloads. Up to the authors if they want to cover
> Boost.Hash as well.
>

Support for std::hash and Boost.Hash have been implemented


> 6. Determine the exact behavior on capacity overflow.
>

I think we will stick with exceptions, and add a macro that disables
exceptions and instead makes them precondition violations


> 7. Make substr return a fixed_string rather than a view, and
> additionally provide a
> view-returning subview member function.
>

Implemented.


> 9 There are wrong "size() + m > max_size()" checks (they could overflow)
> in the code.
>

This has been addressed


> 13. Determine if fixed_string<0> should be specialized or not to make it
> more memory efficient.


This has been implemented, so feel free to play around with it


> 15. Optimize the type of the internal size data member when size_t is
> larger than
> necessary.
>

Implemented


> 16. BOOST_FIXED_STRING_USE_BOOST is unconditionally defined within the
> code.
>

This has been addressed. In addition to that, it has been renamed to
BOOST_FIXED_STRING_STANDALONE


> 22. Store size (as capacity-size) in the last char of the data array, so
> as to save one memory byte.


This has been implemented, there will be a macro to enable/disable this.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [review][fixed_string] FixedString review results

Boost - Dev mailing list
On 16/12/2019 13:45, Krystian Stasiowski wrote:
>> 22. Store size (as capacity-size) in the last char of the data array, so
>> as to save one memory byte.
>
> This has been implemented, there will be a macro to enable/disable this.

Using a macro to alter storage layout is a bad idea unless done very
carefully (eg. the macro selects between implementations in different
namespaces, or with different template parameters).

Otherwise you risk subtle ABI breakage problems if it (accidentally or
otherwise) ends up defined differently in different translation units
that are later linked together.

Anything that alters the ABI layout of the type needs to be reflected in
the type's mangled signature in some way.


Otherwise, just pick something and stick with it; don't make it
configurable.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [review][fixed_string] FixedString review results

Boost - Dev mailing list
On Sun, Dec 15, 2019 at 7:25 PM Gavin Lambert via Boost
<[hidden email]> wrote:
> don't make it configurable.

Agreed

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

[fixed_string] Designing the contract

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
niedz., 15 gru 2019 o 22:33 Vinnie Falco via Boost <[hidden email]>
napisał(a):

> On Sun, Dec 15, 2019 at 12:53 PM Joaquin M López Muñoz via Boost
> <[hidden email]> wrote:
> > Candidate Boost.FixedString is ACCEPTED into Boost
>
> > 6. Determine the exact behavior on capacity overflow.
>
> I only just a few weeks ago saw Boost.Container's solution, committed
> recently, which is to make the behavior of overflow a policy. Although
> my personal view is that this is an overly complex solution, enough
> people want the other behavior that it makes sense to just copy what
> Ion did and do the same thing in static_string. This will make
> everyone happy and it will be consistent with Boost.Container, thus a
> fairly safe and uncontroversial move.
>

I wanted to indicate that this would in fact still be a controversial move.

I need a library to tell me: is resizing over max_size() a bug or a correct
usage? And the answer you will give me is "it is neither or both: it
depends on what policy is selected". And this makes the library's contract
more vague. If I am in a template:

```
template<size_t MaxSize, typename CharT, typename Policy>
void do_something(static_string<MaxSiz, CharT,  Policy> &str)
{
  // ???
}
```

I do not know what the contract is.

Hiding the decision under a macro doesn't seem an uncontroversial solution
either. If I configure the macro to mean "over-resizing is a bug", and I am
using a third party library that internally uses `static_string`, and I may
not even know about it, and it defines the same macro as "over-resize is
fine", I will get a ODR violation and the likely outcome will be that
either I or the third party library will get a different behavior than
requested.

My recommendation would be to just make a call that over-resizing the
string is fine and calls `boost::throw_exception()` and not allow the users
to customize it beyond what `boost::throw_exception()` already offers. If I
need a library that needs the over-resizing to be a diagnosable bug, i will
use a different one (which may be a thin wrapper over `static_string`.

Regards,
&rzej;

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [fixed_string] Designing the contract

Boost - Dev mailing list

> Hiding the decision under a macro doesn't seem an uncontroversial solution
> either. If I configure the macro to mean "over-resizing is a bug", and I am
> using a third party library that internally uses `static_string`, and I may
> not even know about it, and it defines the same macro as "over-resize is
> fine", I will get a ODR violation and the likely outcome will be that
> either I or the third party library will get a different behavior than
> requested.

IMO the ODR violation is critical. I don't think (anymore) behavior
should be changed by (such kind of) definitions





_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [fixed_string] Designing the contract

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
On Mon, Dec 16, 2019 at 12:44 AM Andrzej Krzemienski via Boost
<[hidden email]> wrote:
> I wanted to indicate that this would in fact still be a controversial move.

You bring up very good points, and I didn't really think much about
the Policy approach, because Boost.Container uses it and I simply
assume that most Boost libraries are correct as written. Every problem
that you pointed out is also surely a problem with Boost.Container
then, does static_vector do the wrong thing here?

https://github.com/boostorg/container/blob/dd158987e6bfbb83f82f35d6e34c93577bcb160b/include/boost/container/static_vector.hpp#L38

Thanks

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [fixed_string] Designing the contract

Boost - Dev mailing list
On Mon, Dec 16, 2019 at 8:50 AM Vinnie Falco via Boost <
[hidden email]> wrote:

> On Mon, Dec 16, 2019 at 12:44 AM Andrzej Krzemienski via Boost
> <[hidden email]> wrote:
> > I wanted to indicate that this would in fact still be a controversial
> move.
>
> You bring up very good points, and I didn't really think much about
> the Policy approach, because Boost.Container uses it and I simply
> assume that most Boost libraries are correct as written. Every problem
> that you pointed out is also surely a problem with Boost.Container
> then, does static_vector do the wrong thing here?
>
>
> https://github.com/boostorg/container/blob/dd158987e6bfbb83f82f35d6e34c93577bcb160b/include/boost/container/static_vector.hpp#L38


I think so, yes.  This is a case of not being generic, but instead being
vague.  As Andrej points out, in many contexts I may not even know what the
contract is.

Zach

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [fixed_string] Designing the contract

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
On 16/12/2019 21:43, Andrzej Krzemienski wrote:

> ```
> template<size_t MaxSize, typename CharT, typename Policy>
> void do_something(static_string<MaxSiz, CharT,  Policy> &str)
> {
>    // ???
> }
> ```
>
> I do not know what the contract is.
>
> Hiding the decision under a macro doesn't seem an uncontroversial solution
> either. If I configure the macro to mean "over-resizing is a bug", and I am
> using a third party library that internally uses `static_string`, and I may
> not even know about it, and it defines the same macro as "over-resize is
> fine", I will get a ODR violation and the likely outcome will be that
> either I or the third party library will get a different behavior than
> requested.

This is why, as I mentioned elsewhere in a related context, macros
should only ever select between different explicitly-named-differently
implementations (such as setting a default for that template policy
parameter, but not replacing the template policy parameter).

Otherwise you get ODR violations and inconsistent ABI.  At least when
explicitly part of the signature, both implementations can co-exist in
the same binary with minimal issues as long as they're kept strictly
separate -- and if someone tries to "cross the streams" then they'll get
linker errors.

Or, as I also said elsewhere, just pick one and stick to your guns.
Personally, I prefer throwing exceptions for that case; it's much safer
than allowing a buffer overflow.

> My recommendation would be to just make a call that over-resizing the
> string is fine and calls `boost::throw_exception()` and not allow the users
> to customize it beyond what `boost::throw_exception()` already offers. If I
> need a library that needs the over-resizing to be a diagnosable bug, i will
> use a different one (which may be a thin wrapper over `static_string`.

+1.  Although throw_exception itself is not entirely immune to the
problems above, it is at least a customisation point that only the
application author is allowed to touch, which is better than a macro.

[Although things get a bit dicier with shared objects and private
visibility.]

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [fixed_string] Designing the contract

Boost - Dev mailing list

On 16.12.19 23:43, Gavin Lambert via Boost wrote:

> This is why, as I mentioned elsewhere in a related context, macros
> should only ever select between different explicitly-named-differently
> implementations (such as setting a default for that template policy
> parameter, but not replacing the template policy parameter).
>
> Otherwise you get ODR violations and inconsistent ABI.  At least when
> explicitly part of the signature, both implementations can co-exist in
> the same binary with minimal issues as long as they're kept strictly
> separate -- and if someone tries to "cross the streams" then they'll
> get linker errors.
>
> Or, as I also said elsewhere, just pick one and stick to your guns.
> Personally, I prefer throwing exceptions for that case; it's much
> safer than allowing a buffer overflow.
+1 to the whole. Having macros to select default template params avoids
the ODR and mixing issues. HOWEVER: What about you set
BOOST_USE_THROW=1, then include libraryX then Boost and then use
`static_string<char/*, default chosen by macro*/>`? libraryX might
redefine BOOST_USE_THROW=0 and you get the wrong type.
So IMO either template policy (only) or "stick to your guns".
I'm also pro-exception on "overflow". Often enough (especially in hot
loops where bounds are usually clear) the optimizer will remove all those.
At least, just std::abort.
> +1. Although throw_exception itself is not entirely immune to the
> problems above, it is at least a customisation point that only the
> application author is allowed to touch, which is better than a macro.
>
> [Although things get a bit dicier with shared objects and private
> visibility.]
Oh yes... Damn shared objects. And with boost::throw_exception opens the
door for ODR violations again...



_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [review][fixed_string] FixedString review results

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
On Sun, Dec 15, 2019 at 12:53 PM Joaquin M López Muñoz via Boost
<[hidden email]> wrote:
> 10. to_fixed_string should be a set of overloads
> mathing those of std::string.

What's wrong with one or two function templates constrained on say,
integers and floating point numbers?

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [review][fixed_string] FixedString review results

Boost - Dev mailing list
On 21/12/2019 12:04, Vinnie Falco wrote:
> On Sun, Dec 15, 2019 at 12:53 PM Joaquin M López Muñoz wrote:
>> 10. to_fixed_string should be a set of overloads
>> mathing those of std::string.
>
> What's wrong with one or two function templates constrained on say,
> integers and floating point numbers?

For one, a set of overloads should compile much faster than SFINAE
templates.

It's also less prone to forgetting that bool is an integer type (and in
this context we probably don't want it to be).

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost