[fixed_string][review] Review of fixed_string

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[fixed_string][review] Review of fixed_string

Boost - Dev mailing list
This is my review of FixedString.

* Summary

The library should be ACCEPTED. It's useful, for more than one domain, and
specifically where compile-time string manipulation is concerned, everyone
needs to reinvent it.

The name fixed_string is fine as is. Alternatives exist, but I don't see any
of them as unquestionably better, and one becomes accustomed to
"fixed_string" after a very brief period.

Throwing on exceeding capacity is fine as is.

A function should be provided to check the remaining capacity, because
writing this check by hand is very easy to get wrong, and in fact the
library throughout its implementation consistently gets it wrong.

* Documentation

The documentation is adequate, but I don't find the reference convenient. I
would prefer a synopsis that contains the entire class definition of
`fixed_string`, so that I can see the various overloads at a glance.
Instead, the synopsis only gives me the declaration of `fixed_string`,
without even telling me what namespace it's in. (It also gives me
`basic_string_view`, also without mentioning the namespace.)

The current layout is fine if I want to look up a specific function by name.

The format should ideally follow the standard library, with Throws and
Expects clauses (the latter instead of "undefined behavior unless".)

It's not documented when constructors and assignment operators throw.

* Design

- static_capacity should be removed; capacity() and max_size() should be
static constexpr.

- all nonthrowing functions should be marked `noexcept`.

- initializer_list<CharT> overloads are of dubious utility. I know that the
design copies std::string here, but I think that we should deviate and not
provide them.

- substr() should return a fixed_string, subview() a string_view (as already
implemented.)

- operator+ should be provided, only for arguments with capacity known at
compile time (fixed_string<N>, CharT and CharT const (&)[ N ]), with the
capacity of the result being the sum of the capacities of the arguments.
This is not very useful for long strings, but it's fine for short strings,
useful for compile-time manipulation, and useful in simple cases such as
to_fixed_string( n ) + " bytes written".

- a function that checks whether the remaining capacity is at least n should
be added. I suggest the name

    bool can_fit( std::size_t n ) const noexcept;

although my previous choice of "has_capacity_for" has merit too.

This is, in my opinion, an absolutely necessary addition, because of my
already-stated observation that the check is almost always gotten wrong.
Here for instance is an example:

    if(size() + count > max_size())
        BOOST_FIXED_STRING_THROW(std::length_error{
            "size() + count > max_size()"});

The correct way to write it is max_size() - size() > count, which avoids the
integer overflow in size() + count. In this case, passing count = (size_t)-1
will pass the check (when size() > 0) and then lead to undefined behavior.

This makes the utility of reserve() dubious, because it basically invites
one to get it wrong, but I think we should retain it for consistency.

- everything should ideally be constexpr. The problem here is that constexpr
requires initializing all elements, and this will heavily penalize runtime
uses of f.ex. fixed_string<512>. On compilers with
__builtin_is_constant_evaluated (gcc 9, clang 9) we should use that;
otherwise, it might be worth it to create a specialization for N < some
suitable upper limit.

Again, any nontrivial string manipulation at compile time basically requires
a fixed_string.

* Implementation

The physical separation of the library into three headers is inconvenient. A
single header would make it possible to include the library directly on
Godbolt via https, which is useful for both demonstrations and analysis of
the generated code. In addition, hunting down the definitions is irritating
for people reading the code.

The reverse is true for the tests; they should be split into files each
testing a logical group, such as constructors, assignment, append, and so
on.

The helper functions "testAS", "testI" and so on make reading the tests
unnecessarily cryptic (and is quite annoying for reviewers.) They should
have been named "test_assign", "test_insert".

It's generally better to use `BOOST_TEST_EQ(s.size(), 0);` instead of
`BOOST_TEST(s.size() == 0);` because the former prints the values on
failure, which often gives clues as to where the error is.

The test functions shouldn't be in the boost::fixed_string namespace. Users
don't write their code in this namespace, so the tests don't test what users
use.

All capacity checks should be either audited for correctness and fixed, or
use the proposed can_fit addition.

Instead of repeating

        BOOST_FIXED_STRING_THROW(std::length_error{
            "n > max_size()"});

in every function, which leads to it being inlined there repeatedly, this
could use a helper function

    void throw_length_error( char const * msg, boost::source_location const&
loc )
    {
        boost::throw_exception( std::length_error( msg ), loc );
    }

taking advantage of the new boost::throw_exception overload that takes a
source location as documented in
https://www.boost.org/doc/libs/develop/libs/throw_exception/doc/html/throw_exception.html.

A variation could take `n` as an argument and compose a message containing
it, f.ex.

    void throw_length_error_fmt( boost::source_location const& loc, char
const * fmt, ... );

#define BOOST_FIXED_STRING_USE_BOOST should go, as already mentioned.

Since the library requires C++11, there's no need to use
BOOST_STATIC_ASSERT. static_assert is fine. Unless of course we backport it
to C++03, which might be doable, although of questionable utility.

Checking for overlap and using Traits::copy instead of Traits::move isn't
needed. Traits::move does the same check anyway, so there's no gain, just an
opportunity to get the check wrong and introduce undefined behavior.

"The behavior is undefined if `count >= npos`" doesn't seem necessary. In
this case, count will be bigger than max_size() and the constructor will
throw std::legth_error. There's no reason to gratuitously introduce
undefined behavior.

All "undefined behavior" cases should consistently BOOST_ASSERT.


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [fixed_string][review] Review of fixed_string

Boost - Dev mailing list
On Wed, Dec 4, 2019 at 8:01 AM Peter Dimov via Boost
<[hidden email]> wrote:
> The documentation is adequate, but I don't find the reference convenient.

I'm ok with it if Krystian wants to deploy the docs using asciidoc,
which is probably a better choice for a single-type library.

> Since the library requires C++11, there's no need to use
> BOOST_STATIC_ASSERT. static_assert is fine.

BOOST_STATIC_ASSERT only requires 1 argument :)

Thanks

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [fixed_string][review] Review of fixed_string

Boost - Dev mailing list
Vinnie Falco wrote:
> > Since the library requires C++11, there's no need to use
> > BOOST_STATIC_ASSERT. static_assert is fine.
>
> BOOST_STATIC_ASSERT only requires 1 argument :)

The one argument form isn't used though.

https://github.com/18/fixed_string/blob/master/include/boost/fixed_string/config.hpp#L49

#define BOOST_FIXED_STRING_STATIC_ASSERT(cond, msg)
BOOST_STATIC_ASSERT_MSG(cond, msg)


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [fixed_string][review] Review of fixed_string

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
> - everything should ideally be constexpr. The problem here is that
> constexpr requires initializing all elements, and this will heavily
> penalize runtime uses of f.ex. fixed_string<512>. On compilers with
> __builtin_is_constant_evaluated (gcc 9, clang 9) we should use that;
> otherwise, it might be worth it to create a specialization for N < some
> suitable upper limit.

Wait, I have an idea.

template<std::size_t N> struct fixed_string
{
private:

    std::size_t size_ = 0;

    union
    {
        char first_ = 0;
        char data_[ N+1 ];
    };

public:

    fixed_string() = default;
};

Now default initialization sets just the first char to zero, as we want, and
value initialization initializes all data_ to zero and is constexpr.

auto f1()
{
    fixed_string<24> s;
    return s;
}

auto f2()
{
    constexpr fixed_string<24> s{};
    return s;
}

https://godbolt.org/z/uhf4S4

I don't know if that's defined behavior or not, but if it isn't, it better
be.


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [fixed_string][review] Review of fixed_string

Boost - Dev mailing list
On Wed, Dec 4, 2019 at 6:38 PM Peter Dimov via Boost <[hidden email]>
wrote:

> I don't know if that's defined behavior or not, but if it isn't, it better
> be.
>

It is, but only in C++17 and up.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [fixed_string][review] Review of fixed_string

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
Peter Dimov wrote:

> - a function that checks whether the remaining capacity is at least n should
> be added. I suggest the name
>
>     bool can_fit( std::size_t n ) const noexcept;
>
> although my previous choice of "has_capacity_for" has merit too.
>
> This is, in my opinion, an absolutely necessary addition, because of my
> already-stated observation that the check is almost always gotten wrong.
> Here for instance is an example:
>
>     if(size() + count > max_size())
>         BOOST_FIXED_STRING_THROW(std::length_error{
>             "size() + count > max_size()"});


Make this a free function template and you can use it with other containers:

template <typename CONTAINER>
bool can_fit( const CONTAINER& C, typename CONTAINER::size_type n ) noexcept
{
   return C.capacity() - C.size() >= n;
}





_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [fixed_string][review] Review of fixed_string

Boost - Dev mailing list


On December 5, 2019 9:35:35 AM EST, Phil Endecott via Boost <[hidden email]> wrote:
> Peter Dimov wrote:
> > - a function that checks whether the remaining capacity is at least
> n should
> > be added. I suggest the name
> >
> >     bool can_fit( std::size_t n ) const noexcept;
> >
> > although my previous choice of "has_capacity_for" has merit too.

I suggest the simpler "fits".

> Make this a free function template and you can use it with other
> containers:
>
> template <typename CONTAINER>
> bool can_fit( const CONTAINER& C, typename CONTAINER::size_type n )
> noexcept
> {
>    return C.capacity() - C.size() >= n;
> }

That's a nice idea, especially with the name "fits". :) However, if you reverse the parameters, "fits_in" would be a better name as it indicates the argument order.
--
Rob

(Sent from my portable computation device.)

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost