boost::asio::buffer bug with string literal

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

boost::asio::buffer bug with string literal

Boost - Users mailing list
When using boost::asio::buffer with a string literal the terminating
null character is included in the buffer size, leading to bugs like
accidentally sending out a null character. We could fix this by adding
something like this to buffer.hpp:

/// Create a new non-modifiable buffer that represents the given
character array.
/**
  * @returns A const_buffer value equivalent to:
  * @code const_buffer(
  *     static_cast<const void*>(data),
  *     (N - 1) * sizeof(char)); @endcode
  */
template <std::size_t N>
inline BOOST_ASIO_CONST_BUFFER buffer(
    const char (&data)[N]) BOOST_ASIO_NOEXCEPT
{
  return BOOST_ASIO_CONST_BUFFER(data, (N - 1) * sizeof(char));
}

But this might cause trouble when using a const char array that is not
null-terminated, since we then basically drop the last character. Would
this be the best approach?
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: boost::asio::buffer bug with string literal

Boost - Users mailing list
Would this be the best approach?

It gives me the shivers. Binary data could easily have a zero in the last byte.

Wouldn't it be more explicit to simply provide an overload for boost::utility::string_view and std::string_view?


On 28 November 2017 at 08:43, Martijn Otto via Boost-users <[hidden email]> wrote:
When using boost::asio::buffer with a string literal the terminating
null character is included in the buffer size, leading to bugs like
accidentally sending out a null character. We could fix this by adding
something like this to buffer.hpp:

/// Create a new non-modifiable buffer that represents the given
character array.
/**
  * @returns A const_buffer value equivalent to:
  * @code const_buffer(
  *     static_cast<const void*>(data),
  *     (N - 1) * sizeof(char)); @endcode
  */
template <std::size_t N>
inline BOOST_ASIO_CONST_BUFFER buffer(
    const char (&data)[N]) BOOST_ASIO_NOEXCEPT
{
  return BOOST_ASIO_CONST_BUFFER(data, (N - 1) * sizeof(char));
}

But this might cause trouble when using a const char array that is not
null-terminated, since we then basically drop the last character. Would
this be the best approach?
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users


_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: boost::asio::buffer bug with string literal

Boost - Users mailing list
Fair point, which is why I asked. The overloads for std::string_view
are already there in the 1.66 beta, but they are not used for string
literals. I think this is because it would be a conversion, where using
the template with an array of PodType is not - so this is the better
match.

Of course, with the std::string_view overload and using std::literals
one can simply add sv after their string literals to make it work,
since the overload is then an exact match.

Perhaps we could do something with std::char_traits, which has a
constexpr length function (since c++17). A quick test seems to indicate
it handles both null-terminated and non-null-terminated arrays
properly, though how it does so I have no idea. I have tested it across
different compilation units to prevent the compiler from inlining the
whole thing.

What about something like this?

/// Create a new non-modifiable buffer that represents the given
character array.
/**
  * @returns A const_buffer value equivalent to:
  * @code const_buffer(
  *     static_cast<const void*>(data),
  *     std::char_traits<char>::length(data)); @endcode
  */
inline BOOST_ASIO_CONST_BUFFER buffer(
    const char *data) BOOST_ASIO_NOEXCEPT
{
  return BOOST_ASIO_CONST_BUFFER(data,
std::char_traits<char>::length(data));
}

 Richard Hodges via Boost-users wrote:

> > Would this be the best approach?
>
> It gives me the shivers. Binary data could easily have a zero in the
> last byte.
>
> Wouldn't it be more explicit to simply provide an overload for
> boost::utility::string_view and std::string_view?
>
>
> On 28 November 2017 at 08:43, Martijn Otto via Boost-users <boost-use
> [hidden email]> wrote:
> > When using boost::asio::buffer with a string literal the
> > terminating
> > null character is included in the buffer size, leading to bugs like
> > accidentally sending out a null character. We could fix this by
> > adding
> > something like this to buffer.hpp:
> >
> > /// Create a new non-modifiable buffer that represents the given
> > character array.
> > /**
> >   * @returns A const_buffer value equivalent to:
> >   * @code const_buffer(
> >   *     static_cast<const void*>(data),
> >   *     (N - 1) * sizeof(char)); @endcode
> >   */
> > template <std::size_t N>
> > inline BOOST_ASIO_CONST_BUFFER buffer(
> >     const char (&data)[N]) BOOST_ASIO_NOEXCEPT
> > {
> >   return BOOST_ASIO_CONST_BUFFER(data, (N - 1) * sizeof(char));
> > }
> >
> > But this might cause trouble when using a const char array that is
> > not
> > null-terminated, since we then basically drop the last character.
> > Would
> > this be the best approach?
> > _______________________________________________
> > Boost-users mailing list
> > [hidden email]
> > https://lists.boost.org/mailman/listinfo.cgi/boost-users
> _______________________________________________
> Boost-users mailing list
> [hidden email]
> https://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: boost::asio::buffer bug with string literal

Boost - Users mailing list
In reply to this post by Boost - Users mailing list
On 29/11/2017 01:57, Richard Hodges wrote:
> It gives me the shivers. Binary data could easily have a zero in the
> last byte.

In theory those should be distinguishable -- char is a distinct type
from unsigned char (even if char is unsigned by default), and
*hopefully* anyone poking binary data at the wire is doing so as an
unsigned char or uint8_t.

In practice, there might be some trouble with people abusing std::string
as if it were std::vector<uint8_t> by pointer casting.

_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: boost::asio::buffer bug with string literal

Boost - Users mailing list


On 29 November 2017 at 02:58, Gavin Lambert via Boost-users <[hidden email]> wrote:
On 29/11/2017 01:57, Richard Hodges wrote:
It gives me the shivers. Binary data could easily have a zero in the last byte.

In theory those should be distinguishable -- char is a distinct type from unsigned char (even if char is unsigned by default), and *hopefully* anyone poking binary data at the wire is doing so as an unsigned char or uint8_t.

In reality, in my world (e.g. protocol buffers), std::string is a more useful type for binary data buffers that std::vector<std::uint8_t>. I know it's a different type, but still, it's chars modelling bytes. This is not an uncommon idea.

The fact that you can't construct a std::vector from data without a copy means that std::vector<byte> is less suitable for buffering data than one might imagine.
 

In practice, there might be some trouble with people abusing std::string as if it were std::vector<uint8_t> by pointer casting.


_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users


_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: boost::asio::buffer bug with string literal

Boost - Users mailing list
I don't think std::string should be a problem in this case, if you want
to construct a buffer from an std::string, you wouldn't be using this
overload anyway, since it's a pointer to a char, not an array of
characters.

It would only be a problem for someone who has a proper array of chars
that is not null-terminated. This is not exactly unimaginable, but it's
certainly unlikely and nothing is preventing anyone from using unsigned
char here.

On wo, 2017-11-29 at 07:47 +0100, Richard Hodges via Boost-users wrote:

>
>
> On 29 November 2017 at 02:58, Gavin Lambert via Boost-users <boost-us
> [hidden email]> wrote:
> > On 29/11/2017 01:57, Richard Hodges wrote:
> > > It gives me the shivers. Binary data could easily have a zero in
> > > the last byte.
> > >
> >  
> > In theory those should be distinguishable -- char is a distinct
> > type from unsigned char (even if char is unsigned by default), and
> > *hopefully* anyone poking binary data at the wire is doing so as an
> > unsigned char or uint8_t.
> In reality, in my world (e.g. protocol buffers), std::string is a
> more useful type for binary data buffers that
> std::vector<std::uint8_t>. I know it's a different type, but still,
> it's chars modelling bytes. This is not an uncommon idea.
>
> The fact that you can't construct a std::vector from data without a
> copy means that std::vector<byte> is less suitable for buffering data
> than one might imagine.
>  
> >
> > In practice, there might be some trouble with people abusing
> > std::string as if it were std::vector<uint8_t> by pointer casting.
> >
> >
> > _______________________________________________
> > Boost-users mailing list
> > [hidden email]
> > https://lists.boost.org/mailman/listinfo.cgi/boost-users
> >
> _______________________________________________
> Boost-users mailing list
> [hidden email]
> https://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: boost::asio::buffer bug with string literal

Boost - Users mailing list
I take your point. But what I am alluding to is that people have used chars for bytes since 1970.

I suspect that simply assuming that any char[N] is a string will break all kinds of code.

I think it's better to leave things as they are and demand that if people are dealing in zero-terminated strings, they explicitly state intent with a string_view or similar.

I agree that it's slightly unsatisfactory state of affairs, but this is not ASIO's problem to solve. The problem is with the initial decision by Mrs Kernighan and Ritchie to mix the concerns of pointers to bytes and strings in one type.

 


On 29 November 2017 at 12:59, Martijn Otto via Boost-users <[hidden email]> wrote:
I don't think std::string should be a problem in this case, if you want
to construct a buffer from an std::string, you wouldn't be using this
overload anyway, since it's a pointer to a char, not an array of
characters.

It would only be a problem for someone who has a proper array of chars
that is not null-terminated. This is not exactly unimaginable, but it's
certainly unlikely and nothing is preventing anyone from using unsigned
char here.

On wo, 2017-11-29 at 07:47 +0100, Richard Hodges via Boost-users wrote:
>
>
> On 29 November 2017 at 02:58, Gavin Lambert via Boost-users <boost-us
> [hidden email]> wrote:
> > On 29/11/2017 01:57, Richard Hodges wrote:
> > > It gives me the shivers. Binary data could easily have a zero in
> > > the last byte.
> > >
> >  
> > In theory those should be distinguishable -- char is a distinct
> > type from unsigned char (even if char is unsigned by default), and
> > *hopefully* anyone poking binary data at the wire is doing so as an
> > unsigned char or uint8_t.
> In reality, in my world (e.g. protocol buffers), std::string is a
> more useful type for binary data buffers that
> std::vector<std::uint8_t>. I know it's a different type, but still,
> it's chars modelling bytes. This is not an uncommon idea.
>
> The fact that you can't construct a std::vector from data without a
> copy means that std::vector<byte> is less suitable for buffering data
> than one might imagine.
>  
> >
> > In practice, there might be some trouble with people abusing
> > std::string as if it were std::vector<uint8_t> by pointer casting.
> >
> >
> > _______________________________________________
> > Boost-users mailing list
> > [hidden email]
> > https://lists.boost.org/mailman/listinfo.cgi/boost-users
> >
> _______________________________________________
> Boost-users mailing list
> [hidden email]
> https://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users


_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users