Quantcast

boost::filesystem::path operator/= is broken for char[1] in 1.48

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

boost::filesystem::path operator/= is broken for char[1] in 1.48

Nick Zavaritsky
Hi!

Please consider the code below:

=== BUG ===

boost::filesystem::path some_path("first");
char path_component[1]; /* = "second" */
some_path /= path_component;

Some_path value is expected to be "first/second" but it is "first".
I.e operator /= had no effect.

=== END BUG ===

Concerning a longer string in a char[1] buffer.
Actually path_component is the *last* field in a heap-allocated structure.
The structure is over-allocated to enable storing a longer string in a char[1] buffer.
This technique is often used in C code.

Here is operator /= callstack.

#0  boost::filesystem3::path_traits::empty<char const, 1ul> () at path_traits.hpp:89
#1  0x0000000100001e7e in boost::filesystem3::path::append<char [1]> (this=0x7fff5fbff910, source=@0x100100270, cvt=@0x100100280) at path.hpp:655
#2  0x0000000100001f77 in boost::filesystem3::path::operator/=<char [1]> (this=0x7fff5fbff910, source=@0x100100270) at path.hpp:235

The relevant excerpt from boost::filesystem3::path_traits::empty follows:

namespace path_traits {
  ...
  template <typename T, size_t N> inline
     bool empty(T (&)[N])
       { return N <= 1; }
 ...
}

It appears that boost::filesystem::path treats char[1] as an empty path regardless of the actual content.
This is definitely wrong due to so many C libraries storing long strings in char[1] buffers.
In particular on Mac OS X fts(3) FTSENT::fts_name is char[1].

I believe the current operator =/ behavior is outright dangerous and should be fixed.
By the way the only code currently benifiting from the boost::filesystem::path being overly smart is the code appending "" (empty string literal) to pathes.
_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: boost::filesystem::path operator/= is broken for char[1] in 1.48

Ovanes Markarian
Hello Nick,

please see my answer below.

On Fri, Apr 20, 2012 at 6:08 PM, Nick Zavaritsky <[hidden email]> wrote:
Hi!

Please consider the code below:

=== BUG ===

boost::filesystem::path         some_path("first");
char                                    path_component[1]; /* = "second" */
some_path /= path_component;

Some_path value is expected to be "first/second" but it is "first".
I.e operator /= had no effect.

I think the idea behind this is that any char* related string must be null terminated. So char[1] has only place for '\0'...
 
=== END BUG ===

Concerning a longer string in a char[1] buffer.
Actually path_component is the *last* field in a heap-allocated structure.
The structure is over-allocated to enable storing a longer string in a char[1] buffer.
This technique is often used in C code.
This over-allocation is pretty much the same as reserving some space at the end of the data structure to place data together. Am I right? Why isn't it possible to just write char[128] or char[16] instead of char[1]?  And if you know that this buffer represents some special case, why don't you pass it as such to filesystem? &char[0], than boost::filesystem will not assume it got char[1] and will be looking to '\0' marker...
 

Here is operator /= callstack.

#0  boost::filesystem3::path_traits::empty<char const, 1ul> () at path_traits.hpp:89
#1  0x0000000100001e7e in boost::filesystem3::path::append<char [1]> (this=0x7fff5fbff910, source=@0x100100270, cvt=@0x100100280) at path.hpp:655
#2  0x0000000100001f77 in boost::filesystem3::path::operator/=<char [1]> (this=0x7fff5fbff910, source=@0x100100270) at path.hpp:235

The relevant excerpt from boost::filesystem3::path_traits::empty follows:

namespace path_traits {
 ...
 template <typename T, size_t N> inline
    bool empty(T (&)[N])
      { return N <= 1; }
 ...
}

It appears that boost::filesystem::path treats char[1] as an empty path regardless of the actual content.
This is definitely wrong due to so many C libraries storing long strings in char[1] buffers.
In particular on Mac OS X fts(3) FTSENT::fts_name is char[1].

I believe the current operator =/ behavior is outright dangerous and should be fixed.
By the way the only code currently benifiting from the boost::filesystem::path being overly smart is the code appending "" (empty string literal) to pathes.

Regards,
Ovanes 

_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: boost::filesystem::path operator/= is broken for char[1] in 1.48

Lars Viklund
On Fri, Apr 20, 2012 at 07:15:53PM +0200, Ovanes Markarian wrote:

> Actually path_component is the *last* field in a heap-allocated structure.
> > The structure is over-allocated to enable storing a longer string in a
> > char[1] buffer.
> > This technique is often used in C code.
> >
> This over-allocation is pretty much the same as reserving some space at the
> end of the data structure to place data together. Am I right? Why isn't it
> possible to just write char[128] or char[16] instead of char[1]?  And if
> you know that this buffer represents some special case, why don't you pass
> it as such to filesystem? &char[0], than boost::filesystem will not assume
> it got char[1] and will be looking to '\0' marker...

The trick in C is to malloc the struct with a larger size. A canonical
example is:

  typedef struct S { size_t bytes; char flex[1]; } S;
  S* s = malloc(sizeof(S) + (amount-1));
  s->bytes = amount;

That is, you have bonus bytes after the struct which you can access
through the last struct member. Variants on this is to use a
zero-length-array in the language dialects that support it.

It's a very common way of allocating a variable-size array with bundled
size in a single chunk, which makes cleanup a single free() and gives
you some coherence between the size and data.

In that world, it's perfectly sane and rational to read past the end of
an array, and some compilers have explicit exceptions to reading past
the last element of an end-of-struct 1- or 0-sized array, just because
of this.

--
Lars Viklund | [hidden email]
_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: boost::filesystem::path operator/= is broken for char[1] in 1.48

Nat Goodspeed-2
In reply to this post by Ovanes Markarian
On Fri, Apr 20, 2012 at 1:15 PM, Ovanes Markarian
<[hidden email]> wrote:

> On Fri, Apr 20, 2012 at 6:08 PM, Nick Zavaritsky <[hidden email]> wrote:

>> boost::filesystem::path         some_path("first");
>> char                                    path_component[1]; /* = "second"
>> */
>> some_path /= path_component;
>>
>> Some_path value is expected to be "first/second" but it is "first".
>> I.e operator /= had no effect.

> I think the idea behind this is that any char* related string must be null
> terminated. So char[1] has only place for '\0'...

>> This technique is often used in C code.

> if you
> know that this buffer represents some special case, why don't you pass it as
> such to filesystem? &char[0], than boost::filesystem will not assume it got
> char[1] and will be looking to '\0' marker...

I strongly agree with Ovanes.

>> I believe the current operator =/ behavior is outright dangerous and
>> should be fixed.

Nope, sorry. If you pass a fixed-length array, a proper C++ library
*should* be sensitive to its max. Anything else is unsafe and should
be discouraged.

In the early days of C, people played all sorts of tricky games
because (a) the language didn't support any better alternatives and
(b) there was this cowboy mentality of "structured assembler."

The fact that we're still dealing with legacy C-style APIs is not a
good reason for a modern C++ library to blithely run past the declared
end of a char array. C++ tries very hard to bring type-safety to the
party. It is far more important to try to keep new code from crashing
than it is to accommodate such misleading APIs.

When you know you're dealing with code that willfully lies to the
compiler -- even if it's been frozen into an OS API -- use Ovanes's
workaround to pass char* rather than char[size]. To protect your
fellow coders from the crufty details, wrap it in a function layer
that obeys language rules.
_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users
Loading...