[program_options] Some methods take const char*, others take std::string

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[program_options] Some methods take const char*, others take std::string

Gabriel Redner
Hi all,

Some methods in program_options accept const char*s (e.g.
option_description constructor), while others take const std::string&s
(e.g. option_description::key).  This makes no difference to the user
if he is passing string literals, but becomes annoying if he wants to
pass strings, or expressions which return strings.  Of course he can
put (...).c_str() everywhere, but this is ugly and unnecessary, since
program_options just takes its parameters and shoves them into string
variables anyways.  Since changing the parameter types to const
std::string& should not break existing code, I do not see a downside
to doing so (plus, it would allow me to get rid of .c_str()s
everywhere :-).

Does it seem reasonable for program_options public APIs to accept
std::strings instead of const char*s?  If so I would be happy to do
the legwork and submit a patch.

Thanks,
-Gabe
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Vladimir Prus-3
Gabriel Redner wrote:

> Hi all,
>
> Some methods in program_options accept const char*s (e.g.
> option_description constructor), while others take const std::string&s
> (e.g. option_description::key).  This makes no difference to the user
> if he is passing string literals, but becomes annoying if he wants to
> pass strings, or expressions which return strings.  Of course he can
> put (...).c_str() everywhere, but this is ugly and unnecessary, since
> program_options just takes its parameters and shoves them into string
> variables anyways.  Since changing the parameter types to const
> std::string& should not break existing code, I do not see a downside
> to doing so (plus, it would allow me to get rid of .c_str()s
> everywhere :-).
>
> Does it seem reasonable for program_options public APIs to accept
> std::strings instead of const char*s?  If so I would be happy to do
> the legwork and submit a patch.

This is: https://svn.boost.org/trac/boost/ticket/4142

The point is that using std::string would impose code size overhead when
you pass string literal. I would be happy to see a patch, but such a
patch would have to retain overloads taking const char*.

Thanks,

--
Vladimir Prus
CodeSourcery / Mentor Graphics
+7 (812) 677-68-40

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Yakov Galka
Another option is to define

class str_ref : public iterator_range<const char*> { ... };

Convertible from both, string literals and std::strings, and provide only
one overload that accepts it everywhere.

Just an idea.

On Wed, Aug 10, 2011 at 23:13, Vladimir Prus <[hidden email]>wrote:

> Gabriel Redner wrote:
>
> > Hi all,
> >
> > Some methods in program_options accept const char*s (e.g.
> > option_description constructor), while others take const std::string&s
> > (e.g. option_description::key).  This makes no difference to the user
> > if he is passing string literals, but becomes annoying if he wants to
> > pass strings, or expressions which return strings.  Of course he can
> > put (...).c_str() everywhere, but this is ugly and unnecessary, since
> > program_options just takes its parameters and shoves them into string
> > variables anyways.  Since changing the parameter types to const
> > std::string& should not break existing code, I do not see a downside
> > to doing so (plus, it would allow me to get rid of .c_str()s
> > everywhere :-).
> >
> > Does it seem reasonable for program_options public APIs to accept
> > std::strings instead of const char*s?  If so I would be happy to do
> > the legwork and submit a patch.
>
> This is: https://svn.boost.org/trac/boost/ticket/4142
>
> The point is that using std::string would impose code size overhead when
> you pass string literal. I would be happy to see a patch, but such a
> patch would have to retain overloads taking const char*.
>
> Thanks,
>
> --
> Vladimir Prus
> CodeSourcery / Mentor Graphics
> +7 (812) 677-68-40
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>



--
Yakov
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Gabriel Redner
Hi Vladimir,

I have added the std::string overloads, and my patch (against trunk)
is attached to this message.  I also updated relevant tests to check
both overloads, and I have run the full test suite.  The one thing I
was not able to do was check that the new generated docs are correct -
how do I regenerate the docs?

Additionally, if you don't like the wording (or the content) of my
changes to the 'rationale' section, feel free to rewrite it.

Thanks,
-Gabe

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

patchfile.patch (33K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Gabriel Redner
In reply to this post by Yakov Galka
On Thu, Aug 11, 2011 at 4:45 AM, Yakov Galka <[hidden email]> wrote:

> Another option is to define
>
> class str_ref : public iterator_range<const char*> { ... };
>
> Convertible from both, string literals and std::strings, and provide only
> one overload that accepts it everywhere.
>
> Just an idea.
> --
> Yakov

Hi Yakov,

Thanks for your suggestion!  In the end though, I decided to add
separate std::string overloads.  Since the interfaces involved are few
and simple, it seemed better to overload in the most straightforward
way, rather than add something fancier just to save a bit of typing on
the implementation side.

Thanks,
-Gabe
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Yakov Galka
It's not just for saving typing, it's for making the interface more generic.
Consider the following:

void f(const char*);
void f(const std::string&);

I can use it for null-terminated strings and for std::srings just fine. But
consider the case I want to pass it a sub-string of some string, or a string
stored in std::vector, or a non-null terminated string:

std::string str1 = ...;
std::vector<char> str2 = ...;
char str3[1024];
GetBinaryPacket(str3);

f(str1.substr(1, 5)); // unnecessary copy
f(std::string(str2.begin(), str2.end())); // unnecessary copy
f(std::string(str3, str3 + 5)); // unnecessary copy

Compare it with:

typedef iterator_range<const char*> str_ref;
void f(str_ref);

f(str_ref(str1.data()+1, str1.data()+6)); // no copy
f(str_ref(&str2[0], &str2[0] + str2.size())); // no copy
f(str_ref(str3, str3 + 5)); // no copy

Ok, so the above is longer than the first because we didn't define a nice
interface of str_ref yet, but it demonstrated the generality.

Note that the most general way is to use templates accepting any Range of
chars, just as boost string algorithms do. But this is not good for separate
compilation, so the above is the best compromise of efficiency, separate
compilation and generality we can get.

Yakov

On Mon, Aug 15, 2011 at 01:45, Gabriel Redner <[hidden email]> wrote:

> On Thu, Aug 11, 2011 at 4:45 AM, Yakov Galka <[hidden email]>
> wrote:
> > Another option is to define
> >
> > class str_ref : public iterator_range<const char*> { ... };
> >
> > Convertible from both, string literals and std::strings, and provide only
> > one overload that accepts it everywhere.
> >
> > Just an idea.
> > --
> > Yakov
>
> Hi Yakov,
>
> Thanks for your suggestion!  In the end though, I decided to add
> separate std::string overloads.  Since the interfaces involved are few
> and simple, it seemed better to overload in the most straightforward
> way, rather than add something fancier just to save a bit of typing on
> the implementation side.
>
> Thanks,
> -Gabe
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>



--
Yakov
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Olaf van der Spek-3
In reply to this post by Yakov Galka
On Thu, Aug 11, 2011 at 10:45 AM, Yakov Galka <[hidden email]> wrote:
> class str_ref : public iterator_range<const char*> { ... };
>
> Convertible from both, string literals and std::strings, and provide only
> one overload that accepts it everywhere.

Doesn't iterator_range treat literals as arrays by default?
The str_ref idea is great, but IMO iterator_range isn't suitable for it.
Also, doesn't iterator_range require memory allocations itself?

Olaf
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Christopher Jefferson
In reply to this post by Yakov Galka

On 15 Aug 2011, at 07:57, Yakov Galka wrote:

> It's not just for saving typing, it's for making the interface more generic.
> Consider the following:
>
> void f(const char*);
> void f(const std::string&);
>
> I can use it for null-terminated strings and for std::srings just fine. But
> consider the case I want to pass it a sub-string of some string, or a string
> stored in std::vector, or a non-null terminated string:
>
> std::string str1 = ...;
> std::vector<char> str2 = ...;
> char str3[1024];
> GetBinaryPacket(str3);
>
> f(str1.substr(1, 5)); // unnecessary copy
> f(std::string(str2.begin(), str2.end())); // unnecessary copy
> f(std::string(str3, str3 + 5)); // unnecessary copy
>
> Compare it with:
>
> typedef iterator_range<const char*> str_ref;
> void f(str_ref);
>
> f(str_ref(str1.data()+1, str1.data()+6)); // no copy
> f(str_ref(&str2[0], &str2[0] + str2.size())); // no copy
> f(str_ref(str3, str3 + 5)); // no copy

Is worrying about high efficiency, and lack of copying, in program_options really that important? I would prefer the simplest possible interface, even if it has some small inefficiencies.

Chris
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Gabriel Redner
In reply to this post by Yakov Galka
Hi Yakov,

> It's not just for saving typing, it's for making the interface more generic.
> Consider the following:
> <snip>
> Ok, so the above is longer than the first because we didn't define a nice
> interface of str_ref yet, but it demonstrated the generality.

This is all true, but it's important to keep in mind our use cases.
In general, these APIs will be called with string literals.  The
next-most-common case is for them to be called with std::strings, and
other cases are even less likely.  So it seems best to do the simplest
thing possible which covers the common cases and does not rule out the
uncommon ones.

> Note that the most general way is to use templates accepting any Range of
> chars, just as boost string algorithms do. But this is not good for separate
> compilation, so the above is the best compromise of efficiency, separate
> compilation and generality we can get.

Your compromise is missing one factor - simplicity.  This is a simple
API which does a simple job.  Decorating it with generic tools will
make it harder to understand and to maintain, in exchange for some
small benefit in uncommon use cases.

In any case, it's Vladimir's library, so he has the last word on this.

Thanks,
-Gabe
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Nevin Liber
On 15 August 2011 14:33, Gabriel Redner <[hidden email]> wrote:

>
> Your compromise is missing one factor - simplicity.  This is a simple
> API which does a simple job.  Decorating it with generic tools will
> make it harder to understand and to maintain, in exchange for some
> small benefit in uncommon use cases.
>

When the library is taken is isolation, that is true.  When taken are part
of a greater whole (Boost), it makes it harder, as it is now has an
interface that is inconsistent with other libraries.  Having to keep track
of which libraries support which "similar but not quite the same" interface
does not make things simpler.


> In any case, it's Vladimir's library, so he has the last word on this.
>

Of course.
--
 Nevin ":-)" Liber  <mailto:[hidden email]>  (847) 691-1404
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Olaf van der Spek-3
In reply to this post by Gabriel Redner
On Mon, Aug 15, 2011 at 9:33 PM, Gabriel Redner <[hidden email]> wrote:
> This is all true, but it's important to keep in mind our use cases.
> In general, these APIs will be called with string literals.  The
> next-most-common case is for them to be called with std::strings, and
> other cases are even less likely.  So it seems best to do the simplest
> thing possible which covers the common cases and does not rule out the
> uncommon ones.

Using str_ref would require only a single overload, while using const
char* and const str::string& would require two. So what is the
simplest thing possible?

> Your compromise is missing one factor - simplicity.  This is a simple
> API which does a simple job.  Decorating it with generic tools will
> make it harder to understand and to maintain, in exchange for some
> small benefit in uncommon use cases.

This isn't the only lib that takes string references. It'd be nice to
solve this properly once and for all.

Olaf
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Olaf van der Spek-3
On Mon, Aug 15, 2011 at 10:25 PM, Olaf van der Spek <[hidden email]> wrote:

> On Mon, Aug 15, 2011 at 9:33 PM, Gabriel Redner <[hidden email]> wrote:
>> This is all true, but it's important to keep in mind our use cases.
>> In general, these APIs will be called with string literals.  The
>> next-most-common case is for them to be called with std::strings, and
>> other cases are even less likely.  So it seems best to do the simplest
>> thing possible which covers the common cases and does not rule out the
>> uncommon ones.
>
> Using str_ref would require only a single overload, while using const
> char* and const str::string& would require two. So what is the
> simplest thing possible?

Not using str_ref would require a third overload: (const char*,
size_t). Even then, it still doesn't support for example vector<char>,
CString (MFC) or QString (QT).

Olaf
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Yakov Galka
In reply to this post by Olaf van der Spek-3
On Mon, Aug 15, 2011 at 19:44, Olaf van der Spek <[hidden email]> wrote:

> On Thu, Aug 11, 2011 at 10:45 AM, Yakov Galka <[hidden email]>
> wrote:
> > class str_ref : public iterator_range<const char*> { ... };
> >
> > Convertible from both, string literals and std::strings, and provide only
> > one overload that accepts it everywhere.
>
> Doesn't iterator_range treat literals as arrays by default?
> The str_ref idea is great, but IMO iterator_range isn't suitable for it.
> Also, doesn't iterator_range require memory allocations itself?
>

1) It's a proof of concept, not an implementation. iterator_range is just
one of the infinity of types which can be mapped to the Range concept and
used wherever Ranges expected.

2) Yes, it treats string literals as arrays, so what? It's exactly what we
want.
If you're talking about including or not the null terminator, then I'm still
not sure on this point. We can say that our strings don't contain zeros in
the middle (which is acceptable for function expecting const char*), then if
end() points to a null terminator we can use it as an optimization when
passing to a low-level function accepting a zero-terminated string. If end()
doesn't point to zero, then we must do a copy when passing to something like
fread(). The alternative is to never include the null terminator when
initializing from a string. This solution is also good, since only a small
fraction of our functions pass the strings to the system (where a null
terminated string is needed), and even then sometimes we *must* do a copy,
especially if we want to support Unicode on windows (consider all
boost::interprocess which cannot be used now for opening Unicode paths on
windows).

3) No, iterator_range just stores a pair of iterators, no memory allocations
are done. Also see 1), it's irrelevant to the discussion.


--
Yakov
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Yakov Galka
In reply to this post by Christopher Jefferson
On Mon, Aug 15, 2011 at 19:49, Christopher Jefferson
<[hidden email]>wrote:

>  [...]
> Is worrying about high efficiency, and lack of copying, in program_options
> really that important? I would prefer the simplest possible interface, even
> if it has some small inefficiencies.
>

This is not about program_options, it's about passing strings in general.
It's just this issue came up in this thread, so I proposed it here.


On Mon, Aug 15, 2011 at 22:33, Gabriel Redner <[hidden email]> wrote:

> Hi Yakov,
>
> > It's not just for saving typing, it's for making the interface more
> generic.
> > Consider the following:
> > <snip>
> > Ok, so the above is longer than the first because we didn't define a nice
> > interface of str_ref yet, but it demonstrated the generality.
>
> This is all true, but it's important to keep in mind our use cases.
> In general, these APIs will be called with string literals.  The
> next-most-common case is for them to be called with std::strings, and
> other cases are even less likely.  So it seems best to do the simplest
> thing possible which covers the common cases and does not rule out the
> uncommon ones.
>

There are dozens of other string classes people use. If anyone could map his
contiguous storage string with compatible encoding to str_ref, it could
simplify things greatly.


>
> > Note that the most general way is to use templates accepting any Range of
> > chars, just as boost string algorithms do. But this is not good for
> separate
> > compilation, so the above is the best compromise of efficiency, separate
> > compilation and generality we can get.
>
> Your compromise is missing one factor - simplicity.  This is a simple
> API which does a simple job.  Decorating it with generic tools will
> make it harder to understand and to maintain, in exchange for some
> small benefit in uncommon use cases.
>

Once str_ref is implemented, it makes everything else simpler. So I think
that in the end things are getting simpler. Also

 In any case, it's Vladimir's library, so he has the last word on this.
>

This is not just about this library. It's relevant to interprocess too, for
example.


On Mon, Aug 15, 2011 at 23:25, Olaf van der Spek <[hidden email]> wrote:

> [...]
> This isn't the only lib that takes string references. It'd be nice to
> solve this properly once and for all.
>

Exactly!

Yet I must admit that I have a very little experience with my own proposal.
I used it successfully in compiler application where passing references to
the original strings allowed one cheaply (and lazily) compute the source of
an error.

--
Yakov
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Olaf van der Spek-3
In reply to this post by Yakov Galka
On Wed, Aug 24, 2011 at 5:27 PM, Yakov Galka <[hidden email]> wrote:
> 2) Yes, it treats string literals as arrays, so what? It's exactly what we
> want.

No, because it'll include the null terminator.

> If you're talking about including or not the null terminator, then I'm still
> not sure on this point. We can say that our strings don't contain zeros in
> the middle (which is acceptable for function expecting const char*), then if

It'd be nice to support 'binary' strings too.

> end() points to a null terminator we can use it as an optimization when

end() will point one past the terminator.

> 3) No, iterator_range just stores a pair of iterators, no memory allocations
> are done. Also see 1), it's irrelevant to the discussion.

Ah, I think I'm confused with another type then.

Olaf
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Yakov Galka
On Wed, Aug 24, 2011 at 22:22, Olaf van der Spek <[hidden email]> wrote:

> On Wed, Aug 24, 2011 at 5:27 PM, Yakov Galka <[hidden email]>
> wrote:
> > 2) Yes, it treats string literals as arrays, so what? It's exactly what
> we
> > want.
>
> No, because it'll include the null terminator.
> [...]
> end() will point one past the terminator.
>

The two are exactly opposite. So do you want to include the null in the
range or not?

Yes, my mistake. When I said 'end() points to' I meant 'end()-1 points to'.

> If you're talking about including or not the null terminator, then I'm
> still
> > not sure on this point. We can say that our strings don't contain zeros
> in
> > the middle (which is acceptable for function expecting const char*), then
> if
>
It'd be nice to support 'binary' strings too.
>

I remember that you brought up this topic some time ago. However I don't
think that you need anything special to handle binary chunks of data:

typedef iterator_range<const char*> data_ref;

Unlike for str_ref, this *is* a complete, working implementation.

--
Yakov
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [program_options] Some methods take const char*, others take std::string

Olaf van der Spek-3
On Sat, Aug 27, 2011 at 10:26 AM, Yakov Galka <[hidden email]> wrote:

> On Wed, Aug 24, 2011 at 22:22, Olaf van der Spek <[hidden email]> wrote:
>
>> On Wed, Aug 24, 2011 at 5:27 PM, Yakov Galka <[hidden email]>
>> wrote:
>> > 2) Yes, it treats string literals as arrays, so what? It's exactly what
>> we
>> > want.
>>
>> No, because it'll include the null terminator.
>> [...]
>> end() will point one past the terminator.
>>
>
> The two are exactly opposite.

No they're not. end() is not included in the range, so if end() points
to one past the terminator, the terminator is part of the range.
if end() points to the terminator, the terminator is not part of the range.

> So do you want to include the null in the
> range or not?

I don't.

> It'd be nice to support 'binary' strings too.
>>
>
> I remember that you brought up this topic some time ago. However I don't
> think that you need anything special to handle binary chunks of data:
>
> typedef iterator_range<const char*> data_ref;
>
> Unlike for str_ref, this *is* a complete, working implementation.

The parameter types are usually (const void*, size_t). Not const char*.

Olaf
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost