Boost.Algorithm design question

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
83 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

Boost.Algorithm design question

Marshall Clow-2
So, what do people prefer (and why?):

template<typename InputIterator, typename V>
bool none_of_equal ( InputIterator first, InputIterator last, V const &val )
{
    for ( ; first != last; ++first )
        if ( val == *first )
            return false;
    return true;
}

or

template<typename InputIterator, >
bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )
{
    for ( ; first != last; ++first )
        if ( val == *first )
            return false;
    return true;
}

In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop.
In the second case, there is not.

Of course, the (mythical?) smart compiler would do the conversion once and save the result for reuse.

-- Marshall

Marshall Clow     Idio Software   <mailto:[hidden email]>

A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait).
        -- Yu Suzuki


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Stephan T. Lavavej-2
[Marshall Clow]
> So, what do people prefer (and why?):

> template<typename InputIterator, typename V>
> bool none_of_equal ( InputIterator first, InputIterator last, V const &val )

> or

> template<typename InputIterator, >
> bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )

> In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop.
> In the second case, there is not.

#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.

Consider vector<string> and val being "meow". string is directly comparable to const char *, without any conversions or temporaries.

Even better, consider vector<const char *> and val being a string. Here, the element type is still directly comparable to val's type, but val's type is not implicitly convertible to the element type.

It is true that #1 can result in O(N) conversions, but that's really up to the element/value types involved. Typically, whenever the value type is implicitly convertible to the element type, and the element type is comparable with itself, a direct comparison between the two types could also be provided (as is the case with const char * and string), and will be provided when performance is important.

Heterogeneous comparisons are surprisingly popular, and the STL is moving to support them better. For example, std::lower_bound() didn't support them in C++03, and does in C++11. std::map still doesn't, and users complain about it from time to time.

STL

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Marshall Clow-2
On Oct 5, 2011, at 3:53 PM, Stephan T. Lavavej wrote:

> [Marshall Clow]
>> So, what do people prefer (and why?):
>
>> template<typename InputIterator, typename V>
>> bool none_of_equal ( InputIterator first, InputIterator last, V const &val )
>
>> or
>
>> template<typename InputIterator, >
>> bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )
>
>> In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop.
>> In the second case, there is not.
>
> #1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.
>
> Consider vector<string> and val being "meow". string is directly comparable to const char *, without any conversions or temporaries.
>
> Even better, consider vector<const char *> and val being a string. Here, the element type is still directly comparable to val's type, but val's type is not implicitly convertible to the element type.
>
> It is true that #1 can result in O(N) conversions, but that's really up to the element/value types involved. Typically, whenever the value type is implicitly convertible to the element type, and the element type is comparable with itself, a direct comparison between the two types could also be provided (as is the case with const char * and string), and will be provided when performance is important.
>
> Heterogeneous comparisons are surprisingly popular, and the STL is moving to support them better. For example, std::lower_bound() didn't support them in C++03, and does in C++11. std::map still doesn't, and users complain about it from time to time.

Thanks for explaining the rationale.
That's what I was looking for.


-- Marshall

Marshall Clow     Idio Software   <mailto:[hidden email]>

A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait).
        -- Yu Suzuki


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

lcaminiti
In reply to this post by Stephan T. Lavavej-2
Stephan T. Lavavej-2 wrote
[Marshall Clow]
> So, what do people prefer (and why?):

> template<typename InputIterator, typename V> 
> bool none_of_equal ( InputIterator first, InputIterator last, V const &val )

> or

> template<typename InputIterator, > 
> bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )

> In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop.
> In the second case, there is not.

#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.
+1

--Lorenzo
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

thorsten.ottosen
In reply to this post by Stephan T. Lavavej-2
Den 06-10-2011 00:53, Stephan T. Lavavej skrev:

> [Marshall Clow]
>> So, what do people prefer (and why?):
>
>> template<typename InputIterator, typename V> bool none_of_equal (
>> InputIterator first, InputIterator last, V const&val )
>
>> or
>
>> template<typename InputIterator,> bool none_of_equal (
>> InputIterator first, InputIterator last,
>> iterator_traits<InputIterator>::value_type const&val )
>
>> In the first case, I think there's (possible) conversion from V to
>> iterator_traits<InputIterator>::value_type each time through the
>> loop. In the second case, there is not.
>
> #1 is better. It follows the STL's conventions (e.g. look at
> std::find()) and permits heterogeneous comparisons.
>
> Consider vector<string>  and val being "meow". string is directly
> comparable to const char *, without any conversions or temporaries.
>
> Even better, consider vector<const char *>  and val being a string.
> Here, the element type is still directly comparable to val's type,
> but val's type is not implicitly convertible to the element type.

Even better, don't use vector<const char*>. What o you need that for?

[snip]

> Heterogeneous comparisons are surprisingly popular, and the STL is
> moving to support them better. For example, std::lower_bound() didn't
> support them in C++03, and does in C++11. std::map still doesn't, and
> users complain about it from time to time.


Well, option number two has the distinct advantage that it generates
less code. Option number one would generate code for each length of a
character literal.

To avoid such mess, the implementation can try to decay the value
an then forward to the real implementation. Or the user can use
boost::as_literal().

-Thorsten

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Dave Abrahams
In reply to this post by Marshall Clow-2

on Wed Oct 05 2011, Marshall Clow <mclow.lists-AT-gmail.com> wrote:

> So, what do people prefer (and why?):
>
> template<typename InputIterator, typename V>
> bool none_of_equal ( InputIterator first, InputIterator last, V const &val )
> {
>     for ( ; first != last; ++first )
>         if ( val == *first )
>             return false;
>     return true;
> }
>
> or
>
> template<typename InputIterator, >
> bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )
> {
>     for ( ; first != last; ++first )
>         if ( val == *first )
>             return false;
>     return true;
> }
>
> In the first case, I think there's (possible) conversion from V to
> iterator_traits<InputIterator>::value_type each time through the loop.
> In the second case, there is not.

I prefer the second one.  

- It's the one that matches the mental model: I don't know what the
  first algorithm /means/.
- An exact specification for the first one is more complicated
- It's more efficient in the usual case when a conversion would be
  needed

> Of course, the (mythical?) smart compiler would do the conversion once
> and save the result for reuse.

I think you know the compiler would have to be able to verify that the
conversion had no side-effects; not something we're likely to see for
some time.

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Dave Abrahams
In reply to this post by Stephan T. Lavavej-2

on Wed Oct 05 2011, "Stephan T. Lavavej" <stl-AT-exchange.microsoft.com> wrote:

> [Marshall Clow]
>> So, what do people prefer (and why?):
>
>
>> template<typename InputIterator, typename V>
>> bool none_of_equal ( InputIterator first, InputIterator last, V const &val )
>
>> or
>
>> template<typename InputIterator, >
>> bool none_of_equal ( InputIterator first, InputIterator last,
>> iterator_traits<InputIterator>::value_type const &val )
>
>> In the first case, I think there's (possible) conversion from V to
>> iterator_traits<InputIterator>::value_type each time through the
>> loop.
>> In the second case, there is not.
>
> #1 is better. It follows the STL's conventions (e.g. look at
> std::find()) and permits heterogeneous comparisons.

Sorry, STL.  The other STL's conventions are wrong. :-)

> Consider vector<string> and val being "meow". string is directly
> comparable to const char *, without any conversions or temporaries.

You got lucky this time.

> Even better, consider vector<const char *> and val being a
> string. Here, the element type is still directly comparable to val's
> type, but val's type is not implicitly convertible to the element
> type.

That's cute.  IMO the principled thing to do in that case is to use the
version that takes the additional predicate.

> It is true that #1 can result in O(N) conversions, but that's really
> up to the element/value types involved. Typically, whenever the value
> type is implicitly convertible to the element type, and the element
> type is comparable with itself, a direct comparison between the two
> types could also be provided (as is the case with const char * and
> string), and will be provided when performance is important.
>
> Heterogeneous comparisons are surprisingly popular, and the STL is
> moving to support them better. For example, std::lower_bound() didn't
> support them in C++03, and does in C++11.

/me realizes he is responsible for this change

But that's *totally* different.

/me fumbles around for the explanation

For the version that uses "<", it's an unintentional side-effect of
making the specification uniform.  The function signature already took
an unconstrained object by reference, and my paper was not trying to
change that.  It just happened to make the /semantics/ of using a type
different from the value type of the sequence meaningful in that case.
It also make the specification *simpler* overall and easier to
understand.

The _purpose_ was to support heterogeneity for the version that uses a
function.  The description in terms of partitioning is a more principled
and generic way to describe the actual requirements of the algorithm.
There was never any intention to promote the use of heterogeneous "<",
which again, doesn't make sense.

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Vicente Botet
In reply to this post by Marshall Clow-2
Marshall Clow-2 wrote
So, what do people prefer (and why?):

template<typename InputIterator, typename V> 
bool none_of_equal ( InputIterator first, InputIterator last, V const &val )
{
    for ( ; first != last; ++first )
        if ( val == *first )
            return false;
    return true;
}

or

template<typename InputIterator, > 
bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )
{
    for ( ; first != last; ++first )
        if ( val == *first )
            return false;
    return true;
}

In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop.
In the second case, there is not.
Couldn't you provide both using SFINAE?

Best,
Vicente
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

lcaminiti
In reply to this post by Dave Abrahams
Dave Abrahams wrote
on Wed Oct 05 2011, "Stephan T. Lavavej" <stl-AT-exchange.microsoft.com> wrote:

> [Marshall Clow]
>> So, what do people prefer (and why?):
>
>
>> template<typename InputIterator, typename V> 
>> bool none_of_equal ( InputIterator first, InputIterator last, V const &val )
>
>> or
>
>> template<typename InputIterator, > 
>> bool none_of_equal ( InputIterator first, InputIterator last,
>> iterator_traits<InputIterator>::value_type const &val )
>
>> In the first case, I think there's (possible) conversion from V to
>> iterator_traits<InputIterator>::value_type each time through the
>> loop.
>> In the second case, there is not.
>
> #1 is better. It follows the STL's conventions (e.g. look at
> std::find()) and permits heterogeneous comparisons.

Sorry, STL.  The other STL's conventions are wrong. :-)

> Consider vector<string> and val being "meow". string is directly
> comparable to const char *, without any conversions or temporaries.

You got lucky this time.

> Even better, consider vector<const char *> and val being a
> string. Here, the element type is still directly comparable to val's
> type, but val's type is not implicitly convertible to the element
> type.

That's cute.  IMO the principled thing to do in that case is to use the
version that takes the additional predicate.

> It is true that #1 can result in O(N) conversions, but that's really
> up to the element/value types involved. Typically, whenever the value
> type is implicitly convertible to the element type, and the element
> type is comparable with itself, a direct comparison between the two
> types could also be provided (as is the case with const char * and
> string), and will be provided when performance is important.
>
> Heterogeneous comparisons are surprisingly popular, and the STL is
> moving to support them better. For example, std::lower_bound() didn't
> support them in C++03, and does in C++11.

/me realizes he is responsible for this change

But that's *totally* different.

/me fumbles around for the explanation

For the version that uses "<", it's an unintentional side-effect of
making the specification uniform.  The function signature already took
an unconstrained object by reference, and my paper was not trying to
change that.  It just happened to make the /semantics/ of using a type
different from the value type of the sequence meaningful in that case.
It also make the specification *simpler* overall and easier to
understand.

The _purpose_ was to support heterogeneity for the version that uses a
function.  The description in terms of partitioning is a more principled
and generic way to describe the actual requirements of the algorithm.
There was never any intention to promote the use of heterogeneous "<",
which again, doesn't make sense.
From the standard:

// 25.2, non-modifying sequence operations:
template <class InputIterator, class Predicate>
bool none_of(InputIterator first, InputIterator last, Predicate pred);
template<class InputIterator, class T>
InputIterator find(InputIterator first, InputIterator last, const T& value);

Given that I am used to how the STL works, at first I'd expect Boost.Algorithm to follow the same interface where it makes sense-- why would none_of_equal use an interface different from std::find? Now, if truly the STL interface is incorrect and Boost.Algorithm needs to use iterator_traits::value, wouldn't it make sense to also change the STL interface in the C++ standard?

I think what I am trying to ask is: Does anyone know why std::find does not use use iterator_traits::value?

BTW, I find none_equal /any_equal / all_equal / one_equal more readable than none_of_equal / any_of_equal / all_of_equal / one_of_equal. I'd keep the "of" postfix only for the predicate versions as the STL does (and similarly to find_if). What do you think?

Thanks,
--Lorenzo
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Dave Abrahams

on Thu Oct 06 2011, lcaminiti <lorcaminiti-AT-gmail.com> wrote:

>From the standard:
>
> // 25.2, non-modifying sequence operations:
> template <class InputIterator, class Predicate>
> bool none_of(InputIterator first, InputIterator last, Predicate pred);
> template<class InputIterator, class T>
> InputIterator find(InputIterator first, InputIterator last, const T& value);
>
> Given that I am used to how the STL works, at first I'd expect
> Boost.Algorithm to follow the same interface where it makes sense-- why
> would none_of_equal use an interface different from std::find? Now, if truly
> the STL interface is incorrect and Boost.Algorithm needs to use
> iterator_traits::value, wouldn't it make sense to also change the STL
> interface in the C++ standard?

Yes, it would make sense.  No, we won't do it, because it would break
code.

> I think what I am trying to ask is: Does anyone know why std::find does not
> use use iterator_traits::value?

IIUC it was an oversight, and IIRC Stepanov's later work (EOP) does
restrict the type of "value".


--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Marshall Clow-2
On Oct 6, 2011, at 8:30 AM, Dave Abrahams wrote:

> on Thu Oct 06 2011, lcaminiti <lorcaminiti-AT-gmail.com> wrote:
>
>> From the standard:
>>
>> // 25.2, non-modifying sequence operations:
>> template <class InputIterator, class Predicate>
>> bool none_of(InputIterator first, InputIterator last, Predicate pred);
>> template<class InputIterator, class T>
>> InputIterator find(InputIterator first, InputIterator last, const T& value);
>>
>> Given that I am used to how the STL works, at first I'd expect
>> Boost.Algorithm to follow the same interface where it makes sense-- why
>> would none_of_equal use an interface different from std::find? Now, if truly
>> the STL interface is incorrect and Boost.Algorithm needs to use
>> iterator_traits::value, wouldn't it make sense to also change the STL
>> interface in the C++ standard?
>
> Yes, it would make sense.  No, we won't do it, because it would break
> code.
>
>> I think what I am trying to ask is: Does anyone know why std::find does not
>> use use iterator_traits::value?
>
> IIUC it was an oversight, and IIRC Stepanov's later work (EOP) does
> restrict the type of "value".

I could be misremembering, but didn't iterator_traits come after the first release of the STL?


-- Marshall

Marshall Clow     Idio Software   <mailto:[hidden email]>

A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait).
        -- Yu Suzuki


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Christian Holmquist
In reply to this post by Stephan T. Lavavej-2
On 5 October 2011 17:53, Stephan T. Lavavej <[hidden email]>wrote:

> [Marshall Clow]
> > So, what do people prefer (and why?):
>
> > template<typename InputIterator, typename V>
> > bool none_of_equal ( InputIterator first, InputIterator last, V const
> &val )
>
> > or
>
> > template<typename InputIterator, >
> > bool none_of_equal ( InputIterator first, InputIterator last,
> iterator_traits<InputIterator>::value_type const &val )
>
> > In the first case, I think there's (possible) conversion from V to
> iterator_traits<InputIterator>::value_type each time through the loop.
> > In the second case, there is not.
>
> #1 is better. It follows the STL's conventions (e.g. look at std::find())
> and permits heterogeneous comparisons.
>
> Consider vector<string> and val being "meow". string is directly comparable
> to const char *, without any conversions or temporaries.
>
> Even better, consider vector<const char *> and val being a string. Here,
> the element type is still directly comparable to val's type, but val's type
> is not implicitly convertible to the element type.
>
>
The never ending reference to std::string vs const char* performance thing.
I'm simply amazed about the simplicity of other developers performance
issues, if this was the main bottleneck I would ever deal with I could have
more time drinking coffee =)


-IF- such a case would show up in a profiler, I think it's only fair to let
the developer optimize his scenario with some custom code, that to open up
this can of worms for everyone else.
Honestly I hadn't considered how dangerous std::find etc is until a
discussion came up related to clamp() in other thread.
There is a std::function call find_if (
http://www.sgi.com/tech/stl/find_if.html), let it handle the odd-ball of
comparing unrelated types.

I understand the rational for not changing the c++ standard, but since
Boost.Algorithms is a new library with no requirements on backward
compatibility, let's not redo the same mistake.

- Christian

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Phil Endecott-48
In reply to this post by thorsten.ottosen
Thorsten Ottosen wrote:

> Den 06-10-2011 00:53, Stephan T. Lavavej skrev:
>> [Marshall Clow]
>>> So, what do people prefer (and why?):
>>
>>> template<typename InputIterator, typename V> bool none_of_equal (
>>> InputIterator first, InputIterator last, V const&val )
>>
>>> or
>>
>>> template<typename InputIterator,> bool none_of_equal (
>>> InputIterator first, InputIterator last,
>>> iterator_traits<InputIterator>::value_type const&val )
>>
>>> In the first case, I think there's (possible) conversion from V to
>>> iterator_traits<InputIterator>::value_type each time through the
>>> loop. In the second case, there is not.
>>
>> #1 is better. It follows the STL's conventions (e.g. look at
>> std::find()) and permits heterogeneous comparisons.
>>
>> Consider vector<string>  and val being "meow". string is directly
>> comparable to const char *, without any conversions or temporaries.
>>
>> Even better, consider vector<const char *>  and val being a string.
>> Here, the element type is still directly comparable to val's type,
>> but val's type is not implicitly convertible to the element type.
>
> Even better, don't use vector<const char*>. What o you need that for?

Here's a real motivating example where I want to use vector<const
char*>.  I have a large file containing null-terminated strings which I
memory map.  After opening the file, I construct some sort of index of
those strings in a sorted vector.  Then I search for things using
std::equal_range, std::lower_bound, or similar.

The important thing is that for efficiency the algorithms mustn't
create temporary strings; they must compare the const char* with the
string that I'm searching for directly.  std::string provides operator
overloads to do this.  I want the algorithms to use those heterogeneous comparisons.

Dave Abrahams has said a couple of times that he doesn't even know what
it means to try to do this.  This baffles me, but I trust that there is
some validity in his argument.  But it does lead me to the point that I
made in my review, and my answer to Marshall's question at the start of
this thread: I don't want Boost to waste time on either of those
none_of_equal implementations.  We are all capable of writing our own
4-line none_of_equal implementations that do precisely what we want
them to do.  Why argue about which is the "one true way" to do it, when
we can all have whichever one we prefer?  Instead, let's spend time on
things where the implementation is more than 4 lines long - like a
variant of std::list with O(1) splice!


Regards,  Phil.




_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Stewart, Robert
In reply to this post by Christian Holmquist
Christian Holmquist wrote:

> On 5 October 2011 17:53, Stephan T. Lavavej
> <[hidden email]>wrote:
>
> > #1 is better. It follows the STL's conventions (e.g. look at
> > std::find()) and permits heterogeneous comparisons.
> >
> > Consider vector<string> and val being "meow". string is
> > directly comparable to const char *, without any conversions
> > or temporaries.
>
> The never ending reference to std::string vs const char*
> performance thing. I'm simply amazed about the simplicity of
> other developers performance issues, if this was the main
> bottleneck I would ever deal with I could have more time
> drinking coffee =)

Too much coffee is bad for you, so this is a good thing!

> -IF- such a case would show up in a profiler, I think it's
> only fair to let the developer optimize his scenario with
> some custom code, that to open up this can of worms for
> everyone else.

In the "meow" example above, forcing the construction of a std::string from the string literal implies a free store allocation which, in turn, generally implies a global lock.  That means it can be a drag on MT performance.  A library function shouldn't impose that if preventing or avoiding it is practicable.

_____
Rob Stewart                           [hidden email]
Software Engineer                     using std::disclaimer;
Dev Tools & Components
Susquehanna International Group, LLP  http://www.sig.com




________________________________

IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Dave Abrahams

on Thu Oct 06 2011, "Stewart, Robert" <Robert.Stewart-AT-sig.com> wrote:

> Christian Holmquist wrote:
>> On 5 October 2011 17:53, Stephan T. Lavavej
>
>> <[hidden email]>wrote:
>>
>> > #1 is better. It follows the STL's conventions (e.g. look at
>> > std::find()) and permits heterogeneous comparisons.
>> >
>> > Consider vector<string> and val being "meow". string is
>> > directly comparable to const char *, without any conversions
>> > or temporaries.
>>
>> The never ending reference to std::string vs const char*
>> performance thing. I'm simply amazed about the simplicity of
>> other developers performance issues, if this was the main
>> bottleneck I would ever deal with I could have more time
>> drinking coffee =)
>
> Too much coffee is bad for you, so this is a good thing!
>
>> -IF- such a case would show up in a profiler, I think it's
>> only fair to let the developer optimize his scenario with
>> some custom code, that to open up this can of worms for
>> everyone else.
>
> In the "meow" example above, forcing the construction of a std::string
> from the string literal implies a free store allocation which, in
> turn, generally implies a global lock.  That means it can be a drag on
> MT performance.  A library function shouldn't impose that if
> preventing or avoiding it is practicable.

Avoiding it is easy for the user: use the version that takes a
predicate.

I don't think you can evaluate these choices just by looking at the
implementations.  Before anyone else votes option 1, I'd like to see
somebody write the description of the algorithm, including the concept
requirements.  With option 2, we know that == has to be an equivalence
relation.  The semantics in option 1 are a lot fuzzier.

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Eric Niebler-3
In reply to this post by Marshall Clow-2
On 10/6/2011 8:51 AM, Marshall Clow wrote:
> On Oct 6, 2011, at 8:30 AM, Dave Abrahams wrote:
>> on Thu Oct 06 2011, lcaminiti <lorcaminiti-AT-gmail.com> wrote:
>>> I think what I am trying to ask is: Does anyone know why std::find does not
>>> use use iterator_traits::value?
>>
>> IIUC it was an oversight, and IIRC Stepanov's later work (EOP) does
>> restrict the type of "value".
>
> I could be misremembering, but didn't iterator_traits come after the first release of the STL?

Marshall, I don't see how it could. Without iterator_traits, raw
pointers couldn't be valid iterators, and that was an important part of
Stepanov's vision.

--
Eric Niebler
BoostPro Computing
http://www.boostpro.com

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

lcaminiti
In reply to this post by Dave Abrahams
Dave Abrahams wrote
I don't think you can evaluate these choices just by looking at the
implementations.  Before anyone else votes option 1, I'd like to see
Well, I'm not so sure I like option 1 anymore... if option 1 was an oversight in STL then there is no sense in perpetuating the specification error in Boost.Algorithm (perhaps Boost.Algorithm should then go with option 2 and explain the rationale of why it's different than what STL does for example with std::find).

somebody write the description of the algorithm, including the concept
requirements.  With option 2, we know that == has to be an equivalence
relation.  The semantics in option 1 are a lot fuzzier.
Good point! I'll try to start :)

template< typename Iter, typename Val >
    requires
        InputIterator<Iter>,
        EqualityComparable<iterator_traits<Iter>::value_type, Val>
bool none_of_equal ( Iter first, Iter last, Val const& val ) ; // option 1a

template< typename Iter >
    requires
        InputIterator<Iter>,
        EqualityComparable<iterator_traits<Iter>::value_type>
bool none_of_equal ( Iter first, Iter last, iterator_traits<Iter>::value_type const& value ) ; // option 2

Essentially, I'd think option 1 requires operator== that satisfies has_equal_to<iterator_traits<Iter>::value_type, Val> while option 2 requires operator== that satisfies has_equal_to<iterator_traits<Iter>::value_type>.

The alternative would be for option 1 to require instead:

template< typename Iter, typename Val >
    requires
        InputIterator<Iter>,
        EqualityComparable<iterator_traits<Iter>::value_type>,
        Convertible<Val, iterator_traits<Iter>::value_type>
bool none_of_equal ( Iter first, Iter last, Val const& val ) ; // option 1b

What do you think?

Thanks,
--Lorenzo
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Marshall Clow-2
On Oct 6, 2011, at 11:05 AM, lcaminiti wrote:

>
> Dave Abrahams wrote:
>> somebody write the description of the algorithm, including the concept
>> requirements.  With option 2, we know that == has to be an equivalence
>> relation.  The semantics in option 1 are a lot fuzzier.
>>
>
> Good point! I'll try to start :)
>
> template< typename Iter, typename Val >
>    requires
>        InputIterator<Iter>,
>        EqualityComparable<iterator_traits<Iter>::value_type, Val>
> bool none_of_equal ( Iter first, Iter last, Val const& val ) ; // option 1a

That seems right to me.


> template< typename Iter >
>    requires
>        InputIterator<Iter>,
>        EqualityComparable<iterator_traits<Iter>::value_type>
> bool none_of_equal ( Iter first, Iter last,
> iterator_traits<Iter>::value_type const& value ) ; // option 2

I think you're missing the requirement is that the value passed in be convertible to "iterator_traits<Iter>::value_type", since that is what will happen at the call site.

Given:
        std::vector<int> v ;
        none_of_equal ( v.begin (), v.end (), 3.2 );

you need to capture the fact that the floating point number 3.2 has to be convertible to 'int'.
[ But maybe that's "external" to the function definition ]


> Essentially, I'd think option 1 requires operator== that satisfies
> has_equal_to<iterator_traits<Iter>::value_type, Val>

Yes.

> while option 2
> requires operator== that satisfies
> has_equal_to<iterator_traits<Iter>::value_type>.

Ok - but that's not sufficient.

-- Marshall

Marshall Clow     Idio Software   <mailto:[hidden email]>

A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait).
        -- Yu Suzuki


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Christian Holmquist
In reply to this post by Phil Endecott-48
>
>
>>>
>> Even better, don't use vector<const char*>. What o you need that for?
>>
>
> Here's a real motivating example where I want to use vector<const char*>.
>  I have a large file containing null-terminated strings which I memory map.
>  After opening the file, I construct some sort of index of those strings in
> a sorted vector.  Then I search for things using std::equal_range,
> std::lower_bound, or similar.
>

All those algorithms comes in a version that accepts a predicate, so you
would still be able to do your optimization..
And it would be clear to anyone who's reading the code what your intention
is (to avoid copies), it's simply not a 'coincidence' that there happened to
be an overload that matched your types.
For clarity and maintenance, I prefer to speak out clearly in code the
intention. If it becomes to boilerplate-ly kind of thing, wrap it up in an
own algorithm.


>
> The important thing is that for efficiency the algorithms mustn't create
> temporary strings; they must compare the const char* with the string that
> I'm searching for directly.  std::string provides operator overloads to do
> this.  I want the algorithms to use those heterogeneous comparisons.
>

I understand the first part about not creating temporaries, but not why you
want to use those heterogeneous comparisons. If it's a performance concern,
I would prefer being in control of that explicitly. If you're unlucky while
working with the code at some point in the future and happen to introduce an
implicit copy, everything will compile and run but your speed will be not
what you expected. If you're lucky it will crash and can be debugged early
on



>
> Dave Abrahams has said a couple of times that he doesn't even know what it
> means to try to do this.  This baffles me, but I trust that there is some
> validity in his argument.  But it does lead me to the point that I made in
> my review, and my answer to Marshall's question at the start of this thread:
> I don't want Boost to waste time on either of those none_of_equal
> implementations.  We are all capable of writing our own 4-line none_of_equal
> implementations that do precisely what we want them to do.  Why argue about
> which is the "one true way" to do it, when we can all have whichever one we
> prefer?  Instead, let's spend time on things where the implementation is
> more than 4 lines long - like a variant of std::list with O(1) splice!
>

If we can't get the simple ones right, there's not a  healthy ground to base
more complicated algorithms on.

Also, I don't want to write those 4 lines of code which is potentially
dangerous, simply because I wasn't aware of the implications. If I can read
in some boost rationale like 'this is how it should be done and why' (many
Boost libraries takes this higher moral ground, which I appreciate), then
I've learned something.



>
>
> Regards,  Phil.
>
>
>
cheers,
Christian


>
>

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Algorithm design question

Dave Abrahams
In reply to this post by lcaminiti

on Thu Oct 06 2011, lcaminiti <lorcaminiti-AT-gmail.com> wrote:

> Dave Abrahams wrote:
>>
>> I don't think you can evaluate these choices just by looking at the
>> implementations.  Before anyone else votes option 1, I'd like to see
>>
>
> Well, I'm not so sure I like option 1 anymore... if option 1 was an
> oversight in STL then there is no sense in perpetuating the specification
> error in Boost.Algorithm (perhaps Boost.Algorithm should then go with option
> 2 and explain the rationale of why it's different than what STL does for
> example with std::find).
>
>> somebody write the description of the algorithm, including the concept
>> requirements.  With option 2, we know that == has to be an equivalence
>> relation.  The semantics in option 1 are a lot fuzzier.
>>
>
> Good point! I'll try to start :)
>
> template< typename Iter, typename Val >
>     requires
>         InputIterator<Iter>,
>         EqualityComparable&lt;iterator_traits&lt;Iter&gt;::value_type, Val>
> bool none_of_equal ( Iter first, Iter last, Val const& val ) ; // option 1a

What you failed to do here was to describe the semantic requirements on
EqualityComparable<A,B>, where A != B.

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
12345