lifetime of ranges vs. iterators

classic Classic list List threaded Threaded
120 messages Options
123456
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Steven Watanabe-4
AMDG

David Abrahams wrote:

> on Sat Aug 30 2008, "Giovanni Piero Deretta" <gpderetta-AT-gmail.com> wrote:
>
>  
>> Hum, this would means that every range needs to know how to construct
>> and deconstruct iterators of other ranges, at least if it wants to
>> guarantee that the minimum space possible is used.
>>    
>
> It would be done with associated types and functions of the iterators.
> Few iterators need to store that kind of redundant information, but when
> you build an iterator type that *does*, it would be deemed "polite" to
> specialize an appropriate trait and/or define the corresponding
> overloads to allow it to be decomposed/re-composed.
>
> I've been itching to rework Boost.Iterator recently, and this could make
> a very nice enhancement.
>  

How would this work:

concept IteratorWithEnd<typename Iterator> {
    Iterator get_end(Iterator);
    Iterator set_end(Iterator, Iterator);
}

Taking filter_iterator, for example, this would allow the space to be
minimized,
with nested filter_iterators. (untested)

template<IteratorWithEnd Iterator, class F>
class filter_iterator<Iterator, F> {
public:
    //...
    filter_iterator(Iterator begin, Iterator end, F f) :
iter(set_end(begin, end)), f(f) {}
    filter_iterator& operator++() {
        ++iter;
        while(iter != get_end(iter) && !f(*iter)) ++iter;
        return(*this);
    }
    friend Iterator get_end(const filter_iterator& self) {
        return(filter_iterator(get_end(self.iter), get_end(self.iter),
self.f))
    }
    friend Iterator set_end(const filter_iterator& self, const
filter_iterator& other) {
        return(filter_iterator(self.iter, get_end(other.iter), f));
    }
private:
    Iterator iter;
    F f;
};

In Christ,
Steven Watanabe

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Dave Abrahams

on Sat Aug 30 2008, Steven Watanabe <watanabesj-AT-gmail.com> wrote:

> How would this work:
>
>
> template<IteratorWithEnd Iterator, class F>
> class filter_iterator<Iterator, F>
> {
> public:
>    //...
>    filter_iterator(Iterator begin, Iterator end, F f)
>      : iter(set_end(begin,end)), f(f)
>    {}
>    
>    filter_iterator& operator++() {
>        ++iter;
>        while(iter != get_end(iter) && !f(*iter)) ++iter;
>        return *this;
>    }
>    
>    friend Iterator get_end(const filter_iterator& self) {
>        return
>            filter_iterator(
>                get_end(self.iter), get_end(self.iter), self.f);
>    }
>    
>    friend Iterator set_end(
>        const filter_iterator& self, const filter_iterator& other)
>    {
>        return filter_iterator(self.iter, get_end(other.iter), f);
>    }
>
> private:
>    Iterator iter;
>    F f;
> };

Hi Steven,

Say for example Iterator is strided_iterator<int>.  Then the return type
of set_end above is strided_iterator<int>, right?  So how will its body
typecheck?  Seems like this needs a little more thought.

PS: I suggest you use more linebreaks, at least in your postings; code
with linebreaks inserted by mailers is even harder to read than code
with too few linebreaks ;-)

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Steven Watanabe-4
AMDG

David Abrahams wrote:

>>    friend Iterator get_end(const filter_iterator& self) {
>>        return
>>            filter_iterator(
>>                get_end(self.iter), get_end(self.iter), self.f);
>>    }
>>    
>>    friend Iterator set_end(
>>        const filter_iterator& self, const filter_iterator& other);
>>    
>
> Hi Steven,
>
> Say for example Iterator is strided_iterator<int>.  Then the return type
> of set_end above is strided_iterator<int>, right?  So how will its body
> typecheck?  Seems like this needs a little more thought.
>  

Oops.  I intended the return type to be filter_iterator.

In Christ,
Steven Watanabe

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Dave Abrahams

on Sat Aug 30 2008, Steven Watanabe <watanabesj-AT-gmail.com> wrote:

> AMDG
>
> David Abrahams wrote:
>>>    friend Iterator get_end(const filter_iterator& self) {
>>>        return
>>>            filter_iterator(
>>>                get_end(self.iter), get_end(self.iter), self.f);
>>>    }
>>>        friend Iterator set_end(
>>>        const filter_iterator& self, const filter_iterator& other);
>>>    
>>
>> Hi Steven,
>>
>> Say for example Iterator is strided_iterator<int>.  Then the return type
>> of set_end above is strided_iterator<int>, right?  So how will its body
>> typecheck?  Seems like this needs a little more thought.
>>  
>
> Oops.  I intended the return type to be filter_iterator.

Okay then, I take it you meant something like:

    template<IteratorWithEnd Iterator, class F>
    class filter_iterator<Iterator, F>
    {
    public:
       //...
       filter_iterator(Iterator begin, Iterator end, F f)
         : iter(set_end(begin,end)), f(f)
       {}
       
       filter_iterator& operator++() {
           ++iter;
           while(iter != get_end(iter) && !f(*iter)) ++iter;
           return *this;
       }
       
       friend filter_iterator get_end(const filter_iterator& self) {
           return
               filter_iterator(
                   get_end(self.iter), get_end(self.iter), self.f);
       }
       
       friend filter_iterator set_end(
           const filter_iterator& self, const filter_iterator& other)
       {
           return filter_iterator(self.iter, get_end(other.iter), f);
       }
   
    private:
       Iterator iter;
       F f;
    };

??

I'm still not seeing it.  In set_end you have:

     return filter_iterator(self.iter, get_end(other.iter), f);

Isn't the result of get_end going to be a filter_iterator?
The 2nd argument to filter_iterator's ctor is Iterator, though.

Maybe if you wrote out what you were trying to say in english it would
be easier for me to understand your goal here.

One thing that seems missing from your design in general is some way of
getting a type that represents the common information that must be
carried by any two iterators over the same range.  If you don't provide
a way to factor that information out, I don't see a way to avoid
redundant storage.

I was thinking of something more like this:

  // Gimme a better name, please!  The abstraction is a thing that
  // is always used in pairs, where each instance contains some
  // redundant information in common.
  auto concept Factorable<typename T>
  {
       typename common_data = void;
       typename unique_data = T;

       common_data common(T) {}
       unique_data unique(T x) { return x; }
       T reconstruct(common_data, unique_data x) { return x; }
  };

  template <class I, class F>
  concept_map Factorable<filter_iterator<I,F> >
  {
       typedef compressed_pair<I,F> common_data
       typedef I unique_data;

       common_data common(filter_iterator<I,F> p)
       {
           return common_data(p.end(), p.func());
       }

       unique_data common(filter_iterator<I,F> p)
       {
           return p.base();
       }

       filter_iterator<I,F> reconstruct(common_data c, unique_data u)
       {
           return filter_iterator<I,F>( c.first(), c.second(), u);
       }
  };

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Arno Schödl
>  // Gimme a better name, please!  The abstraction is a thing that
>  // is always used in pairs, where each instance contains some
>  // redundant information in common.
>  auto concept Factorable<typename T>
>  {
>       typename common_data = void;
>       typename unique_data = T;
>
>       common_data common(T) {}
>       unique_data unique(T x) { return x; }
>       T reconstruct(common_data, unique_data x) { return x; }
>  };

How is the separation of common_data and unique_data different from a separation of ranges and iterators? If iterators of ranges can rely on their range to exist, this is where common data like functors, end iterators etc. can be stored.

Arno

--
Dr. Arno Schoedl · [hidden email]
Technical Director
 
think-cell Software GmbH · Invalidenstr. 34 · 10115 Berlin, Germany
http://www.think-cell.com · phone +49-30-666473-10 · toll-free (US) +1-800-891-8091
Directors: Dr. Markus Hannebauer, Dr. Arno Schoedl · Amtsgericht Charlottenburg, HRB 85229

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Dave Abrahams

on Sun Aug 31 2008, Arno Schödl <aschoedl-AT-think-cell.com> wrote:

>>  // Gimme a better name, please!  The abstraction is a thing that
>>  // is always used in pairs, where each instance contains some
>>  // redundant information in common.
>>  auto concept Factorable<typename T>
>>  {
>>       typename common_data = void;
>>       typename unique_data = T;
>>
>>       common_data common(T) {}
>>       unique_data unique(T x) { return x; }
>>       T reconstruct(common_data, unique_data x) { return x; }
>>  };
>
> How is the separation of common_data and unique_data different from a separation
> of ranges and iterators? If iterators of ranges can rely on their range to
> exist, this is where common data like functors, end iterators etc. can be
> stored.

Naturally the point is to store the common data once when you need to
store more than one iterator to the same sequence -- in a range, for
example.

If you're asking why I bother reconstituting the whole iterator out of
its parts, instead of simply referring to the common_data stored in the
range: an adapted iterator is a useful concept regardless of whether
anyone is using a range abstraction.

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Arno Schödl
> > How is the separation of common_data and unique_data different from a separation
> > of ranges and iterators? If iterators of ranges can rely on their range to
> > exist, this is where common data like functors, end iterators etc. can be
> > stored.

> Naturally the point is to store the common data once when you need to
> store more than one iterator to the same sequence -- in a range, for
> example.

> If you're asking why I bother reconstituting the whole iterator out of
> its parts, instead of simply referring to the common_data stored in the
> range: an adapted iterator is a useful concept regardless of whether
> anyone is using a range abstraction.

You could provide an adapted_iterator and also an adapted_range. If we subscribe to the rule that ranges must stay valid for their iterators to be valid, the adapted_range::iterator can use the common data stored in the range, while the adapted_iterator stores the common data itself. Both could even be derived from the same source code. Do you then still need a factored iterator?

Or do you want to avoid people having to use the range abstraction? If you pass iterator pairs to algorithms instead of ranges, at least this parameter passing would have to pass the common data redundantly, even if inside the algorithm, say when many iterators have to be stored, the common data is stripped and stored separately.

--
Dr. Arno Schoedl · [hidden email]
Technical Director
 
think-cell Software GmbH · Invalidenstr. 34 · 10115 Berlin, Germany
http://www.think-cell.com · phone +49-30-666473-10 · toll-free (US) +1-800-891-8091
Directors: Dr. Markus Hannebauer, Dr. Arno Schoedl · Amtsgericht Charlottenburg, HRB 85229

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Dave Abrahams

on Sun Aug 31 2008, Arno Schödl <aschoedl-AT-think-cell.com> wrote:

>> > How is the separation of common_data and unique_data different from a
> separation
>> > of ranges and iterators? If iterators of ranges can rely on their range to
>> > exist, this is where common data like functors, end iterators etc. can be
>> > stored.
>
>> Naturally the point is to store the common data once when you need to
>> store more than one iterator to the same sequence -- in a range, for
>> example.
>
>> If you're asking why I bother reconstituting the whole iterator out of
>> its parts, instead of simply referring to the common_data stored in the
>> range: an adapted iterator is a useful concept regardless of whether
>> anyone is using a range abstraction.
>
> You could provide an adapted_iterator and also an adapted_range.

My point is that the adapted_range would then have semantically
equivalent iterators with a different representation (and an extra
indirection), which seems like a waste.

> If we subscribe to the rule that ranges must stay valid for their
> iterators to be valid,

I don't.  I do subscribe to the rule that generic code cannot afford to
destroy an arbitrary range while its iterators are still in use.

> the adapted_range::iterator can use the common data stored in the
> range, while the adapted_iterator stores the common data itself. Both
> could even be derived from the same source code.

Yeah, that's still a lot of coding effort.

> Do you then still need a factored iterator?

You need to be able to take two adapted iterators and turn them into a
range.  Do you want that range to contain redundant data?  I don't.

> Or do you want to avoid people having to use the range abstraction?

I certainly don't want to force it on them.

> If you pass iterator pairs to algorithms instead of ranges, at least
> this parameter passing would have to pass the common data redundantly,
> even if inside the algorithm, say when many iterators have to be
> stored, the common data is stripped and stored separately.

Of course that's understood.

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Arno Schödl
> > If we subscribe to the rule that ranges must stay valid for their
> > iterators to be valid,

> I don't.  I do subscribe to the rule that generic code cannot afford to
> destroy an arbitrary range while its iterators are still in use.

Isn't that what I said? There may be iterators that work without their ranges, but in general they don't.

> > the adapted_range::iterator can use the common data stored in the
> > range, while the adapted_iterator stores the common data itself. Both
> > could even be derived from the same source code.

> Yeah, that's still a lot of coding effort.

I think you could write it generically, a la iterator_facade/adaptor, so it is a one-time fixed cost.

> > Do you then still need a factored iterator?

> You need to be able to take two adapted iterators and turn them into a
> range.  Do you want that range to contain redundant data?  I don't.

> > Or do you want to avoid people having to use the range abstraction?

> I certainly don't want to force it on them.

Ok, now I understand. The debate is about primacy of ranges or iterators. You propose that iterators stay the primary vehicle and to convert them to/from ranges by stripping the common information. But that would mean that there is no "lean" iterator, all iterators would contain the redundant information.

When stacking N difference_ranges, the size difference between "fat" and "lean" iterators is 2^N. Thus in fully generic code where you don't know anything about the stacking depth, even generating a single fat iterator carries a potentially exponential penalty.

This fact makes me think that range is not merely a fancy name for a pair of iterators but a concept in its own right between containers and iterators. Generic algorithms must be written on the basis of ranges rather than iterators or take a significant performance hit.

--
Dr. Arno Schoedl · [hidden email]
Technical Director
 
think-cell Software GmbH · Invalidenstr. 34 · 10115 Berlin, Germany
http://www.think-cell.com · phone +49-30-666473-10 · toll-free (US) +1-800-891-8091
Directors: Dr. Markus Hannebauer, Dr. Arno Schoedl · Amtsgericht Charlottenburg, HRB 85229

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

lifetime of ranges vs. iterators

Arno Schödl
In reply to this post by Arno Schödl
> When stacking N difference_ranges, the size difference between "fat" and "lean"
> iterators is 2^N. Thus in fully generic code where you don't know anything about
> the stacking depth, even generating a single fat iterator carries a potentially
> exponential penalty.

> This fact makes me think that range is not merely a fancy name for a pair of
> iterators but a concept in its own right between containers and iterators.
> Generic algorithms must be written on the basis of ranges rather than iterators
> or take a significant performance hit.

To be fair, this hit could be avoided by storing the sub-iterators of the difference_iterator in their factored form, e.g., m_ittrav/m_ittravEnd would be stored as m_ittrav/m_ittravEnd/m_travcommon. But this would put a big burden on the compiler to optimize the frequent compose/decompose operations. For example:

    *m_ittrav;

would become

    *compose( m_ittrav, m_travcommon ) // common data not used at all

and

    ++m_ittrav;

would become

    m_ittrav=decompose( ++compose( m_ittrav, m_travcommon ) ).first;

-Arno

--
Dr. Arno Schoedl · [hidden email]
Technical Director
 
think-cell Software GmbH · Invalidenstr. 34 · 10115 Berlin, Germany
http://www.think-cell.com · phone +49-30-666473-10 · toll-free (US) +1-800-891-8091
Directors: Dr. Markus Hannebauer, Dr. Arno Schoedl · Amtsgericht Charlottenburg, HRB 85229

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

thorsten.ottosen
In reply to this post by Giovanni Piero Deretta
Giovanni Piero Deretta skrev:

>>> Stacking them naively would have 2^N storage
>>> growth, rather than 4^N, and we still don't need to worry about range
>>> lifetime.
>> I've been trying to say that generic algorithms *can't* worry about the
>> range lifetime; they have to assume that the iterators they're passed
>> will not be invalidated before they're destroyed.
>
> I think that the OP is talking about a different problem: let say you
> want to name the result of stacked applications of range adaptors:
>
> template<typename Range>
> void my_algo(Range& r) {
>   auto stacked_range = r | filtered(filter) | mapped(mapper);
>   // use stacked_range
> }
>
> This may lead iterator lifetime problems, not because r iterators may
> be invalid, but because the iterators of the range returned by
> filtered(...) may become invalid as that temporary has been
> destructed.

I don't see how this happens. The iterators are stored by value in each
new range that is generated.

-Thorsten
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Arno Schödl
> This may lead iterator lifetime problems, not because r iterators may
> be invalid, but because the iterators of the range returned by
> filtered(...) may become invalid as that temporary has been
> destructed.

> I don't see how this happens. The iterators are stored by value in each
> new range that is generated.

Correct, but in the case of difference_range, storing iterators by value leads to 2^N storage bloat when stacking N difference_iterators. You can avoid this with factorable iterators (Dave) or by storing ranges by value and avoiding copying containers either by explicitly wrapping containers as "by ref" (Giovanni) or by range trait (me).

--
Dr. Arno Schoedl · [hidden email]
Technical Director
 
think-cell Software GmbH · Invalidenstr. 34 · 10115 Berlin, Germany
http://www.think-cell.com · phone +49-30-666473-10 · toll-free (US) +1-800-891-8091
Directors: Dr. Markus Hannebauer, Dr. Arno Schoedl · Amtsgericht Charlottenburg, HRB 85229

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Giovanni Piero Deretta
In reply to this post by thorsten.ottosen
On Mon, Sep 1, 2008 at 10:18 AM, Thorsten Ottosen
<[hidden email]> wrote:

> Giovanni Piero Deretta skrev:
>
>>>> Stacking them naively would have 2^N storage
>>>> growth, rather than 4^N, and we still don't need to worry about range
>>>> lifetime.
>>>
>>> I've been trying to say that generic algorithms *can't* worry about the
>>> range lifetime; they have to assume that the iterators they're passed
>>> will not be invalidated before they're destroyed.
>>
>> I think that the OP is talking about a different problem: let say you
>> want to name the result of stacked applications of range adaptors:
>>
>> template<typename Range>
>> void my_algo(Range& r) {
>>  auto stacked_range = r | filtered(filter) | mapped(mapper);
>>  // use stacked_range
>> }
>>
>> This may lead iterator lifetime problems, not because r iterators may
>> be invalid, but because the iterators of the range returned by
>> filtered(...) may become invalid as that temporary has been
>> destructed.
>
> I don't see how this happens. The iterators are stored by value in each new
> range that is generated.
>

But it doesn't need to be so.  And that's exactly what all the
discussion is about :)

--
gpd
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Giovanni Piero Deretta
In reply to this post by Dave Abrahams
On Mon, Sep 1, 2008 at 5:24 AM, David Abrahams <[hidden email]> wrote:

>
> on Sun Aug 31 2008, Arno Schödl <aschoedl-AT-think-cell.com> wrote:
>
>>> > How is the separation of common_data and unique_data different from a
>> separation
>>> > of ranges and iterators? If iterators of ranges can rely on their range to
>>> > exist, this is where common data like functors, end iterators etc. can be
>>> > stored.
>>
>>> Naturally the point is to store the common data once when you need to
>>> store more than one iterator to the same sequence -- in a range, for
>>> example.
>>
>>> If you're asking why I bother reconstituting the whole iterator out of
>>> its parts, instead of simply referring to the common_data stored in the
>>> range: an adapted iterator is a useful concept regardless of whether
>>> anyone is using a range abstraction.
>>
>> You could provide an adapted_iterator and also an adapted_range.
>
> My point is that the adapted_range would then have semantically
> equivalent iterators with a different representation (and an extra
> indirection), which seems like a waste.

hum, which indirection?

<...>
>> Do you then still need a factored iterator?
>
> You need to be able to take two adapted iterators and turn them into a
> range.  Do you want that range to contain redundant data?  I don't.

Hei, maybe this is all that we need! Let's have a metafunction that
given an iterator type returns its preferred range type (the
associated range). The default might be boost.range or even std::pair.
 The developer of an iterator adaptor, can specialize the metafunction
so that the associate range is the optimum representation for an
iterator pair.

Do you think this would be enough? It seems too simple to me, so
probably I'm missing something...

--
gpd
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Arno Schödl
> >
> > You need to be able to take two adapted iterators and turn them into a
> > range.  Do you want that range to contain redundant data?  I don't.

> Hei, maybe this is all that we need! Let's have a metafunction that
> given an iterator type returns its preferred range type (the
> associated range). The default might be boost.range or even std::pair.
>  The developer of an iterator adaptor, can specialize the metafunction
> so that the associate range is the optimum representation for an
> iterator pair.

> Do you think this would be enough? It seems too simple to me, so
> probably I'm missing something...

I wouldn't know what to do when my iterator_adaptor needs three iterators to the same range. I could store two efficiently as a range and retrieve them when needed by querying range::begin()/end(), but what do I do with the third? Dave's factored iterators go further in the same direction, basically dropping the requirement that the common data must be a range, it can be anything.

Both proposals have the disadvantage that I need to wrap/unwrap the [common data (Dave)/range (Giovanni)] whenever I change one of the wrapped iterators. I'd rather store the common data explicitly, and have indirections inside the iterators that point to it.

Correct me if I misunderstand one of the two proposals.

Arno

--
Dr. Arno Schoedl · [hidden email]
Technical Director
 
think-cell Software GmbH · Invalidenstr. 34 · 10115 Berlin, Germany
http://www.think-cell.com · phone +49-30-666473-10 · toll-free (US) +1-800-891-8091
Directors: Dr. Markus Hannebauer, Dr. Arno Schoedl · Amtsgericht Charlottenburg, HRB 85229

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Giovanni Piero Deretta
On Mon, Sep 1, 2008 at 1:56 PM, Arno Schödl <[hidden email]> wrote:

>> >
>> > You need to be able to take two adapted iterators and turn them into a
>> > range.  Do you want that range to contain redundant data?  I don't.
>
>> Hei, maybe this is all that we need! Let's have a metafunction that
>> given an iterator type returns its preferred range type (the
>> associated range). The default might be boost.range or even std::pair.
>>  The developer of an iterator adaptor, can specialize the metafunction
>> so that the associate range is the optimum representation for an
>> iterator pair.
>
>> Do you think this would be enough? It seems too simple to me, so
>> probably I'm missing something...
>
> I wouldn't know what to do when my iterator_adaptor needs three iterators to the >same range. I could store two efficiently as a range and retrieve them when needed >by querying range::begin()/end(), but what do I do with the third? Dave's factored
>terators go further in the same direction, basically dropping the requirement that the >common data must be a range, it can be anything.

You are right. It wouldn't work with three iterators.

But three iterators are a range of ranges. Maybe segmented iterators
(or, better, segmented ranges)  can come to the rescue?

Need to think about it. Could you give a practical example of an
iterator adapter needing three iterators?

>
> Both proposals have the disadvantage that I need to wrap/unwrap the [common data >(Dave)/range (Giovanni)] whenever I change one of the wrapped iterators. I'd rather >store the common data explicitly, and have indirections inside the iterators that point >to it.

You would still have to allow for storing the wrapped iterator in the
wrapper if the wrapped iterator does not support refactoring. I do not
think there is a way around it.

Also, I have this (completely unfounded) feeling that an iterator that
relies on accessing common data in the owning range might be harder to
optimize.

I think that ranges should be used for long term storage, so if the
iterator size itself is not optimal shouldn't be a big problem.

Storing all data in a range and using an indirection in the iterator
would be beneficial when that data would make the iterator expensive
to copy (for example the shared pointer in shared_container_iterator:
you could move the shared pointer in a shared_container_range and put
a plain pointer in the iterator itself).

--
gpd
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [SPAM (Bayesian)] - Re: lifetime of ranges vs. iterators - Bayesian Filter detected spam

Arno Schödl
> Need to think about it. Could you give a practical example of an
> iterator adapter needing three iterators?

I never needed them in practice. But I would forward this to Dave. He likes the complete generality of factored iterators, I think. I think it is a cleaner concept than saying the common data must be representable as a range, but I cannot really back this up with practical examples. The wrapping/unwrapping (see below) troubles me in either case.

> You would still have to allow for storing the wrapped iterator in the
> wrapper if the wrapped iterator does not support refactoring. I do not
> think there is a way around it.

But in my proposal (actually, yours), all iterators would be "lean", and each iterator would have access to its range, all the way up the iterator/range stack. So each iterator would itself only wrap other lean iterators. Storage bloat for stack size N would only be O(N) for the indirections, rather than O(2^N).

> Also, I have this (completely unfounded) feeling that an iterator that
> relies on accessing common data in the owning range might be harder to
> optimize.

> I think that ranges should be used for long term storage, so if the
> iterator size itself is not optimal shouldn't be a big problem.

Without any kind of storage optimization like your preferred-range templates or Dave's factored-iterators, the storage bloat would be _factor_ 2^N, so ever handling completely unwrapped iterators just does not scale.

So it comes down to:

a) with storage optimization, at every iterator access, unwrap/wrap the iterator out of their storage:

   [Dave]
   void increment() {
      m_itBase=decompose( ++compose( m_itBase, m_commondata ) ).first; // second is common_data again
   }

   or

   [Giovanni]
   void increment() {
      m_rng=TRange( ++m_rng.begin(), m_rng.end() );
   }

or

b) have the iterator carry an indirection to the outside data.

Given that the iterators coming out of a) are decomposed recursively by the next layer of iterators (that is, ItBase::operator++ inside the expression does the same thing again, all the way up), I would think that a) is actually harder to optimize than b), and the effect of no optimization would be much more dramatic.

--
Dr. Arno Schoedl · [hidden email]
Technical Director
 
think-cell Software GmbH · Invalidenstr. 34 · 10115 Berlin, Germany
http://www.think-cell.com · phone +49-30-666473-10 · toll-free (US) +1-800-891-8091
Directors: Dr. Markus Hannebauer, Dr. Arno Schoedl · Amtsgericht Charlottenburg, HRB 85229

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [SPAM (Bayesian)] - Re: lifetime of ranges vs. iterators - Bayesian Filter detected spam

Giovanni Piero Deretta
On Mon, Sep 1, 2008 at 4:08 PM, Arno Schödl <[hidden email]> wrote:
>> Need to think about it. Could you give a practical example of an
>> iterator adapter needing three iterators?
>
> I never needed them in practice. But I would forward this to Dave. He likes the
> complete generality of factored iterators, I think. I think it is a cleaner concept than
> saying the common data must be representable as a range, but I cannot really back
> this up with practical examples. The wrapping/unwrapping (see below) troubles me in > either case.
>

Sure, full generality is always good, but if ranges cover 90% of the
cases, maybe it is good enough.

>> You would still have to allow for storing the wrapped iterator in the
>> wrapper if the wrapped iterator does not support refactoring. I do not
>> think there is a way around it.
>
> But in my proposal (actually, yours), all iterators would be "lean", and each iterator
> would have access to its range, all the way up the iterator/range stack. So each
> iterator would itself only wrap other lean iterators. Storage bloat for stack size N
> would only be O(N) for the indirections, rather than O(2^N).

You can't assume that all iterators are lean. In fact most won't.
Unless you drop compatibility with existing iterators, which is a
no-no.

My proposal was aimed at making ranges as lean as possible, not
necessarily iterators.
Anyways, as long as all parts of the stack are 'lean' ranges, you
should have only a 2x storage bloat when using iterators.

>
>> Also, I have this (completely unfounded) feeling that an iterator that
>> relies on accessing common data in the owning range might be harder to
>> optimize.
>
>> I think that ranges should be used for long term storage, so if the
>> iterator size itself is not optimal shouldn't be a big problem.
>
> Without any kind of storage optimization like your preferred-range templates or
> Dave's factored-iterators, the storage bloat would be _factor_ 2^N, so ever handling > completely unwrapped iterators just does not scale.
>
> So it comes down to:
>
> a) with storage optimization, at every iterator access, unwrap/wrap the iterator out of their storage:
>
>   [Dave]
>   void increment() {
>      m_itBase=decompose( ++compose( m_itBase, m_commondata ) ).first; // second is common_data again
>   }
>
>   or
>
>   [Giovanni]
>   void increment() {
>      m_rng=TRange( ++m_rng.begin(), m_rng.end() );
>   }
>

(Make that '++' 'boost::next()').

They are pretty equivalent. I think that my idea requires less
scafolding: you probably are going to make a specific wrapper range
anyway, so it is a mater of specializing associated_range. On the
other hand, once you have compose and decompose, you proably do not
need a specific wrapper. So, I do not know. I would have to try.

I do not think that the above looks that bad. In fact I think that
compilers should be able to optimize that wrapping and unwrapping
pretty well (even more with move semantics).

Here is a more or less complete (forward)  filter_iterator and
associated range (given appropriate definitions of front and rest)
[note: more or less pseudocode]

template<class Iter, class F>
struct filter_iterator : iterator_facade<blah,blah...> {
 filter_iterat(Iter b, iter e, F f ) :
   m_range(b,e), m_f(f) {}
private:
  void increment() {
      while(!m_f(front(m_range))
            m_range = rest(m_range);
  }

  reference dereference() const {
     // problem: front(x) is defined as *begin(x);
     // what if the reference returned by
     // begin(m_range) is invalidated when the
     // iterator goes out of scope? Yes, it
     // happens and I got bitten at least once.
     // We need a trait.
     return front(m_range);
  }

  bool equal(filter_iterator const&rhs) {
         return begin(rhs.m_range) == begin(m_range) &&
                    end(rhs.m_range) == end(m_range);
  }

   associated_range<Iter>::type m_range;
   F m_f; // use compressed pair here
};

template<class R, class F>
struct filter_range {
  typedef filter_iterator<range_iterator<R>::type, R> iterator;

  iterator begin() const{
       return iterator(begin(m_range), end(m_range), m_f);
  }

  iterator end() const{
       return iterator(end(m_range), end(m_range), m_f);
  }

   R m_range;
   F m_f;  // use compressed pair
};

// specialization
template<class Iter, class F>
class associated_range<filter_iterator<Iter, F> >  {
   typedef filter_range<associater_range<Iter>::type, F> type;
};

This should handle nested filter iterators (assuming no further
reductions) or other iterators with specialized nested_ranges quite
nicely.

> or
>
> b) have the iterator carry an indirection to the outside data.
>
> Given that the iterators coming out of a) are decomposed recursively by the next
> layer of iterators (that is, ItBase::operator++ inside the expression does the same
> thing again, all the way up), I would think that a) is actually harder to optimize than b),

compilers are quite good at eliminating all compile time recursive
functions, so I would say it doesn't matter.

> and the effect of no optimization would be much more dramatic.

this is always the case with code with high abstraction (10x is what I
usually see).

--
gpd
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Dave Abrahams
In reply to this post by Arno Schödl

on Mon Sep 01 2008, Arno Schödl <aschoedl-AT-think-cell.com> wrote:

>> > If we subscribe to the rule that ranges must stay valid for their
>> > iterators to be valid,
>
>> I don't.  I do subscribe to the rule that generic code cannot afford to
>> destroy an arbitrary range while its iterators are still in use.
>
> Isn't that what I said?

No, but it might be what you meant.

> There may be iterators that work without their ranges,
> but in general they don't.
>
>> > the adapted_range::iterator can use the common data stored in the
>> > range, while the adapted_iterator stores the common data itself. Both
>> > could even be derived from the same source code.
>
>> Yeah, that's still a lot of coding effort.
>
> I think you could write it generically, a la iterator_facade/adaptor,
> so it is a one-time fixed cost.

I'm not sure if it's worth the effort.

>> > Do you then still need a factored iterator?
>
>> You need to be able to take two adapted iterators and turn them into a
>> range.  Do you want that range to contain redundant data?  I don't.
>
>> > Or do you want to avoid people having to use the range abstraction?
>
>> I certainly don't want to force it on them.
>
> Ok, now I understand. The debate is about primacy of ranges or
> iterators.

I'm not sure I agree.

> You propose that iterators stay the primary vehicle

Let's say, I propose that they are first-class citizens that work
without an underlying Range object.  

        char* buffer=malloc(lotsabytes);
       
How am I going to operate on that if there needs to be an underlying
range object?  There isn't one; the iterators stand alone.

(Ranges can be first-class citizens too if you like).

> and to convert
> them to/from ranges by stripping the common information.

As an optimization.

> But that would mean that there is no "lean" iterator, all iterators
> would contain the redundant information.

I think int* is plenty lean.

I don't know what you mean by "lean iterator," actually.  You can store
the common information in the iterator itself, or you'll have pointers
or references to the common information, but either way there will be
redundant bits in a pair of filter_iterators.

> When stacking N difference_ranges, the size difference between "fat"
> and "lean" iterators is 2^N. Thus in fully generic code where you
> don't know anything about the stacking depth, even generating a single
> fat iterator carries a potentially exponential penalty.

You've lost me.  Maybe you'd better spell out what you mean by "fat" and
"lean," and by "generating a fat iterator."

> This fact makes me think that range is not merely a fancy name for a
> pair of iterators but a concept in its own right between containers
> and iterators. Generic algorithms must be written on the basis of
> ranges rather than iterators or take a significant performance hit.

Have you actually measured anything yet?

I think you're being at least very hasty.  If you do something that
slows down operation on pairs of pointers, you won't be doing anyone
favors.

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: lifetime of ranges vs. iterators

Dave Abrahams
In reply to this post by Giovanni Piero Deretta

on Mon Sep 01 2008, "Giovanni Piero Deretta" <gpderetta-AT-gmail.com> wrote:

>>> You could provide an adapted_iterator and also an adapted_range.
>>
>> My point is that the adapted_range would then have semantically
>> equivalent iterators with a different representation (and an extra
>> indirection), which seems like a waste.
>
> hum, which indirection?

The iterators have to refer to the common data somehow.

> <...>
>>> Do you then still need a factored iterator?
>>
>> You need to be able to take two adapted iterators and turn them into a
>> range.  Do you want that range to contain redundant data?  I don't.
>
> Hei, maybe this is all that we need! Let's have a metafunction that
> given an iterator type returns its preferred range type (the
> associated range). The default might be boost.range or even std::pair.
>  The developer of an iterator adaptor, can specialize the metafunction
> so that the associate range is the optimum representation for an
> iterator pair.
>
> Do you think this would be enough? It seems too simple to me, so
> probably I'm missing something...

It is quite attractive, I must say.  I'll give it some thought.

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
123456