[poly_collection] User-defined polymorphism model

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[poly_collection] User-defined polymorphism model

Boost - Dev mailing list
Hi Joaquín and all reviewers,

Before we reach the end of the review period, I wanted to comment an
(IMHO) interesting feature used in the internal design of
PolyCollection. The library offers three of polymorphism models:
base-derived, boost::any and signature-sharing callable types.

Under the hood, the library abstracts those polymorphic models into a
generic class in the detail namespace:

   template<typename Model,typename Allocator>
   class poly_collection;

and each concrete container is just implemented in terms of poly_collection:

   template<typename Base,typename Allocator>
   class base_collection
     : public detail::poly_collection<detail::base_model<Base>,Allocator>

The first template parameter (Model) is the abstraction ("concept") that
implements the non-generic part of each polymorphism model. A model
implements the following types and operations (comments on each
attribute are my interpretation, they can be completely wrong):

template<class ModelArg>
class Model
{
   //
   //Value-related attributes
   //

   //The type we want to expose as value_type of the container
   using value_type= ...;

   //Is T a subtype of the general type (acceptable as value)?
   template<class T> using is_subtype= ...

   //Is T the "most refined type"?
   template<class T> using is_terminal=...

   //Id of a value
   template<class T> static const std::type_info& subtypeid(const T& x);

   //Obtain a type-erased pointer to the value
   template<class T> static void* subaddress(T& x);
   template<typename T> static const void* subaddress(const T& x);

   //
   //Segment-related
   //

   //Segment iterators that point to type_erased value_types
   using base_iterator=...
   using const_base_iterator=...

   //const<->non-const conversions
   static base_iterator nonconst_iterator(const_base_iterator it);
   template<typename T>
   static iterator<T> nonconst_iterator(const_iterator<T> it);

   //Concrete iterators for Derived
   template<typename Derived> using iterator=...;
   template<typename Derived> using const_iterator=...

   //Generic type, comparable with base_iterator.
   //base_iterator == the last returned base_sentinel
   //means that base_iterator points one-past the last
   //element of a segment.
   using base_sentinel=...;
   using const_base_sentinel=...

   //Construct a segment for type T using the given allocator and
   //return it type-erased.
   template<typename T,typename Allocator>
   static detail::segment_backend<Model>* make(const Allocator& al);
};

I have several questions on this polymorphism model.

1) Wouldn't be a good idea to make this model-based polymorphism (or a
refined one) public? Maybe the current model needs more work to support
additional polymorphism models. Of course requirements for each
operation in a model should be detailed. A correct abstraction to define
user-provided polymorphism models is hard but this seems really a great
feature that makes this library more valuable for users.

2) If understand this model abstraction correctly, each Model chooses
the segment type to store values (it could be any sequence type, I
guess). The segment_backend abstraction does not only abstract
ConcreteTypes from value_types (using void pointers that will be casted
to real types in the implementation of the virtual function), but also
the segment type (the sequence holding the concrete values).

Current implementation uses two segment types: split_segment (stores
Concrete types and generic value_type's in two separate vectors) and
packed_segment (stores ConcreteTypes which already contain value_type's
as a subobject). function_collection and any_collection use the first
segment, base_collection uses the second.

Why are different segment types needed? Can't we store value_types and
ConcreteTypes in a struct and use only one type of segment? If this is
possible, I guess iteration performance would be improved for
function_collection and any_collection. Model definition could be
simplified.

On the other hand, if the sequence type can be left out of the model
definition, the underlying sequence type (change vector with deque,
small vector or any other) could be made configurable. Would this be
useful for users?

3) I understand that caching a base_sentinel in each segment is an
optimization to avoid a virtual function call when iterating with
base_iterators. Is this assumption correct? Why is a different
base_sentinel type needed to mark the end of each segment? Wouldn't a
"base_iterator" pointing to that on-past-the end position suffice? Is an
optimization (maybe base_sentinel is cheaper) or a must have?

Thanks,

Ion

PD: I have more design questions related to other topics but I will post
a different thread for each one.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [poly_collection] User-defined polymorphism model

Boost - Dev mailing list

> Em 14 may 2017, às 1:32, Ion Gaztañaga <[hidden email]> escreveu:
>
> Hi Joaquín and all reviewers,
>
> Before we reach the end of the review period, I wanted to comment an (IMHO) interesting feature used in the internal design of PolyCollection. The library offers three of polymorphism models: base-derived, boost::any and signature-sharing callable types.
>
> Under the hood, the library abstracts those polymorphic models into a generic class in the detail namespace:
>
> [...]

> I have several questions on this polymorphism model.
>
> 1) Wouldn't be a good idea to make this model-based polymorphism (or a refined one) public? Maybe the current model needs more work to support additional polymorphism models. Of course requirements for each operation in a model should be detailed. A correct abstraction to define user-provided polymorphism models is hard but this seems really a great feature that makes this library more valuable for users.

I'd be really conservative here and withhold such a big move until real use cases are found. The problem is once this is made public the design of the lib becomes effectively frozen.

>
> 2) If understand this model abstraction correctly, each Model chooses the segment type to store values (it could be any sequence type, I guess). The segment_backend abstraction does not only abstract ConcreteTypes from value_types (using void pointers that will be casted to real types in the implementation of the virtual function), but also the segment type (the sequence holding the concrete values).
>
> Current implementation uses two segment types: split_segment (stores Concrete types and generic value_type's in two separate vectors) and packed_segment (stores ConcreteTypes which already contain value_type's as a subobject). function_collection and any_collection use the first segment, base_collection uses the second.
>
> Why are different segment types needed? Can't we store value_types and ConcreteTypes in a struct and use only one type of segment? If this is possible, I guess iteration performance would be improved for function_collection and any_collection. Model definition could be simplified.
>

I tried using packed_segment with function_collection and any_collection, profiled and found split_segment is faster: basically, it gets you more values per cache line.


> On the other hand, if the sequence type can be left out of the model definition, the underlying sequence type (change vector with deque, small vector or any other) could be made configurable. Would this be useful for users?
>

This was also proposed by Thorsten, please see my answer there.


> 3) I understand that caching a base_sentinel in each segment is an optimization to avoid a virtual function call when iterating with base_iterators. Is this assumption correct?

Yes, this is it.

> Why is a different base_sentinel type needed to mark the end of each segment? Wouldn't a "base_iterator" pointing to that on-past-the end position suffice? Is an optimization (maybe base_sentinel is cheaper) or a must have?
>

An optimization. base_iterator would do.

Joaquín M López Muñoz

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [poly_collection] User-defined polymorphism model

Boost - Dev mailing list
On 14/05/2017 9:51, Joaquín M López Muñoz via Boost wrote:

>> I have several questions on this polymorphism model.
>>
>> 1) Wouldn't be a good idea to make this model-based polymorphism
>> (or a refined one) public? Maybe the current model needs more work
>> to support additional polymorphism models. Of course requirements
>> for each operation in a model should be detailed. A correct
>> abstraction to define user-provided polymorphism models is hard but
>> this seems really a great feature that makes this library more
>> valuable for users.
>
> I'd be really conservative here and withhold such a big move until
> real use cases are found. The problem is once this is made public the
> design of the lib becomes effectively frozen.

Right. We'd need more to test more polymorphism models in order to
deduce a more generic model. Noted for a future improvement.

>> Current implementation uses two segment types: split_segment
>> (stores Concrete types and generic value_type's in two separate
>> vectors) and packed_segment (stores ConcreteTypes which already
>> contain value_type's as a subobject). function_collection and
>> any_collection use the first segment, base_collection uses the
>> second.
>>
>> Why are different segment types needed? Can't we store value_types
>> and ConcreteTypes in a struct and use only one type of segment? If
>> this is possible, I guess iteration performance would be improved
>> for function_collection and any_collection. Model definition could
>> be simplified.
>>
>
> I tried using packed_segment with function_collection and
> any_collection, profiled and found split_segment is faster:
> basically, it gets you more values per cache line.

Ok, now I see it. Since poly_collection optimizes processing and not
space or other operations, split_segment seems a good choice as objects
are tightly packed and all additional data is stored in another segment.
packed_segment is an optimization when value_type is already stored
inside the object (like a base class). Makes sense.

>> On the other hand, if the sequence type can be left out of the
>> model definition, the underlying sequence type (change vector with
>> deque, small vector or any other) could be made configurable. Would
>> this be useful for users?
>>
>
> This was also proposed by Thorsten, please see my answer there.

Right. Maybe the least intrusive approach is to require element
contiguity so something like small-vector could be used. But I agree
this is not an essential feature.

Best,

Ion

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost