return type of operator[](size_type) for sparse vectors?

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

return type of operator[](size_type) for sparse vectors?

Toon Knapen
Given a sparse vector v of size 10, what would v[0] return? Should it
return a reference to the element at position 0 in the sparse vector.
What if that element is a structural-zero? It clearly is not as
straightforward as with dense vectors.

The options AFAICT are:
1) provide no 'operator[](size_type)' to avoid confusion

2) return the value at the corresponding position or if the position
corresponds to a structural-zero return zero. Thus in that case
operator[] will not affect/change the structure and operator[] will not
allow to change the value at the corresponding position.

3) return a proxy. If the proxy is being assigned to, the structure will
be changed in case the position corresponded with a structural-zero.

What would you find most intuitive? (of course other methods allow to
query if a specific position corresponds to a non-zero or not so
operator[] is not the only way to access elements). Also your feedback
on you experience with other libraries can be usefull to make the right
decision here.

toon

_______________________________________________
glas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/glas
Reply | Threaded
Open this post in threaded view
|

Re: A few more thoughts about return type of operator[](size_type) for sparse vectors

Toon Knapen
Christophe Sanz wrote:

> Conceptually, a vector is dense. Sparsity would just be a feature to
> exploit for implementation efficiency. Whether the vector appear as dense upon
> outputting all the elements does not imply that he is not stored as sparse.
> In the outputting case, the reference being const would be consistent. Now the
> question would be: when should a sparse vector change number nnz?


indeed if 'operator[](size_type)' would be a _const_ member function it
could return a const_reference (or a value_type) and thus using the
operator would not affect the sparsity. But is this acceptable for the
users? The whole discussion is similar to the operator[] of the
std::map. In the case of std::map, operator[] is  only accessible on
non-const maps and only consulting the data associated to a key using
operator[] adds the key to the map (if it was not there already).


>
> I think I could see that the MTL deserve no room for changing nnz,

[snip]

What do you mean exactly by this?


toon
_______________________________________________
glas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/glas
Reply | Threaded
Open this post in threaded view
|

Re: return type of operator[](size_type) for sparse vectors?

Russell Smiley
In reply to this post by Toon Knapen
I think I prefer option 3 (proxy that modifies a structural zero if
necessary). Presumably there is some penalty for this more sophisticated
behaviour.

If "proxy_type operator[]" is not provided how would structural zeroes
normally be modified?

Russell.

> -----Original Message-----

> Subject: [glas] return type of operator[](size_type) for
> sparse vectors?
>
>
> Given a sparse vector v of size 10, what would v[0] return? Should it
> return a reference to the element at position 0 in the sparse vector.
> What if that element is a structural-zero? It clearly is not as
> straightforward as with dense vectors.
>
> The options AFAICT are:
> 1) provide no 'operator[](size_type)' to avoid confusion
>
> 2) return the value at the corresponding position or if the position
> corresponds to a structural-zero return zero. Thus in that case
> operator[] will not affect/change the structure and
> operator[] will not
> allow to change the value at the corresponding position.
>
> 3) return a proxy. If the proxy is being assigned to, the
> structure will
> be changed in case the position corresponded with a structural-zero.
>
_______________________________________________
glas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/glas
Reply | Threaded
Open this post in threaded view
|

Re: return type of operator[](size_type) for sparse vectors?

Peter Gottschling
I also prefer option 3 and I don't believe that it would cause much
overhead.  Also, I think that vectors are expected to change often in
applications (much more than matrices) and need the ability to change
the sparsity pattern and insert non-zeros.

Another question is how to keep the sparsity pattern small.
- When v[i]= 0, should v[i] be removed from the structure?
- What if v[i]= f(x) with f(x)~0?  One could provide user-defined
predicates that decide when an element can be dropped. How much would
this complicate the implementation?
- Do have typical sparse vector applications regular assignments v= 0
so that an element-wise cancellation is not needed?
- Should v= 0 automatically empty the structure or do we prefer to have
something like v.reset() to do this?  For dense vectors v.reset() would
set all elements to 0 to keep an unified interface.

Having no operator[] access would be strange for a vector and makes
many implementations more difficult.  The behavior of operator[] should
be consistent with dense vectors.  For efficiency reasons, the
preferred access to vectors which are possibly sparse should be with
iterators (cursors) over non-zero elements (if the vector is the driven
data structure).

For instance, multiplying a sparse matrix with a vector, one usually
iterates over the non-zero elements of the matrix and expects the
vector elements to be random-accessible (matrix-driven).  For very
sparse vectors it could be more efficient to iterate over the non-zeros
of the vector and access the corresponding matrix elements needed for
multiplication.  This requires that the columns of the matrix are
accessible efficiently, which is given for a CCS matrix but not for a
CRS matrix.

Speaking of iterations over sparse data structures.  Karl, Toon and I
had the idea to provide different traversal tags for begin and end
functions.  This idea works fine to distinguish for instance between
row-wise, column-wise or element-wise matrix traversal (under the
condition that the matrix allows such traversal).  Recently, I intended
to implement dense traversal for sparse matrices, i.e. to go over all
elements including structural zeros.  The reason I hesitated was that
the cursors/iterators need conceptionally different information,
offsets within internal data structures versus matrix indices.  While
this is still implementable, kind of, it would introduce a lot of
overloading and complicating the implementation significantly.  As an
alternative, I find it clearer to have a dense_view of a sparse matrix
or vector and use the cursor/iterator of this view to traverse all
elements.  (Loops over all matrix or vector indices and using
operator[] or operator() would be equivalent but less generic.)  Better
we have dense_matrix_view and dense_vector_view which should be both
implementable generically (without specialization).

Peter

On 26.04.2006, at 15:11, Russell Smiley wrote:

> I think I prefer option 3 (proxy that modifies a structural zero if
> necessary). Presumably there is some penalty for this more
> sophisticated
> behaviour.
>
> If "proxy_type operator[]" is not provided how would structural zeroes
> normally be modified?
>
> Russell.
>
>> -----Original Message-----
>
>> Subject: [glas] return type of operator[](size_type) for
>> sparse vectors?
>>
>>
>> Given a sparse vector v of size 10, what would v[0] return? Should it
>> return a reference to the element at position 0 in the sparse vector.
>> What if that element is a structural-zero? It clearly is not as
>> straightforward as with dense vectors.
>>
>> The options AFAICT are:
>> 1) provide no 'operator[](size_type)' to avoid confusion
>>
>> 2) return the value at the corresponding position or if the position
>> corresponds to a structural-zero return zero. Thus in that case
>> operator[] will not affect/change the structure and
>> operator[] will not
>> allow to change the value at the corresponding position.
>>
>> 3) return a proxy. If the proxy is being assigned to, the
>> structure will
>> be changed in case the position corresponded with a structural-zero.
>>
> _______________________________________________
> glas mailing list
> [hidden email]
> http://lists.boost.org/mailman/listinfo.cgi/glas
>
------------
Peter Gottschling
Research Associate
Open Systems Laboratory
Indiana University
301i Lindley Hall
Bloomington, IN 47405
Tel.: +1 812 855-8898   Fax: +1 812 856 0853
http://www.osl.iu.edu/~pgottsch

_______________________________________________
glas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/glas
Reply | Threaded
Open this post in threaded view
|

Re: A few more thoughts about return type of operator[](size_type) for sparse vectors

Peter Gottschling
In reply to this post by Toon Knapen

On 26.04.2006, at 15:03, Toon Knapen wrote:

> Christophe Sanz wrote:
>>
>> I think I could see that the MTL deserve no room for changing nnz,
> [snip]
>
No, it does change.  In MTL 2 (the last released version) the structure
is modified implicitly each time a zero matrix element is changed.  
This behavior is very handy but not very efficient.

In the new, not yet released version, matrices are per se constant.  To
change a matrix, an inserter object has to be defined which enables
efficient insertion.  Upon destruction of the inserter the matrix is
compressed.  Other libraries use this insertion-only, read-only
approach too.  The difference is that one can have more than one
insertion phase by defining another inserter later.  The question
whether we will allow reading (less efficient) during the insertion
phase is still under consideration.  However, I will write this down in
detail and provide some performance comparisons.

Peter

------------
Peter Gottschling
Research Associate
Open Systems Laboratory
Indiana University
301i Lindley Hall
Bloomington, IN 47405
Tel.: +1 812 855-8898   Fax: +1 812 856 0853
http://www.osl.iu.edu/~pgottsch

_______________________________________________
glas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/glas
Reply | Threaded
Open this post in threaded view
|

Re: return type of operator[](size_type) for sparse vectors?

Toon Knapen
In reply to this post by Peter Gottschling
Peter Gottschling wrote:

> I also prefer option 3 and I don't believe that it would cause much
> overhead.  Also, I think that vectors are expected to change often in
> applications (much more than matrices) and need the ability to change
> the sparsity pattern and insert non-zeros.
>
> Another question is how to keep the sparsity pattern small.
> - When v[i]= 0, should v[i] be removed from the structure?
> - What if v[i]= f(x) with f(x)~0?  One could provide user-defined
> predicates that decide when an element can be dropped. How much would
> this complicate the implementation?
> - Do have typical sparse vector applications regular assignments v= 0
> so that an element-wise cancellation is not needed?
> - Should v= 0 automatically empty the structure or do we prefer to have
> something like v.reset() to do this?  For dense vectors v.reset() would
> set all elements to 0 to keep an unified interface.


if option 3 is taken and thus if operator[] can have an effect on the
structure, the questions above are difficult to answer and semantics
will become even more subtle (and thus unclear for the user).


>
> Having no operator[] access would be strange for a vector and makes
> many implementations more difficult.  The behavior of operator[] should
> be consistent with dense vectors.  


That is impossible. Even if operator[] can change the structure (as in
option 3), operator[] might change the structure and therefore will
invalidate the iterators. The iterators for a dense-vector are however
not invalidated when using operator[] so I think it will never be
possible to have the same behavior as dense.

Anyway we also foresee to have other members to insert non-zero's
explicitly so operator[] is not the only way. The discussion about
operator[] is purely about convenience, not about features.


> For efficiency reasons, the
> preferred access to vectors which are possibly sparse should be with
> iterators (cursors) over non-zero elements (if the vector is the driven
> data structure).


agreed



>
> Speaking of iterations over sparse data structures.  Karl, Toon and I
> had the idea to provide different traversal tags for begin and end
> functions.  This idea works fine to distinguish for instance between
> row-wise, column-wise or element-wise matrix traversal (under the
> condition that the matrix allows such traversal).  Recently, I intended
> to implement dense traversal for sparse matrices, i.e. to go over all
> elements including structural zeros.  The reason I hesitated was that
> the cursors/iterators need conceptionally different information,
> offsets within internal data structures versus matrix indices.  While
> this is still implementable, kind of, it would introduce a lot of
> overloading and complicating the implementation significantly.  As an
> alternative, I find it clearer to have a dense_view of a sparse matrix
> or vector and use the cursor/iterator of this view to traverse all
> elements.  (Loops over all matrix or vector indices and using
> operator[] or operator() would be equivalent but less generic.)  Better
> we have dense_matrix_view and dense_vector_view which should be both
> implementable generically (without specialization).

The one does not exclude the other. The dense-views seem indeed
interesting but we can play around with both (if we really want) and see
what is most convenient/performant.


_______________________________________________
glas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/glas
Reply | Threaded
Open this post in threaded view
|

Re: return type of operator[](size_type) for sparse vectors?

Karl Meerbergen
I recall similar discussions earlier on and wonder whether we have to
provide an operator[] for non-const sparse vectors in the GLAS core. It
is very interpretation dependent and therefore it is probably better to
provide containers/adaptors that implement a specific interpretation of
operator[]. A few possibilities:

* v[i] never inserts (assumes the structure does not change)
* v[i] returns a proxy and allows insertion (the most general, but
potentially most inefficient case)
Then there are issues such as: do we always need a sorted sparse
structure? Does v[i]=0 create a structural zero?
* v[i] returns a proxy, but only allows push_back to the structure

This is all application dependent. In the applications I am implementing
I use all of these strategies in different situations, so I am unable to
say which interpretation of operator[] I need. I need them all!

In my opinion, this does not mean we have to implement all these
strategies, but allow the user to make a tuned adaptor or container that
does the right job for him. The GLAS core should provide the tools to
implement strategies efficiently.

I suggest to only provide the functionalities in the GLAS core that we
need to implement the algorithms that are suppported by GLAS. We can
talk later about functionalities that make GLAS more sexy or easier to use.

Karl



Toon Knapen wrote:

>Peter Gottschling wrote:
>  
>
>>I also prefer option 3 and I don't believe that it would cause much
>>overhead.  Also, I think that vectors are expected to change often in
>>applications (much more than matrices) and need the ability to change
>>the sparsity pattern and insert non-zeros.
>>
>>Another question is how to keep the sparsity pattern small.
>>- When v[i]= 0, should v[i] be removed from the structure?
>>- What if v[i]= f(x) with f(x)~0?  One could provide user-defined
>>predicates that decide when an element can be dropped. How much would
>>this complicate the implementation?
>>- Do have typical sparse vector applications regular assignments v= 0
>>so that an element-wise cancellation is not needed?
>>- Should v= 0 automatically empty the structure or do we prefer to have
>>something like v.reset() to do this?  For dense vectors v.reset() would
>>set all elements to 0 to keep an unified interface.
>>    
>>
>
>
>if option 3 is taken and thus if operator[] can have an effect on the
>structure, the questions above are difficult to answer and semantics
>will become even more subtle (and thus unclear for the user).
>
>
>  
>
>>Having no operator[] access would be strange for a vector and makes
>>many implementations more difficult.  The behavior of operator[] should
>>be consistent with dense vectors.  
>>    
>>
>
>
>That is impossible. Even if operator[] can change the structure (as in
>option 3), operator[] might change the structure and therefore will
>invalidate the iterators. The iterators for a dense-vector are however
>not invalidated when using operator[] so I think it will never be
>possible to have the same behavior as dense.
>
>Anyway we also foresee to have other members to insert non-zero's
>explicitly so operator[] is not the only way. The discussion about
>operator[] is purely about convenience, not about features.
>
>
>  
>
>>For efficiency reasons, the
>>preferred access to vectors which are possibly sparse should be with
>>iterators (cursors) over non-zero elements (if the vector is the driven
>>data structure).
>>    
>>
>
>
>agreed
>
>
>
>  
>
>>Speaking of iterations over sparse data structures.  Karl, Toon and I
>>had the idea to provide different traversal tags for begin and end
>>functions.  This idea works fine to distinguish for instance between
>>row-wise, column-wise or element-wise matrix traversal (under the
>>condition that the matrix allows such traversal).  Recently, I intended
>>to implement dense traversal for sparse matrices, i.e. to go over all
>>elements including structural zeros.  The reason I hesitated was that
>>the cursors/iterators need conceptionally different information,
>>offsets within internal data structures versus matrix indices.  While
>>this is still implementable, kind of, it would introduce a lot of
>>overloading and complicating the implementation significantly.  As an
>>alternative, I find it clearer to have a dense_view of a sparse matrix
>>or vector and use the cursor/iterator of this view to traverse all
>>elements.  (Loops over all matrix or vector indices and using
>>operator[] or operator() would be equivalent but less generic.)  Better
>>we have dense_matrix_view and dense_vector_view which should be both
>>implementable generically (without specialization).
>>    
>>
>
>The one does not exclude the other. The dense-views seem indeed
>interesting but we can play around with both (if we really want) and see
>what is most convenient/performant.
>
>
>_______________________________________________
>glas mailing list
>[hidden email]
>http://lists.boost.org/mailman/listinfo.cgi/glas
>
>  
>


--
==============================================
Karl Meerbergen

Free Field Technologies

NEW ADDRESS FROM FEBRUARY 1st ONWARD:

Axis Park Louvain-la-Neuve
rue Emile Francqui, 1
B-1435 Mont-Saint Guibert - BELGIUM

Company Phone:  +32 10 45 12 26
Company Fax:    +32 10 45 46 26
Mobile Phone:   +32 474 26 66 59
Home Phone:     +32 2 306 38 10
mailto:[hidden email]
http://www.fft.be
============================================



_______________________________________________
glas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/glas
Reply | Threaded
Open this post in threaded view
|

Re: return type of operator[](size_type) for sparse vectors?

Guntram Berti
In reply to this post by Toon Knapen

On Tue, Apr 25, 2006 at 09:22:23PM +0200, Toon Knapen wrote:

> Given a sparse vector v of size 10, what would v[0] return? Should it
> return a reference to the element at position 0 in the sparse vector.
> What if that element is a structural-zero? It clearly is not as
> straightforward as with dense vectors.
>
> The options AFAICT are:
> 1) provide no 'operator[](size_type)' to avoid confusion
>
> 2) return the value at the corresponding position or if the position
> corresponds to a structural-zero return zero. Thus in that case
> operator[] will not affect/change the structure and operator[] will not
> allow to change the value at the corresponding position.
>
> 3) return a proxy. If the proxy is being assigned to, the structure will
> be changed in case the position corresponded with a structural-zero.
>
> What would you find most intuitive? (of course other methods allow to
> query if a specific position corresponds to a non-zero or not so
> operator[] is not the only way to access elements). Also your feedback
> on you experience with other libraries can be usefull to make the right
> decision here.


In the GrAL library, I use both operator[] (for read/write access)
and operator() (for read-only access). For the equivalent of sparse
arrays in GrAL, operator() returns a copy of the value or the default
value, resp.
The convention of using both ops [] and () permits to choose at the call site
whether write access is needed, and thus to choose the potentially more
efficient operator() if possible. (I think that's not possible by just
providing operator[] in const and non-const form, as the non-const variant
seems to be preferred by the compiler - any language expert's comment on this?)

Of course, to make use of this idea within glas practical, one would have to
support operator() for all vectors and probably also matrices.

--guntram


--
============================================================
Dr. Guntram Berti -- NEC C&C Research Laboratories   O__  ==
Rathausallee 10, D-53757 St. Augustin, Germany      c/ /'_ ==
++49 +2241 92 52  -32(voice) -99(fax)              (*) \(*) ==
                                                  ------------
http://www.ccrl-nece.de/~berti - [hidden email]
============================================================
_______________________________________________
glas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/glas
Reply | Threaded
Open this post in threaded view
|

Re: return type of operator[](size_type) for sparse vectors?

Toon Knapen
In reply to this post by Russell Smiley
I have implemented a sparse-vector taking into account the remarks that
have been made previously. I think this is a good compromise between
performance and functionality. I uploaded the latest state of the
documentation and the current implementation of a sparse-vector, the
mapped_vector, can be found here
http://glas.sourceforge.net/doc/models/container/mapped_vector.html

There is also a small test that you can find
glas/test/sparse_vector/mapped_vector.cpp. (remember that we switched to
subversion if you want to check it out).

toon

Russell Smiley wrote:

> I think I prefer option 3 (proxy that modifies a structural zero if
> necessary). Presumably there is some penalty for this more sophisticated
> behaviour.
>
> If "proxy_type operator[]" is not provided how would structural zeroes
> normally be modified?
>
> Russell.
>
>> -----Original Message-----
>
>> Subject: [glas] return type of operator[](size_type) for
>> sparse vectors?
>>
>>
>> Given a sparse vector v of size 10, what would v[0] return? Should it
>> return a reference to the element at position 0 in the sparse vector.
>> What if that element is a structural-zero? It clearly is not as
>> straightforward as with dense vectors.
>>
>> The options AFAICT are:
>> 1) provide no 'operator[](size_type)' to avoid confusion
>>
>> 2) return the value at the corresponding position or if the position
>> corresponds to a structural-zero return zero. Thus in that case
>> operator[] will not affect/change the structure and
>> operator[] will not
>> allow to change the value at the corresponding position.
>>
>> 3) return a proxy. If the proxy is being assigned to, the
>> structure will
>> be changed in case the position corresponded with a structural-zero.
>>
> _______________________________________________
> glas mailing list
> [hidden email]
> http://lists.boost.org/mailman/listinfo.cgi/glas
>


--
Toon Knapen

------------------------------------------------
Check out our training program on acoustics
and register on-line at http://www.fft.be/?id=35
_______________________________________________
glas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/glas