Is someone interested in container CompressedVector, a vector-like storage for integer types

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Is someone interested in container CompressedVector, a vector-like storage for integer types

Alexander Bulovyatov
Hi!

Assume you want an interface as for vector<unsigned int>, but know that
elements have only N significant bits, say 17. In this case the
container will save 46% memory compared to vector<unsigned int>.

CompressedVector internally stores elements as a dynamic bitset
(vector<size_t>) and uses bitshift ops to put/get significant bits
into one or two size_t elements. It's similar to vector<bool> and
dynamic_bitset, and has the same drawback, it's not a proper STL
container.

Anyway, the container is simple, efficient and quite fast.
It will work best for very long arrays of indexes. Since its storage
is continuous, it can also be used to speed up IO (load/store to
disk, network transfers) acting as a very fast compression.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Is someone interested in container CompressedVector, a vector-like storage for integer types

Vicente Botet
Alexander Bulovyatov wrote
Hi!

Assume you want an interface as for vector<unsigned int>, but know that
elements have only N significant bits, say 17. In this case the
container will save 46% memory compared to vector<unsigned int>.

CompressedVector internally stores elements as a dynamic bitset
(vector<size_t>) and uses bitshift ops to put/get significant bits
into one or two size_t elements. It's similar to vector<bool> and
dynamic_bitset, and has the same drawback, it's not a proper STL
container.

Anyway, the container is simple, efficient and quite fast.
It will work best for very long arrays of indexes. Since its storage
is continuous, it can also be used to speed up IO (load/store to
disk, network transfers) acting as a very fast compression.
Hi,

Alejandro Cabrera needed something like that for counting bloom filters (GSOC project). He included a private implementation taking in account his needs.

Do you think that you could provide an implementation that is optimized for (2, 4 bits)?

Best,
Vicente

 
Reply | Threaded
Open this post in threaded view
|

Re: Is someone interested in container CompressedVector, a vector-like storage for integer types

Alexander Bulovyatov
Well, I don't see any special optimization for 2, 4 bits. They will work faster than, say 7, because a compiler replaces mul / div by power of 2 with << >> ops.
Reply | Threaded
Open this post in threaded view
|

Re: Is someone interested in container CompressedVector, a vector-like storage for integer types

Vicente Botet
Alexander Bulovyatov wrote
Well, I don't see any special optimization for 2, 4 bits. They will work faster than, say 7, because a compiler replaces mul / div by power of 2 with << >> ops.
Hin

In this cases you are sure the bits will be always on the same word, so  there is no need to take care of managing multiple words for theese cases.

Best,
Vicente
Reply | Threaded
Open this post in threaded view
|

Re: Is someone interested in container CompressedVector, a vector-like storage for integer types

Alexander Bulovyatov
You're right. Although, the second word isn't always needed, so we check a condition before considering it.
OK, an additional condition N != pow of 2 can be added. It's static, so it'll be optimized by a compiler, then we save one IF.
In my opinion, the most tricky part is to provide efficient iterators and packed ops as insert(interval), push_back(interval).