Proposal for a thread synchronisation primitive: future_queue

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Proposal for a thread synchronisation primitive: future_queue

Boost - Dev mailing list
Dear boost developers,

I want to propose a thread synchronisation primitive, which works like a mixture of (lockfree) queues and futures. Therefore I gave it the name "future_queue". It is basically a lockfree queue which can be waited on for new values (without busy waiting), but it has some nice extra features like when_any() and continuations.

The github project can be found here:
https://github.com/mhier/future_queue 

It already contains a README file with a short description, an example and results of a first performance measurement. Further documentation so far exists only inside the code:
https://github.com/mhier/future_queue/blob/master/include/future_queue.hpp 

My first question is, whether this is really something new (I might have overlooked something in the many boost libraries) and if it makes sense to integrate it into the boost libraries.

Another question would be whether this should really be its own library, since it is basically a single class (with a few helper classes) and a single non-member function. Maybe it would be better to make it part of an existing library?

I am happy to receive some first feedback!

Cheers,
Martin


--
Martin Hierholzer
DESY
Notkestrasse 85
22607 Hamburg
Office 55a / 123
Tel: +49 40 8998 1897
Mob: +49 176 992 46 476
Skype: mhier48

This e-mail may contain confidential and/or privileged information and is intended only for the use of the individual or entity named above. If you received it in error, please advise the sender immediately by reply e-mail and delete the original. Any further use - especially any form of publication - of this e-mail without explicit authorisation is strictly prohibited.


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for a thread synchronisation primitive: future_queue

Boost - Dev mailing list
I like the idea. Btw, why isn't the has_data() named empty() to conform
with stl containers?
Also it would be nice to have the member functions documented in the readme
also. (I might be missing something though).

/Viktor


On Tue, Apr 24, 2018 at 9:13 AM Martin Hierholzer via Boost <
[hidden email]> wrote:

> Dear boost developers,
>
> I want to propose a thread synchronisation primitive, which works like a
> mixture of (lockfree) queues and futures. Therefore I gave it the name
> "future_queue". It is basically a lockfree queue which can be waited on for
> new values (without busy waiting), but it has some nice extra features like
> when_any() and continuations.
>
> The github project can be found here:
> https://github.com/mhier/future_queue
>
> It already contains a README file with a short description, an example and
> results of a first performance measurement. Further documentation so far
> exists only inside the code:
> https://github.com/mhier/future_queue/blob/master/include/future_queue.hpp
>
> My first question is, whether this is really something new (I might have
> overlooked something in the many boost libraries) and if it makes sense to
> integrate it into the boost libraries.
>
> Another question would be whether this should really be its own library,
> since it is basically a single class (with a few helper classes) and a
> single non-member function. Maybe it would be better to make it part of an
> existing library?
>
> I am happy to receive some first feedback!
>
> Cheers,
> Martin
>
>
> --
> Martin Hierholzer
> DESY
> Notkestrasse 85
> 22607 Hamburg
> Office 55a / 123
> Tel: +49 40 8998 1897 <+49%2040%2089981897>
> Mob: +49 176 992 46 476 <+49%20176%2099246476>
> Skype: mhier48
>
> This e-mail may contain confidential and/or privileged information and is
> intended only for the use of the individual or entity named above. If you
> received it in error, please advise the sender immediately by reply e-mail
> and delete the original. Any further use - especially any form of
> publication - of this e-mail without explicit authorisation is strictly
> prohibited.
>
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for a thread synchronisation primitive: future_queue

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
On 25/04/2018 04:01, Martin Hierholzer wrote:
> I want to propose a thread synchronisation primitive, which works
> like a mixture of (lockfree) queues and futures. Therefore I gave it
>  the name "future_queue". It is basically a lockfree queue which can
> be waited on for new values (without busy waiting), but it has some
> nice extra features like when_any() and continuations.
>
> The github project can be found here:
> https://github.com/mhier/future_queue

Looks like a nice concept at first glance (I haven't looked at the
implementation), although some of those known issues (particularly the
limited lifespan) would probably need to be solved before it could be
accepted.

> My first question is, whether this is really something new (I might
have overlooked something in the many boost libraries) and if it makes
sense to integrate it into the boost libraries.

I don't believe there is currently any equivalent, and it is a hole that
would be nice to fill.

It's not a completely new concept, of course; one of the original future
proposals included a future_stream that was essentially a recursive
future -- whenever it completed it produced both a value and another
future for the next value.  This didn't make it into Boost (or the
Standard) in the end, though I assume many people have reimplemented
something similar on their own.

Having said that, the usual thinking on implementing blockable lockfree
queues doesn't involve futures per se, but rather focuses on using an
"eventcount" to provide the blocking behaviour as a wrapper around the
core data structure itself, so it can be used with multiple different
kinds of queues -- lockfree is usually a balancing act between
tradeoffs, so different use cases benefit most from different structures.


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for a thread synchronisation primitive: future_queue

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
Dear Viktor,

thank you very much for your feedback!

> I like the idea. Btw, why isn't the has_data() named empty() to conform with stl
> containers?

Good point. I just changed this.

> Also it would be nice to have the member functions documented in the readme
> also. (I might be missing something though).

I have now added a second .md file containing the reference documentation of all member and non-member functions. It is linked in the README.md file.

Cheers,
Martin

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for a thread synchronisation primitive: future_queue

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
Dear Gavin,

thank you, too, for your nice feedback!

> Looks like a nice concept at first glance (I haven't looked at the
> implementation), although some of those known issues (particularly the
> limited lifespan) would probably need to be solved before it could be
> accepted.

It clearly is not yet ready, I know this. I just want to get feedback as early as possible to steer into the right direction from the beginning.


> It's not a completely new concept, of course; one of the original future
> proposals included a future_stream that was essentially a recursive
> future -- whenever it completed it produced both a value and another
> future for the next value.  This didn't make it into Boost (or the
> Standard) in the end, though I assume many people have reimplemented
> something similar on their own.

This sounds like a very similar concept indeed, but I cannot find anything about this on the web. Maybe the future_queue is a bit different in the sense that one can have multiple ready values in the queue, just like in any normal fifo queue.


> Having said that, the usual thinking on implementing blockable lockfree
> queues doesn't involve futures per se, but rather focuses on using an
> "eventcount" to provide the blocking behaviour as a wrapper around the
> core data structure itself, so it can be used with multiple different
> kinds of queues -- lockfree is usually a balancing act between
> tradeoffs, so different use cases benefit most from different structures.

I know that, but I think with the future_queue approach one can do much more (and is also simpler to use since you don't have to bother with wrapping the queue and dealing with counters). I think the important features are continuations, when_any and (today added) when_all. With these one can build something like "processing pipelines" even without having to explicitly care about the creation of threads. I will prepare a more advanced example showing better what I mean. I will let you know when it's ready.

Cheers,
Martin

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for a thread synchronisation primitive: future_queue

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
Hi again,

I have updated the README with a second, bigger example showing also the use of continuations, when_all and when_any.


> Having said that, the usual thinking on implementing blockable lockfree
> queues doesn't involve futures per se, but rather focuses on using an
> "eventcount" to provide the blocking behaviour as a wrapper around the
> core data structure itself, so it can be used with multiple different
> kinds of queues -- lockfree is usually a balancing act between
> tradeoffs, so different use cases benefit most from different structures.

On a second thought, maybe I could even change the implementation of the future_queue to internally use e.g. the well-known boost::lockfree queues together with some semaphore to implement the blocking behaviour. I am not 100% sure yet, but I think all current features could be implemented by that. Do you think this would improve the implementation? It might be a bit simpler and maybe even more efficient in some cases.

Cheers,
Martin

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for a thread synchronisation primitive: future_queue

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
On 27/04/2018 05:12, Martin Hierholzer wrote:
> This sounds like a very similar concept indeed, but I cannot find
> anything about this on the web. Maybe the future_queue is a bit
> different in the sense that one can have multiple ready values in
> the queue, just like in any normal fifo queue.

The recursive futures support that too, in the sense that when it
returns a "next" future it might already have a completed value which
can be extracted synchronously without blocking.

There's an example in this post
https://lists.boost.org/Archives/boost/2007/03/118742.php.

Of course, it relies on futures, shared_ptrs, and memory allocations,
which might be too heavyweight for some usages.  But it's also trivial
to implement around those existing types, which is probably why it
didn't end up making it into the standard as a separate type.

> On a second thought, maybe I could even change the implementation of
> the future_queue to internally use e.g. the well-known
> boost::lockfree queues together with some semaphore to implement the
> blocking behaviour. I am not 100% sure yet, but I think all current
> features could be implemented by that. Do you think this would
> improve the implementation? It might be a bit simpler and maybe even
> more efficient in some cases.

I would love to see a portable eventcount
(http://www.1024cores.net/home/lock-free-algorithms/eventcounts)
implementation in Boost.Lockfree.


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for a thread synchronisation primitive: future_queue

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
>> Having said that, the usual thinking on implementing blockable
>> lockfree queues doesn't involve futures per se, but rather focuses
>> on using an "eventcount" to provide the blocking behaviour as a
>> wrapper around the core data structure itself, so it can be used
>> with multiple different kinds of queues -- lockfree is usually a
>> balancing act between tradeoffs, so different use cases benefit
>> most from different structures.
>
> On a second thought, maybe I could even change the implementation of
> the future_queue to internally use e.g. the well-known
> boost::lockfree queues together with some semaphore to implement the
> blocking behaviour. I am not 100% sure yet, but I think all current
> features could be implemented by that. Do you think this would
> improve the implementation? It might be a bit simpler and maybe even
> more efficient in some cases.

one point to consider: even if the queue is lockfree, the semaphore
operations may internally acquire locks (maybe not in user-space, but in
the OS) ...

i've double checked the kernel-space implementations on linux and darwin
some time ago: posting/notifying a semaphore is not lockfree on these
platforms, as some code paths involve (spin)locks. so if lockfree
behavior is required to produce values from a real-time context, one may
have to poll a lockfree queue.

cheers,
tim



_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for a thread synchronisation primitive: future_queue

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
>> On a second thought, maybe I could even change the implementation of
>> the future_queue to internally use e.g. the well-known
>> boost::lockfree queues together with some semaphore to implement the
>> blocking behaviour. I am not 100% sure yet, but I think all current
>> features could be implemented by that. Do you think this would
>> improve the implementation? It might be a bit simpler and maybe even
>> more efficient in some cases.
>
> I would love to see a portable eventcount
> (http://www.1024cores.net/home/lock-free-algorithms/eventcounts)
> implementation in Boost.Lockfree.

i'm not sure, if boost.lockfree is a good place for it (is it even
possible to implement them without relying on system calls, which may
acquire locks in kernel-space?)


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for a thread synchronisation primitive: future_queue

Boost - Dev mailing list
On 9/05/2018 17:30, Tim Blechmann wrote:

>>> On a second thought, maybe I could even change the implementation of
>>> the future_queue to internally use e.g. the well-known
>>> boost::lockfree queues together with some semaphore to implement the
>>> blocking behaviour. I am not 100% sure yet, but I think all current
>>> features could be implemented by that. Do you think this would
>>> improve the implementation? It might be a bit simpler and maybe even
>>> more efficient in some cases.
>>
>> I would love to see a portable eventcount
>> (http://www.1024cores.net/home/lock-free-algorithms/eventcounts)
>> implementation in Boost.Lockfree.
>
> i'm not sure, if boost.lockfree is a good place for it (is it even
> possible to implement them without relying on system calls, which may
> acquire locks in kernel-space?)

Even the Boost.LockFree queue may not always be lock-free -- eg. if you
push more nodes than the queue's capacity (and it's not fixed_sized) it
will allocate more, which will often take a Big System Lock (though
modern allocators try to use smaller less-contended locks to reduce the
impact).

And these things are usually implemented with a fast-path that does not
perform any kernel operations, and a slow-path that does (along with the
hope that the slow-path is not triggered too often).

As soon as you start talking about wanting to block, you have to involve
the kernel (or a userspace scheduler).  This is a choice to gain some
convenience while sacrificing some "lock-free-ness", just like the
choice to use a non-fixed queue.

The basic "lock-free but blocking" expected behaviour is:

   1. If pushing but the queue is full, then block until it isn't
(blocking path)
   2. If a popper is waiting, push and then wake them up (slow-path)
   3. Otherwise just push without kernel operations (fast-path)

And similarly on the consuming end for an empty queue.  (And a
non-fixed-size queue might omit step #1 entirely, though still end up
blocking a little if it hits an allocator.)

The main trick is in the wakeups (and reliably knowing whether someone
is waiting); you can never really avoid making kernel calls to wake up
another thread, but you want to do so only when actually needed (erring
on the side of doing an unnecessary wakeup rather than never doing a
wakeup), and you need to use constructs that (while perhaps not
completely lock-free) are at least wait-free.

Semaphores and Win32 events are usually good candidates for this,
because the scheduler generally treats them as atomic operations as far
as user code is concerned.  Traditional mutexes (and condition
variables, since they rely on those) are bad candidates because the
thread might be preempted while a lock is held, which prevents forward
progress of another thread.

But you also usually have to be careful to use the "true"
synchronisation primitives of the platform, not emulations defined in
libraries, because those might use locks or other constructs that are
problematic.  When you're implementing a platform-specific abstraction,
it's one of the rare cases that it's not a good idea to depend on other
abstractions.


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for a thread synchronisation primitive: future_queue

Boost - Dev mailing list
----- Original Message -----
> From: "boost" <[hidden email]>
> To: "boost" <[hidden email]>
> Cc: "Gavin Lambert" <[hidden email]>
> Sent: Wednesday, 9 May, 2018 08:43:34
> Subject: Re: [boost] Proposal for a thread synchronisation primitive: future_queue

> Even the Boost.LockFree queue may not always be lock-free

Even std::atomic is not guaranteed to be lock free, see std::atomic::is_lock_free(). I think it is mainly important to specify precisely when to expect lock-free behaviour and when locks might internally occur.


> But it's also trivial
> to implement around those existing types, which is probably why it
> didn't end up making it into the standard as a separate type.

Sure it is possible to combine certain primitives to more complex primitives and users can always do this themselves. Still, my proposal goes beyond just a combination of an eventcount/semaphore and a lockfree queue. Just like futures it knows continuations, when_all/when_any and similar things. If you say it's too trivial to put this into a fundamental library, one could say a future is, too. A future also is just a combination of a value and a semaphore. Yet it is a widely used primitive everyone is happy to have.

My actual use case for the future_queue is a class of applications which can be described maybe as a very complex, multi-threaded soft PLC (programmable logic controller) application. For now we have only soft realtime requirements which are even met by the current spsc_queue<future> construction. We have many threads running (like 100 or 200), basically one thread per task or action. This simplifies the program structure a lot, since we have independent modules with each one thread. This may sound crazy, but actually the performance is much better than I originally anticipated (sleeping threads don't cost performance, only context switches are important).

The future_queue would help to implement constructions like this much easier. Tasks can be implemented as continuations. Since we have when_all() and when_any(), which can be continued on their own, one can even create tasks which use multiple future_queues as an input.

Maybe this explains a bit better, what I am aiming at here?


> I would love to see a portable eventcount
> (http://www.1024cores.net/home/lock-free-algorithms/eventcounts)
> implementation in Boost.Lockfree.

I would anyway offer to work on a portable semaphore, but I am afraid I don't have much experience with platform independent implementations. I will probably soon extend my simple semaphore class from the future_queue library to work with other values than 0 or 1 (since I need that for a better implementation of the future_queue) and I can also try to make a good implementation for Mac OS X, if this helps you. I don't really have access to development tools on Windows any more and basically no experience on other platforms. Maybe someone else can add more implementations?

Cheers,
Martin

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Proposal for a thread synchronisation primitive: future_queue

Boost - Dev mailing list
>> I would love to see a portable eventcount
>> (http://www.1024cores.net/home/lock-free-algorithms/eventcounts)
>> implementation in Boost.Lockfree.
>
> I would anyway offer to work on a portable semaphore, but I am afraid
> I don't have much experience with platform independent
> implementations. I will probably soon extend my simple semaphore
> class from the future_queue library to work with other values than 0
> or 1 (since I need that for a better implementation of the
> future_queue) and I can also try to make a good implementation for
> Mac OS X, if this helps you. I don't really have access to
> development tools on Windows any more and basically no experience on
> other platforms. Maybe someone else can add more implementations?

https://github.com/boostorg/sync/blob/develop/include/boost/sync/semaphore.hpp

unfortunately we never got it finished/reviewed. iirc obtaining the
semaphore count doesn't really map to all OS apis ...

as for macos iirc the different semaphore apis have quite different
performance characteristics (iirc unnamed semaphores were not
implemented while dispatch semaphores had the best throughput).


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost