Boost.Asio as "thread-pool": How to apply back-pressure?

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
Hi,

I'm using Boost.ASIO, not for networking, but simply to parallelize work items,
with a simple graph of "processing nodes" as detailed below. It's working fine,
but uses too much memory. I'd like insights on limiting memory use via "throttling",
or what I've also seen called "back-pressure".

At a high level, I process two files (A and B) composed of several "entries" each,
extracting a subset or a transformation of those entries, that I then write into an output file (C).
(those 3 files reach into the many GBs, thus the need for parallelism, and limiting memory use).

Extracting entries from A and B are independent operations, implemented single-threaded,
producing independent work items (for subsetting or transforming each item), A#1...A#n, and B#1..B#m.
That's the "fan-out" part, with each work-item (task) scheduled on any thread of the ASIO pool, since independent.

Writing to C is also single-threaded, and needs to "fan-in" the work posted by the A#n and B#m functors,
and I serialize that via posting to a C-specific strand (still on any thread, as long as serialized, doesn't matter).
Lets call all those tasks writing to C the C#n+m tasks, which are posted to the strand via the A#s and B#s.

My issue is that Boost.Asio seems to schedule an awful lots of A# and B# tasks, before getting to the C# tasks,
which results in accumulating in memory too many C# tasks, and thus using too much memory.

I don't see a way to force more C "downstream" tasks to be scheduled, before processing so much A and B tasks,
and accumulating pending C tasks in the work queue, thus using too much memory again.

Could someone please recommend a way to have a more balanced flow of tasks in the graph?
Or alternative designs even, if what I do above is not ideal. Is Boost.Asio even suitable in this case?

Thanks, --DD


_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
On 09/02/2021 13:19, Dominique Devienne via Boost-users wrote:

> My issue is that Boost.Asio seems to schedule an awful lots of A# and B#
> tasks, before getting to the C# tasks,
> which results in accumulating in memory too many C# tasks, and thus
> using too much memory.
>
> I don't see a way to force more C "downstream" tasks to be scheduled,
> before processing so much A and B tasks,
> and accumulating pending C tasks in the work queue, thus using too much
> memory again.
>
> Could someone please recommend a way to have a more balanced flow of
> tasks in the graph?
> Or alternative designs even, if what I do above is not ideal. Is
> Boost.Asio even suitable in this case?

I think choosing a better framework for your use case would make your
life awfully easier. Grand Central Dispatch works very well on Mac OS
and FreeBSD, and is built into the system. The port of GCD (libdispatch)
to Linux is acceptable. On Windows, you want the Win32 thread pool,
totally different API, but does the same thing.

The difference with GCD like thread pools is firstly that they are
global across the whole system, managed systemically across all
processes by your OS kernel. Secondly, you can assign priority per work
item submitted, so in your case you would assign the highest priority to
the end-most work items, thus ensuring they get selected preferentially
for execution and therefore don't build up in memory. Thirdly, they come
with i/o integration, so there is a highly efficient integration between
your OS i/o reactor and the global whole-system thread pool.

Had ASIO been constructed after these became common, it would
undoubtedly have been designed around them. As it stands, some of us on
WG21 are hoping to target a Networking v2 design which is based around
GCD-type designs for C++ 26 or 29.

You may find the unfinished low level prototype platform abstraction of
these facilities at https://github.com/ned14/llfio/pull/68 useful to
study when designing your own integration. It works well on GCD and
Win32 thread pools. Its native Linux backend is not usable yet. I hope
to send it WG21's way for study before April.

Niall
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
In reply to this post by Boost - Users mailing list
On Tue, Feb 9, 2021 at 5:20 AM Dominique Devienne via Boost-users
<[hidden email]> wrote:
> Could someone please recommend a way to have a more balanced flow of tasks in the graph?
> Or alternative designs even, if what I do above is not ideal. Is Boost.Asio even suitable in this case?

Yes. Only queue A and B tasks when there are N or fewer C tasks
scheduled. Alternatively, run A and B each in their own thread and
block when there are more than N C tasks scheduled.

Regards
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
In reply to this post by Boost - Users mailing list
On Tue, Feb 9, 2021 at 3:56 PM Niall Douglas via Boost-users <[hidden email]> wrote:
On 09/02/2021 13:19, Dominique Devienne via Boost-users wrote:

> My issue is that Boost.Asio seems to schedule an awful lots of A# and B#
> tasks, before getting to the C# tasks, [...], and thus using too much memory.
 
I think choosing a better framework for your use case would make your
life awfully easier. Grand Central Dispatch works very well on Mac OS
and FreeBSD, and is built into the system. The port of GCD (libdispatch)
to Linux is acceptable. On Windows, you want the Win32 thread pool,
totally different API, but does the same thing.

Thanks for your answer Niall. But sounds to me that what you are proposing
is a large project in and of itself, versus tweaking a portable Boost.Asio-based
existing program, to use less memory.

I agree abusing Boost.Asio is not ideal here, although it works OK enough,
and I'm looking for more pragmatic and immediately applicable advise, if possible.

I've thought myself of a few approaches, but none seem very appealing, and
some likely would stall / block some task (via cond-vars for example), starving
ASIO of some of its threads, so less than ideal.

Surely I'm not the only one who ever tried something like this, no?

You may find the unfinished low level prototype platform abstraction of
these facilities at https://github.com/ned14/llfio/pull/68 useful to
study when designing your own integration. It works well on GCD and
Win32 thread pools. Its native Linux backend is not usable yet. I hope
to send it WG21's way for study before April.

I'll have a look, out of curiosity. But I need Windows/Linux portability,
and as outlined above something a bit more approachable for a mere mortal like me.

Thanks, --DD

_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
On 09/02/2021 15:13, Dominique Devienne via Boost-users wrote:

>     I think choosing a better framework for your use case would make your
>     life awfully easier. Grand Central Dispatch works very well on Mac OS
>     and FreeBSD, and is built into the system. The port of GCD (libdispatch)
>     to Linux is acceptable. On Windows, you want the Win32 thread pool,
>     totally different API, but does the same thing.
>
>
> Thanks for your answer Niall. But sounds to me that what you are proposing
> is a large project in and of itself, versus tweaking a portable
> Boost.Asio-based
> existing program, to use less memory.

I agree that the learning curve for a new API will be quite steep
initially. But GCD's API is quite nicely designed, it's intuitive, and
it "just works".

I can't say that the Win32 thread API is as nicely designed. It *is*
very flexible and performant, but a lot of it is "non-obvious" relative
to GCD's API design.

On the other hand, if you implement a GCD based implementation, you'll
#ifdef in a Win32 thread API implementation quite easily.

> I've thought myself of a few approaches, but none seem very appealing, and
> some likely would stall / block some task (via cond-vars for example),
> starving
> ASIO of some of its threads, so less than ideal.
>
> Surely I'm not the only one who ever tried something like this, no?

This is the beauty of GCD-like implementations. You don't need to think
about load nor scheduling, except in the highest level. You just feed
work to GCD, tell it the priority for each work item, GCD figures out
how best to execute it given your system's *current* conditions.

In other words, if half your CPUs are currently at 100%, GCD _only_
occupies the other half of your CPUs. It automatically scales up, or
down, concurrency according to your workload e.g. if your work item
stalls in a sleep, or a syscall, GCD automatically notices and increases
concurrency. If there are too many work items currently running for
system resources, GCD automatically notices and decreases concurrency.

This is very nice, and getting ASIO to do the same is a lot of extra work.

>     You may find the unfinished low level prototype platform abstraction of
>
>     these facilities at https://github.com/ned14/llfio/pull/68
>     <https://github.com/ned14/llfio/pull/68> useful to
>     study when designing your own integration. It works well on GCD and
>     Win32 thread pools. Its native Linux backend is not usable yet. I hope
>     to send it WG21's way for study before April.
>
>
> I'll have a look, out of curiosity. But I need Windows/Linux portability,
> and as outlined above something a bit more approachable for a mere
> mortal like me.

If you ignore the Linux native backend i.e. assume
LLFIO_DYNAMIC_THREAD_POOL_GROUP_USING_GCD=1 or _WIN32=1, you'll note
that the implementation reduces down to not much code at all. There is
approximately a one-one relationship between GCD and Win32 thread API calls.

My suggestion is concentrate on GCD, and use my code in the PR to
"translate" the GCD APIs to their equivalents in the Win32 thread API.

You may want to try out a little toy prototype to get a feel for the GCD
API. I think you'll find you like it. apt install libdispatch-dev should
do it. Apple have good API documentation for it on their website.

Niall
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
On 2/9/2021 10:59 AM, Niall Douglas via Boost-users wrote:

> On 09/02/2021 15:13, Dominique Devienne via Boost-users wrote:
>
>>      I think choosing a better framework for your use case would make your
>>      life awfully easier. Grand Central Dispatch works very well on Mac OS
>>      and FreeBSD, and is built into the system. The port of GCD (libdispatch)
>>      to Linux is acceptable. On Windows, you want the Win32 thread pool,
>>      totally different API, but does the same thing.
>>
>>
>> Thanks for your answer Niall. But sounds to me that what you are proposing
>> is a large project in and of itself, versus tweaking a portable
>> Boost.Asio-based
>> existing program, to use less memory.
> I agree that the learning curve for a new API will be quite steep
> initially. But GCD's API is quite nicely designed, it's intuitive, and
> it "just works".
>
> I can't say that the Win32 thread API is as nicely designed. It *is*
> very flexible and performant, but a lot of it is "non-obvious" relative
> to GCD's API design.
>
> On the other hand, if you implement a GCD based implementation, you'll
> #ifdef in a Win32 thread API implementation quite easily.
Is this something that can be done with libunifex, that Eric Niebler and
colleagues are working on?

https://github.com/facebookexperimental/libunifex

Damien
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
On 09/02/2021 18:13, Damien via Boost-users wrote:

>> I agree that the learning curve for a new API will be quite steep
>> initially. But GCD's API is quite nicely designed, it's intuitive, and
>> it "just works".
>>
>> I can't say that the Win32 thread API is as nicely designed. It *is*
>> very flexible and performant, but a lot of it is "non-obvious" relative
>> to GCD's API design.
>>
>> On the other hand, if you implement a GCD based implementation, you'll
>> #ifdef in a Win32 thread API implementation quite easily.
>
> Is this something that can be done with libunifex, that Eric Niebler and
> colleagues are working on?
>
> https://github.com/facebookexperimental/libunifex

To my best knowledge, LLFIO's dynamic_thread_pool_group is the first
attempt to create a portable standards aspiring API abstraction wrapping
all the major proprietary dynamic thread pool implementations.

That said, libunifex watches LLFIO closely, indeed they borrowed quite
heavily from LLFIO's non-public Windows async i/o abstraction, so it
would not surprise me if there has been a recent addition to libunifex
in this area (I haven't been able to keep up with libunifex since covid
began, to be honest).

Equally, FB only super cares (in production deployment terms) about
libunifex on Linux only. Other platforms aren't deployed in production.
As Linux lacks a kernel supported GCD implementation, for what they're
needing which is very high concurrency socket i/o on Linux, there isn't
a strong need for a GCD implementation.

I'll put this another way: you can make do without a GCD like
implementation if you're socket i/o bound, whereas a GCD like
implementation is ideal if you're compute bound. If you're file i/o
bound, traditionally one avoids a GCD like implementation like the
plague because of i/o congestion blowout, but proposed
llfio::dynamic_thread_pool_group is intended to prove to WG21 SG1 that
in fact GCD is great for file i/o, especially lots of memory mapped file
i/o. You just need an i/o load aware work item pacer.

(Future WG21 paper forthcoming will have lots of pretty graphs proving
this on all the major platforms)

Niall
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
In reply to this post by Boost - Users mailing list
On Tue, Feb 9, 2021, 10:20 AM Damien via Boost-users <[hidden email]> wrote:
On 2/9/2021 10:59 AM, Niall Douglas via Boost-users wrote:
> On 09/02/2021 15:13, Dominique Devienne via Boost-users wrote:
>
 
I find Niall's advice to be often wrong, unhelpful, and noisy. Just one data point for you to consider.

Regards

_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
In reply to this post by Boost - Users mailing list
On Tue, Feb 9, 2021 at 7:20 AM Dominique Devienne <[hidden email]> wrote:
> ...

At the end of the day, backpressure means that you limit the rate at
which you launch the dependee tasks (the A and B tasks). You can do
this two ways. Have the dependent task responsible for launching its
dependencies every time it completes, and some condition is met (for
example, that there are less than N dependent tasks active). Or, the
dependee task blocks until some condition is met (again, when there
are less than N dependent tasks active).

There is no need to resort to exotic libraries or operating system
facilities, this is a straightforward problem which can be solved
entirely using the C++11 standard library.

Thanks
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
In reply to this post by Boost - Users mailing list


On Tue, Feb 9, 2021 at 7:07 PM Vinnie Falco via Boost-users <[hidden email]> wrote:
On Tue, Feb 9, 2021, 10:20 AM Damien via Boost-users <[hidden email]> wrote:
On 2/9/2021 10:59 AM, Niall Douglas via Boost-users wrote:
> On 09/02/2021 15:13, Dominique Devienne via Boost-users wrote:
>
 
I find Niall's advice to be often wrong, unhelpful, and noisy. Just one data point for you to consider.

Honestly!

_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
In reply to this post by Boost - Users mailing list
On Tue, 9 Feb 2021 at 14:56, Niall Douglas via Boost-users
<[hidden email]> wrote:
> Had ASIO been constructed after these became common, it would
> undoubtedly have been designed around them. As it stands, some of us on
> WG21 are hoping to target a Networking v2 design which is based around
> GCD-type designs for C++ 26 or 29.

You are killing me. We got Networking delayed from C++20 to C++23
(hopefully) just to get it deprecated in C++26?
What does Chris think about this? He is arguably been open to changes
to Networking TS. Why don't you aim directly for such a "Networking
v2", maybe for C++26 (by now I don't mind waiting another 3 years),
and skip "Networking V1"?
I am not going to pretend to get the details, but I understand there
are a lot of more or less related moving parts (executors, io_uring,
GCD, continuations...). So if it gets delayed so be it, we already
have ASIO for the time being. But in 2021 work on trying to get
something in C++23 with the idea of deprecating it on C++26 sounds...
strange.
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list


On Wed, 10 Feb 2021 at 04:27, Cristian Morales Vega via Boost-users <[hidden email]> wrote:

What does Chris think about this?

Chris has been using Asio in production code for something like 15 years. It processes billions of transactions per day in financial exchanges around the world.

It is demonstrably excellent at what it was designed to do, which is to process a large number of messages while fairly sharing communications resources across all connections.

From my point of view the meddling by WG21 is most unwelcome as it serves to destabilise the interface of Asio, which is confusing for users of Asio and Beast, and also implementers of dependent libraries - who I regularly coach. The jumps from boost 1.69 to 1.70 (executors TS) and from 1.73 to 1.74 (unified executors) have been particularly problematic.

As far as I am able to tell from attending some of the meetings, the motivation for changes amongst certain actors in WG21 seems to me to be driven by either malice or willful ignorance of the impact on the user community.

Of course like many boost libraries, it should already have been standardised without the butchering. 

As things stand, it seems to me that the whole process of standardising networking is so infected with self interest and externalisation of costs, that the enterprise is probably doomed. No doubt this is intended by a small cadre of participants.


 
He is arguably been open to changes
to Networking TS. Why don't you aim directly for such a "Networking
v2", maybe for C++26 (by now I don't mind waiting another 3 years),
and skip "Networking V1"?
I am not going to pretend to get the details, but I understand there
are a lot of more or less related moving parts (executors, io_uring,
GCD, continuations...). So if it gets delayed so be it, we already
have ASIO for the time being. But in 2021 work on trying to get
something in C++23 with the idea of deprecating it on C++26 sounds...
strange.
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users


--
Richard Hodges
office: +442032898513
home: +376841522
mobile: +376380212


_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
In reply to this post by Boost - Users mailing list
On 10/02/2021 03:26, Cristian Morales Vega wrote:
> On Tue, 9 Feb 2021 at 14:56, Niall Douglas via Boost-users
> <[hidden email]> wrote:
>> Had ASIO been constructed after these became common, it would
>> undoubtedly have been designed around them. As it stands, some of us on
>> WG21 are hoping to target a Networking v2 design which is based around
>> GCD-type designs for C++ 26 or 29.
>
> You are killing me. We got Networking delayed from C++20 to C++23
> (hopefully) just to get it deprecated in C++26?

Nobody said anything about deprecation. "Orthogonal" would be a better
description, and you choose whichever Networking suits your problem
best. Same as iostreams being perfectly fine for 95% of file i/o, but
for some problems it's a really poor fitting solution.

> What does Chris think about this? He is arguably been open to changes
> to Networking TS. Why don't you aim directly for such a "Networking
> v2", maybe for C++26 (by now I don't mind waiting another 3 years),
> and skip "Networking V1"?

Chris thinks that what is broadly envisaged for Networking v2 isn't
particularly useful for the average C++ developer. It would suit people
with specialised needs only, in his opinion.

I would only have a weak disagreement with that opinion, it does make
sense, from a certain perspective, and it's certainly no showstopper for
me personally. I have always been a strong supporter that we only
standardise existing practice. My biggest objection to present
Networking v1 is how much has been changed through design by committee
from original ASIO. The present Networking v1 is very far away from
standard practice, and most of the changes have not been, in my opinion,
for the better. Traditional ASIO was much better, in my opinion.

Pre-covid we had expected that Networking would ship for 23. Many feel
that won't be possible now due to covid's impact on productivity, not
least because the Executors saga keeps on turning, and WG21 is about to
refactor it yet again, and that will have yet more knock on effects on
Networking which probably means it will slip to 26.

> I am not going to pretend to get the details, but I understand there
> are a lot of more or less related moving parts (executors, io_uring,
> GCD, continuations...). So if it gets delayed so be it, we already
> have ASIO for the time being. But in 2021 work on trying to get
> something in C++23 with the idea of deprecating it on C++26 sounds...
> strange.

ASIO is a throughput maximising based design designed around latency
unpredictable i/o. If you have that problem to solve (e.g. HTTP servers
facing public internet), it is probably the best *portably available*
design currently known.

If you have other needs, then the ASIO design becomes increasingly less
appropriate. Most of the concerted opposition to Networking v1 comes
from the tech multinationals on WG21 because they care most about
non-public intra datacenter networking mixed with parallelised compute
within strictly bounded response times, for which using ASIO is
substantially below what's achievable, hence Facebook funding libunifex,
or Apple funding GCD, or Google funding Executors, or Netflix funding
BSD pthread pools, and so on. A lot of the rage and anger from certain
members of the C++ community has been disproportionately directed at a
tiny subset of those opposing Networking v1 on WG21, what they have not
realised is how much broader based, but also much weaker than is
supposed, that opposition actually is.

Everybody I know of is trying their best to support and aid Chris in
getting Networking v1 done, even if they personally would not be
strongly in favour of what it has mutated into since it started
standardisation. Most of that mutation has been imposed by committee,
very little of it with preexisting direct empirical experience, and it's
absolutely no fun to be on the receiving side of that. He's done a ton
of work to get it this far, and absolutely nobody wants Networking to
become the next Filesystem. I know I was certainly dismayed when it
began to look like the 23 ship date could slip to 26, and I can't think
of anyone else who wouldn't be dismayed as well.

Niall
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
In reply to this post by Boost - Users mailing list
On 10/02/2021 09:19, Richard Hodges via Boost-users wrote:

> As far as I am able to tell from attending some of the meetings, the
> motivation for changes amongst certain actors in WG21 seems to me to be
> driven by either malice or willful ignorance of the impact on the user
> community.

I think this too strong. People are sent by their employers to represent
their employer's interests on WG21. I can't think of a major tech
multinational whose representatives have not voiced serious concerns
about how poorly Networking maps onto their platform's proprietary
networking technologies, which is true, but equally very few of them
have been willing to fund a reference implementation which does improve
that mapping AND is completely portable to all the other platforms. They
have accepted that critique of their critiques, and everybody has
(mostly) moved on.

Most of what is delaying Networking in the past year or so has not been
directly related to Networking (that ship has sailed, the committee has
conclusively voted that Networking = ASIO). The present problems are the
concurrency model, specifically what and how and why Executors
ought/could/should/might be. That's the exact same problem which has
bedevilled the Networking proposal since its very beginning, but I want
to be absolutely clear that it isn't just Networking being roadblocked
by this. Several major proposals are blocked by Executors.

I don't think malice nor wilful ignorance is behind the churn in
Executors. Rather, if WG21 gets it wrong, it's a huge footgun, so they
rightly won't progress it until its done. Everything which depends on it
is therefore blocked.

I would also remind everybody that there was an option to progress the
blocking only API of ASIO into the standard which could have been
achieved quickly, but Chris elected not to do so at that time. I think
he was right in that choice, had the blocking API been standardised, it
would be unlikely the async API would have been revisited quickly.

Niall
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list


On Wed, 10 Feb 2021 at 10:49, Niall Douglas via Boost-users <[hidden email]> wrote:
On 10/02/2021 09:19, Richard Hodges via Boost-users wrote:

> As far as I am able to tell from attending some of the meetings, the
> motivation for changes amongst certain actors in WG21 seems to me to be
> driven by either malice or willful ignorance of the impact on the user
> community.

I think this too strong. People are sent by their employers to represent
their employer's interests on WG21. I can't think of a major tech
multinational whose representatives have not voiced serious concerns
about how poorly Networking maps onto their platform's proprietary
networking technologies, which is true, but equally very few of them
have been willing to fund a reference implementation which does improve
that mapping AND is completely portable to all the other platforms. They
have accepted that critique of their critiques, and everybody has
(mostly) moved on.

Most of what is delaying Networking in the past year or so has not been
directly related to Networking (that ship has sailed, the committee has
conclusively voted that Networking = ASIO). The present problems are the
concurrency model, specifically what and how and why Executors
ought/could/should/might be. That's the exact same problem which has
bedevilled the Networking proposal since its very beginning, but I want
to be absolutely clear that it isn't just Networking being roadblocked
by this. Several major proposals are blocked by Executors.

It seems to me that what is holding up executors is the insistence on the sender/receiver nonsense.

The use case for which seems to be (from looking at the proposals) unmaintainable and unintelligible write-once multi-page compound statements describing what could easily be expressed in a simple coroutine.
 

I don't think malice nor wilful ignorance is behind the churn in
Executors. Rather, if WG21 gets it wrong, it's a huge footgun, so they
rightly won't progress it until its done. Everything which depends on it
is therefore blocked.

I would also remind everybody that there was an option to progress the
blocking only API of ASIO into the standard which could have been
achieved quickly, but Chris elected not to do so at that time. I think
he was right in that choice, had the blocking API been standardised, it
would be unlikely the async API would have been revisited quickly.

A blocking-only API after half a decade of pontificating would be a risible outcome, which would reflect even more poorly on the competence of an already distrusted standards committee. 

We already have a standardised blocking-only API in Berkeley Sockets. What on earth would be the point of choosing C++ as the language for your program only to have it spend all its time blocked on a socket, and dependent on IO-specific mechanisms for cancellation?

Such a thing does not belong in the standard at all.
 

Niall
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users


--
Richard Hodges
office: +442032898513
home: +376841522
mobile: +376380212


_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list
On 10/02/2021 11:54, Richard Hodges via Boost-users wrote:

>     Most of what is delaying Networking in the past year or so has not been
>     directly related to Networking (that ship has sailed, the committee has
>     conclusively voted that Networking = ASIO). The present problems are the
>     concurrency model, specifically what and how and why Executors
>     ought/could/should/might be. That's the exact same problem which has
>     bedevilled the Networking proposal since its very beginning, but I want
>     to be absolutely clear that it isn't just Networking being roadblocked
>     by this. Several major proposals are blocked by Executors.
>
>
> It seems to me that what is holding up executors is the insistence on
> the sender/receiver nonsense.
>
> The use case for which seems to be (from looking at the proposals)
> unmaintainable and unintelligible write-once multi-page compound
> statements describing what could easily be expressed in a simple coroutine.

It's not Sender-Receiver. The committee voted on that two years ago and
that's a done deal. Chris has bought into it, ASIO already ships with
support for it, that's water under the bridge.

No to do a horrible over simplification, the current problem is about
what a Scheduler ought to be, and what an Executor ought to be if a
Scheduler is split away from an Executor. They're currently rejigging
the execution policies around a Scheduler-Executor split, and that has
knock on consequences on anything which touches execution policies.

I'm not the right person to say any more detail on this. My opinions on
Executors are widely known (they are not favourable), and I don't keep
up to date with the latest from SG1 in this topic, mainly because doing
so gets me rather angry. If anybody listened to me on this, I'd just
plaster "implementation defined" over most of the entire execution thing
for now, let people actually build working implementations, then look
again later at standardising what emerges from that implementation
practice. I have failed to persuade anybody of that course of action.

> A blocking-only API after half a decade of pontificating would be a
> risible outcome, which would reflect even more poorly on the competence
> of an already distrusted standards committee. 
>
> We already have a standardised blocking-only API in Berkeley Sockets.
> What on earth would be the point of choosing C++ as the language for
> your program only to have it spend all its time blocked on a socket, and
> dependent on IO-specific mechanisms for cancellation?
>
> Such a thing does not belong in the standard at all.

Personally speaking, I would have plenty of time for a blocking API
which acts over Coroutines. So, it's blocking from the perspective of
the Coroutine, but it's actually a suspend-resume point and all the
socket i/o across many Coroutines gets multiplexed.

That solution is no ASIO of course, but you could proceed with
standardising it without needing to refer to Executors in any way i.e.
timely progress would be possible.

Niall
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Boost.Asio as "thread-pool": How to apply back-pressure?

Boost - Users mailing list


On Wed, 10 Feb 2021 at 13:34, Niall Douglas via Boost-users <[hidden email]> wrote:
On 10/02/2021 11:54, Richard Hodges via Boost-users wrote:

>     Most of what is delaying Networking in the past year or so has not been
>     directly related to Networking (that ship has sailed, the committee has
>     conclusively voted that Networking = ASIO). The present problems are the
>     concurrency model, specifically what and how and why Executors
>     ought/could/should/might be. That's the exact same problem which has
>     bedevilled the Networking proposal since its very beginning, but I want
>     to be absolutely clear that it isn't just Networking being roadblocked
>     by this. Several major proposals are blocked by Executors.
>
>
> It seems to me that what is holding up executors is the insistence on
> the sender/receiver nonsense.
>
> The use case for which seems to be (from looking at the proposals)
> unmaintainable and unintelligible write-once multi-page compound
> statements describing what could easily be expressed in a simple coroutine.

It's not Sender-Receiver. The committee voted on that two years ago and
that's a done deal. Chris has bought into it, ASIO already ships with
support for it, that's water under the bridge.

Chris included sender-receive into asio out of politeness as far as I can tell. 
I asked him about use cases on two occasions. One during an executors
round-table and another time privately. I won't put any words in 
his mouth, because there weren't any.

The Committee certainly has not "bought into it". Discussions continue, with
Ville (IIRC) recently (correctly) suggesting that it be hived off into its own paper.

What is actually happening is that a vociferous minority interest group is 
pushing its own idea of what is fanciful, without regard for the wider user 
base. 
 

No to do a horrible over simplification, the current problem is about
what a Scheduler ought to be, and what an Executor ought to be if a
Scheduler is split away from an Executor. They're currently rejigging
the execution policies around a Scheduler-Executor split, and that has
knock on consequences on anything which touches execution policies.

I'm not the right person to say any more detail on this. My opinions on
Executors are widely known (they are not favourable), and I don't keep
up to date with the latest from SG1 in this topic, mainly because doing
so gets me rather angry. If anybody listened to me on this, I'd just
plaster "implementation defined" over most of the entire execution thing
for now, let people actually build working implementations, then look
again later at standardising what emerges from that implementation
practice. I have failed to persuade anybody of that course of action.

> A blocking-only API after half a decade of pontificating would be a
> risible outcome, which would reflect even more poorly on the competence
> of an already distrusted standards committee. 
>
> We already have a standardised blocking-only API in Berkeley Sockets.
> What on earth would be the point of choosing C++ as the language for
> your program only to have it spend all its time blocked on a socket, and
> dependent on IO-specific mechanisms for cancellation?
>
> Such a thing does not belong in the standard at all.

Personally speaking, I would have plenty of time for a blocking API
which acts over Coroutines. So, it's blocking from the perspective of
the Coroutine, but it's actually a suspend-resume point and all the
socket i/o across many Coroutines gets multiplexed.


This is what Asio gives us today through a trivial replacement of 
a completion token, and it is manifestly _not_ a blocking API if 
it involves coroutines.


That solution is no ASIO of course, but you could proceed with
standardising it without needing to refer to Executors in any way i.e.
timely progress would be possible.

Asio with current asio executors works pretty well, is nicely flexible and 
works extremely well with coroutines calling asynchronously 
between disparate executor types. I have some demos of this in my 
C++ Alliance blog.

Fancy words mean nothing Niall, what is at stake here is C++ being useful 
for networking in server applications out of the box. A perfectly acceptable,
ubiquitous, production tested, solution was proposed by one brilliant 
concerned individual. Then a hoard of untalented engineers mobbed
the agenda to push something no-one else wants.
 

Niall
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users


--
Richard Hodges
office: +442032898513
home: +376841522
mobile: +376380212


_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Networking @ WG21 (was: Re: Boost.Asio as "thread-pool": How to apply back-pressure?)

Boost - Users mailing list
NOTE: This is becoming off topic for boost-users.

On 10/02/2021 13:10, Richard Hodges wrote:

> Chris included sender-receive into asio out of politeness as far as I
> can tell.

No, the committee took a vote and said they were to be supported. As a
proposal champion, you now have a choice between (a) abandon your
proposal (b) write a paper proposing an alternative implementation of
the committee's choice into a least worst formulation (c) adopt them as
understood when the vote was taken.

Chris chose (b) for Sender-Receiver, the committee accepted his
alternative formulation, we all moved on.

> The Committee certainly has not "bought into it". Discussions continue, with
> Ville (IIRC) recently (correctly) suggesting that it be hived off into
> its own paper.

Until the committee takes a vote changing direction, preceding votes are
all that matters. Suggestions by any individual, even one as esteemed as
Ville, are merely ideas until voted upon.

> What is actually happening is that a vociferous minority interest group is 
> pushing its own idea of what is fanciful, without regard for the wider user 
> base.

That's not how things work. There is, by definition, a small group of
domain experts in any particular topic on WG21. Of that group of domain
experts, I would say there is an overwhelming majority dissatisfied with
the current proposal. What the committee ends up adopting will always be
somewhere in the middle where a consensus could be reached, and
everybody fairly equally dislikes the proposal.

From the outside, it looks like only a few dozen people ever do
anything, but internally one tends to trust the opinions of the domain
experts and follow their leads on a particular topic. Me personally, I
have been consistent in wanting ASIO standardised as-was but for it to
not monopolise the whole of the Networking, Concurrency, Execution and
Parallelism space i.e. it should "plug into" everything else, but
otherwise be left alone. So looking at the current proposals, I've got
much of what I'd prefer, but not all of it. I am extremely sure Chris is
in a very similar position, as is everybody on WG21. Consensus building
means those sort of outcome.

The biggest problem with consensus building is it leads to overwhelming
complexity in order to please enough people. The whole Executors and
Networking space as currently proposed is hideously more complex than it
could be, and that will only get worse as WG21 grows in number.

> Fancy words mean nothing Niall, what is at stake here is C++ being useful 
> for networking in server applications out of the box. A perfectly
> acceptable,
> ubiquitous, production tested, solution was proposed by one brilliant 
> concerned individual.

You don't need to keep making things personal with me or others. I have
no axe to grind with you. I'm taking the time to inform you that what
you think is the situation may not be the situation. Of course it's all
opinions and interpretations, and if Ville or Chris were doing this
they'd have quite different opinions and interpretations to mine. But
I'd like to think that mine are not a wildly inaccurate interpretation.
Just different than some echo chambers that I suspect you spend a lot of
time within.

As I've often said, ASIO is great for what it was designed for. But it's
not great as you get away from that. At work we run our custom database
inside an isolated segment of AWS where S3 performance is highly
predictable and the network is entirely ours. We currently use ASIO
running over a thread pool. The AWS instance has a 25Gbps network
connection, we currently transfer a few dozen Mb per S3 request, so
that's about a 1000-5000 us latency which is easily lost in the ASIO
latency noise. However in the near future we shall be making 100x more
requests at 100x smaller S3 requests. Now the i/o latency becomes as low
as 10-50 us, and now replacing ASIO with something with stronger
guarantees starts to look profitable.

I'm not saying that we shall actually replace ASIO, it's all wider cost
benefit in the end. I am saying that ASIO's design is not as ideal as
others for bounded latency i/o of lots of small packets where something
like libunifex has a much more appropriate design for that particular
use case (and indeed, this is exactly what FB intends libunifex for).

> Then a hoard of untalented engineers mobbed
> the agenda to push something no-one else wants.

Original ASIO had a very incompatible conceptualisation of concurrency
and parallelism to where WG21, especially SG1, is currently at and
indeed where it intends to go. Reconciling the two conceptualisations
was never going to be easy, indeed a large part of original opposition
to ASIO becoming Networking was precisely ASIO's strong and fixed
opinions on how concurrency and parallelism ought be to framed.

The "hoard of untalented engineers" has a wealth of domain experience in
a far wider range of computing systems, ones that ASIO's original
concurrency model would not work well with e.g. ASIO's original model
would not work at all well on a GPU, or on a thousand CPU machine, or
non cache coherent systems, and so on. They are trying to generify that
model into something less fixed.

I may not agree with the outcomes, but I agree with the intent and
spirit and goodwill behind the effort. And I most definitely consider
the engineering talent involved to be world class. Some of the people
who have contributed to this know more about atomics than quite
literally anybody else on the planet, for example. And they're not the
exception in there, that's the level of talent common in WG21.

Just because they're not doing things the way you want in a timeframe
you want makes them untalented or pushing things no one else wants.
They're doing their jobs, representing their employer's interests, and
there is a convincing argument that those employer interests point
broadly in the direction travelled to date in this area by WG21.
Remember after all who pays for WG21 to operate, and thus by definition
it is their interests served first before all others.

Niall
_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users