* On small data sets CPU algorithm would work faster than a GPGPU. How about implicitly dispatching algo on CPU in those cases? - The call on whether to execute the algorithm should be left up to the user. While library author agrees that this would be a useful feature, he just
don't think Compute is the right place for that logic.
* How about providing way to do chains of async operations
- This is a big task that will be solved some day.
* How about providing Boost.ASIO like error handling via throw and error_code
- Implementing an approach like ASIO's wouldn't be that difficult.
Thanks to all the reviewers for spending their time and providing useful comments so far!