Richard's March Update
Mar 31, 2020Coding in the time of a Pandemic
It has been an interesting month, there having been the minor distraction of a lockdown of our little country. The borders with Spain and France were closed about three weeks ago and all residents have been asked to stay at home other than to buy groceries or walk their dogs. Fortunately I have dogs so I at least have a legitimate reason to see the sun.
One of the advantages of living in a tiny country is that the government has been able to secure the supply of 150,000 COVID-19 testing kits, which represents two tests per resident. They are also working on supplying every resident with masks for use when shopping. I am hoping to report in my next blog that we are allowed outside subject to a negative test and the wearing of a mask and gloves.
Fortunately, until today, our internet has been uninterrupted. Communication with my friends and colleagues at the C++ Alliance and the wider developer community has continued.
Boost Release
The Boost 1.73 release is imminent. Thus much of my focus in the latter half of the month has been on addressing any remaining issues in Beast that represent an easy win in terms of demonstrating progress between releases.
This brings to a close my first quarter as a maintainer of the Beast library. I would have liked to have produced more in terms of feature development and architectural improvements, but a few interesting things came up which delayed this; some of which I will share with you here.
(Possibly) Interesting Asio Things
To say that Boost.Beast has a strong dependency on Boost.Asio would be an understatement. It should therefore come as no surprise that the Beast team spend a lot of time working with Asio and (certainly in my case) a lot of time working to understand the internals.
We had cause to reach out to Chris Kohlhoff, Asio’s author, on two occasions in recent
times. If you read my February blog you would have seen the issues we have faced with the
DynamicBuffer
concept. This month it was about the thread-safety of composed operations and
IO objects.
But first, the result of a question I asked myself:
Is it possible to write an asynchronous composed operation entirely as a lambda?
In short, if you’re using c++14 or better, the answer is happily yes!
Here is the smallest program I could think of:
a: Implemented asynchronously
b: Targeting a POSIX system (just because I happen to know more about POSIX than Windows)
This program simply copies the contents of stdin
to stdout
:
int
main()
{
asio::io_context ioc;
auto exec = ioc.get_executor();
auto in = asio::posix::stream_descriptor(exec, ::dup(STDIN_FILENO));
auto out = asio::posix::stream_descriptor(exec, ::dup(STDOUT_FILENO));
async_copy_all(in, out, [](auto&& ec, auto total){
std::cout << "\ntransferred " << total << " bytes\n";
if (ec.failed())
{
std::cerr << "transfer failure: " << ec.message() << std::endl;
std::exit(ec.value());
}
});
ioc.run();
return 0;
}
People who are unused to writing composed operations (asynchronous operations that fit into
the ASIO ecosystem), or people who have written them longer ago than last year, might at
this stage feel their hearts sinking in anticipation of the complex horror show awaiting
them when writing the function async_copy_all
.
Fortunately, Asio’s new(ish) async_compose
template function makes this reasonably
painless:
template<class InStream, class OutStream, class CompletionToken>
auto
async_copy_all(
InStream &fd_in,
OutStream &fd_out,
CompletionToken &&completion)
{
return asio::async_compose<
CompletionToken,
void(system::error_code const &,std::size_t)>(
[&fd_in, &fd_out,
coro = asio::coroutine(),
total = std::size_t(0),
store = std::make_unique<char[]>(4096)]
(auto &self,
system::error_code ec = {},
std::size_t bytes_transferred = 0) mutable
{
BOOST_ASIO_CORO_REENTER(coro)
for(;;)
{
BOOST_ASIO_CORO_YIELD
{
auto buf = asio::buffer(store.get(), 4096);
fd_in.async_read_some(buf, std::move(self));
}
if (ec.failed() || bytes_transferred == 0)
{
if (ec == asio::error::eof)
ec.clear();
return self.complete(ec, total);
}
BOOST_ASIO_CORO_YIELD
{
auto buf = asio::buffer(store.get(), bytes_transferred);
fd_out.async_write_some(buf, std::move(self));
}
total += bytes_transferred;
if (ec.failed())
return self.complete(ec, total);
}
},
completion, fd_in, fd_out);
}
There are a few things to note in the implementation.
- The first is that the entire asynchronous operation’s implementation state is captured in the capture block of the lambda (this is why we need c++14 or higher)
- Secondly, the lambda is mutable. This is so we can update the state and then
move
it into the completion handler of each internal asynchronous operation. - The second and third arguments of the lambda’s function signature are defaulted. This is
because
async_compose
will cause the implementation (in this case, our lambda) to be called once with no arguments (other thanself
) during initiation. - There is an explicit check for
eof
after the yielding call tofd_in.async_read_some
. In Asio,eof
is one of a few error codes that represents an informational condition rather than an actual error. Another isconnection_aborted
, which can occur during anaccept
operation on a TCP socket. Failing to check for this error-that-is-not-an-error can result in asio-based servers suddenly going quiet for ‘no apparent reason’. - Notice that the un-named object created by
async_compose
intercepts every invocation on it and transfers control to our lambda by prepending a reference to itself to the argument list. The type ofSelf
is actually a specialisation of anasio::detail::composed_op<...>
(as at Boost 1.72). However, since this class is in the detail namespace, this should never be relied on in any program or library. - Note that I create the buffer object
buf
in separate statements to the initiations of the async operations on the streams. This is because theunique_ptr
calledstore
is going to bemove
d during the initiating function call. Remember that arguments to function calls are evaluated in unknowable order in c++, so accessingstore
in the same statement in which the entire completion handler has beenmove
d would result in UB. - Finally,
async_compose
is passed both the input and output stream (in addition to their references being captured in the lambda) so that both streams’ associated executors can be informed that there is outstanding work. It may be surprising to some that the input and output streams may legally be associated with different executors.
Actually, now that I write this, it occurs to me that it is unclear to me what is the ‘associated executor’ of the composed operation we just created. Asio’s documentation is silent on the subject.
Inspecting the code while single-stepping through a debug build revealed that the executor is
taken from the first of the io_objects_or_executors&&...
arguments to async_compose
which
itself has an associated executor. If none of them do, then the system_executor
is chosen as
the default executor (more on why this may cause surprises and headaches later). Note that as
always, wrapping the lambda in a call to bind_executor
will force the composed operation’s
intermediate invocations to happen on the bound executor.
In our case, it is fd_in
which will be providing the executor and as a result, every
invocation of our lambda (except the first) is guaranteed to be happen by being invoked
as if by post(fd_in.get_executor(), <lambda>(...))
.
system_executor
and “What Could Possibly Go Wrong?”
Once upon a time, when I first started using Asio, there were no executor
s at all. In
fact, there were no io_context
s either. There was an io_service
object. At some point
(I don’t remember the exact version of Asio, but it was at least five years ago) the
io_service
was replace with io_context
, an object which did basically the same job.
More recently, the io_context
represents the shared state of a model of the Executor
Named Type Requirement (aka Concept). The state of the art is moving towards passing copies
of Executor
s rather than references to io_context
s.
Asio now contains a concrete type, the executor
which is a type-erased wrapper which
may be assigned any any class which models an Executor
.
As you might expect, we are heading into a world where there might be more than one model
of Executor
. In anticipation of this, by default, all Asio IO objects are now associated
with the polymorphic wrapper type executor
rather than a io_context::executor_type
.
One such model of Executor
supplied by Asio is the system_executor
, which is actually
chosen as the default associated executor of any completion handler. That is, if you initiate
an asynchronous operation in Asio today, against a hypothetical io_object that does not have
an associated executor and you do not bind your handler to an executor of your own, then
your handler will be invoked as-if by post(asio::system_executor(), <handler>)
- that is,
it will be called on some implementation-defined thread.
Now that the basics are covered, back to what could possibly go wrong?
Well imagine a hypothetical home-grown IO Object or AsyncStream. Older versions of the Asio documentation used to include an example user IO Object, the logging socket.
The basic premise of our logging socket is that it will do everything a socket will do, plus log the sending and receiving of data, along with the error codes associated with each read or write operation.
Clearly the implementation of this object will contain an asio socket object and some kind of logger. The internal state must be touched on every asynchronous operation initiation (to actually initiate the underlying operation and record the event) and during every completion handler invocation, in order to update the logger with the results of the asynchronous operation.
As we know, invocations of intermediate completion handlers happen on the executor associated with the final completion handler provided by the user, so in our case, the actions will be something like this:
on the initiating thread:
logging_socket::async_write_some
logging_socket::async_write_some_op::operator()()
logging_socket::impl::update_logger(...)
socket::async_write_some(...)
... time passes...
on a thread associated with the associated executor:
logging_socket::async_write_some_op::operator()(ec, bytes_transferred)
logging_socket::impl::update_logger()
user_completion_handler(ec, bytes_transferred)
The situation will be similar for a write operation.
Now consider the following code (ls
is an object of our hypothetical type logging_socket
:
ls.async_write_some(
get_tx_buffer(),
net::bind_executor(
net::system_executor(),
[](auto ec, auto size){
/* what happens here is not relevant */
}));
ls.async_read_some(
get_rx_buffer(),
net::bind_executor(
net::system_executor(),
[](auto ec, auto size){
/* what happens here is not relevant */
}));
What have I done? Not much, simply initiated a read and a write at the same time - a
perfectly normal state of affairs for a socket. The interesting part is that I have
bound both asynchronous completion handlers to the system_executor
. This means that
each of the handlers will be invoked (without synchronisation) on two arbitrary threads.
Looking at our pseudo-code above, it becomes clear that there will be a race for the
logging_socket
’s implementation:
- Between the initiation of the read and the completion of the write, and
- between the completion of the read and the completion of the write
Again the Asio documentation is silent on the correct method of mitigating this situation. Two possible workarounds have occurred to me so far:
- Never use a
system_executor
unless first wrapping it in astrand
. - Ensure that all composed operations of IO objects are thread-safe with respect to mutation of the implementation. If this is made true, it almost inevitably follows that the entire IO Object may as well be made thread-safe (which Asio IO Objects are not).
I have reached out to Chris for final judgement and will update the blog (and possibly much of Beast!) in response to a definitive answer.
Unified Web Client
I have been given the go-ahead to make a start on exploring a unified web-client library which will eventually become a candidate for inclusion into Boost.
The obvious course of action, building directly on top of Beast is a no-go. If the library is to be used on platforms such as tablets and phones, or appear in the various app stores of vendors, there are restrictions on which implementations of communications libraries may be used. To cut a long story short, vendors want to minimise the risk of security vulnerabilities being introduced by people’s home-grown communications and encryption code.
So my initial focus will be on establishing an object model that:
- Provides a high degree of utility (make simple things simple).
- Emulates or captures the subtleties of vendor’s Web Client frameworks.
- Efficiently slots into the Asio asynchronous completion model.
Of course, linux and proprietary embedded systems do not have a mandated communications libraries, so there will certainly be heavy use of Beast in the unconstrained platform- specific code.
More information as it becomes available.
All Posts by This Author
- 08/10/2022 Richard's August Update
- 10/10/2021 Richard's October Update
- 05/30/2021 Richard's May 2021 Update
- 04/30/2021 Richard's April Update
- 03/30/2021 Richard's February/March Update
- 01/31/2021 Richard's January Update
- 01/01/2021 Richard's New Year Update - Reusable HTTP Connections
- 12/22/2020 Richard's November/December Update
- 10/31/2020 Richard's October Update
- 09/30/2020 Richard's September Update
- 09/01/2020 Richard's August Update
- 08/01/2020 Richard's July Update
- 07/01/2020 Richard's May/June Update
- 04/30/2020 Richard's April Update
- 03/31/2020 Richard's March Update
- 02/29/2020 Richard's February Update
- 01/31/2020 Richard's January Update
- View All Posts...