Asynchronous programming. Await the Future

asynchronous programming

This is the third post in a series on asynchronous programming. The whole series tries to answer a simple question: "What is asynchrony?". In the beginning, when I first started digging into the question, I thought I knew what it was. It turned out that I didn't know the slightest thing about asynchrony. So let's find out!

Whole series:

  • Asynchronous programming. Blocking I/O and non-blocking I/O
  • Asynchronous programming. Cooperative multitasking
  • Asynchronous programming. Await the Future
  • Asynchronous programming. Python3.5+


Some applications implement parallelism using several processes instead of several threads. Although the implementation details are different, conceptually it is the same model and in this post, I use the terms threads, but you can easily change it into processes.


Also here we will speak only in terms of explicit cooperative multitasking — callbacks, since this is the most common and widely used variant for implementing asynchronous frameworks. But I think it's also interchangeable with cooperative threads.

现代应用程序最常见的活动是使用输入和输出操作 (I/O),而不是进行大量的数字运算。 使用 I/O 函数的问题在于它们是阻塞的。 与CPU的速度相比,写入硬盘或从网络读取需要大量的时间。在任务完成之前无法完成功能,同时您的应用程序什么也不做。 对于需要高性能的应用程序,这是一个主要障碍,因为其他操作和其他 I/O 操作一直在等待。

The most common activity of modern applications is working with input and output operations(I/O) rather than with a lot of number crunching. The problem with using I/O functions is that they are blocking. Actual writing to a hard disk or reading from a network takes a lot of time compared to the speed of CPU. Functions cannot be completed until the task is done, meanwhile your application does nothing. For the applications that require high performance, this is a major obstacle because other actions and other I/O operations keep waiting.

使用线程是标准的解决方案之一。每个阻塞的 I/O 操作都在一个独立的线程中进行。当一个线程调用阻塞函数时,处理器会调度另一个需要 CPU 的线程。

One of the standard solutions is to use threads. Each blocking I/O operation starts in a separate thread. When a thread calls the blocking function, the processor schedules another thread that actually needs CPU.


在同步模型中,线程被分配给一个任务并开始执行其命令。 当任务完成时,线程接受下一个任务并做同样的事情:它一个接一个地执行它的所有命令来执行一个指定的任务。 在这样的系统中,线程不能半途而废地离开任务并转到下一个任务。 所以我们可以确定,当一个函数被执行时——它不能被搁置并且会在另一个函数开始之前完全完成(并且可以改变当前函数正在使用的数据)。

In this execution model, thread is assigned to one task and starts the execution of its commands. When the task is complete, the thread takes the next task and does the same: it executes all its commands one after another to perform one specified task. In such a system, a thread cannot leave a task halfway done and go to the next one. So we can be sure that when a function is executed — it can't be put on hold and will be completely finished before another one starts (and can change the data the current function is working with).

Single thread


If a system is executed in a single thread and there are several tasks associated with it, they will be executed in this one thread one after another sequentially.

Single threaded


And if tasks are always executed in a certain order, the execution of a later task can assume that all earlier tasks ended without errors, with all their results available for use — a certain simplification in logic.


And if one of the commands is slow, the whole system will wait for the completion of this command — it is impossible to bypass it.

Multiple threads


In a multi-threaded system, the principle is preserved — one thread is assigned to one task and works on it until it is completed.


But in the multi-threaded system, each task is executed in a separate control thread. Threads are controlled by the operating system and can be executed, in a system with several processors or several cores in parallel or can be multiplexed on one processor.

现在我们有多个线程和任务(不是一个任务,而是几个不同的任务)要同时执行。 事实上,已经完成其任务之一的线程可以继续执行下一个任务。

Now we have more than one thread and tasks (not one task, but several different tasks) to be executed at the same time. In fact, the thread that has completed work on one of its tasks can proceed to the next one.

asynchronous programming

多线程编程时首先会遇到的事情之一是同步的访问数据。像 C 这样的低级语言根本没有任何用于访问同步的内置原语,你至少需要使用 POSIX 信号量或编写自己的解决方案。

Synchronizing access to data is one of the first things a person will encounter when taking on multithreaded programming. Low-level languages like C doesn't have any built-in primitives for access synchronization at all, and you will at least need to use POSIX semaphores or write your own solutions.

问题在于,在大多数非隔离语言中,两个线程可以在没有警告的情况下读取或写入同一个变量。 如果您不解决此类情况,您很容易获得程序的未定义行为,但很可能会崩溃。

The problem is that in most non-isolated languages, two threads can read or write to the same variable without warning. If you do not solve such situations, you can easily get undefined behavior of your program, but most likely it will crash.

总体而言,多线程程序更复杂,更容易出错,并且包括常见问题:竞争条件、死锁和资源耗尽。 引入了其他概念和实体来解决这些问题(锁、信号量、超时等)。

Overall multithreaded programs are more complex and tend to be more prone to errors, and include common issues: race conditions, deadlocks, and resource exhaustion. Additional concepts and entities are introduced to resolve these problems(locks, semaphores, timeouts, etc).


Another execution model uses a different style, asynchronous style.

A small note. Do not confuse non-blocking and asynchronous I/O. Asynchronous is the opposite of synchronous, while non-blocking I/O is the opposite of blocking. Both are quite similar, but also differ in that asynchronous is used with a wider range of operations, while non-blocking is mainly used with I/O.

Most modern operating systems provide subsystems of event notification. For example, a usual read call to a socket will be blocked until the sender really sends something. Instead, an application can ask the operating system to watch the socket and queue the notification event until the data will be ready. The application can check the events whenever it likes (perhaps by doing some scripting before using the processor to the maximum) and process other things in the meantime. This is asynchronous because application at one point expresses interest and at another point uses data (in time and space).


The asynchronous code removes the blocking operation from the main thread of the application, so it continues to run, but after some time (or maybe somewhere else). Simply put, the main thread saves the task and passes its execution to a later time.

Asynchrony and context switching

Although asynchronous programming can speed up I/O tasks and solve thread synchronization problems it was actually designed to deal with a completely different problem — frequent switching of processor context.

When several threads are launched, each processor core can still execute only one thread at a time. To allow all the threads/processes to share resources, CPU context switching is very frequent. To simplify the situation, the processor constantly saves all the information about the thread at random intervals and switches to another thread. Threads are also resources, they are not free. This leads to frequent dumps and loads of thread' data and therefore to frequent misses in the low-level CPU cache and therefore possible performance problems.

Asynchronous programming is essentially cooperative multitasking with the user-space threading where the application manages the threads and switches the context, not the OS. Basically, in the asynchronous world, context switching only occurs at certain switching points, not at undetermined intervals.


Compared to the synchronous model, the asynchronous model works best when:

  • There is a large number of tasks, so there is probably always at least one task that can make progress;
  • Tasks perform many I/O operations, which causes the synchronous program to spend a lot of time in blocking mode when other tasks can be performed;
  • Tasks are largely independent of each other, so there is no need for intertask interaction (and therefore, no need to wait for one task from another).

These conditions almost perfectly characterize a typical busy server (for example, a web server) in a client-server system. Each task is a single client request with input/output in the form of receiving the request and sending the response. The server implementation is the main candidate for an asynchronous model, so Twisted and Node.js along with other asynchronous libraries, have become so popular in recent years.

Why not just use more threads? If one thread is blocked by I/O operation, the other thread can make progress, right?

Threads are still a resource and they are not free and limited by the OS. As the number of threads increases, your server may begin to experience performance problems. With each new thread, there are some memory overheads associated with creating and maintaining a thread's state.

Another performance gain from an asynchronous model is that it avoids context switching — every time the operating system transfers control from one thread to another it must store all the relevant registers, memory map, stack pointers, processor context, etc. so that another thread can resume execution where it stopped. The cost of this can be quite significant.

Event loop

How can an event of a new task arrival reach the application if the execution thread is busy processing another task?

Everything depends on the implementation. Different libraries use different approaches. Some of them use cooperative multitasking and simply produce coroutines and some of them separate event receipts and event processing into different threads (OS threads or user-level threads).

But as we understood from the previous post reactor/proactor patterns have their advantages.

And how do you manage all the events? In the analog of our reactor/proactor patterns — the scheduler or the event loop.


An event loop is exactly how it sounds, there is a queue of events (where all the events that have happened are stored — in the figure above it's called a "task queue") and a loop that simply constantly pulls those events out of the queue and calls callbacks from those events (all execution goes through the call stack). The API is an API for calls to asynchronous functions such as waiting for a response from a client or a database.

In this flow, all function calls go first to the call stack, then asynchronous commands are executed through API and after they are completed, the callback goes to the task queue and then to the call stack again.

The coordination of this process takes place in the event loop.

You see how this differs from the reactor pattern we talked about [in the last post] (https://luminousmen.com/post/asynchronous-programming-cooperative-multitasking)? That's right — they are the same.

When the event loop forms the central control flow of the program, as it often happens, it can be called the main loop or main event loop. This name is appropriate, because such an event loop is at the highest level of control inside the application.

In event-driven programming, the application expresses interest in certain events and responds to them when they occur. It is the responsibility of the event loop to collect events from the operating system or monitor other event sources, and the user can register callbacks that will be called when an event occurs.

The event loop usually runs forever.

JS event loop concept explained: What the heck is the event loop anyway? | Philip Roberts | JSConf EU


Summing up the whole theoretical series:

  • Asynchronous operations in applications can make them more efficient, and most importantly, faster for the user.

  • OS threads are cheaper than processes, but it is still very expensive to use one thread for each task. Reuse will be more efficient and this is what asynchronous programming provides us with.

  • The asynchronous model is one of the most important approaches to optimizing and scaling applications related to input/output (yes — it will not help in the case of tasks related to CPU).

  • Asynchronous programs may be difficult to write and debug.