Skip to main content

The rise of the coroutines

(Image credit: Image Credit: StartupStockPhotos / Pexels)

Coroutines have been known as a concept and used in various niches for ages, since Melvin Conway coined the term in 1958. Coroutines are lightweight, independent instances of code that can be launched to do certain tasks, get suspended, usually to wait for asynchronous events, and be resumed to continue their jobs. Coroutines make it easier to build highly-concurrent software that performs many tasks at the same time or keeps track of many independent event streams.

A short history

Coroutines used to be a popular concept in the programming languages of 1960s-1980s era, but they were largely forgotten and fell out of curriculars as multithreading became wide-spread. Traditionally, mainstream programming languages of 1990s-2010s had provided two chief ways to do things concurrently. One is to start an OS-hosted thread for each parallel activity, which works fine as long as you don’t need thousands of them. The other approach is to do some form of event-loop based programming with callbacks, extremely popular for UI programming and sometimes also used as a basis for highly-scalable input/output libraries on the backend. However, callbacks make the code quite complicated, hard to reason about and to debug, especially when different concurrent activities have to keep and update their own state.

In domains that are extraordinarily rich in fine-grained concurrent activities coroutines have lingered despite the neglect by the general public. Take games for example. The popular scripting language for game development — Lua (created in 1993, even before Java and JavaScript) supports coroutines, because they are indispensable for code where many independent objects and characters with their own state are concurrently executing their own scripts and interacting with each other.

An advance of asynchronous programming

What used to be a niche problem is now becoming mainstream one both in backend and frontend development. Most software used to be CPU-bound, locally solving its own problems. Asynchrony in software used to be centered around interaction with a person sitting at the keyboard and callback-based approach to coding these interactions served well. Nowadays, everything is networked. Mobile and web applications use dozens of services, monolith architectures on the backend are being replaced with hundreds of interacting services. A software system that used to spend time computing something on a local machine, now often waits for some other service to return the result of the computation.

A code that is waiting for asynchronous events becomes a programming norm, not an exception; concurrent communication is standard — the modern software does not tend to show us a blocking “please wait” message as it used to do in the past. With threads being too expensive and callbacks too cumbersome for this problem, there is a fresh rising interest in the concept of coroutines.

Thread-like solutions

For many developers coroutines are a new concept. Developers are either not being taught any kind of programming practices for concurrency at all, or are being taught classic thread-based and event-based approaches to concurrency. So, there are two main directions from which modern programming languages approach this emerging problem of light-weight concurrency.

One approach is to give programmers a very thread-like and familiar programming model, but make threads lightweight. Most notably this approach is taken by Go programming language (2009), which does not have a concept of a thread in the language, but a goroutine, which is essentially a coroutine that is dressed into a very thread-like form. A somewhat similar approach is being worked on by the  Java team under the codename of Project Loom. At the time of the writing, the plan is to leave threads directly available to developers but introduce an additional concept of a virtual thread, that is lightweight but is programmed very much like a thread from a developer standpoint. That is the main advantage of this approach, making it easy to learn and easy to port legacy thread-based software to, but also its main weakness, because programming reliable software in a world that expects ubiquitous concurrency requires different engineering practices from the world where threads were a few.

In particular, in a modern world of networked software it becomes quite useful to distinguish between local CPU-bound computatiations, that are usually fast or, at least, take predictable time, and between asynchronous requests that are orders of magnitude slower and may make take unpredictably long time due to network congestions and 3rd party service slowness. Ironically, “A Note on Distributed Computing” by Sun Microsystems, the now-classic paper, argued that the two shall never be conflated back in 1994, yet this insight was largely ignored in the design of systems for almost two decades, during an era of failed attempts to build distributed communication architectures that make remote operations indistinguishable from local ones for developers.

Rebirth of coroutines

An opposite to thread-like approach is the introduction of coroutines into a programming language as a separate concept, specifically tailored and distinctly shaped for the world of massively asynchronous software. Initially, this approach used to be popular chiefly among single-threaded scripting languages that do not provide an option to use threads to their full extent.

Most notably, Python started adding coroutines as a mechanism for writing concurrent code in 2005 and evolved this approach over the years. Meanwhile, the milestone for coroutines adoption in the main-stream programming languages had happened in 2012 when C# added coroutines in the form of async functions and await expressions. This particular syntactic form had quickly gained popularity and was adopted by Python in 2015, by JavaScript in 2017, and in many other programming languages. Even the old guard, C++ has added async/await syntax in 2020, renamed there to co_async/co_await, to support coroutines, thus cementing the main-stream status of coroutines in programming. The concept of coroutines had now fully risen back from obscurity.

A color of your function

The main disadvantage of async/await concurrency, that the thread-like concurrency does not have, is now known as a problem of red/blue code, as explained by Bob Nystrom in his 2015 blog post “What Color is Your Function”. When using async/await you have to write asynchronous code in a visually quite different manner from the regular, computational code.

This concern leads to a variation of async/await approach to coroutines that takes a different syntactic form to mitigate the problem, so that asynchronous code syntactically looks the same in the source, yet retains the advantage of marking the parts of the code that could end up indefinitely waiting for external events. This path was taken by Kotlin in 2017. Kotlin coroutines are implemented using a suspend keyword to mark functions that can suspend the execution of coroutines, without mandating any kind of distinct await expression in the logic of the program itself. In essence, it is an async/await-inspired implementation — there is a marker for async functions in the code, but without having to mark their calls with await.

A road to structured concurrency

Coroutines are useful, are gaining popularity, and are here to stay, which means that developers will need to learn best practices of using coroutines. One particular trend, that is gaining traction because of coroutines, is the structured concurrency paradigm.

Coroutines enable writing highly concurrent and asynchronous software seemingly at ease, yet every coroutine the code launches risks being accidentally suspended for a long time, waiting for events or responses that might even never happen. That creates a new way to leak resources in a software that developers come unequipped and unprepared to deal with. Structured concurrency is a discipline of encapsulating concurrent pieces of code in such a way as to prevent those kinds of leaks from happening and to make concurrent software easier for humans to reason about.

This paradigm shift that is happening right now is akin to the ascent of the structured programming paradigm that was sparked by the Dejikstra’s famous “Go to statement considered harmful” in 1968 and had culminated in the universal adoption of structured programming in all languages we program in today.

We are still living in the world where most of the concurrent software is being written in an unstructured way, an analogy with the old days of code written with GOTO statements that was aptly captured by Nathaniel Smith in his “Notes on structured concurrency, or: Go statement considered harmful”. Yet, all the languages that are introducing light-weight concurrency paradigms are also adding library abstractions for structured concurrency. Just like it happened with structured programming in the past, we can foresee that in the future a structured approach to concurrency becomes a default that is enforced by programming languages and their concurrency libraries.

Roman Elizarov, Kotlin Libraries Team Lead, Kotlin