Async Ruby - RubyConf 2021 talk transcript
This talk was given at RubyConf 2021. Below is the slightly edited talk transcript. You can also watch the video.
Table of contents
1. Introduction 2. Asynchronous programming 2.1. Asynchronous programming benefits 3. Ruby is synchronous 3.1. Ruby threads 3.2. Ruby thread downsides 4. Async Ruby 4.1. Async gem 4.2. Async ecosystem 5. Basic example 5.1. Async tasks 5.2. Async program structure 6. Advanced example 6.1. URI.open 6.2. HTTParty 6.3. Redis 6.4. SSH 6.5. SQL queries 6.6. Blocking operations 6.7. Spawning processes 7. Advanced scaling example 7.1. Scalability limits 8. Understanding Async Ruby 8.1. Event reactor 8.2. Fibers 8.3. Fiber scheduler 9. Common questions 9.1. Async Rails? 9.2. Production ready? 9.3. How to get started? 10. Async Ruby creator 11. Conclusion
Introduction
Async Ruby is an awesome addition to the Ruby language. It's been available for some time now, but relatively few people know about it and it has stayed off of the Ruby mainstream.
The goal is to show you, at a high level, what Async Ruby is about. Whether you're a beginner or an advanced rubyist, I hope to show you something you didn't know about Ruby.
We're going to go through a couple simple examples that show the power of asynchronous programming, and we'll also explain the core concepts of how it all works.
I've been a Ruby programmer for 10 years now and this is, in my opinion, by far the most exciting addition to the Ruby language during this time.
My name is Bruno Sutic. I'm an Async Ruby early adopter, and I've made a couple small contributions to it. You can find me on GitHub as @bruno-. You can also find my contact info on my webpage, brunosutic.com.
Asynchronous programming
Before jumping into Async Ruby, let's explore what does async really mean? What is asynchronous programming?
It's commonly accepted that JavaScript brought async programming to the mainstream developer's consciousness, so it would be fitting to explain asynchronous programming with a simple JavaScript example. I assume a lot of you have written at least a little JavaScript, because it's so unavoidable these days.
Let's look at this example:
fetch("https://httpbin.org/delay/2").then((res) => {
console.log(`Status is ${res.status}`)
})
console.log("runs first")
- We make a simple HTTP GET request to
httpbin.org
. - We register a promise that runs when the request response is received. This function just prints the response status.
- On the last, 4th line of this example, we're printing a string.
The output, shown below, is expected:
runs first Status is 200
This is the simplest example of an async program, in which we typically make an I/O request, and then something happens later in a callback when the request is complete.
One thing to note in the output here is:
- When the program first runs the code on the last line, it prints the string.
- Later, when the request is done, it prints the response status.
If you think about it, it's unusual for simple programs to run backwards, such as:
- line 1
- line 4
- then back to line 2
To us, developers, and humans, programs that run
The point I'm trying to make here is: async programs are harder to follow and understand. Programs that run
In the case of JavaScript, as the program becomes more complex, they may end up in an infamous state called a "callback hell" or "promise hell", or even "async await hell".
Asynchronous programming benefits
So then, why would we want to make our programs asynchronous? Why not just stick to a linear,
The answer is simple: performance. To understand this, let's look at the following example with JavaScript
fetch("https://httpbin.org/delay/2").then(...)
fetch("https://httpbin.org/delay/2").then(...)
fetch("https://httpbin.org/delay/2").then(...)
Here, we're making 3 HTTP GET requests, and each one takes 2 seconds to run. How long will this whole program run? Surprise, surprise - the program will run for 2 seconds total!
In this example we're firing 3 HTTP requests at practically the same time. The trick is that waiting for the responses happens in parallel. Asynchronous programming enables this to happen, and that's how we achieve these big performance gains.
Ruby is synchronous
If we look at the equivalent code in Ruby, we'll see that the same example takes 3x longer to run.
require "open-uri"
URI.open("https://httpbin.org/delay/2")
URI.open("https://httpbin.org/delay/2")
URI.open("https://httpbin.org/delay/2")
In this case the math is predictable: 3 x 2 = 6 seconds. The reason for this is that there's no parallel waiting on the responses. Ruby is synchronous.
Ruby threads
So, how do you make 3, or 5, or 100 requests in Ruby more performant? You use threads.
This example shows how to speed up our program with 3 requests in Ruby.
require "open-uri"
1.upto(3).map {
Thread.new do
URI.open("https://httpbin.org/delay/2")
end
}.each(&:join)
The whole program finishes in 2 seconds!
Ruby thread downsides
And now you may be wondering: Ruby isn't asynchronous by design, but it has threads, so are we good?
If you've done any
There are two specific problems with them:
Language-level race conditions- These are particulary nasty and hard to debug. This type of problem can occur with even the simplest of thread programs.
- Maximum number of threads
- This matters when you want to make a large number of parallel requests.
I just tried maxing out the number of threads on my machine, which is a
Async Ruby
Async Ruby is a new type of concurrency in Ruby. If you ever think "I want to do multiple things at the same time in Ruby", then Async may be a good fit.
Here are a couple of examples:
- Serving more requests per second with the same hardware.
- Making more requests with your API client at the same time.
- Handling more websocket connections concurrently.
Ruby has a couple options when you want to do multiple things at the same time:
- Processes
- Ractors
- Threads
- Async
Async is the new addition to the above list.
Async gem
So, how do you run Async Ruby? Async is just a gem, and you install it with gem install async
- that's it.
It's a very nice gem, because Matz invited it to Ruby's standard library. The invite has not yet been accepted.
The gem creator is Samuel Williams, a Ruby core committer. He also wrote the Fiber scheduler, an important Ruby 3.0 that makes Async integrate with Ruby in a
So, you can kinda feel that the Ruby core team, including Matz himself, are backing this gem.
Async ecosystem
Async Ruby is also an ecosystem of gems. Here's a couple of them:
async-http
- A powerful HTTP client.
async-await
- Adds some syntax sugar to Async.
falcon
- A highly scalable asynchronous HTTP server built around the Async core.
async-redis
- Redis client.
async-websocket
- The name says it all.
This talk focuses on the core async
gem and the accompanying Ruby language integration.
Basic example
Let's do an Async Ruby example that is equivalent to the JavaScript example we had before.
require "async"
require "async/http/internet"
time = Time.now
Async do |task|
internet = Async::HTTP::Internet.new
task.async do
internet.get("https://httpbin.org/delay/2")
end
task.async do
internet.get("https://httpbin.org/delay/2")
end
task.async do
internet.get("https://httpbin.org/delay/2")
end
end
puts "Duration: #{Time.now - time}"
In this example we're using async-http
gem. The only thing you have to know about it is that it's an HTTP client. You call get
on it, and it makes a request.
The actual code starts with a capitalized Async
- a kernel method with a block. All the asynchronous code in a Ruby program is always wrapped in an Async
block.
Async tasks
Async Ruby has a concept of tasks, and we spin multiple tasks when we want to make things run concurrently. In this example we're running three requests at the same time.
And just like in the previous JavaScript example, all three requests start at virtually the same time. The big win is that waiting on the responses happens in parallel. The example output confirms this:
Duration: 2.428274121
The total running time of this example is slightly more than the expected 2 seconds because of the network latency.
Async program structure
The basic example above shows the general structure of Async Ruby programs:
- You start with an
Async
block that is passed a main task. - That main task is usually used to spawn more Async
sub-tasks . - These
sub-tasks run concurrently to each other and to the main task.
Just to make it explicitly clear: Async tasks can be nested indefinitely. So, a task block is passed a
Another thing to clarify is that it's all just Ruby. Async does not contain any DSL - nor does it do gimmicks, like monkey patching. The previous example performs only HTTP requests within tasks, but you can run any Ruby code anywhere - in a main task or
Advanced example
Hopefully the above example has given you a positive first impression of Async Ruby. Once you get a little used to how things work, you see it's actually really neat, and the performance benefits are awesome. Let's now see another code example. If you're not impressed yet, this may just blow. your. mind.
URI.open
You may've not liked that we're using a new HTTP client in the first example. The truth is, you can use Ruby's URI.open
to achieve the same result.
require "async"
require "open-uri"
start = Time.now
Async do |task|
task.async do
URI.open("https://httpbin.org/delay/2")
end
task.async do
URI.open("https://httpbin.org/delay/2")
end
end
puts "Duration: #{Time.now - start}"
The example output:
Duration: 2.440876417
Here, we see that two requests triggered with URI.open
are completed in about 2 seconds. Since it's the same result as before, we know the requests ran at the same time.
HTTParty
But, URI.open
may also not be your favorite tool. The brilliant thing about Async Ruby is that any HTTP client is supported. Let's try running HTTParty
gem and see how that works.
require "async"
require "open-uri"
require "httparty"
start = Time.now
Async do |task|
task.async do
URI.open("https://httpbin.org/delay/2")
end
task.async do
HTTParty.get("https://httpbin.org/delay/2")
end
end
puts "Duration: #{Time.now - start}"
And the output is:
Duration: 2.415048833
Ok, the program ran in about 2 seconds which means that all requests ran concurrently.
Redis
So far, we've only seen examples making HTTP requests. But, what about other network requests? Let's try Redis which has its own protocol built on top of TCP.
This example extends the previous one with another task at the bottom of Async
block.
require "async"
require "open-uri"
require "httparty"
require "redis"
start = Time.now
Async do |task|
task.async do
URI.open("https://httpbin.org/delay/2")
end
task.async do
HTTParty.get("https://httpbin.org/delay/2")
end
task.async do
Redis.new.blpop("abc123", 2)
end
end
puts "Duration: #{Time.now - start}"
The added Redis command runs for 2 seconds before returning.
We run the example:
Duration: 2.410222604
It completes in about 2 seconds. Wow! We can also make Redis commands asynchronous.
In fact, any I/O operation can be made asynchronous. All existing, synchronous code is fully compatible with Async. You don't have to use async-http
or async-redis
. You can just continue using the libraries you are already familiar with.
SSH
Let's add another example to the mix. I'll use net-ssh
gem to execute an SSH command on the remote server.
This example extends the previous one with another task at the bottom of Async
block.
require "async"
require "open-uri"
require "httparty"
require "redis"
require "net/ssh"
start = Time.now
Async do |task|
task.async do
URI.open("https://httpbin.org/delay/2")
end
task.async do
HTTParty.get("https://httpbin.org/delay/2")
end
task.async do
Redis.new.blpop("abc123", 2)
end
task.async do
Net::SSH.start("164.90.237.21").exec!("sleep 1.5")
end
end
puts "Duration: #{Time.now - start}"
This SSH command runs sleep 1.5
on the target server. Because of some overhead, it finishes in about 2 seconds total.
And the output is:
Duration: 2.400152144
Ok, there you have it. We added SSH to the mix and it works seamlessly with other network requests.
SQL queries
You may be wondering - what about databases? We connect to the databases over the network. Does Async support SQL queries?
I'll use sequel
gem to check if asynchronous database operations are supported. The query added to the example takes exactly 2 seconds to run.
This example extends the previous one with another task at the bottom of Async
block.
require "async"
require "open-uri"
require "httparty"
require "redis"
require "net/ssh"
require "sequel"
DB = Sequel.postgres
Sequel.extension(:fiber_concurrency)
start = Time.now
Async do |task|
task.async do
URI.open("https://httpbin.org/delay/2")
end
task.async do
HTTParty.get("https://httpbin.org/delay/2")
end
task.async do
Redis.new.blpop("abc123", 2)
end
task.async do
Net::SSH.start("164.90.237.21").exec!("sleep 1.5")
end
task.async do
DB.run("SELECT pg_sleep(2)")
end
end
puts "Duration: #{Time.now - start}"
The output:
Duration: 2.465881664
And yes, database queries are supported as well. Cool, right?
Blocking operations
Let's see another example that uses Ruby's sleep
method.
This example extends the previous one with another task at the bottom of Async
block.
require "async"
require "open-uri"
require "httparty"
require "redis"
require "net/ssh"
require "sequel"
DB = Sequel.postgres
Sequel.extension(:fiber_concurrency)
start = Time.now
Async do |task|
task.async do
URI.open("https://httpbin.org/delay/2")
end
task.async do
HTTParty.get("https://httpbin.org/delay/2")
end
task.async do
Redis.new.blpop("abc123", 2)
end
task.async do
Net::SSH.start("164.90.237.21").exec!("sleep 1.5")
end
task.async do
DB.run("SELECT pg_sleep(2)")
end
task.async do
sleep 2
end
end
puts "Duration: #{Time.now - start}"
What do you expect this sleep will do? Will it increase the total example duration by 2 seconds? Let's check the output:
Duration: 2.397805105
The whole program runs in about 2 seconds, which indicates this sleep
ran concurrently with other tasks. Nice! So, not only can we run network I/O asynchronously, we can also run other blocking operations async.
Spawning processes
What other, often used, blocking operations do we run in Ruby? How about we try spawning new child processes?
This example extends the previous one with another task at the bottom of Async
block.
require "async"
require "open-uri"
require "httparty"
require "redis"
require "net/ssh"
require "sequel"
DB = Sequel.postgres
Sequel.extension(:fiber_concurrency)
start = Time.now
Async do |task|
task.async do
URI.open("https://httpbin.org/delay/2")
end
task.async do
HTTParty.get("https://httpbin.org/delay/2")
end
task.async do
Redis.new.blpop("abc123", 2)
end
task.async do
Net::SSH.start("164.90.237.21").exec!("sleep 1.5")
end
task.async do
DB.run("SELECT pg_sleep(2)")
end
task.async do
sleep 2
end
task.async do
`sleep 2`
end
end
puts "Duration: #{Time.now - start}"
I'm using a sleep
system command in this example. Don't get confused, this is actually running an external system command. It could be any other executable, but I chose sleep
so I can easily control the duration.
The output is:
Duration: 2.396816366
And there you have it: system commands can run async as well.
Advanced scaling example
We've covered a lot so far, and hopefully these features look exciting to you. You saw something new, something really innovative in Ruby. But that's not all. Let me show you how easily Async Ruby scales.
This example extends the previous one by repeating the content of Async
block 10.times
.
require "async"
require "open-uri"
require "httparty"
require "redis"
require "net/ssh"
require "sequel"
DB = Sequel.postgres(max_connections: 10)
Sequel.extension(:fiber_concurrency)
start = Time.now
Async do |task|
10.times do
task.async do
URI.open("https://httpbin.org/delay/2")
end
task.async do
HTTParty.get("https://httpbin.org/delay/2")
end
task.async do
Redis.new.blpop("abc123", 2)
end
# task.async do
# Net::SSH.start("164.90.237.21").exec!("sleep 1.5")
# end
task.async do
DB.run("SELECT pg_sleep(2)")
end
task.async do
sleep 2
end
task.async do
`sleep 2`
end
end
end
puts "Duration: #{Time.now - start}"
Quick note about Net::SSH
. I had to remove that operation because I couldn't figure out the correct SSH configuration for this example.
Let's see how long this example runs:
Duration: 2.82646708
2 seconds! Yes, we're running 60 tasks, each lasts 2 seconds, and total example run time is slightly more than 2.5 seconds.
How about cranking things up? How about we repeat this 100 times. Let's see what happens.
This example is almost the same as the previous one, but Async
block is repeated 100.times
.
require "async"
require "open-uri"
require "httparty"
require "redis"
require "net/ssh"
require "sequel"
DB = Sequel.postgres(max_connections: 100)
Sequel.extension(:fiber_concurrency)
start = Time.now
Async do |task|
100.times do
task.async do
URI.open("https://httpbin.org/delay/2")
end
task.async do
HTTParty.get("https://httpbin.org/delay/2")
end
task.async do
Redis.new.blpop("abc123", 2)
end
# task.async do
# Net::SSH.start("164.90.237.21").exec!("sleep 1.5")
# end
task.async do
DB.run("SELECT pg_sleep(2)")
end
task.async do
sleep 2
end
task.async do
`sleep 2`
end
end
end
puts "Duration: #{Time.now - start}"
We're now running 600 concurrent operations. Example duration is:
Duration: 3.753404045
The total program run time increased by a second because of the overhead of establishing so many network connections. Still, I find this pretty impressive.
So, there you have it: easy scaling with Async. You can crank the numbers up, but in my case Redis server and PostgreSQL database started complaining, so I left it at that.
Scalability limits
You can argue we could make the last example work with threads - creating 600 threads. I think that's really pushing the limits with threads. My hunch is the thread scheduling overhead would be just too high. When using threads, it's more common to limit the number of threads to say, 50 or 100.
On the other hand, 600 concurrent Async tasks is a common thing to do. The upper limit on the number of Async tasks per process is in the single digit millions. Some users have successfully done that.
This limit, of course, depends on the system and what you're trying to do. For example, if you're making or receiving network requests, you'll probably run out of ports at
In any case, I hope that you get the idea that Async Ruby is a very, very, powerful tool.
Understanding Async Ruby
To me, the biggest part of the magic is running 3 HTTP requests with URI.open
. With vanilla Ruby that takes 6 seconds. And then, by using the same method within the Async
block, the program runs for 2 seconds.
It's the same with other examples: sleep
, Redis etc. They all normally run in a blocking way, but then inside an Async
block they work asynchronously. It's a great example of keeping Ruby code fully backwards compatible. But how does that work?
There's a lot to learn about Async Ruby, but I think there are 3 main concepts to understand:
- Event reactor
- Fibers
- Fiber scheduler
Each of these 3 topics is very broad, so I'll just provide a summary here.
Event reactor
The event reactor is sometimes called by other names: event system or event loop. Every async implementation, in every language, say JavaScript, always has some kind of event reactor behind it.
Async Ruby is no exception. The current version of async
gem uses nio4r
gem as an event reactor backend. nio4r
then uses libev
to wrap systems' native APIs - epoll
on linux, kqueue
on Mac etc.
What does the event reactor do? It efficiently waits for I/O events. When an event happens, it performs an action we programmed it to do. On a very high level:
- We make an HTTP request and then we wait.
- Event reactor notifies us when the response for that request is ready and can be read from the underlying socket.
- We read from the socket.
These notifications are very efficient with resource usage and allow for high scalability. For example, if you hear a server can handle 10 thousand connections at the same time or a crawler can make a large number of concurrent requests - an event reactor is probably the technology behind that.
Fibers
You saw that Async has tasks. Tasks are just wrappers around fibers. Event reactor drives the execution of these fibers. For example:
- When a response in task 1 is ready, the event reactor resumes task or fiber number 1.
- Later on, when response in task 2 is ready, it resumes task or fiber number 2.
You get the idea.
Due to the decision to register fibers with event reactor we get a really nice property that code within a single task behaves completely synchronously. This means you can read it
The code behaves asynchronously only if you use task.async
. There's no way you can get "callback hell" with Async Ruby.
Fiber scheduler
The last piece of the puzzle, and the last big concept, is the fiber scheduler. Fiber scheduler was listed as one of the big Ruby 3.0 features. It provides hoooks for blocking functions inside Ruby. Examples of those blocking features are:
- Waiting for an I/O read or write
- Waiting on a
sleep
method to finish
In essence, fiber scheduler turns blocking behavior into
Let's take the sleep
method for example. If you're running sleep 2
in an Async
block, instead of blocking the whole program for 2 seconds, the fiber scheduler will run that sleep
in a
Now you know the big benefit of fiber scheduler. Along with fibers and event reactor, it makes Async Ruby seem like magic.
Common questions
Async Rails?
It's time for the big question: does Async work with Ruby on Rails? The answer is currently, no. The reason is that ActiveRecord needs more work to support async
gem.
Production ready?
Another big question you may have: is Async Ruby production ready? The answer to that question is - yes! Async Ruby is production ready and a number of people are running it in production. Everyone using it has nothing but praises for Async.
As an example, Trevor Turk recently started using async
on helloweather.com
. They've replaced Puma and Typheous::Hydra
with Falcon and Async::HTTP
. They immediately cut their server costs to one third, and their overall system is now more stable.
How to get started?
If you're excited about what you've seen so far, you're probably asking: how do I get started? I think the single best starting point to learn Async Ruby is async
gem github repo github.com/socketry/async. From there you'll find a link to project documentation.
Async Ruby creator
I've already mentioned Samuel Williams, but I think it doesn't hurt to say this again: this guy is the sole creator of Async Ruby, Async ecosystem and a Ruby core committer that implemented fiber scheduler feature.
Huge thanks to Samuel! He's making an awesome contribution to all of us Ruby developers.
Conclusion
I hope you liked what you saw in this speech. Async is an exciting new addition to Ruby. It's a whole new type of concurrency added to the language. As you saw, it's super powerful and very scalable.
This changes what's possible with Ruby. It changed the way I think about designing programs and apps.
One of the best things is that it does not obsolete any of the existing code.
Just like Ruby itself, Async Ruby is beautifully designed and a joy to use.
Happy hacking with Async Ruby!