Concurrency in Ruby is all the rage these days, and people can't seem to agree
whether Threads, Fibers, event loops, or actors are the best solution. But did you ever consider that your *sequential* Ruby program might be concurrent, with nary a Thread, Fiber, or callback in sight? Well, it happened to me.
This is the story of how accidental concurrency (also known as re-entrancy) broke my brain multiple times over the course of two years, spawned flamewars on Twitter, long blog posts, and the various solutions I took to solve the problem. Along the way we'll illuminate some subtleties of concurrent programming in Ruby, differences between several Ruby implementations, and how we can all write code that is friendlier when accidental concurrency strikes.
15. Streaming Ops
client server
stream_me
result
result
result
done
list-keys & MapReduce
16. Stream in Ruby
# Request a streamed operation
client.stream_something do |result|
process(result)
end
17. Stream in Ruby
# Request a streamed operation
client.stream_something do |result|
process(result)
end
# Stream via curb
if block_given?
curl.on_body {|c| yield c; c.size }
else
curl.on_body # Clear out the callback
end
21. Curling Back
def curl
@curl ||= Curl::Easy.new
end
def curl
Thread.current[:curl] ||= Curl::Easy.new
end
22. "The first basic rule is that you must
never simultaneously share a libcurl
handle (be it easy or multi or whatever)
between multiple threads."
-- libcurl docs
23. NO THREADS you must
"The first basic rule is that
never simultaneously share a libcurl
handle (be it easy or multi or whatever)
between multiple threads."
-- libcurl docs
24. Realization
• curl yields to block BEFORE return
• block tries to reuse handle while
connection is processing
25. Re-entrant
"In computing, a computer program or
subroutine is called reentrant if it can
be interrupted in the middle of its
execution and then safely called again
("re-entered") before its previous
invocation's complete execution."
Wikipedia
26. Re-entrant
"...a subroutine can fail to be reentrant
if it relies on a global variable to remain
unchanged but that variable is modified
when the subroutine is recursively
invoked."
Wikipedia
27. Re-entrant
"...a subroutine can fail to be reentrant
if it relies on a global variable to remain
unchanged but that variable is modified
when the subroutine is recursively
invoked."
Wikipedia
29. Solution #1: Fibers
if block_given?
_curl = curl
Fiber.new {
f = Fiber.current
_curl.on_body do |chunk|
f.resume(chunk); chunk.size
end
loop do
yield Fiber.yield
end
}.resume
else
# ...
30. Solution #1: Fibers
if block_given?
_curl = curl local curl
Fiber.new {
f = Fiber.current
_curl.on_body do |chunk|
f.resume(chunk); chunk.size
end
loop do
yield Fiber.yield
end
}.resume
else
# ...
31. Solution #1: Fibers
if block_given?
_curl = curl
Fiber.new { open fiber
f = Fiber.current
_curl.on_body do |chunk|
f.resume(chunk); chunk.size
end
loop do
yield Fiber.yield
end
}.resume
else
# ...
32. Solution #1: Fibers
if block_given?
_curl = curl
Fiber.new {
f = Fiber.current this fiber
_curl.on_body do |chunk|
f.resume(chunk); chunk.size
end
loop do
yield Fiber.yield
end
}.resume
else
# ...
33. Solution #1: Fibers
if block_given?
_curl = curl
Fiber.new {
f = Fiber.current
_curl.on_body do |chunk|
f.resume(chunk); chunk.size
resume on chunk
end
loop do
yield Fiber.yield
end
}.resume
else
# ...
34. Solution #1: Fibers
if block_given?
_curl = curl
Fiber.new {
f = Fiber.current
_curl.on_body do |chunk|
f.resume(chunk); chunk.size
end
loop do wait for chunk,
yield Fiber.yield yield chunk,
end repeat
}.resume
else
# ...
35. Solution #1: Fibers
if block_given?
_curl = curl
Fiber.new {
f = Fiber.current
_curl.on_body do |chunk|
f.resume(chunk); chunk.size
end
loop do
yield Fiber.yield
end start the fiber/loop
}.resume
else
# ...
36. How it Works
• Stream block runs INSIDE the Fiber
•Fibers have Thread locals
• Thread.current[:curl] is isolated
39. class Pump
def initialize(block)
@fiber = Fiber.new do
loop do
block.call Fiber.yield
end
end
@fiber.resume
end
def pump(input)
@fiber.resume input
input.size if input.respond_to?(:size)
end
def to_proc
45. class Pool::Element
attr_accessor :object, :owner
def initialize(object)
@object = object
@owner = nil
end
def lock
@owner = Thread.current
end
def unlock
@owner = nil
end
end
46. def take(opts = {}) # alias >> take
result = nil
begin
element = nil
@lock.synchronize do
element = pool.find {|e| e.unlocked? }
unless element
resource = opts[:default] || @open.call
element = Element.new(resource)
@elements << element
end
element.lock
end
result = yield element.object
rescue BadResource
delete_element(element) and raise
ensure
element.unlock if element
end
result
end
47. def take(opts = {}) # alias >> take
result = nil result var
begin
element = nil
@lock.synchronize do
element = pool.find {|e| e.unlocked? }
unless element
resource = opts[:default] || @open.call
element = Element.new(resource)
@elements << element
end
element.lock
end
result = yield element.object
rescue BadResource
delete_element(element) and raise
ensure
element.unlock if element
end
result
end
48. def take(opts = {}) # alias >> take
result = nil
begin
element = nil pool element
@lock.synchronize do
element = pool.find {|e| e.unlocked? }
unless element
resource = opts[:default] || @open.call
element = Element.new(resource)
@elements << element
end
element.lock
end
result = yield element.object
rescue BadResource
delete_element(element) and raise
ensure
element.unlock if element
end
result
end
49. def take(opts = {}) # alias >> take
result = nil
begin
element = nil
@lock.synchronize do grab pool lock
element = pool.find {|e| e.unlocked? }
unless element
resource = opts[:default] || @open.call
element = Element.new(resource)
@elements << element
end
element.lock
end
result = yield element.object
rescue BadResource
delete_element(element) and raise
ensure
element.unlock if element
end
result
end
50. def take(opts = {}) # alias >> take
result = nil
begin
element = nil
@lock.synchronize do
element = pool.find {|e| e.unlocked? }
unless element
resource = opts[:default] || @open.call
element = Element.new(resource)
@elements << element
end find a connection
element.lock
end
result = yield element.object
rescue BadResource
delete_element(element) and raise
ensure
element.unlock if element
end
result
end
51. def take(opts = {}) # alias >> take
result = nil
begin
element = nil
@lock.synchronize do
element = pool.find {|e| e.unlocked? }
unless element
resource = opts[:default] || @open.call
element = Element.new(resource)
@elements << element
all claimed!
end
element.lock
end
result = yield element.object
rescue BadResource
delete_element(element) and raise
ensure
element.unlock if element
end
result
end
52. def take(opts = {}) # alias >> take
result = nil
begin
element = nil
@lock.synchronize do
element = pool.find {|e| e.unlocked? }
unless element
resource = opts[:default] || @open.call
element = Element.new(resource)
@elements << element
end add new
element.lock
end conn to
result = yield element.object
rescue BadResource
pool
delete_element(element) and raise
ensure
element.unlock if element
end
result
end
53. def take(opts = {}) # alias >> take
result = nil
begin
element = nil
@lock.synchronize do
element = pool.find {|e| e.unlocked? }
unless element
resource = opts[:default] || @open.call
element = Element.new(resource)
@elements << element
end
element.lock claim it
end
result = yield element.object
rescue BadResource
delete_element(element) and raise
ensure
element.unlock if element
end
result
end
54. def take(opts = {}) # alias >> take
result = nil
begin
element = nil
@lock.synchronize do
element = pool.find {|e| e.unlocked? }
unless element
resource = opts[:default] || @open.call
element = Element.new(resource)
@elements << element
end
element.lock
end
result = yield element.object call block
rescue BadResource
delete_element(element) and raise
ensure
element.unlock if element
end
result
end
55. def take(opts = {}) # alias >> take
result = nil
begin
element = nil
@lock.synchronize do
element = pool.find {|e| e.unlocked? }
unless element
resource = opts[:default] || @open.call
element = Element.new(resource)
@elements << element
end
element.lock
end
result = yield element.object
rescue BadResource
delete_element(element) and raise
ensure
element.unlock if element
cleanup
end baddies
result
end
56. def take(opts = {}) # alias >> take
result = nil
begin
element = nil
@lock.synchronize do
element = pool.find {|e| e.unlocked? }
unless element
resource = opts[:default] || @open.call
element = Element.new(resource)
@elements << element
end
element.lock
end
result = yield element.object
rescue BadResource
delete_element(element) and raise
ensure
element.unlock if element release
end
result
end
57. # If you actually do this request, you'll get a
# warning. So don't.
objects = []
pool.take do |conn|
conn.list_keys("ruby") do |keys|
keys.each do |k|
pool.>> {|c| objects << c.get("ruby", k) }
end
end
end
58. # If you actually do this request, you'll get a
# warning. So don't.
objects = []
client.list_keys("ruby") do |keys|
keys.each do |k|
objects << client.get("ruby", k)
end
end
My name is Sean Cribbs, you can find me on Twitter @seancribbs. Today I&#x2019;m going to tell you a story about how accidental concurrency happened to me, and how you can prepare for it.\n
...or alternatively called &#x201C;ACCIDENTAL? FFFFFFFUUUUUUUUUUUU&#x201D;\n\nFair warning, I&#x2019;m including a bunch of memes in my presentation to lighten the mood. If you don&#x2019;t get them, I&#x2019;m sorry. Just nod and smile and humor the crazy man.\n\nBut first, a little bit more about my background and how I came to this problem.\n
I work for Basho Technologies, a company originally based in Cambridge, Massachusetts, USA, but now geographically distributed. We make Riak, the awesome distributed, fault-tolerant database. Riak is used at Kiip (where Mitchell works) as well as at other startups like Yammer, Voxer, Bump, and Github, but also Fortune 500 companies like Comcast (an American telco/cable provider), America Online (YES AOL) and Citigroup. Of course, there are many others we don&#x2019;t even know about because Riak is free and open-source.\n
At Basho we like to think of ourselves as experts on distributed systems. A number of our founding engineers built Akamai in the late 90&#x2019;s, the content distribution network -- a sort of cloud before we called them that. Most of our stuff is in Erlang, where we get a lot of concurrent problems solved for us or with minimal effort.\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
This case of accidental concurrency happened when I was implementing several streaming requests/responses that Riak supports for the Ruby client. Let&#x2019;s look at how streaming operations work.\n
To perform a streaming operation on Riak, the client sends the request, and then the server sends partial results to the client in multiple packets. The client waits around, accepting the partial results until the server signals that it&#x2019;s done.\n\nStreaming operations are great for large responses because both the client and the server can do things more efficiently -- the client can proactively process intermediate results from the server, and the server can reduce the amount of data it has to buffer by sending portions of the result immediately. Riak supports two such operations, list-keys and MapReduce. In HTTP, this is implemented as a chunked response, whereas on the Protocol Buffers-based binary protocol, it simply sends multiple message frames, followed by a final &#x201C;done&#x201D; frame.\n\nIf you look at the sequence of events, it looks a lot like an iteration over the response chunks. So that&#x2019;s how I implemented the interface, as if you were calling #each on an Enumerable.\n
To perform a streaming operation on Riak, the client sends the request, and then the server sends partial results to the client in multiple packets. The client waits around, accepting the partial results until the server signals that it&#x2019;s done.\n\nStreaming operations are great for large responses because both the client and the server can do things more efficiently -- the client can proactively process intermediate results from the server, and the server can reduce the amount of data it has to buffer by sending portions of the result immediately. Riak supports two such operations, list-keys and MapReduce. In HTTP, this is implemented as a chunked response, whereas on the Protocol Buffers-based binary protocol, it simply sends multiple message frames, followed by a final &#x201C;done&#x201D; frame.\n\nIf you look at the sequence of events, it looks a lot like an iteration over the response chunks. So that&#x2019;s how I implemented the interface, as if you were calling #each on an Enumerable.\n
To perform a streaming operation on Riak, the client sends the request, and then the server sends partial results to the client in multiple packets. The client waits around, accepting the partial results until the server signals that it&#x2019;s done.\n\nStreaming operations are great for large responses because both the client and the server can do things more efficiently -- the client can proactively process intermediate results from the server, and the server can reduce the amount of data it has to buffer by sending portions of the result immediately. Riak supports two such operations, list-keys and MapReduce. In HTTP, this is implemented as a chunked response, whereas on the Protocol Buffers-based binary protocol, it simply sends multiple message frames, followed by a final &#x201C;done&#x201D; frame.\n\nIf you look at the sequence of events, it looks a lot like an iteration over the response chunks. So that&#x2019;s how I implemented the interface, as if you were calling #each on an Enumerable.\n
To perform a streaming operation on Riak, the client sends the request, and then the server sends partial results to the client in multiple packets. The client waits around, accepting the partial results until the server signals that it&#x2019;s done.\n\nStreaming operations are great for large responses because both the client and the server can do things more efficiently -- the client can proactively process intermediate results from the server, and the server can reduce the amount of data it has to buffer by sending portions of the result immediately. Riak supports two such operations, list-keys and MapReduce. In HTTP, this is implemented as a chunked response, whereas on the Protocol Buffers-based binary protocol, it simply sends multiple message frames, followed by a final &#x201C;done&#x201D; frame.\n\nIf you look at the sequence of events, it looks a lot like an iteration over the response chunks. So that&#x2019;s how I implemented the interface, as if you were calling #each on an Enumerable.\n
To perform a streaming operation on Riak, the client sends the request, and then the server sends partial results to the client in multiple packets. The client waits around, accepting the partial results until the server signals that it&#x2019;s done.\n\nStreaming operations are great for large responses because both the client and the server can do things more efficiently -- the client can proactively process intermediate results from the server, and the server can reduce the amount of data it has to buffer by sending portions of the result immediately. Riak supports two such operations, list-keys and MapReduce. In HTTP, this is implemented as a chunked response, whereas on the Protocol Buffers-based binary protocol, it simply sends multiple message frames, followed by a final &#x201C;done&#x201D; frame.\n\nIf you look at the sequence of events, it looks a lot like an iteration over the response chunks. So that&#x2019;s how I implemented the interface, as if you were calling #each on an Enumerable.\n
Here&#x2019;s the basic format of a streaming operation in the Riak Ruby client. You call the streaming operation, passing a block that will receive intermediate results and do something with them.\n\nNow initially, I was using the &#x201C;curb&#x201D; library for HTTP, which provides direct bindings to libcurl. You can give a Ruby block to curb that will be called for each chunk of the response. This made a natural transition to streaming the results through the user&#x2019;s code block, with a little wrapping. Here&#x2019;s roughly what that code looked like.\n
Here&#x2019;s the basic format of a streaming operation in the Riak Ruby client. You call the streaming operation, passing a block that will receive intermediate results and do something with them.\n\nNow initially, I was using the &#x201C;curb&#x201D; library for HTTP, which provides direct bindings to libcurl. You can give a Ruby block to curb that will be called for each chunk of the response. This made a natural transition to streaming the results through the user&#x2019;s code block, with a little wrapping. Here&#x2019;s roughly what that code looked like.\n
That&#x2019;s awesome, let&#x2019;s STREAM ALL THE THINGS!\n
... which works fine until -- for example, when listing keys -- you want to make ADDITIONAL requests to the server from within the streaming block. You get an exception! What happened?\n
As it turns out, that &#x201C;curl&#x201D; thing was an instance variable containing a Curl::Easy connection handle, initialized something like this. The idea was that when you made multiple requests in a row, you could keep the same connection around and avoid setup/teardown costs. Well, it didn&#x2019;t work in the streaming case.\n\nSo I thought to make it a Thread-local variable instead, like so.\nNOPE! Still broken! Why was it being corrupted?\n
Here&#x2019;s what the libcurl docs say about the problem. <read excerpt> I read this and thought, WAT? I&#x2019;m not using any threads!\n
I didn&#x2019;t really understand the problem until I realized that libcurl yields to the block BEFORE the request finishes and returns. That means, that through the lexically-bound scope of the block, it has access to the ORIGINAL connection from which it initiated the first request and so it tries to reuse the handle CONCURRENTLY (that is, recursively) with processing the original request. Woops!\n
This informative (but ugly) sequence diagram shows you what I just mentioned, the key parts being in purple and red and the huge leftward line before those parts. If you really want to read it later, you can find it on my blog, which I&#x2019;ll give a link to later.\n\nBut there&#x2019;s another term for this problem of &#x201C;accidental concurrency&#x201D;: the code is not re-entrant.\n
Re-entrance is a classical CS problem, defined here by Wikipedia: <read snippet>\n
Furthermore: <read snippet> Essentially, I was attempting to reuse the &#x201C;curl&#x201D; instance variable or Thread local WHILE it was already in-use and unavailable. I was unintentionally allowing recursion via the passed block and not protecting that curl handle from improper accesses. It was effectively a global variable. I was disappoint.\n\nSo now what do I do? I wanted the solution to be simple to implement and maintain, but also look synchronous to the user.\n
I had two goals for the solution, which were to:\n\n- keep the call looking like a synchronous iteration, with no extra work done by caller about concurrency\n- Make solution as simple as possible\n
The first solution I came up with -- this was way back in 2010 -- was to use Fibers to isolate the local connection state. I don&#x2019;t remember where I found this solution, but it might have been StackOverflow or Dave Thomas&#x2019; blog posts about the subject.\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
I used &#x201C;Poor Man&#x2019;s Fibers&#x201D; @tmm1 for 1.8 & JRuby - which is essentially a thread with two queues. The passed streaming block runs inside the context of the created Fiber.\n\nMatz could correct me on this, but I think the thread-locals for Fibers is simply an artifact of its design and not an explicit feature. Rubinius, for example, doesn&#x2019;t have them last I checked.\n\nAnyway, as a result, the Thread-local version of the handle is isolated when streaming and things just work out peachy.\n
With this, I thought: THIS IS AWESOME, FIBERS NAILED IT. For a long time, this remained the solution. Although it met the first goal of appearing synchronous, it&#x2019;s not the simplest to understand or maintain.\n
Almost a year later, I was wanting to add new HTTP libraries behind the Riak client so that people could use it more effectively. I wanted the same benefits that Curb gave in terms of long-lived connections, but there were some quirks with its implementation that made it work poorly on certain types of operations.\n\nIncidentally, the new HTTP library I picked ALSO used a Thread-local variable for keep-alive connections. I had the same problem all over again, so this time I wanted to generalize the solution.\n
The idea was to extract the fibery-isolation bits from the HTTP-library specific bits. What I came up with looks strikingly like the event loop in a Reactor library like EventMachine so I called it the Pump. This is the entire implementation.\n\nNotice how we create and start the Fiber just as we did before, using yield and resume with the infinite loop.\n\nThe &#x201C;pump&#x201D; method is the inside of the &#x201C;on_body&#x201D; callback block from before. It receives a chunk and then resumes the fiber with that chunk.\n\nFinally, the to_proc method makes it easier to treat a Pump object like a block, by binding the pump method and returning it as a proc. \n
That means our previous ugly Curb-related code looks like this instead! Wow that&#x2019;s a lot easier to follow. It also paved the way for me to do similar things on Riak&#x2019;s binary protocol, which I was just beginning to implement and where I had to manage sockets myself.\n\nThis solution pretty much met both goals - hiding concurrency away from the user and also having a simple implementation. However, it doesn&#x2019;t appear that simple or clear unless you understand where it came from.\n
At this point I was proud of my achievement and started to talk about it on Twitter. I drew a lot of confusion and ire from Evan Phoenix and Yehuda Katz, respectively. Unfortunately, thanks to the expiration policy for tweets I was unable to exhume the conversation before this talk. In the course of trying to explain myself, I ended up writing a blog post which is the source of most of this talk. You can find it at the second link above, or go to seancribbs.com/tech and open the archive for March 2011. \n\nEvan didn&#x2019;t understand what I was trying to solve -- &#x201C;what is this code good for?&#x201D; he said -- and, as is often the case, Yehuda was right. The real solution to the concurrency/re-entrancy problem, he explained, is to implement a proper connection pool that protects connections in use and creates new ones as needed.\n
On the other hand, my coworker Andy is also right. <read quote, emphasis mine> Making a connection pool is hard, and understanding all the edge cases of concurrent access, locks and condition variables is difficult, so I procrastinated on the issue.\n
Because I procrastinated, I was able to pawn the work off onto one of my awesome committers, Kyle Kingsbury. \n
This is Kyle, he works at Boundary. He is awesome. Give him a high-five on Twitter @aphyr.\n\nHe and I discussed the code a lot, but most of the work in the current version is his. In addition to creating a proper connection pool, he also wrote a bunch of neat logic to load-balance connections across Riak hosts in the cluster, with error-sensitive host selection and retries. But let&#x2019;s look at how only the pool bit of it works.\n
We&#x2019;ll start with the smallest part, which is the individual items in the pool, we call elements. They are essentially a wrapper around the thing we want to allocate, that is, connections. To claim or lock an element, the &#x201C;owner&#x201D; instance variable is set to the current thread; to unlock, it is set to nil. There are also query methods to find out if the element is locked, which simply test whether the owner is nil, but I&#x2019;ve omitted those here.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Here&#x2019;s an example of how you might use the pool to perform a request. Notice how we streamed the list-keys operation through the block, taking another connection to fetch the values pointed to by the keys we have received.\n\nBut we don&#x2019;t need to expose the pool to the top-level interface, that&#x2019;s an implementation detail.\n\n\n
Now we&#x2019;ve achieved the same level of abstraction as we originally tried to get, but we are thread-safe and our connections are protected from corruption.\n
There are some things I left out of the presented code which deserve mentioning:\n\n* The pool creator specifies &#x201C;open&#x201D; and &#x201C;close&#x201D; variables that are used to allocate and deallocate elements in the pool.\n* When you call take, you can also provide a filter which will let you choose whether a given element is valid. This lets us do the error-sensitive host selection, so we retry requests on a different host.\n* You can iterate over the pool elements in a thread-safe way to do things like modify all of them, or clear the pool. This uses a separate lock for iteration and a condition variable to detect when elements are released. You can also just get the size of the pool, which is useful for metrics.\n* In order to make this solution actually work, we had to monkey-patch/duck-punch the HTTP libraries so they don&#x2019;t handle the keep-alive bits. \n* There&#x2019;s also some anecdotal evidence that the pool methodology gives some performance improvement, but I chalk this up to the fact that it is actually thread-safe, allowing you to take advantage of IO concurrency, as well as spreads the load via the multi-host feature.\n
So how can we, as a community that develops awesome Ruby libraries, protect against accidental concurrency?\n
First, we need to be careful about leaky scope. Throwing an exception when something is corrupted is not sufficient, we need to make sure precious and sensitive resources aren&#x2019;t leaked outside their intended scope. Both thread-locals and ivars can leak if we&#x2019;re not careful.\n\nIf your code yields to user code blocks, make sure those things you created in the code path that yields are unreachable from the block unless explicitly passed. If you do leak scope, protect your valuables with things like thread-safe resource pools.\n
Second, we can be more explicit about the types of behaviors our code exhibit around cached resources. Does it only work in certain circumstances? Document that. Don&#x2019;t just assume everyone is going to want ONE connection per host and port.\n\nIn order to fix the problems I had with one HTTP library, I had to do some deep monkey-patch on a private method, which both feels wrong and potentially breaks any other code that uses the library. Instead, the library could either make it transparent AND SAFE to the user, or allow the user to inject the desired caching behavior (which might be none at all).\n
Thanks for listening, I&#x2019;d be happy to take questions now.\n