A Case of Accidental Concurrency

•

2 j'aime•1,748 vues

Concurrency in Ruby is all the rage these days, and people can't seem to agree whether Threads, Fibers, event loops, or actors are the best solution. But did you ever consider that your *sequential* Ruby program might be concurrent, with nary a Thread, Fiber, or callback in sight? Well, it happened to me. This is the story of how accidental concurrency (also known as re-entrancy) broke my brain multiple times over the course of two years, spawned flamewars on Twitter, long blog posts, and the various solutions I took to solve the problem. Along the way we'll illuminate some subtleties of concurrent programming in Ruby, differences between several Ruby implementations, and how we can all write code that is friendlier when accidental concurrency strikes.

Technologie Business

A Case of
Accidental
Concurrency
Sean Cribbs
@seancribbs

Processes Reactors

Threads

Ruby Concurrency

Processes Reactors

Threads

Ruby Concurrency
Fibers

Processes Reactors

Threads

Ruby Concurrency
Actors
Fibers

Processes Reactors

Threads

Ruby Concurrency
Actors
Fibers

Sequential Code

Riak Streaming
Operations

http://thecatchandthehatch.com/pages/river-resources/

Streaming Ops
client server
stream_me

result

result

result

done

Streaming Ops
client server
stream_me

result

result

result

done

list-keys & MapReduce

$Stream in Ruby # Request a streamed operation client.stream_something do |result| process(result) end # Stream via curb if block_given? curl.on_body {|c| yield c; c.size } else curl.on_body # Clear out the callback end$

Curl::Err::MultiBadEasyHandle
:
Invalid easy handle

Curling Back
def curl
@curl ||= Curl::Easy.new
end

Curling Back
def curl
@curl ||= Curl::Easy.new
end

def curl
Thread.current[:curl] ||= Curl::Easy.new
end

"The ﬁrst basic rule is that you must
never simultaneously share a libcurl
handle (be it easy or multi or whatever)
between multiple threads."

-- libcurl docs

NO THREADS you must
"The ﬁrst basic rule is that
never simultaneously share a libcurl
handle (be it easy or multi or whatever)
between multiple threads."

-- libcurl docs

Realization

• curl yields to block BEFORE return
• block tries to reuse handle while
connection is processing

Re-entrant
"In computing, a computer program or
subroutine is called reentrant if it can
be interrupted in the middle of its
execution and then safely called again
("re-entered") before its previous
invocation's complete execution."

Wikipedia

Re-entrant

"...a subroutine can fail to be reentrant
if it relies on a global variable to remain
unchanged but that variable is modiﬁed
when the subroutine is recursively
invoked."

Wikipedia

$Solution #1: Fibers if block_given? _curl = curl Fiber.new { f = Fiber.current _curl.on_body do |chunk| f.resume(chunk); chunk.size end loop do yield Fiber.yield end }.resume else # ...$

$Solution #1: Fibers if block_given? _curl = curl local curl Fiber.new { f = Fiber.current _curl.on_body do |chunk| f.resume(chunk); chunk.size end loop do yield Fiber.yield end }.resume else # ...$

$Solution #1: Fibers if block_given? _curl = curl Fiber.new { open ﬁber f = Fiber.current _curl.on_body do |chunk| f.resume(chunk); chunk.size end loop do yield Fiber.yield end }.resume else # ...$

$Solution #1: Fibers if block_given? _curl = curl Fiber.new { f = Fiber.current this ﬁber _curl.on_body do |chunk| f.resume(chunk); chunk.size end loop do yield Fiber.yield end }.resume else # ...$

$Solution #1: Fibers if block_given? _curl = curl Fiber.new { f = Fiber.current _curl.on_body do |chunk| f.resume(chunk); chunk.size resume on chunk end loop do yield Fiber.yield end }.resume else # ...$

$Solution #1: Fibers if block_given? _curl = curl Fiber.new { f = Fiber.current _curl.on_body do |chunk| f.resume(chunk); chunk.size end loop do wait for chunk, yield Fiber.yield yield chunk, end repeat }.resume else # ...$

$Solution #1: Fibers if block_given? _curl = curl Fiber.new { f = Fiber.current _curl.on_body do |chunk| f.resume(chunk); chunk.size end loop do yield Fiber.yield end start the ﬁber/loop }.resume else # ...$

How it Works

• Stream block runs INSIDE the Fiber
•Fibers have Thread locals
• Thread.current[:curl] is isolated

class Pump
  def initialize(block)
    @ﬁber = Fiber.new do
      loop do
        block.call Fiber.yield
      end
    end
    @ﬁber.resume
  end

  def pump(input)
    @ﬁber.resume input
    input.size if input.respond_to?(:size)
  end

  def to_proc

Simpler

if block_given?
curl.on_body(&Pump.new(block))
else
curl.on_body
end

COMMENCE
TWITTERSTORM
http://seancribbs.com/tech/2011/03/08/how-ripple-uses-
ﬁbers

Photos by James Duncan Davidson, O’Reilly RailsConf 2010
http://www.ﬂickr.com/photos/oreillyconf/4686730445/ http://www.ﬂickr.com/photos/oreillyconf/4683464653/

“There are three hard
things in Computer
Science: naming, cache
invalidation, and
connection pooling.”

Andy Gross, Basho
Architect
Jan 2012

class Pool::Element
attr_accessor :object, :owner

def initialize(object)
@object = object
@owner = nil
end

def lock
@owner = Thread.current
end

def unlock
@owner = nil
end
end

$def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end$

$def take(opts = {}) # alias >> take result = nil result var begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end$

$def take(opts = {}) # alias >> take result = nil begin element = nil pool element @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end$

$def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do grab pool lock element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end$

$def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end ﬁnd a connection element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end$

$def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element all claimed! end element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end$

$def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end add new element.lock end conn to result = yield element.object rescue BadResource pool delete_element(element) and raise ensure element.unlock if element end result end$

$def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock claim it end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end$

$def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock end result = yield element.object call block rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end$

$# If you actually do this request, you'll get a # warning. So don't. objects = [] pool.take do |conn| conn.list_keys("ruby") do |keys| keys.each do |k| pool.>> {|c| objects << c.get("ruby", k) } end end end$

# If you actually do this request, you'll get a
# warning. So don't.
objects = []
client.list_keys("ruby") do |keys|
keys.each do |k|
objects << client.get("ruby", k)
end
end

Contenu connexe

Tendances

JavaScript FunctionsColin DeCarlo

Python return statementMenard Maranan

Currying and Partial Function Application (PFA)Dhaval Dalal

A JIT Smalltalk VM written in itselfESUG

DRYing to Monad in Java8Dhaval Dalal

The Django Book / Chapter 3: Views and URLconfsVincent Chien

GR8Conf 2011: GParsGR8Conf

Creating Lazy stream in CSharpDhaval Dalal

Reactive Programming for a demanding world: building event-driven and respons...Mario Fusco

Concurrecny inf sharpRiccardo Terrell

G parsNexThoughts Technologies

Concurrency, Scalability & Fault-tolerance 2.0 with Akka Actors & STMMario Fusco

JavaScript FunctionsBrian Moschel

Connecting your phone and home with firebase and android things - James Cogga...DroidConTLV

$q and Promises in AngularJS a_sharif

Understanding Async/Await in JavascriptHao Luo

Symfony & Javascript. Combining the best of two worldsIgnacio Martín

LinkedIn TBC JavaScript 100: FunctionsAdam Crabtree

Zenddispatch enGlobal CyberSoft JSC

AkkaTim Dalton

Tendances (20)

JavaScript Functions

Python return statement

Currying and Partial Function Application (PFA)

A JIT Smalltalk VM written in itself

DRYing to Monad in Java8

The Django Book / Chapter 3: Views and URLconfs

GR8Conf 2011: GPars

Creating Lazy stream in CSharp

Reactive Programming for a demanding world: building event-driven and respons...

Concurrecny inf sharp

G pars

Concurrency, Scalability & Fault-tolerance 2.0 with Akka Actors & STM

JavaScript Functions

Connecting your phone and home with firebase and android things - James Cogga...

$q and Promises in AngularJS

Understanding Async/Await in Javascript

Symfony & Javascript. Combining the best of two worlds

LinkedIn TBC JavaScript 100: Functions

Zenddispatch en

Akka

En vedette

La computadoraangel0850

El testhactor

TituloPenegrandemoney

รูปMiwako111

Trabajo ba tripJhonny Nuñez

Nutricion y alimentacionIvonne Castillo

Jonathanjonathanornacapito

Csaba Markus ArtTrinity Blu*** Don't Thank Me for Viewing Shows....but Rather, Pay It Forward :))***

Producción contenido Digital -UnoLuis López

Viacrusis 2016 - Coelgio el Rosario Vi estacionLuis López

Partido de maestros 2016Luis López

Vestidos ReciclajeLuis López

Operation researchBrainmapsolutions

Biography of micheal jacksonawesomesmart

AhpJhonatan Contreras Aparicio

An introduction to codingiain bruce

En vedette (16)

La computadora

El test

Titulo

รูป

Trabajo ba trip

Nutricion y alimentacion

Jonathan

Csaba Markus Art

Producción contenido Digital -Uno

Viacrusis 2016 - Coelgio el Rosario Vi estacion

Partido de maestros 2016

Vestidos Reciclaje

Operation research

Biography of micheal jackson

Ahp

An introduction to coding

Similaire à A Case of Accidental Concurrency

Concurrent programming with Celluloid (MWRC 2012)tarcieri

Ruby 1.9 FibersKevin Ball

Your 🧠 on Swift ConcurrencyDonny Wals

Dataflow: Declarative concurrency in RubyLarry Diehl

A tour on ruby and friends旻琦潘

Everything is Permitted: Extending Built-insAndrew Dupont

Building Cloud CastlesBen Scofield

Scale up your thinkingYardena Meymann

Concurrency Programming in Java - 07 - High-level Concurrency objects, Lock O...Sachintha Gunasena

ConcurrencyIsaac Liao

Introduction to modern c++ principles(part 1)Oky Firmansyah

Async and Non-blocking IO w/ JRubyJoe Kutner

React Internals - How understanding React implementation can help us write be...Ankit Muchhala

Ruby: OOP, metaprogramming, blocks, iterators, mix-ins, duck typing. Code styleAnton Shemerey

Enumerablemussawir20

.NET Multithreading/MultitaskingSasha Kravchuk

Torquebox OSCON Java 2011tobiascrawley

Js in-ten-minutesPhong Vân

Groovy concurrencyAlex Miller

Why using finalizers is a bad ideaPVS-Studio

Similaire à A Case of Accidental Concurrency (20)

Concurrent programming with Celluloid (MWRC 2012)

Ruby 1.9 Fibers

Your 🧠 on Swift Concurrency

Dataflow: Declarative concurrency in Ruby

A tour on ruby and friends

Everything is Permitted: Extending Built-ins

Building Cloud Castles

Scale up your thinking

Concurrency Programming in Java - 07 - High-level Concurrency objects, Lock O...

Concurrency

Introduction to modern c++ principles(part 1)

Async and Non-blocking IO w/ JRuby

React Internals - How understanding React implementation can help us write be...

Ruby: OOP, metaprogramming, blocks, iterators, mix-ins, duck typing. Code style

Enumerable

.NET Multithreading/Multitasking

Torquebox OSCON Java 2011

Js in-ten-minutes

Groovy concurrency

Why using finalizers is a bad idea

Plus de Sean Cribbs

Eventually Consistent Data Structures (from strangeloop12)Sean Cribbs

Eventually-Consistent Data StructuresSean Cribbs

Embrace NoSQL and Eventual Consistency with RippleSean Cribbs

Riak with node.jsSean Cribbs

Schema Design for Riak (Take 2)Sean Cribbs

Riak (Øredev nosql day)Sean Cribbs

Riak Tutorial (Øredev)Sean Cribbs

The Radiant EthicSean Cribbs

Introduction to Riak and Ripple (KC.rb)Sean Cribbs

Riak with RailsSean Cribbs

Schema Design for RiakSean Cribbs

Introduction to Riak - Red Dirt Ruby Conf TrainingSean Cribbs

Introducing Riak and RippleSean Cribbs

Round PEG, Round Hole - Parsing FunctionallySean Cribbs

Story Driven Development With CucumberSean Cribbs

Achieving Parsing Sanity In ErlangSean Cribbs

Of Rats And DragonsSean Cribbs

Erlang/OTP for RubyistsSean Cribbs

Content Management That Won't Rot Your BrainSean Cribbs

Plus de Sean Cribbs (19)

Eventually Consistent Data Structures (from strangeloop12)

Eventually-Consistent Data Structures

Embrace NoSQL and Eventual Consistency with Ripple

Riak with node.js

Schema Design for Riak (Take 2)

Riak (Øredev nosql day)

Riak Tutorial (Øredev)

The Radiant Ethic

Introduction to Riak and Ripple (KC.rb)

Riak with Rails

Schema Design for Riak

Introduction to Riak - Red Dirt Ruby Conf Training

Introducing Riak and Ripple

Round PEG, Round Hole - Parsing Functionally

Story Driven Development With Cucumber

Achieving Parsing Sanity In Erlang

Of Rats And Dragons

Erlang/OTP for Rubyists

Content Management That Won't Rot Your Brain

Dernier

The State of Passkeys with FIDO Alliance.pptxLoriGlavin3

QCon London: Mastering long-running processes in modern architecturesBernd Ruecker

Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González

React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal

Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll

TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes

MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar

Data governance with Unity Catalog PresentationKnoldus Inc.

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765

A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3

Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Dernier (20)

The State of Passkeys with FIDO Alliance.pptx

QCon London: Mastering long-running processes in modern architectures

Generative Artificial Intelligence: How generative AI works.pdf

React Native vs Ionic - The Best Mobile App Framework

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...

Emixa Mendix Meetup 11 April 2024 about Mendix Native development

TeamStation AI System Report LATAM IT Salaries 2024

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes

MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes

Data governance with Unity Catalog Presentation

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx

Generative AI - Gitex v1Generative AI - Gitex v1.pptx

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx

Long journey of Ruby standard library at RubyConf AU 2024

Genislab builds better products and faster go-to-market with Lean project man...

Design pattern talk by Kaya Weers - 2024 (v2)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

A Case of Accidental Concurrency

1. A Case of Accidental Concurrency Sean Cribbs @seancribbs

3. I work for Basho We make

4. Distributed Systems Experts

5. Ruby Concurrency

6. Processes Ruby Concurrency

7. Processes Threads Ruby Concurrency

8. Processes Reactors Threads Ruby Concurrency

9. Processes Reactors Threads Ruby Concurrency Fibers

10. Processes Reactors Threads Ruby Concurrency Actors Fibers

11. Processes Reactors Threads Ruby Concurrency Actors Fibers Sequential Code

12. Riak Streaming Operations http://thecatchandthehatch.com/pages/river-resources/

13. Streaming Ops client server stream_me

14. Streaming Ops client server stream_me result result result done

15. Streaming Ops client server stream_me result result result done list-keys & MapReduce

16. Stream in Ruby # Request a streamed operation client.stream_something do |result| process(result) end

17. Stream in Ruby # Request a streamed operation client.stream_something do |result| process(result) end # Stream via curb if block_given? curl.on_body {|c| yield c; c.size } else curl.on_body # Clear out the callback end

18.

19. Curl::Err::MultiBadEasyHandle : Invalid easy handle

20. Curling Back def curl @curl ||= Curl::Easy.new end

21. Curling Back def curl @curl ||= Curl::Easy.new end def curl Thread.current[:curl] ||= Curl::Easy.new end

22. "The ﬁrst basic rule is that you must never simultaneously share a libcurl handle (be it easy or multi or whatever) between multiple threads." -- libcurl docs

23. NO THREADS you must "The ﬁrst basic rule is that never simultaneously share a libcurl handle (be it easy or multi or whatever) between multiple threads." -- libcurl docs

24. Realization • curl yields to block BEFORE return • block tries to reuse handle while connection is processing

25. Re-entrant "In computing, a computer program or subroutine is called reentrant if it can be interrupted in the middle of its execution and then safely called again ("re-entered") before its previous invocation's complete execution." Wikipedia

26. Re-entrant "...a subroutine can fail to be reentrant if it relies on a global variable to remain unchanged but that variable is modiﬁed when the subroutine is recursively invoked." Wikipedia

27. Re-entrant "...a subroutine can fail to be reentrant if it relies on a global variable to remain unchanged but that variable is modiﬁed when the subroutine is recursively invoked." Wikipedia

28. Solution #1: Fibers

29. Solution #1: Fibers if block_given? _curl = curl Fiber.new { f = Fiber.current _curl.on_body do |chunk| f.resume(chunk); chunk.size end loop do yield Fiber.yield end }.resume else # ...

30. Solution #1: Fibers if block_given? _curl = curl local curl Fiber.new { f = Fiber.current _curl.on_body do |chunk| f.resume(chunk); chunk.size end loop do yield Fiber.yield end }.resume else # ...

31. Solution #1: Fibers if block_given? _curl = curl Fiber.new { open ﬁber f = Fiber.current _curl.on_body do |chunk| f.resume(chunk); chunk.size end loop do yield Fiber.yield end }.resume else # ...

32. Solution #1: Fibers if block_given? _curl = curl Fiber.new { f = Fiber.current this ﬁber _curl.on_body do |chunk| f.resume(chunk); chunk.size end loop do yield Fiber.yield end }.resume else # ...

33. Solution #1: Fibers if block_given? _curl = curl Fiber.new { f = Fiber.current _curl.on_body do |chunk| f.resume(chunk); chunk.size resume on chunk end loop do yield Fiber.yield end }.resume else # ...

34. Solution #1: Fibers if block_given? _curl = curl Fiber.new { f = Fiber.current _curl.on_body do |chunk| f.resume(chunk); chunk.size end loop do wait for chunk, yield Fiber.yield yield chunk, end repeat }.resume else # ...

35. Solution #1: Fibers if block_given? _curl = curl Fiber.new { f = Fiber.current _curl.on_body do |chunk| f.resume(chunk); chunk.size end loop do yield Fiber.yield end start the ﬁber/loop }.resume else # ...

36. How it Works • Stream block runs INSIDE the Fiber •Fibers have Thread locals • Thread.current[:curl] is isolated

37.

38. Solution #2: Pump

39. class Pump def initialize(block) @fiber = Fiber.new do loop do block.call Fiber.yield end end @fiber.resume end def pump(input) @fiber.resume input input.size if input.respond_to?(:size) end def to_proc

40. Simpler if block_given? curl.on_body(&Pump.new(block)) else curl.on_body end

41. COMMENCE TWITTERSTORM http://seancribbs.com/tech/2011/03/08/how-ripple-uses- fibers Photos by James Duncan Davidson, O’Reilly RailsConf 2010 http://www.flickr.com/photos/oreillyconf/4686730445/ http://www.flickr.com/photos/oreillyconf/4683464653/

42. “There are three hard things in Computer Science: naming, cache invalidation, and connection pooling.” Andy Gross, Basho Architect Jan 2012

43. Solution #3: Connection Pool

44. @aphyr

45. class Pool::Element attr_accessor :object, :owner def initialize(object) @object = object @owner = nil end def lock @owner = Thread.current end def unlock @owner = nil end end

46. def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end

47. def take(opts = {}) # alias >> take result = nil result var begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end

48. def take(opts = {}) # alias >> take result = nil begin element = nil pool element @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end

49. def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do grab pool lock element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end

50. def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end ﬁnd a connection element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end

51. def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element all claimed! end element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end

52. def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end add new element.lock end conn to result = yield element.object rescue BadResource pool delete_element(element) and raise ensure element.unlock if element end result end

53. def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock claim it end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end

54. def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock end result = yield element.object call block rescue BadResource delete_element(element) and raise ensure element.unlock if element end result end

55. def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element cleanup end baddies result end

56. def take(opts = {}) # alias >> take result = nil begin element = nil @lock.synchronize do element = pool.ﬁnd {|e| e.unlocked? } unless element resource = opts[:default] || @open.call element = Element.new(resource) @elements << element end element.lock end result = yield element.object rescue BadResource delete_element(element) and raise ensure element.unlock if element release end result end

57. # If you actually do this request, you'll get a # warning. So don't. objects = [] pool.take do |conn| conn.list_keys("ruby") do |keys| keys.each do |k| pool.>> {|c| objects << c.get("ruby", k) } end end end

58. # If you actually do this request, you'll get a # warning. So don't. objects = [] client.list_keys("ruby") do |keys| keys.each do |k| objects << client.get("ruby", k) end end

59. Becoming Less “Accidental”

60. Fix Leaky Scope

61. Be Explicit

62. Thanks! @seancribbs

Notes de l'éditeur

My name is Sean Cribbs, you can find me on Twitter @seancribbs. Today I&#x2019;m going to tell you a story about how accidental concurrency happened to me, and how you can prepare for it.\n
...or alternatively called &#x201C;ACCIDENTAL? FFFFFFFUUUUUUUUUUUU&#x201D;\n\nFair warning, I&#x2019;m including a bunch of memes in my presentation to lighten the mood. If you don&#x2019;t get them, I&#x2019;m sorry. Just nod and smile and humor the crazy man.\n\nBut first, a little bit more about my background and how I came to this problem.\n
I work for Basho Technologies, a company originally based in Cambridge, Massachusetts, USA, but now geographically distributed. We make Riak, the awesome distributed, fault-tolerant database. Riak is used at Kiip (where Mitchell works) as well as at other startups like Yammer, Voxer, Bump, and Github, but also Fortune 500 companies like Comcast (an American telco/cable provider), America Online (YES AOL) and Citigroup. Of course, there are many others we don&#x2019;t even know about because Riak is free and open-source.\n
At Basho we like to think of ourselves as experts on distributed systems. A number of our founding engineers built Akamai in the late 90&#x2019;s, the content distribution network -- a sort of cloud before we called them that. Most of our stuff is in Erlang, where we get a lot of concurrent problems solved for us or with minimal effort.\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
But this talk is about concurrency in Ruby, not Erlang. What kinds of tools do we have for concurrency in Ruby? We&#x2019;re all pretty familiar with doing concurrency with multiple processes, where we simply start new instances of our program -- we&#x2019;ve been doing that for a long time. We&#x2019;re also familiar with threads, which until 1.9 weren&#x2019;t really as useful as they could have been. To solve some of the problems with threads before 1.9, people created reactor libraries like EventMachine, mostly to handle IO-heavy work (not CPU-heavy), but they tend to require a different programming model, with callbacks and such. More recently, we got Fibers which let you have multiple streams of work that cooperate more explicitly, and Actor libraries like Celluloid that make your objects feel like isolated processes. But there&#x2019;s another type of &#x201C;concurrency&#x201D; <emphasize quotes> that can happen to sequential Ruby code! If you&#x2019;re like I was when I first ran into this, you&#x2019;re probably saying &#x201C;WAT?!?!&#x201D;\n
This case of accidental concurrency happened when I was implementing several streaming requests/responses that Riak supports for the Ruby client. Let&#x2019;s look at how streaming operations work.\n
To perform a streaming operation on Riak, the client sends the request, and then the server sends partial results to the client in multiple packets. The client waits around, accepting the partial results until the server signals that it&#x2019;s done.\n\nStreaming operations are great for large responses because both the client and the server can do things more efficiently -- the client can proactively process intermediate results from the server, and the server can reduce the amount of data it has to buffer by sending portions of the result immediately. Riak supports two such operations, list-keys and MapReduce. In HTTP, this is implemented as a chunked response, whereas on the Protocol Buffers-based binary protocol, it simply sends multiple message frames, followed by a final &#x201C;done&#x201D; frame.\n\nIf you look at the sequence of events, it looks a lot like an iteration over the response chunks. So that&#x2019;s how I implemented the interface, as if you were calling #each on an Enumerable.\n
To perform a streaming operation on Riak, the client sends the request, and then the server sends partial results to the client in multiple packets. The client waits around, accepting the partial results until the server signals that it&#x2019;s done.\n\nStreaming operations are great for large responses because both the client and the server can do things more efficiently -- the client can proactively process intermediate results from the server, and the server can reduce the amount of data it has to buffer by sending portions of the result immediately. Riak supports two such operations, list-keys and MapReduce. In HTTP, this is implemented as a chunked response, whereas on the Protocol Buffers-based binary protocol, it simply sends multiple message frames, followed by a final &#x201C;done&#x201D; frame.\n\nIf you look at the sequence of events, it looks a lot like an iteration over the response chunks. So that&#x2019;s how I implemented the interface, as if you were calling #each on an Enumerable.\n
To perform a streaming operation on Riak, the client sends the request, and then the server sends partial results to the client in multiple packets. The client waits around, accepting the partial results until the server signals that it&#x2019;s done.\n\nStreaming operations are great for large responses because both the client and the server can do things more efficiently -- the client can proactively process intermediate results from the server, and the server can reduce the amount of data it has to buffer by sending portions of the result immediately. Riak supports two such operations, list-keys and MapReduce. In HTTP, this is implemented as a chunked response, whereas on the Protocol Buffers-based binary protocol, it simply sends multiple message frames, followed by a final &#x201C;done&#x201D; frame.\n\nIf you look at the sequence of events, it looks a lot like an iteration over the response chunks. So that&#x2019;s how I implemented the interface, as if you were calling #each on an Enumerable.\n
To perform a streaming operation on Riak, the client sends the request, and then the server sends partial results to the client in multiple packets. The client waits around, accepting the partial results until the server signals that it&#x2019;s done.\n\nStreaming operations are great for large responses because both the client and the server can do things more efficiently -- the client can proactively process intermediate results from the server, and the server can reduce the amount of data it has to buffer by sending portions of the result immediately. Riak supports two such operations, list-keys and MapReduce. In HTTP, this is implemented as a chunked response, whereas on the Protocol Buffers-based binary protocol, it simply sends multiple message frames, followed by a final &#x201C;done&#x201D; frame.\n\nIf you look at the sequence of events, it looks a lot like an iteration over the response chunks. So that&#x2019;s how I implemented the interface, as if you were calling #each on an Enumerable.\n
To perform a streaming operation on Riak, the client sends the request, and then the server sends partial results to the client in multiple packets. The client waits around, accepting the partial results until the server signals that it&#x2019;s done.\n\nStreaming operations are great for large responses because both the client and the server can do things more efficiently -- the client can proactively process intermediate results from the server, and the server can reduce the amount of data it has to buffer by sending portions of the result immediately. Riak supports two such operations, list-keys and MapReduce. In HTTP, this is implemented as a chunked response, whereas on the Protocol Buffers-based binary protocol, it simply sends multiple message frames, followed by a final &#x201C;done&#x201D; frame.\n\nIf you look at the sequence of events, it looks a lot like an iteration over the response chunks. So that&#x2019;s how I implemented the interface, as if you were calling #each on an Enumerable.\n
Here&#x2019;s the basic format of a streaming operation in the Riak Ruby client. You call the streaming operation, passing a block that will receive intermediate results and do something with them.\n\nNow initially, I was using the &#x201C;curb&#x201D; library for HTTP, which provides direct bindings to libcurl. You can give a Ruby block to curb that will be called for each chunk of the response. This made a natural transition to streaming the results through the user&#x2019;s code block, with a little wrapping. Here&#x2019;s roughly what that code looked like.\n
Here&#x2019;s the basic format of a streaming operation in the Riak Ruby client. You call the streaming operation, passing a block that will receive intermediate results and do something with them.\n\nNow initially, I was using the &#x201C;curb&#x201D; library for HTTP, which provides direct bindings to libcurl. You can give a Ruby block to curb that will be called for each chunk of the response. This made a natural transition to streaming the results through the user&#x2019;s code block, with a little wrapping. Here&#x2019;s roughly what that code looked like.\n
That&#x2019;s awesome, let&#x2019;s STREAM ALL THE THINGS!\n
... which works fine until -- for example, when listing keys -- you want to make ADDITIONAL requests to the server from within the streaming block. You get an exception! What happened?\n
As it turns out, that &#x201C;curl&#x201D; thing was an instance variable containing a Curl::Easy connection handle, initialized something like this. The idea was that when you made multiple requests in a row, you could keep the same connection around and avoid setup/teardown costs. Well, it didn&#x2019;t work in the streaming case.\n\nSo I thought to make it a Thread-local variable instead, like so.\nNOPE! Still broken! Why was it being corrupted?\n
Here&#x2019;s what the libcurl docs say about the problem. <read excerpt> I read this and thought, WAT? I&#x2019;m not using any threads!\n
I didn&#x2019;t really understand the problem until I realized that libcurl yields to the block BEFORE the request finishes and returns. That means, that through the lexically-bound scope of the block, it has access to the ORIGINAL connection from which it initiated the first request and so it tries to reuse the handle CONCURRENTLY (that is, recursively) with processing the original request. Woops!\n
This informative (but ugly) sequence diagram shows you what I just mentioned, the key parts being in purple and red and the huge leftward line before those parts. If you really want to read it later, you can find it on my blog, which I&#x2019;ll give a link to later.\n\nBut there&#x2019;s another term for this problem of &#x201C;accidental concurrency&#x201D;: the code is not re-entrant.\n
Re-entrance is a classical CS problem, defined here by Wikipedia: <read snippet>\n
Furthermore: <read snippet> Essentially, I was attempting to reuse the &#x201C;curl&#x201D; instance variable or Thread local WHILE it was already in-use and unavailable. I was unintentionally allowing recursion via the passed block and not protecting that curl handle from improper accesses. It was effectively a global variable. I was disappoint.\n\nSo now what do I do? I wanted the solution to be simple to implement and maintain, but also look synchronous to the user.\n
I had two goals for the solution, which were to:\n\n- keep the call looking like a synchronous iteration, with no extra work done by caller about concurrency\n- Make solution as simple as possible\n
The first solution I came up with -- this was way back in 2010 -- was to use Fibers to isolate the local connection state. I don&#x2019;t remember where I found this solution, but it might have been StackOverflow or Dave Thomas&#x2019; blog posts about the subject.\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
Here&#x2019;s what the solution looks like. It&#x2019;s a little hard to follow so I&#x2019;ll walk you through it. For now we&#x2019;ll only focus on the streaming case since the sequential case is still the same as before.\n\n1) First we get a local copy of the currently unsullied curl handle.\n2) Then we open a new fiber - remember fibers are cooperative and must be explicitly started/stopped. This block won&#x2019;t execute until we call resume later down.\n3) Inside the block, we make a local copy of the current fiber so we can refer to it in the callback.\n4) Now we apply the same on_body callback as before, but instead of yielding to the passed block, we resume the fiber. You&#x2019;ll see why in a second.\n5) Then we start an infinite loop that immediately yields control of the fiber (halts waiting for a message chunk). When it gets that chunk, it passes it to the original block via yield, then starts all over again.\n6) Finally, we start the fiber using &#x201C;resume&#x201D; so that the infinite loop can begin.\n\nThat&#x2019;s great, but why does this work?\n
I used &#x201C;Poor Man&#x2019;s Fibers&#x201D; @tmm1 for 1.8 & JRuby - which is essentially a thread with two queues. The passed streaming block runs inside the context of the created Fiber.\n\nMatz could correct me on this, but I think the thread-locals for Fibers is simply an artifact of its design and not an explicit feature. Rubinius, for example, doesn&#x2019;t have them last I checked.\n\nAnyway, as a result, the Thread-local version of the handle is isolated when streaming and things just work out peachy.\n
With this, I thought: THIS IS AWESOME, FIBERS NAILED IT. For a long time, this remained the solution. Although it met the first goal of appearing synchronous, it&#x2019;s not the simplest to understand or maintain.\n
Almost a year later, I was wanting to add new HTTP libraries behind the Riak client so that people could use it more effectively. I wanted the same benefits that Curb gave in terms of long-lived connections, but there were some quirks with its implementation that made it work poorly on certain types of operations.\n\nIncidentally, the new HTTP library I picked ALSO used a Thread-local variable for keep-alive connections. I had the same problem all over again, so this time I wanted to generalize the solution.\n
The idea was to extract the fibery-isolation bits from the HTTP-library specific bits. What I came up with looks strikingly like the event loop in a Reactor library like EventMachine so I called it the Pump. This is the entire implementation.\n\nNotice how we create and start the Fiber just as we did before, using yield and resume with the infinite loop.\n\nThe &#x201C;pump&#x201D; method is the inside of the &#x201C;on_body&#x201D; callback block from before. It receives a chunk and then resumes the fiber with that chunk.\n\nFinally, the to_proc method makes it easier to treat a Pump object like a block, by binding the pump method and returning it as a proc. \n
That means our previous ugly Curb-related code looks like this instead! Wow that&#x2019;s a lot easier to follow. It also paved the way for me to do similar things on Riak&#x2019;s binary protocol, which I was just beginning to implement and where I had to manage sockets myself.\n\nThis solution pretty much met both goals - hiding concurrency away from the user and also having a simple implementation. However, it doesn&#x2019;t appear that simple or clear unless you understand where it came from.\n
At this point I was proud of my achievement and started to talk about it on Twitter. I drew a lot of confusion and ire from Evan Phoenix and Yehuda Katz, respectively. Unfortunately, thanks to the expiration policy for tweets I was unable to exhume the conversation before this talk. In the course of trying to explain myself, I ended up writing a blog post which is the source of most of this talk. You can find it at the second link above, or go to seancribbs.com/tech and open the archive for March 2011. \n\nEvan didn&#x2019;t understand what I was trying to solve -- &#x201C;what is this code good for?&#x201D; he said -- and, as is often the case, Yehuda was right. The real solution to the concurrency/re-entrancy problem, he explained, is to implement a proper connection pool that protects connections in use and creates new ones as needed.\n
On the other hand, my coworker Andy is also right. <read quote, emphasis mine> Making a connection pool is hard, and understanding all the edge cases of concurrent access, locks and condition variables is difficult, so I procrastinated on the issue.\n
Because I procrastinated, I was able to pawn the work off onto one of my awesome committers, Kyle Kingsbury. \n
This is Kyle, he works at Boundary. He is awesome. Give him a high-five on Twitter @aphyr.\n\nHe and I discussed the code a lot, but most of the work in the current version is his. In addition to creating a proper connection pool, he also wrote a bunch of neat logic to load-balance connections across Riak hosts in the cluster, with error-sensitive host selection and retries. But let&#x2019;s look at how only the pool bit of it works.\n
We&#x2019;ll start with the smallest part, which is the individual items in the pool, we call elements. They are essentially a wrapper around the thing we want to allocate, that is, connections. To claim or lock an element, the &#x201C;owner&#x201D; instance variable is set to the current thread; to unlock, it is set to nil. There are also query methods to find out if the element is locked, which simply test whether the owner is nil, but I&#x2019;ve omitted those here.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Now let&#x2019;s look at the core of what the pool does, which is to find and allocate elements to users of the pool and automatically release them when they are finished. We called the method that handles this task &#x201C;take&#x201D; and it has a fun alias to the &#x201C;right-shift&#x201D; operator which makes your calls look sort of like Haskell&#x2019;s monadic &#x201C;bind&#x201D; operator.\n\nAgain, I&#x2019;ve simplified the code here so it&#x2019;s easier to understand. First we set up a result variable to receive the return value of the passed block, which we will return at the end of the method. We also set up an element variable to receive a connection that we grab from the pool. \n\nInside the begin/rescue/end section we obtain a lock on the elements in the pool. Once we have the lock, we find an element that is not locked. If there are none available, we create a new one and add it to the pool. Finally, we lock the element so that no other sections of code -- even in the same Thread -- will try to use it until we&#x2019;re done. Then we exit the critical section. Notice how we&#x2019;ve kept the critical section quite small - this is important if you want your code to move forward reliably. Only lock when necessary! Now we actually get to the real work - we yield the connection to the block, which can do whatever it needs to service the request.\n\nThe rescue clause here lets us raise exceptions when we find that the connection is bad (disconnected or otherwise) and remove that element from the pool, then re-raise the exception so that the caller can do something with it, like decrement a retry count.\n\nFinally, we ensure that the element is unlocked and ready for the next user before returning the result. Note that we don&#x2019;t care about grabbing the lock here because we&#x2019;re already done with the connection, not claiming it, and we&#x2019;re not modifying the pool membership.\n
Here&#x2019;s an example of how you might use the pool to perform a request. Notice how we streamed the list-keys operation through the block, taking another connection to fetch the values pointed to by the keys we have received.\n\nBut we don&#x2019;t need to expose the pool to the top-level interface, that&#x2019;s an implementation detail.\n\n\n
Now we&#x2019;ve achieved the same level of abstraction as we originally tried to get, but we are thread-safe and our connections are protected from corruption.\n
There are some things I left out of the presented code which deserve mentioning:\n\n* The pool creator specifies &#x201C;open&#x201D; and &#x201C;close&#x201D; variables that are used to allocate and deallocate elements in the pool.\n* When you call take, you can also provide a filter which will let you choose whether a given element is valid. This lets us do the error-sensitive host selection, so we retry requests on a different host.\n* You can iterate over the pool elements in a thread-safe way to do things like modify all of them, or clear the pool. This uses a separate lock for iteration and a condition variable to detect when elements are released. You can also just get the size of the pool, which is useful for metrics.\n* In order to make this solution actually work, we had to monkey-patch/duck-punch the HTTP libraries so they don&#x2019;t handle the keep-alive bits. \n* There&#x2019;s also some anecdotal evidence that the pool methodology gives some performance improvement, but I chalk this up to the fact that it is actually thread-safe, allowing you to take advantage of IO concurrency, as well as spreads the load via the multi-host feature.\n
So how can we, as a community that develops awesome Ruby libraries, protect against accidental concurrency?\n
First, we need to be careful about leaky scope. Throwing an exception when something is corrupted is not sufficient, we need to make sure precious and sensitive resources aren&#x2019;t leaked outside their intended scope. Both thread-locals and ivars can leak if we&#x2019;re not careful.\n\nIf your code yields to user code blocks, make sure those things you created in the code path that yields are unreachable from the block unless explicitly passed. If you do leak scope, protect your valuables with things like thread-safe resource pools.\n
Second, we can be more explicit about the types of behaviors our code exhibit around cached resources. Does it only work in certain circumstances? Document that. Don&#x2019;t just assume everyone is going to want ONE connection per host and port.\n\nIn order to fix the problems I had with one HTTP library, I had to do some deep monkey-patch on a private method, which both feels wrong and potentially breaks any other code that uses the library. Instead, the library could either make it transparent AND SAFE to the user, or allow the user to inject the desired caching behavior (which might be none at all).\n
Thanks for listening, I&#x2019;d be happy to take questions now.\n

A Case of Accidental Concurrency

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (16)

Similaire à A Case of Accidental Concurrency

Similaire à A Case of Accidental Concurrency (20)

Plus de Sean Cribbs

Plus de Sean Cribbs (19)

Dernier

Dernier (20)

A Case of Accidental Concurrency

Notes de l'éditeur