SlideShare une entreprise Scribd logo
1  sur  55
Télécharger pour lire hors ligne
A Python Petting Zoo
   Python and ZooKeeper
        Devon Jones
  Senior Software Engineer

  knewt.ly/pettingzoo-slides
Knewton is an education technology company with a goal of
bringing adaptive education to the masses. Knewton makes it
possible to break courses into tiny parts that are delivered to
each student as personalized, real-time recommendations.

Knewton recommends the best work for an individual learner by
calculating data on what we know about a student, similar
students, the learning objective, and the content itself at a given
point in time.

The Knewton platform today has tens of thousands of students,
will have over 600k students starting Sept. 2012, and will soon
have millions of students.
What is this about?
● What is ZooKeeper, how is it useful
● State of ZooKeeper on python
● The release of PettingZoo, Knewton's
  ZooKeeper recipes for managing a
  distributed machine learning cluster
What is ZooKeeper?
According to ZooKeeper's Apache Site:

ZooKeeper is a centralized service for maintaining configuration information,
naming, providing distributed synchronization, and providing group services. All
of these kinds of services are used in some form or another by distributed
applications. Each time they are implemented there is a lot of work that goes
into fixing the bugs and race conditions that are inevitable. Because of the
difficulty of implementing these kinds of services, applications initially usually
skimp on them, which make them brittle in the presence of change and difficult
to manage. Even when done correctly, different implementations of these
services lead to management complexity when the applications are deployed.




                           Right, but what is it?
What is ZooKeeper?
ZooKeeper is a distributed filesystem based on
the PAXOS algorithim with a few valuable
features.
● No single point of failure
● Strictly ordered, observable state
● Events
● Sequential and ephemeral primitives
● ACLs
PAXOS
http://groups.csail.mit.edu/tds/papers/DePrisco/WDAG97.
pdf
● Distributed Finite State Machine
● Pairwise connections among hosts
● Rounds of Consensus Propositions
  conducted by leader
● Time bound contracts on interactions
● Useful to have internal concurrency
  contracts
What is ZooKeeper for?
ZooKeeper is a platform for creating protocols
for synchronization of distributed systems.
Uses include:
● Distributed configuration
● Queues
● Implementation of distributed concurrency
   primitives such as locks, barriers, latches,
   counters, etc.
In short, it's a system for managing shared
state between distributed systems.
Consistency Guarantees
ZooKeeper makes a number of promises:
● Sequential Consistency
● Atomicity
● Single System Image
● Reliability
● Timeliness
Watches
● Can be set by any read operation
● Will fire an event to the client who set them
  once and only once
● Can be set on nodes or on their data.
● Will always be sent to you in a fixed order
Sequential Nodes
● Appends a monotonically increasing number
  to the end of the znode (file)
● Can be used directly for leader election
● Provides communication of server's view of
  ordering
Ephemeral Nodes
● Only exist for the life of a connection
● If your connection does not respond to a
  keepalive request, will disappear
● Used to ensure reliability against service
  disruption for most recipes
● Used to trigger events, such as
  reconfiguration if a service goes down that
  published discovery configs
Recipes
ZooKeeper's documentation explains how to
implement a number of recipes:
● Barriers
● Locks
● Queues
● Counters
● Two Phase Commit
● Leader Election
Recipes
As a community, we need well tested versions
of these recipes as well as other valuable
protocols built on ZooKeeper
State of ZooKeeper
● Very low level
● Coding for it is very complex
● Lots of edge cases
● ZooKeeper needs high level, well tested
  libraries
● Very few complete, high level solutions exist
ZooKeeper Libraries
The first high level library with significant recipe
implementations has emerged from Netflix. It's
name is Curator.

Unfortunately for us, it's in Java, not Python.
Curator
https://github.com/Netfix/curator
● State
   ○   High level api
   ○   Non-Resilient Client
   ○   Documentation
   ○   Tests, Embeddable ZooKeeper for testing
● Recipes
   ○   Leader Latch, Leader Election
   ○   Multiple Locks and Semaphores
   ○   Multiple Queues
   ○   Barrier, Double Barrier
   ○   Shared Counter/Distributed Atomic Long
State of ZooKeeper on Python
● Current state is in a lot of flux
● Went from only low level bindings to a
  number of incomplete bindings in first half of
  2012
● So far nothing like Curator has emerged (but
  it appears to be brewing)
One of the top ranked results for 'Python ZooKeeper'
Summary of Python ZooKeeper
            bindings
● There are about 10 presently
● Many suffer from not handling known edge
  cases in ZooKeeper
● Some suffer problems with resilient
  connections
● The following is derived from the python
  ZooKeeper binding census of Ben Bangert
Official Bindings
● State
  ○ Complete access to the ZooKeeper C bindings
  ○ Full of sharp edges
  ○ Not a resilient client
  ○ No recipes
  ○ Threads communication with ZooKeeper in a C
    thread
  ○ Foundation for most other libraries
  ○ Very low level
Bindings
Twisted:
● txzookeeper

gevent:
● gevent-zookeeper
● kazoo
Bindings (Cont)
High level clients:
● zc.zk
● kazoo
● twitter zookeeper

Others:
● zkpython (low level)
● pykeeper (not resiliant)
● zoop (not resiliant)
Bindings (Cont)
Recipes only:
● zktools
● PettingZoo
State of ZooKeeper on Python:
             Kind of a Mess
A project exists to merge some of the high level
bindings in an attempt to create a python
equivalent of Curator: https://github.
com/python-zk

Started by Ben Bangert to merge Kazoo & zc.
zk with an attempt to implement all Curator
recipes.
PettingZoo
https://github.com/Knewton/pettingzoo-python
● State
  ○   Relies on zc.zk
  ○   Documented, doc strings
  ○   Tests (mock ZooKeeper)
  ○   All recipes implemented in a Java version as well
● Recipes
  ○   Distributed Config
  ○   Distributed Bag
  ○   Leader Queue
  ○   Role Match
PettingZoo
● In heavy development
● Distributed Discovery, Distributed Bag are
  well tested and used in production
● Leader Queue and Role Match are tested,
  but undeployed
● PettingZoo will be ported to or merged with
  the kazoo effort when it is ready
Our Problem
Need to be able to do stream processing of
observations of student interactions with course
material. This involves multiple models that
have interdependent parameters. This requires:
● Sharding along different axes dependent
  upon the models
● Subscriptions between models for
  parameters
● Dynamic reconfiguration of the environment
  to deal with current load
Distributed Discovery
Allows services in a dynamic, distributed
environment to be able to be quickly alerted of
service address changes.
● Most service discovery recipes only contain
   host:port, Distributed Discovery can share
   arbitrary data as well (using yaml)
● Can handle load balancing through random
   selection of config
● Handles rebalancing on pool change
How does this help us scale?
● Makes discovery of dependencies simple
● Adds to reliability of system by quickly
  removing dead resources
● Makes dynamic reconfiguration simple as
  additional resources become available
Distributed Discovery Example:
                  Write
from pettingzoo.discovery import write_distributed_config
from pettingzoo.utils import connect_to_zk
conn = connect_to_zk( 'localhost:2181' )
config = {
    'header': {
         'service_class' : 'kestrel', 'metadata' : {
             'protocol' : 'memcached' , 'version': 1.0
         }
    },
    'host': 'localhost' ,
    'port': 22133
}
write_distributed_config(conn, 'kestrel', 'platform' , config)
Distributed Discovery Example:
                  Read
from pettingzoo.discovery import DistributedMultiDiscovery
from pettingzoo.utils import connect_to_zk
conn = connect_to_zk( 'localhost:2181' )
dmd = DistributedMultiDiscovery(conn)
conf = None
def update_configs (path, updated_configs):
    conf.update(updated_configs)
conf = dmd.load_config( 'kestrel', 'platform' , callback=update_configs)
Distributed Bag
Recipe for a distributed bag (dbag) that allows
processes to share a collection. Any
participant can post or remove data, alerting all
others.
 ● Used as a part of Role Match

 ● Useful for any case where

   processes need to share
   configuration determined at
   runtime
How does this help us scale?
● Can quickly alert processes as to who is
  subscribing to them
● Reduces load by quickly yanking dead
  subscriptions
● Provides event based subscriptions, making
  implementation simpler
Distributed Bag
                                 ● Sequential items
                    Item
                                   contain the actual data
        Items
                      1Item
                         2Item   ● Can be ephemeral
                            3    ● Clients set delete watch
                                   on discrete items
<bag>                            ● Token is set to id of
                                   highest item
                                 ● Clients set a child
        Tokens
                    Token          watch on the "Tokens"
                      3
                                   node
                                 ● Can determine exact
                                   adds and deletes with a
                                   constant number of
                                   messages per delta
Distributed Bag Example
import yaml
from pettingzoo.dbag import DistributedBag
from pettingzoo.utils import connect_to_zk

...
conn = connect_to_zk( 'localhost:2181' )
dbag = DistributedBag(conn, '/some/data' )
docs = {}
def acb(cb_dbag, cb_id):
    docs[cb_id] = cb_dbag.get(cb_id)
def rcb(cb_dbag, cb_id):
    docs.remove(cb.id)

dbag.add_listeners(add_callback=acb, remove_callback=rcb)
add_id = dbag.add(yaml.load(document), ephemeral=True)
docs[add_id] = document
Leader Queue
Recipe is similar to Leader Election, but makes
it easy to monitor your spare capacity.
● Used in Role Match
● As services are ready to do work, they
    create an ephemeral, sequential node in the
    queue.
● Any member always knows if either they are
    in the queue or at the front
● Watch lets leader know when it is elected
How does this help us scale?
● Gives a convenient method of assigning
  work
● Makes monitoring current excess capacity
  easy
Leader Queue
                ● Candidates register with
                  sequential, ephemeral nodes
          C_1
                ● Candidate sets delete watch
                  on predecessor
                ● Candidate is elected when it
                  is the smallest node
<queue>   C_3   ● When elected, candidate
                  takes over its new role
                ● When ready, candidate
                  removes itself from the
                  queue
          C_4
                ● Only one candidate needs to
                  call get_children upon any
                  node exiting
Leader Queue Usage Example
from pettingzoo.leader_queue import LeaderQueue, Candidate
from pettingzoo.utils import connect_to_zk

class SomeCandidate (Candidate):
    def on_elected (self):
        <do something sexy>

conn = connect_to_zk( 'localhost:2181' )
leaderq = LeaderQueue(conn)
leaderq.add_candidate(SomeCandidate())
Role Match
Allows systems to expose needed, long lived
jobs, and for services to take over those jobs
until all are filled.
● Dbag used to expose jobs
● Leader queue used to hold applicants
● Records which jobs are presently held with
   ephemeral node
● Lets a new process take over if a worker
   dies
● We use it for sharding/segmentation to
   dynamically adjust the shards as needed
How does this help us scale?
● Core of our ability to dynamically adjust
  shards
● Lets the controlling process adjust problem
  spaces and have those tasks become
  automatically filled
● Monitoring is easy to identify who is working
  on what, when
Role Match

             job
                          Distributed   ● Leader monitors for
                             Bag
                                            open jobs
                                        ●   Job holder creates
                                            an ephemeral
                                            assignment
<match>   applicant      Leader Queue
                                        ●   Assignment id
                                            matches job id,
                                            indicating that it is
                           Assgn
                                            claimed
          assignment
                             1
                             A_2
Future: Distributed Config
Next project is Distributed Config.
● Allows service config to be recorded and
  changed with a yaml config
● Every process that connects creates a child
  node of the appropriate service
● Any change in a child node's config
  overrides the overall service config for that
  process
● Any change of the parent or child fires a
  watch to let the process know that it's config
  has changed
Questions?
  devon@knewton.com

knewt.ly/pettingzoo-slides

   knewt.ly/pettingzoo
Appendix: Official Bindings
● State
  ○ Complete access to the ZooKeeper C bindings
  ○ Full of sharp edges
  ○ Not a resilient client
  ○ No recipes
  ○ Threads communication with ZooKeeper in a C
    thread
  ○ Foundation for most other libraries
  ○ Very low level
Appendix: zkpython
https://github.com/duncf/zkpython/
improvements to a fork of the official bindings
● Status
   ○ Resilient Client
   ○ No docs beyond the official bindings
   ○ Good tests
● Recipes
   ○ Basic Lock (Using unique id rather than UUID)
Appendix: pykeeper
https://github.com/nkvoll/pykeeper
● State
  ○   Non-resilient Client (not resilient to errors)
  ○   High level client
  ○   Documented, No doc strings
  ○   Tests (Requires running ZooKeeper)
Appendix: zoop
https://github.com/davidmiller/zoop
● State
   ○   Doesn't handle node create edge-case
   ○   Doesn't handle retryable exceptions
   ○   Documented, doc strings
   ○   Tests (Requires running ZooKeeper)
● Recipes
   ○ Revocable Lock (Doesn't handle create node edge-
     case, uses a permanent node instead of ephemeral)
Appendix: txzookeeper
https://launchpad.net/txzookeeper
● State
  ○   Resilient Client
  ○   Doc strings, no additional documentation
  ○   Supports twisted (only)
  ○   Well tested
● Recipes
  ○   Basic Lock (Not using UUID)
  ○   Queue
  ○   ReliableQueue
  ○   SerializedQueue
Appendix: twitter zookeeper
https://github.com/twitter/commons/tree/master/src/python/twitter/common/zookeeper

● State
      ○    Resilient Client
      ○    Handles node create edge-case
      ○    Some documentation
      ○    Well Tested
      ○    Tied to a lot of twitter commons code, difficult to
           extract
● Recipes
      ○ Service Registration/Discovery
Appendix: gevent-zookeeper
https://github.com/jrydberg/gevent-zookeeper/
● State
  ○ Supports gevent
  ○ No documentation
  ○ No tests
Appendix: Kazoo
https://github.com/nimbusproject/kazoo
● Status
  ○   Resilient Client
  ○   Doc strings, no additional documentation
  ○   Supports gevent
  ○   Minimal tests
● Recipes
  ○ Basic Lock (Uses UUID properly)
Appendix: zc.zk
https://github.com/python-zk/zc.zk
● State
  ○   Resilient Client
  ○   High level client
  ○   Higher level automatic watch functionality
  ○   Higher level restoration of ephemeral nodes
  ○   Tests (Mock ZooKeeper)
  ○   Documentation, no doc strings
● Recipes
  ○ Service Registration/Discovery
Appendix: zktools
https://github.com/mozilla-services/zktools
● State
   ○ Relies on zc.zk
   ○ Documented, doc strings
   ○ Tests (Requires running ZooKeepe)
● Recipes
   ○ Shared Read/Write Locks
   ○ AsyncLock
   ○ Revocable Locks
Appendix: PettingZoo
https://github.com/Knewton/pettingzoo-python
● State
  ○   Relies on zc.zk
  ○   Documented, doc strings
  ○   Tests (mock ZooKeeper)
  ○   All Recipes implemented in a java version as well
● Recipes
  ○   Distributed Config
  ○   Distributed Bag
  ○   Leader Queue
  ○   Role Match

Contenu connexe

Tendances

Winter is coming? Not if ZooKeeper is there!
Winter is coming? Not if ZooKeeper is there!Winter is coming? Not if ZooKeeper is there!
Winter is coming? Not if ZooKeeper is there!Joydeep Banik Roy
 
Apache ZooKeeper TechTuesday
Apache ZooKeeper TechTuesdayApache ZooKeeper TechTuesday
Apache ZooKeeper TechTuesdayAndrei Savu
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introducejhao niu
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service DiscoveryJohn Billings
 
Python and cassandra
Python and cassandraPython and cassandra
Python and cassandraJon Haddad
 
使用ZooKeeper打造軟體式負載平衡
使用ZooKeeper打造軟體式負載平衡使用ZooKeeper打造軟體式負載平衡
使用ZooKeeper打造軟體式負載平衡Lawrence Huang
 
Meetup on Apache Zookeeper
Meetup on Apache ZookeeperMeetup on Apache Zookeeper
Meetup on Apache ZookeeperAnshul Patel
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeperknowbigdata
 
ZooKeeper - wait free protocol for coordinating processes
ZooKeeper - wait free protocol for coordinating processesZooKeeper - wait free protocol for coordinating processes
ZooKeeper - wait free protocol for coordinating processesJulia Proskurnia
 
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
 
Zookeeper In Action
Zookeeper In ActionZookeeper In Action
Zookeeper In Actionjuvenxu
 
Shall we play a game?
Shall we play a game?Shall we play a game?
Shall we play a game?Maciej Lasyk
 
Docker and Maestro for fun, development and profit
Docker and Maestro for fun, development and profitDocker and Maestro for fun, development and profit
Docker and Maestro for fun, development and profitMaxime Petazzoni
 
Introduction to .Net Driver
Introduction to .Net DriverIntroduction to .Net Driver
Introduction to .Net DriverDataStax Academy
 
Centralized Application Configuration with Spring and Apache Zookeeper
Centralized Application Configuration with Spring and Apache ZookeeperCentralized Application Configuration with Spring and Apache Zookeeper
Centralized Application Configuration with Spring and Apache ZookeeperRyan Gardner
 
Reactive programming with examples
Reactive programming with examplesReactive programming with examples
Reactive programming with examplesPeter Lawrey
 
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Zabbix
 
zookeeperProgrammers
zookeeperProgrammerszookeeperProgrammers
zookeeperProgrammersHiroshi Ono
 

Tendances (20)

Apache ZooKeeper
Apache ZooKeeperApache ZooKeeper
Apache ZooKeeper
 
Winter is coming? Not if ZooKeeper is there!
Winter is coming? Not if ZooKeeper is there!Winter is coming? Not if ZooKeeper is there!
Winter is coming? Not if ZooKeeper is there!
 
Apache ZooKeeper TechTuesday
Apache ZooKeeper TechTuesdayApache ZooKeeper TechTuesday
Apache ZooKeeper TechTuesday
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introduce
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service Discovery
 
Python and cassandra
Python and cassandraPython and cassandra
Python and cassandra
 
使用ZooKeeper打造軟體式負載平衡
使用ZooKeeper打造軟體式負載平衡使用ZooKeeper打造軟體式負載平衡
使用ZooKeeper打造軟體式負載平衡
 
Meetup on Apache Zookeeper
Meetup on Apache ZookeeperMeetup on Apache Zookeeper
Meetup on Apache Zookeeper
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
ZooKeeper - wait free protocol for coordinating processes
ZooKeeper - wait free protocol for coordinating processesZooKeeper - wait free protocol for coordinating processes
ZooKeeper - wait free protocol for coordinating processes
 
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
 
Zookeeper In Action
Zookeeper In ActionZookeeper In Action
Zookeeper In Action
 
Shall we play a game?
Shall we play a game?Shall we play a game?
Shall we play a game?
 
Curator intro
Curator introCurator intro
Curator intro
 
Docker and Maestro for fun, development and profit
Docker and Maestro for fun, development and profitDocker and Maestro for fun, development and profit
Docker and Maestro for fun, development and profit
 
Introduction to .Net Driver
Introduction to .Net DriverIntroduction to .Net Driver
Introduction to .Net Driver
 
Centralized Application Configuration with Spring and Apache Zookeeper
Centralized Application Configuration with Spring and Apache ZookeeperCentralized Application Configuration with Spring and Apache Zookeeper
Centralized Application Configuration with Spring and Apache Zookeeper
 
Reactive programming with examples
Reactive programming with examplesReactive programming with examples
Reactive programming with examples
 
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
 
zookeeperProgrammers
zookeeperProgrammerszookeeperProgrammers
zookeeperProgrammers
 

Similaire à A Python Petting Zoo

Introduction to ZooKeeper - TriHUG May 22, 2012
Introduction to ZooKeeper - TriHUG May 22, 2012Introduction to ZooKeeper - TriHUG May 22, 2012
Introduction to ZooKeeper - TriHUG May 22, 2012mumrah
 
NetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksNetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksRuslan Meshenberg
 
Kubernetes and CoreOS @ Athens Docker meetup
Kubernetes and CoreOS @ Athens Docker meetupKubernetes and CoreOS @ Athens Docker meetup
Kubernetes and CoreOS @ Athens Docker meetupMist.io
 
Implementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdfImplementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdfJose Manuel Ortega Candel
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesSrinath Perera
 
First steps with kubernetes
First steps with kubernetesFirst steps with kubernetes
First steps with kubernetesVinícius Kroth
 
A Closer Look at Kubernetes Pods and Replica Sets
A Closer Look at Kubernetes Pods and Replica SetsA Closer Look at Kubernetes Pods and Replica Sets
A Closer Look at Kubernetes Pods and Replica SetsJanakiram MSV
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetesRishabh Indoria
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaSteven Wu
 
Comparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systemsComparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systemsImesha Sudasingha
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpJosé Román Martín Gil
 
Testing Django APIs
Testing Django APIsTesting Django APIs
Testing Django APIstyomo4ka
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kevin Lynch
 
Docker Madison, Introduction to Kubernetes
Docker Madison, Introduction to KubernetesDocker Madison, Introduction to Kubernetes
Docker Madison, Introduction to KubernetesTimothy St. Clair
 
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and DaemonsQConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemonsaspyker
 
Kubernetes Me this Batman
Kubernetes Me this BatmanKubernetes Me this Batman
Kubernetes Me this BatmanSonatype
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and MetricsRicardo Lourenço
 

Similaire à A Python Petting Zoo (20)

Introduction to ZooKeeper - TriHUG May 22, 2012
Introduction to ZooKeeper - TriHUG May 22, 2012Introduction to ZooKeeper - TriHUG May 22, 2012
Introduction to ZooKeeper - TriHUG May 22, 2012
 
NetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksNetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talks
 
Kubernetes and CoreOS @ Athens Docker meetup
Kubernetes and CoreOS @ Athens Docker meetupKubernetes and CoreOS @ Athens Docker meetup
Kubernetes and CoreOS @ Athens Docker meetup
 
Implementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdfImplementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdf
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple Spaces
 
First steps with kubernetes
First steps with kubernetesFirst steps with kubernetes
First steps with kubernetes
 
A Closer Look at Kubernetes Pods and Replica Sets
A Closer Look at Kubernetes Pods and Replica SetsA Closer Look at Kubernetes Pods and Replica Sets
A Closer Look at Kubernetes Pods and Replica Sets
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Comparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systemsComparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systems
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
 
Testing Django APIs
Testing Django APIsTesting Django APIs
Testing Django APIs
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
Docker Madison, Introduction to Kubernetes
Docker Madison, Introduction to KubernetesDocker Madison, Introduction to Kubernetes
Docker Madison, Introduction to Kubernetes
 
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and DaemonsQConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
 
Distributed fun with etcd
Distributed fun with etcdDistributed fun with etcd
Distributed fun with etcd
 
Kubernetes Me this Batman
Kubernetes Me this BatmanKubernetes Me this Batman
Kubernetes Me this Batman
 
Kubernetes Me This Batman
Kubernetes Me This BatmanKubernetes Me This Batman
Kubernetes Me This Batman
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and Metrics
 

Dernier

Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...itnewsafrica
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 

Dernier (20)

Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 

A Python Petting Zoo

  • 1. A Python Petting Zoo Python and ZooKeeper Devon Jones Senior Software Engineer knewt.ly/pettingzoo-slides
  • 2. Knewton is an education technology company with a goal of bringing adaptive education to the masses. Knewton makes it possible to break courses into tiny parts that are delivered to each student as personalized, real-time recommendations. Knewton recommends the best work for an individual learner by calculating data on what we know about a student, similar students, the learning objective, and the content itself at a given point in time. The Knewton platform today has tens of thousands of students, will have over 600k students starting Sept. 2012, and will soon have millions of students.
  • 3. What is this about? ● What is ZooKeeper, how is it useful ● State of ZooKeeper on python ● The release of PettingZoo, Knewton's ZooKeeper recipes for managing a distributed machine learning cluster
  • 4. What is ZooKeeper? According to ZooKeeper's Apache Site: ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications. Each time they are implemented there is a lot of work that goes into fixing the bugs and race conditions that are inevitable. Because of the difficulty of implementing these kinds of services, applications initially usually skimp on them, which make them brittle in the presence of change and difficult to manage. Even when done correctly, different implementations of these services lead to management complexity when the applications are deployed. Right, but what is it?
  • 5. What is ZooKeeper? ZooKeeper is a distributed filesystem based on the PAXOS algorithim with a few valuable features. ● No single point of failure ● Strictly ordered, observable state ● Events ● Sequential and ephemeral primitives ● ACLs
  • 6. PAXOS http://groups.csail.mit.edu/tds/papers/DePrisco/WDAG97. pdf ● Distributed Finite State Machine ● Pairwise connections among hosts ● Rounds of Consensus Propositions conducted by leader ● Time bound contracts on interactions ● Useful to have internal concurrency contracts
  • 7. What is ZooKeeper for? ZooKeeper is a platform for creating protocols for synchronization of distributed systems. Uses include: ● Distributed configuration ● Queues ● Implementation of distributed concurrency primitives such as locks, barriers, latches, counters, etc. In short, it's a system for managing shared state between distributed systems.
  • 8. Consistency Guarantees ZooKeeper makes a number of promises: ● Sequential Consistency ● Atomicity ● Single System Image ● Reliability ● Timeliness
  • 9. Watches ● Can be set by any read operation ● Will fire an event to the client who set them once and only once ● Can be set on nodes or on their data. ● Will always be sent to you in a fixed order
  • 10. Sequential Nodes ● Appends a monotonically increasing number to the end of the znode (file) ● Can be used directly for leader election ● Provides communication of server's view of ordering
  • 11. Ephemeral Nodes ● Only exist for the life of a connection ● If your connection does not respond to a keepalive request, will disappear ● Used to ensure reliability against service disruption for most recipes ● Used to trigger events, such as reconfiguration if a service goes down that published discovery configs
  • 12. Recipes ZooKeeper's documentation explains how to implement a number of recipes: ● Barriers ● Locks ● Queues ● Counters ● Two Phase Commit ● Leader Election
  • 13. Recipes As a community, we need well tested versions of these recipes as well as other valuable protocols built on ZooKeeper
  • 14. State of ZooKeeper ● Very low level ● Coding for it is very complex ● Lots of edge cases ● ZooKeeper needs high level, well tested libraries ● Very few complete, high level solutions exist
  • 15. ZooKeeper Libraries The first high level library with significant recipe implementations has emerged from Netflix. It's name is Curator. Unfortunately for us, it's in Java, not Python.
  • 16. Curator https://github.com/Netfix/curator ● State ○ High level api ○ Non-Resilient Client ○ Documentation ○ Tests, Embeddable ZooKeeper for testing ● Recipes ○ Leader Latch, Leader Election ○ Multiple Locks and Semaphores ○ Multiple Queues ○ Barrier, Double Barrier ○ Shared Counter/Distributed Atomic Long
  • 17. State of ZooKeeper on Python ● Current state is in a lot of flux ● Went from only low level bindings to a number of incomplete bindings in first half of 2012 ● So far nothing like Curator has emerged (but it appears to be brewing)
  • 18. One of the top ranked results for 'Python ZooKeeper'
  • 19. Summary of Python ZooKeeper bindings ● There are about 10 presently ● Many suffer from not handling known edge cases in ZooKeeper ● Some suffer problems with resilient connections ● The following is derived from the python ZooKeeper binding census of Ben Bangert
  • 20. Official Bindings ● State ○ Complete access to the ZooKeeper C bindings ○ Full of sharp edges ○ Not a resilient client ○ No recipes ○ Threads communication with ZooKeeper in a C thread ○ Foundation for most other libraries ○ Very low level
  • 22. Bindings (Cont) High level clients: ● zc.zk ● kazoo ● twitter zookeeper Others: ● zkpython (low level) ● pykeeper (not resiliant) ● zoop (not resiliant)
  • 23. Bindings (Cont) Recipes only: ● zktools ● PettingZoo
  • 24. State of ZooKeeper on Python: Kind of a Mess A project exists to merge some of the high level bindings in an attempt to create a python equivalent of Curator: https://github. com/python-zk Started by Ben Bangert to merge Kazoo & zc. zk with an attempt to implement all Curator recipes.
  • 25. PettingZoo https://github.com/Knewton/pettingzoo-python ● State ○ Relies on zc.zk ○ Documented, doc strings ○ Tests (mock ZooKeeper) ○ All recipes implemented in a Java version as well ● Recipes ○ Distributed Config ○ Distributed Bag ○ Leader Queue ○ Role Match
  • 26. PettingZoo ● In heavy development ● Distributed Discovery, Distributed Bag are well tested and used in production ● Leader Queue and Role Match are tested, but undeployed ● PettingZoo will be ported to or merged with the kazoo effort when it is ready
  • 27. Our Problem Need to be able to do stream processing of observations of student interactions with course material. This involves multiple models that have interdependent parameters. This requires: ● Sharding along different axes dependent upon the models ● Subscriptions between models for parameters ● Dynamic reconfiguration of the environment to deal with current load
  • 28. Distributed Discovery Allows services in a dynamic, distributed environment to be able to be quickly alerted of service address changes. ● Most service discovery recipes only contain host:port, Distributed Discovery can share arbitrary data as well (using yaml) ● Can handle load balancing through random selection of config ● Handles rebalancing on pool change
  • 29. How does this help us scale? ● Makes discovery of dependencies simple ● Adds to reliability of system by quickly removing dead resources ● Makes dynamic reconfiguration simple as additional resources become available
  • 30. Distributed Discovery Example: Write from pettingzoo.discovery import write_distributed_config from pettingzoo.utils import connect_to_zk conn = connect_to_zk( 'localhost:2181' ) config = { 'header': { 'service_class' : 'kestrel', 'metadata' : { 'protocol' : 'memcached' , 'version': 1.0 } }, 'host': 'localhost' , 'port': 22133 } write_distributed_config(conn, 'kestrel', 'platform' , config)
  • 31. Distributed Discovery Example: Read from pettingzoo.discovery import DistributedMultiDiscovery from pettingzoo.utils import connect_to_zk conn = connect_to_zk( 'localhost:2181' ) dmd = DistributedMultiDiscovery(conn) conf = None def update_configs (path, updated_configs): conf.update(updated_configs) conf = dmd.load_config( 'kestrel', 'platform' , callback=update_configs)
  • 32. Distributed Bag Recipe for a distributed bag (dbag) that allows processes to share a collection. Any participant can post or remove data, alerting all others. ● Used as a part of Role Match ● Useful for any case where processes need to share configuration determined at runtime
  • 33. How does this help us scale? ● Can quickly alert processes as to who is subscribing to them ● Reduces load by quickly yanking dead subscriptions ● Provides event based subscriptions, making implementation simpler
  • 34. Distributed Bag ● Sequential items Item contain the actual data Items 1Item 2Item ● Can be ephemeral 3 ● Clients set delete watch on discrete items <bag> ● Token is set to id of highest item ● Clients set a child Tokens Token watch on the "Tokens" 3 node ● Can determine exact adds and deletes with a constant number of messages per delta
  • 35. Distributed Bag Example import yaml from pettingzoo.dbag import DistributedBag from pettingzoo.utils import connect_to_zk ... conn = connect_to_zk( 'localhost:2181' ) dbag = DistributedBag(conn, '/some/data' ) docs = {} def acb(cb_dbag, cb_id): docs[cb_id] = cb_dbag.get(cb_id) def rcb(cb_dbag, cb_id): docs.remove(cb.id) dbag.add_listeners(add_callback=acb, remove_callback=rcb) add_id = dbag.add(yaml.load(document), ephemeral=True) docs[add_id] = document
  • 36. Leader Queue Recipe is similar to Leader Election, but makes it easy to monitor your spare capacity. ● Used in Role Match ● As services are ready to do work, they create an ephemeral, sequential node in the queue. ● Any member always knows if either they are in the queue or at the front ● Watch lets leader know when it is elected
  • 37. How does this help us scale? ● Gives a convenient method of assigning work ● Makes monitoring current excess capacity easy
  • 38. Leader Queue ● Candidates register with sequential, ephemeral nodes C_1 ● Candidate sets delete watch on predecessor ● Candidate is elected when it is the smallest node <queue> C_3 ● When elected, candidate takes over its new role ● When ready, candidate removes itself from the queue C_4 ● Only one candidate needs to call get_children upon any node exiting
  • 39. Leader Queue Usage Example from pettingzoo.leader_queue import LeaderQueue, Candidate from pettingzoo.utils import connect_to_zk class SomeCandidate (Candidate): def on_elected (self): <do something sexy> conn = connect_to_zk( 'localhost:2181' ) leaderq = LeaderQueue(conn) leaderq.add_candidate(SomeCandidate())
  • 40. Role Match Allows systems to expose needed, long lived jobs, and for services to take over those jobs until all are filled. ● Dbag used to expose jobs ● Leader queue used to hold applicants ● Records which jobs are presently held with ephemeral node ● Lets a new process take over if a worker dies ● We use it for sharding/segmentation to dynamically adjust the shards as needed
  • 41. How does this help us scale? ● Core of our ability to dynamically adjust shards ● Lets the controlling process adjust problem spaces and have those tasks become automatically filled ● Monitoring is easy to identify who is working on what, when
  • 42. Role Match job Distributed ● Leader monitors for Bag open jobs ● Job holder creates an ephemeral assignment <match> applicant Leader Queue ● Assignment id matches job id, indicating that it is Assgn claimed assignment 1 A_2
  • 43. Future: Distributed Config Next project is Distributed Config. ● Allows service config to be recorded and changed with a yaml config ● Every process that connects creates a child node of the appropriate service ● Any change in a child node's config overrides the overall service config for that process ● Any change of the parent or child fires a watch to let the process know that it's config has changed
  • 45. Appendix: Official Bindings ● State ○ Complete access to the ZooKeeper C bindings ○ Full of sharp edges ○ Not a resilient client ○ No recipes ○ Threads communication with ZooKeeper in a C thread ○ Foundation for most other libraries ○ Very low level
  • 46. Appendix: zkpython https://github.com/duncf/zkpython/ improvements to a fork of the official bindings ● Status ○ Resilient Client ○ No docs beyond the official bindings ○ Good tests ● Recipes ○ Basic Lock (Using unique id rather than UUID)
  • 47. Appendix: pykeeper https://github.com/nkvoll/pykeeper ● State ○ Non-resilient Client (not resilient to errors) ○ High level client ○ Documented, No doc strings ○ Tests (Requires running ZooKeeper)
  • 48. Appendix: zoop https://github.com/davidmiller/zoop ● State ○ Doesn't handle node create edge-case ○ Doesn't handle retryable exceptions ○ Documented, doc strings ○ Tests (Requires running ZooKeeper) ● Recipes ○ Revocable Lock (Doesn't handle create node edge- case, uses a permanent node instead of ephemeral)
  • 49. Appendix: txzookeeper https://launchpad.net/txzookeeper ● State ○ Resilient Client ○ Doc strings, no additional documentation ○ Supports twisted (only) ○ Well tested ● Recipes ○ Basic Lock (Not using UUID) ○ Queue ○ ReliableQueue ○ SerializedQueue
  • 50. Appendix: twitter zookeeper https://github.com/twitter/commons/tree/master/src/python/twitter/common/zookeeper ● State ○ Resilient Client ○ Handles node create edge-case ○ Some documentation ○ Well Tested ○ Tied to a lot of twitter commons code, difficult to extract ● Recipes ○ Service Registration/Discovery
  • 51. Appendix: gevent-zookeeper https://github.com/jrydberg/gevent-zookeeper/ ● State ○ Supports gevent ○ No documentation ○ No tests
  • 52. Appendix: Kazoo https://github.com/nimbusproject/kazoo ● Status ○ Resilient Client ○ Doc strings, no additional documentation ○ Supports gevent ○ Minimal tests ● Recipes ○ Basic Lock (Uses UUID properly)
  • 53. Appendix: zc.zk https://github.com/python-zk/zc.zk ● State ○ Resilient Client ○ High level client ○ Higher level automatic watch functionality ○ Higher level restoration of ephemeral nodes ○ Tests (Mock ZooKeeper) ○ Documentation, no doc strings ● Recipes ○ Service Registration/Discovery
  • 54. Appendix: zktools https://github.com/mozilla-services/zktools ● State ○ Relies on zc.zk ○ Documented, doc strings ○ Tests (Requires running ZooKeepe) ● Recipes ○ Shared Read/Write Locks ○ AsyncLock ○ Revocable Locks
  • 55. Appendix: PettingZoo https://github.com/Knewton/pettingzoo-python ● State ○ Relies on zc.zk ○ Documented, doc strings ○ Tests (mock ZooKeeper) ○ All Recipes implemented in a java version as well ● Recipes ○ Distributed Config ○ Distributed Bag ○ Leader Queue ○ Role Match