SlideShare une entreprise Scribd logo
1  sur  48
Multi-Master Synchronous Replication
Galera Cluster for MySQL
     August 2012




     Alex Yu
     VP Products

     alex@severalnines.com




                             Confidential
Copyright Severalnines AB




Agenda

     About Severalnines

     What is Galera Replication?

     Galera Concepts

     Node Provisioning

     Network partitioning/Split brain

     Configuration Example

     Benchmarks & Performance Metrics

     Best Practices

     Monitoring and Management


                                         Confidential   2
Copyright Severalnines AB




About Us
    Stockholm, Tokyo and Singapore

    Database Automation and DBaaS software vendor

    Over 7,000 deployments to date

    Commercial product launched Q1 2011

    Winner Best Startup EuroCloud Europe 2011

    Launched Europe’s first Data Cloud in Nov 2011

    Press coverage 2011: CIO Magazine, eWeek, PC-World, IDG
     News, Le Figaro, LeMondeInformatique, heise.de, Computerwelt,
     silicon.de, etc …

                                Confidential                         3
Copyright Severalnines AB




What is Galera Replication?

      Synchronous (Virtually) Multi-Master Replication
          Read and Write on any Node
          No Master Failover! No Slave Lag!
                                                                   Application           MySQL Server
          Guaranteed write consistency
                                                                   WSREP API              WSREP API
          Cluster wide conflicts resolution (certification)
                                                               WSREP Provider            wsrep plugin

      Highly Available and Scalable                                       Replication           Replication
          No SPOF
          Read and Write (Parallel Applier threads) scalability
          Geographical Replication (Mix MySQL Async & Galera Sync)

      Cluster (Group Communication Protocol)
          Automatic Node Provisioning, QoS
                                             Confidential                                                      4
Copyright Severalnines AB




Galera Cluster for MySQL

     Codership patches for MySQL
          Binaries and source available at launchpad

     InnoDB (& MyISAM experimental)
                                                        Client      Client       Client
          No need to change DB schema/queries
          Local queries
                                                                      LB
     Parallel Replication!
          Multiple Applier Threads (1-512)              R/W         R/W           R/W

                                                         MySQL       MySQL         MySQL
          Row events, row level locks                  [WSREP]     [WSREP]       [WSREP]




     Asynchronous Replication                          Galera Replication (Synchronous)
          In/Out of the cluster
                                         Confidential                                       5
Copyright Severalnines AB




Galera Cluster for MySQL cont.

      Higher probability for “deadlocks”
          Cluster wide optimistic locking
          Locking conflicts detected at commit                      Client      Client       Client
          First to commit succeeds

      Minimum 3 nodes required
                                                                                   LB
          “Donor” node blocks writes during full synch
           of joining/recovering node
                                                                      R/W         R/W           R/W
          3rd node then is available for service
                                                                      MySQL       MySQL         MySQL
                                                                     [WSREP]
          Gotchas: 2 recovering nodes will block the last node                  [WSREP]       [WSREP]




      Replication performance dependent on                          Galera Replication (Synchronous)
          Network latency
          Performance of the “slowest” or the farthest Node (RTT)
          Number of deployed nodes
                                              Confidential                                               6
Copyright Severalnines AB




Synchronous Replication
                                      Transaction t1
  Node 1
           BEGIN                    COMMIT (REQ)                        COMMIT (ACK/returns)
                       Statements
                                               Commit response time
                                                                                                           time
                                                              COMMIT or
                                                               Rollback


                               WS Replication event
                                                    OK or Conflict
  Node 2                                                               Transaction applied
                                                                       (virtually synchronous)
                                      WS
                                                                                                           time
                                             Certification     Apply event




  Node 3                                                                 Transaction applied
                                                                         (virtually synchronous)
                                        WS
                                                                                                           time
                                              Certification      Apply event         All nodes 100% sync
                                                              Confidential                                        7
Copyright Severalnines AB




Galera Concepts

    Application State
         A set of data that application decides to replicate
         Default is the whole MySQL databases. Every node is a complete replica
         Application state is identified by a Global Transaction ID

    Global Transaction ID (GTID)
         f7720ae0-6f9b-11e1-0800-598d1b386dce:32520198989
             CLUSTER/HISTORY/STATE UUID:TRX/STATE/SEQNO
         All replicated transactions can be uniquely referenced in any node

    Initial state: f7720ae0-6f9b-11e1-0800-598d1b386dce:0

    Undefined state: 00000000-0000-0000-0000-000000000000:-1
                                        Confidential                               8
Copyright Severalnines AB




Galera Concepts cont.
                                                                          MySQL
                                                                         [WSREP]

     Primary Component - PC
          The whole cluster is a PC during normal operation
          Node and network failures                            MySQL
                                                               [WSREP]
                                                                                    MySQL
                                                                                   [WSREP]


              Splits clusters into several components
                                                                   Primary Component
     Only PC can continue to modify state

     Quorum algorithm invoked to select a PC during cluster partitioning
          Majority rules
          Minority tries to reconnect with PC




                                         Confidential                                        9
Copyright Severalnines AB




Galera Concepts cont.

     State Snapshot Transfer - SST
          A transfer of a consistent snapshot of a node state corresponding to a
           certain GTID
          Initialize the state of a newly joining cluster node from an already
           initialized node (donor)

     Incremental State Transfer - IST
          Catch up with the cluster by replaying missing transactions
              Known initial node state
              Enough transactions cached at the donor




                                          Confidential                              10
Copyright Severalnines AB




Galera Concepts cont.

     Node Failures
          A peer crash is indistinguishable from network failure
          A node is considered failed when it no longer can be communicated with

     Node health verified by receiving messages or keepalives
          evs.inactive_timeout
              sets the timeout after which node is considered inactive (dead)
          evs.suspect_timeout
              sets the timeout after which the node can be pronounced dead if
               everyone else agrees




                                         Confidential                               11
Copyright Severalnines AB




Galera Concepts cont.

     LAN vs WAN replication
          No notion of local or remote node
          Works as long as TCP works

     May need tuning to be more tolerant to network latency/issues

     Network params sample
          evs.keepalive_period = PT3S
          evs.inactive_check_period = PT10S
          evs.suspect_timeout = PT30S
          evs.inactive_timeout = PT1M
          evs.consensus_timeout = PT1M

                                        Confidential                  12
Copyright Severalnines AB




Node Provisioning

     Automatic node (re)synchronization

     A ‘donor’ is chosen to provision a ‘joiner’ node
          ‘Donor’ node is blocked (write operations) until SST completes

     State Snapshot Transfer - SST
          Scriptable interface
              mysqldump (slow)
              rsync (fast)
              Percona Xtrabackup (faster and non-blocking)




                                        Confidential                        13
Copyright Severalnines AB




Node Provisioning cont.

                                      Client           Client           Client




                                                    Load balancer




                                          Node 1        MySQL
                                                       [WSREP]




                                 Node 2    MySQL
                                          [WSREP]




                                                         Confidential            14
Copyright Severalnines AB




Node Provisioning cont.

                                      Client           Client              Client




                                                    Load balancer




                                          Node 1        MySQL
                                                       [WSREP]




                                                                         MySQL
                                 Node 2    MySQL
                                          [WSREP]                       [WSREP]     ‘Joiner’ Node 3




                                                         Confidential                                 15
Copyright Severalnines AB




Node Provisioning cont.

                                      Client           Client              Client




                                                    Load balancer




                                          Node 1        MySQL
                                                       [WSREP]




                                                                  ‘Joiner’ Node 3
                                                                         MySQL
                                 Node 2    MySQL
                                          [WSREP]                       [WSREP]     rsync receive
                                                                                    wsrep_cluster_address=Node 2
                                                     SST Request




                                                         Confidential                                              16
Copyright Severalnines AB




Node Provisioning cont.

                                       Client           Client              Client




                                                     Load balancer




                                          Node 1         MySQL
                                                        [WSREP]




                                                                   ‘Joiner’ Node 3
                                                                          MySQL
                                 Node 2     MySQL
                                           [WSREP]                       [WSREP]     rsync receive
                                                   rsync send
                                 Node 2 in ‘donor mode’.
                                 Write operations blocked




                                                          Confidential                               17
Copyright Severalnines AB




Node Provisioning cont.

                                      Client           Client              Client




                                                    Load balancer




                                          Node 1        MySQL
                                                       [WSREP]




                                                                                      Catch up
                                                                         MySQL
                                 Node 2    MySQL
                                          [WSREP]                       [WSREP]     Node 3




                                                         Confidential                            18
Copyright Severalnines AB




Network Partitioning/Split Brain

     Quorum based system
          “Majority >50%” partition continues operation
          “Minority” partition blocks operations
              Until reconnected with Primary Component

     Use odd number of nodes
          Minimum 3 (5, 7, 9 etc)

     Galera Arbitrator (garbd)
          Useful if you have even number of nodes
          Nodes across DCs
          Replication relay

                                         Confidential      19
Copyright Severalnines AB




Network Partitioning/Split Brain cont.
                                               Client       Client          Client




                                                         Load balancer



                                                MySQL
                                               [WSREP]

                                                                                      MySQL
1 Primary Component                                                                  [WSREP]




                                    MySQL
                                   [WSREP]




                                             DC1                                     DC2




                                                             Confidential                      20
Copyright Severalnines AB




Network Partitioning/Split Brain cont.
                                               Client       Client          Client




                                                         Load balancer



                                                MySQL
                                               [WSREP]

                                                                                      MySQL
                                                                                               Block operations until
Primary Component ?                                                                  [WSREP]
                                                                                               reconnected with PC
                                    MySQL
                                   [WSREP]




                                             DC1                                     DC2




                                                             Confidential                                               21
Copyright Severalnines AB




Network Partitioning/Split Brain cont.
                                         Client       Client          Client




                                                   Load balancer



                                          MySQL
                                         [WSREP]


                                                                                MySQL
                                                                               [WSREP]




                              MySQL
                             [WSREP]




                                       DC1                                     DC2


                                                          Galera
                                                         Arbitrator

                                                               DC3
                                                       Confidential                      22
Copyright Severalnines AB




Network Partitioning/Split Brain cont.
                                         Client       Client            Client




                                                   Load balancer



                                          MySQL
                                         [WSREP]


                                                                                  MySQL
                                                                                 [WSREP]




                              MySQL
                             [WSREP]
                                                          Replication
                                                            Relay
                                       DC1                                       DC2


                                                          Galera
                                                         Arbitrator

                                                               DC3
                                                       Confidential                        23
Copyright Severalnines AB




Network Partitioning/Split Brain cont.
                                               Client       Client          Client




                                                         Load balancer



                                                MySQL
                                               [WSREP]


                                                                                      MySQL
Primary Component ?                                                                  [WSREP]




                                    MySQL
                                   [WSREP]




                                             DC1                                     DC2


                                                                Galera
                                                               Arbitrator

                                                                     DC3
                                                             Confidential                      24
Copyright Severalnines AB




Galera Configuration Example
 [mysqld]
 wsrep_cluster_address=/usr/lib64/libgalera_smm.so
 wsrep_node_address=gcomm:// # NOTE: This must be changed to peer address ASAP!
 wsrep_node_name=node1
 wsrep_provider='/usr/lib64/galera/libgalera_smm.so'
 wsrep_provider_options='gcache.size=1G;socket.ssl_key=my_key;socket.ssl_cert=my_cert
 ' wsrep_slave_threads=16
 wsrep_sst_method=xtrabackup
 wsrep_sst_auth=root:

 innodb_buffer_pool_size=1G
 innodb_log_file_size=256M
 innodb_autoinc_lock_mode=2
 innodb_flush_log_at_trx_commit=0
 innodb_doublewrite=0
 innodb_file_per_table=1
 binlog_format=ROW
 datadir=/var/lib/mysql
 log-bin = mysql-bin
 server-id = 2
 relay-log = mysql-relay-bin
 #read-only = 1
 log-slave-updates = 1

                                         Confidential                                   25
Copyright Severalnines AB




wsrep variables

     wsrep_provider
          Path to wsrep provider library

     wsrep_cluster_address
          URI form:'gcomm://another_node_address?opt1=val1&opt2=val2
          'gcomm://' special meaning. Initialize the cluster (never leave it in my.cnf)

     wsrep_node_address
          An optional address of the node. A short-cut way to configure listen
           addresses for replication and state transfers
          By default it will be initialized to the first network interface returned by
           ifconfig. This could be unreliable.
          For best results initialize it explicitly
                                            Confidential                                   26
Copyright Severalnines AB




wsrep variables cont.

     wsrep_node_name
          An optional name for the node. It will be used in logging and to identify the
           desired donor for state transfer
          Default it will be initialized to hostname

     wsrep_provider_options
          Semicolon-separated list of options specific to provider
          Ex:
              gcache.size – a size of the permanent transaction on-disk cache
              socket.ssl_key, socket.ssl_cert – SSL key and certificate files




                                          Confidential                                     27
Copyright Severalnines AB




wsrep variables cont.

     wsrep_slave_threads
          Parallel applying threads (1-512)
          >1 requires certain InnoDB settings. Applying of STATEMENT-based
           events is always serialized

     wsrep_sst_method
          Base package contains scripts for mysqldump, rsync and xtrabackup
           based state snapshot transfers. Own scripts can be used
          Default is mysqldump




                                        Confidential                           28
Copyright Severalnines AB




Performance Metrics

     wsrep_flow_control_paused
         Fraction of the time replication was paused

     wsrep_flow_control_sent
         How many times this node paused replication

     wsrep_local_recv_queue_avg
         Average length of slave trx queue – a sign of slave side bottleneck

     wsrep_cert_deps_distance
         How many transactions can be applied in parallel

     wsrep_local_send_queue_avg
         A sign of network bottleneck
                                      Confidential                              29
Copyright Severalnines AB




Number of conflicts/”deadlocks”

     wsrep_last_committed
          Last committed transaction

     wsrep_local_cert_failures, wsrep_local_bf_aborts
          Rollbacks, conflicts detected




                                           Confidential   30
Copyright Severalnines AB




Benchmarks: sysbench, tps




                http://codership.com/content/whats-difference-kenneth
                                         Confidential                   31
Copyright Severalnines AB




Benchmarks: sysbench, latency




                http://codership.com/content/whats-difference-kenneth
                                         Confidential                   32
Copyright Severalnines AB




Benchmarks: Comparing NDB vs Galera




                    Note: No optimizations done for the NDB storage engine (DB schema nor queries)

                http://codership.com/content/whats-difference-kenneth
                                                       Confidential                                  33
Copyright Severalnines AB




Benchmarks: Comparing NDB vs Galera




                    Note: No optimizations done for the NDB storage engine (DB schema nor queries)

                http://codership.com/content/whats-difference-kenneth
                                                       Confidential                                  34
Copyright Severalnines AB




Best Practices

      Dedicated switch/network for Galera Nodes (1 GBit min)

      Connection pools/Load balancing with applications
          Gives best performance
          Use static/elastic IPs for the Galera nodes
          Con: Need to handle node membership changes
          Con: JDBC/PHP etc are not aware of Galera specific Node states

      Load Balancers
          Hardware, e.g., IP5
          SW load balancer
              HAProxy with Galera specific health check scripts
              IP dispatching in the kernal for example Linux LVS
              GLB (Galera Load Balancer)
              Con: Need to setup LB redundancy

                                                  Confidential              35
Copyright Severalnines AB




Best Practices cont.

     Reference Node
                                                  Client             Client             Client
          Act as a ‘donor’ node
          Backup node
          No client connections                                      LB

                                                            R/W                R/W                R/W

                                                            MySQL
                                                           [WSREP]     ...     MySQL
                                                                              [WSREP]
                                                                                                  MySQL
                                                                                                 [WSREP]




                                                                                                 Donor & Backup
                                                                                                     Node




                                   Confidential                                                                   36
Copyright Severalnines AB




Best Practices cont.

     Minimize probability of deadlocks
          Writes go only to 1 Node
          Applications use connection pool or          Client             Client             Client

           load balancer on read only nodes
          Have 1 “reference” Node for write failover
                                                                            LB
           and donor
                                                                   R                   R                   W

                                                                  MySQL
                                                                 [WSREP]     ...     MySQL
                                                                                    [WSREP]
                                                                                                         MySQL
                                                                                                        [WSREP]




                                                                                                       “Master” Node




                                        Confidential                                                                   37
Copyright Severalnines AB




Galera Limitations

      MyISAM replication is experimental
          DDL statements are replicated in statement level
          Any writes to other table types, including system (mysql.*) tables are not replicated
          CREATE USER..., but issuing: INSERT INTO mysql.user..., will not be replicated
          Non-deterministic functions like NOW() are not supported

      Query log cannot be directed to table

      LOCK/UNLOCK TABLES cannot be supported in multi-master setups
          lock functions (GET_LOCK(), RELEASE_LOCK()... )

      Maximum allowed transaction size is defined by wsrep_max_ws_rows
       and wsrep_max_ws_size

      XA transactions can not be supported due to possible rollback on commit

                                                 Confidential                                      38
Copyright Severalnines AB




Monitoring and Management




                                Confidential   39
Copyright Severalnines AB




ClusterControl

  Host Monitoring (CPU, RAM, Disk, Network)                   Configuration Management

  DB Metrics Monitoring                                       Performance Management

  DB Resources Monitoring                                     Database Upgrades/Downgrades

  Cluster-wide Query Analyzer                                 Online Scaling of MySQL Servers

  Schema Management                                           Configurable Resource Thresholds

  Replication Fail-over                                       Alarms and Email Notifications

  Clusterware – Process Management and Automated Recovery     Backup Scheduling

  Manual start/stop of Nodes

  Real-time Performance Probes

                                               Confidential                                        40
Copyright Severalnines AB




Configurators




                                 Confidential   41
Copyright Severalnines AB




Galera Configurator




                                 Confidential   42
Copyright Severalnines AB




Galera Configurator cont.




                                 Confidential   43
Copyright Severalnines AB




Galera Configurator cont.




                                 Confidential   44
Copyright Severalnines AB




Deploy Galera Cluster with HAProxy

     cd ~/s9s-galera-2.10/mysql/scripts/install

     ./deploy.sh &> | tee -a cc.log

     wget http://severalnines.com/downloads/s9s-haproxy.tar.gz

     tar zxvf s9s-haproxy.tar.gz

     cd haproxy

     ./install-haproxy.sh <lb host> <rhel|debian> galera

     done...

                                    Confidential                  45
Copyright Severalnines AB




                            Confidential   46
Copyright Severalnines AB




                            Confidential   47
Copyright Severalnines AB




Resources

     Severalnines MySQL Galera Configurator
         http://www.severalnines.com/resources/configurator

     Supported platforms (MySQL Galera)
         http://support.severalnines.com/entries/21589522-verified-and-supported-operating-
          systems

     Galera limitations
         http://support.severalnines.com/entries/21692388-limitations-in-galera-replication-for-
          mysql

     ClusterControl server requirements
         http://support.severalnines.com/entries/20614858-server-requirements-on-premise-
          amis-other-imageshttp://support.severalnines.com/entries/20614858-server-
          requirements-on-premise-amis-other-images

                                            Confidential                                            48

Contenu connexe

Tendances

Tendances (20)

Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
 
Virtualization with KVM (Kernel-based Virtual Machine)
Virtualization with KVM (Kernel-based Virtual Machine)Virtualization with KVM (Kernel-based Virtual Machine)
Virtualization with KVM (Kernel-based Virtual Machine)
 
MariaDB Galera Cluster presentation
MariaDB Galera Cluster presentationMariaDB Galera Cluster presentation
MariaDB Galera Cluster presentation
 
Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
Voldemort Nosql
Voldemort NosqlVoldemort Nosql
Voldemort Nosql
 
CloudStack Architecture
CloudStack ArchitectureCloudStack Architecture
CloudStack Architecture
 
Disaster Recovery and High Availability with Kafka, SRM and MM2
Disaster Recovery and High Availability with Kafka, SRM and MM2Disaster Recovery and High Availability with Kafka, SRM and MM2
Disaster Recovery and High Availability with Kafka, SRM and MM2
 
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIOHigh Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
 
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera ) Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
 
Ceph scale testing with 10 Billion Objects
Ceph scale testing with 10 Billion ObjectsCeph scale testing with 10 Billion Objects
Ceph scale testing with 10 Billion Objects
 
Ceph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing GuideCeph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing Guide
 
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best Practices
 
How to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScaleHow to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScale
 
CloudStack Best Practice in PPTV
CloudStack Best Practice in PPTVCloudStack Best Practice in PPTV
CloudStack Best Practice in PPTV
 
Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기
 
(CMP402) Amazon EC2 Instances Deep Dive
(CMP402) Amazon EC2 Instances Deep Dive(CMP402) Amazon EC2 Instances Deep Dive
(CMP402) Amazon EC2 Instances Deep Dive
 
Streaming architecture patterns
Streaming architecture patternsStreaming architecture patterns
Streaming architecture patterns
 

En vedette

MySQL Multi-Source Replication for PL2016
MySQL Multi-Source Replication for PL2016MySQL Multi-Source Replication for PL2016
MySQL Multi-Source Replication for PL2016
Wagner Bianchi
 

En vedette (6)

Geographically Distributed Multi-Master MySQL Clusters
Geographically Distributed Multi-Master MySQL ClustersGeographically Distributed Multi-Master MySQL Clusters
Geographically Distributed Multi-Master MySQL Clusters
 
Geographically Distributed Multi-Master MySQL Clusters
Geographically Distributed Multi-Master MySQL ClustersGeographically Distributed Multi-Master MySQL Clusters
Geographically Distributed Multi-Master MySQL Clusters
 
Tungsten University: MySQL Multi-Master Operations Made Simple With Tungsten ...
Tungsten University: MySQL Multi-Master Operations Made Simple With Tungsten ...Tungsten University: MySQL Multi-Master Operations Made Simple With Tungsten ...
Tungsten University: MySQL Multi-Master Operations Made Simple With Tungsten ...
 
Multi-master, multi-region MySQL deployment in Amazon AWS
Multi-master, multi-region MySQL deployment in Amazon AWSMulti-master, multi-region MySQL deployment in Amazon AWS
Multi-master, multi-region MySQL deployment in Amazon AWS
 
MySQL Multi Master Replication
MySQL Multi Master ReplicationMySQL Multi Master Replication
MySQL Multi Master Replication
 
MySQL Multi-Source Replication for PL2016
MySQL Multi-Source Replication for PL2016MySQL Multi-Source Replication for PL2016
MySQL Multi-Source Replication for PL2016
 

Similaire à Galera cluster for MySQL - Introduction Slides

CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief ComparisonCloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
bizalgo
 
AWS Summit 2011: High Availability Database Architectures in AWS Cloud
AWS Summit 2011: High Availability Database Architectures in AWS CloudAWS Summit 2011: High Availability Database Architectures in AWS Cloud
AWS Summit 2011: High Availability Database Architectures in AWS Cloud
Amazon Web Services
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value Store
Santal Li
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_store
drewz lin
 

Similaire à Galera cluster for MySQL - Introduction Slides (20)

CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief ComparisonCloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
 
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
 
AWS Summit 2011: High Availability Database Architectures in AWS Cloud
AWS Summit 2011: High Availability Database Architectures in AWS CloudAWS Summit 2011: High Availability Database Architectures in AWS Cloud
AWS Summit 2011: High Availability Database Architectures in AWS Cloud
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value Store
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_store
 
SQL Azure in deep
SQL Azure in deepSQL Azure in deep
SQL Azure in deep
 
eBay From Ground Level to the Clouds
eBay From Ground Level to the CloudseBay From Ground Level to the Clouds
eBay From Ground Level to the Clouds
 
Open stack in sina
Open stack in sinaOpen stack in sina
Open stack in sina
 
Designing a reactive data platform: Challenges, patterns, and anti-patterns
Designing a reactive data platform: Challenges, patterns, and anti-patterns Designing a reactive data platform: Challenges, patterns, and anti-patterns
Designing a reactive data platform: Challenges, patterns, and anti-patterns
 
Running MariaDB in multiple data centers
Running MariaDB in multiple data centersRunning MariaDB in multiple data centers
Running MariaDB in multiple data centers
 
You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]
You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]
You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]
 
Architectures with Windows Azure
Architectures with Windows AzureArchitectures with Windows Azure
Architectures with Windows Azure
 
Sql server 2012 - always on deep dive - bob duffy
Sql server 2012 - always on deep dive - bob duffySql server 2012 - always on deep dive - bob duffy
Sql server 2012 - always on deep dive - bob duffy
 
No Surprises Geo Replication - Pulsar Virtual Summit Europe 2021
No Surprises Geo Replication - Pulsar Virtual Summit Europe 2021No Surprises Geo Replication - Pulsar Virtual Summit Europe 2021
No Surprises Geo Replication - Pulsar Virtual Summit Europe 2021
 
[NHN] 성공적인 소셜게임 런칭과 기술
[NHN] 성공적인 소셜게임 런칭과 기술[NHN] 성공적인 소셜게임 런칭과 기술
[NHN] 성공적인 소셜게임 런칭과 기술
 
Apache con 2011 gd
Apache con 2011 gdApache con 2011 gd
Apache con 2011 gd
 
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
 
Xen and Apache cloudstack
Xen and Apache cloudstack  Xen and Apache cloudstack
Xen and Apache cloudstack
 
Cassandra Internals Overview
Cassandra Internals OverviewCassandra Internals Overview
Cassandra Internals Overview
 
MySQL 5.7 clustering: The developer perspective
MySQL 5.7 clustering: The developer perspectiveMySQL 5.7 clustering: The developer perspective
MySQL 5.7 clustering: The developer perspective
 

Plus de Severalnines

Webinar slides: How to Migrate from Oracle DB to MariaDB
Webinar slides: How to Migrate from Oracle DB to MariaDBWebinar slides: How to Migrate from Oracle DB to MariaDB
Webinar slides: How to Migrate from Oracle DB to MariaDB
Severalnines
 
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControlWebinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Severalnines
 
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
Severalnines
 
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
Severalnines
 
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...
Severalnines
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Severalnines
 
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBWebinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Severalnines
 
Webinar slides: How to Measure Database Availability?
Webinar slides: How to Measure Database Availability?Webinar slides: How to Measure Database Availability?
Webinar slides: How to Measure Database Availability?
Severalnines
 
Webinar slides: Designing Open Source Databases for High Availability
Webinar slides: Designing Open Source Databases for High AvailabilityWebinar slides: Designing Open Source Databases for High Availability
Webinar slides: Designing Open Source Databases for High Availability
Severalnines
 

Plus de Severalnines (20)

Cloud's future runs through Sovereign DBaaS
Cloud's future runs through Sovereign DBaaSCloud's future runs through Sovereign DBaaS
Cloud's future runs through Sovereign DBaaS
 
Tips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloudTips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloud
 
Working with the Moodle Database: The Basics
Working with the Moodle Database: The BasicsWorking with the Moodle Database: The Basics
Working with the Moodle Database: The Basics
 
SysAdmin Working from Home? Tips to Automate MySQL, MariaDB, Postgres & MongoDB
SysAdmin Working from Home? Tips to Automate MySQL, MariaDB, Postgres & MongoDBSysAdmin Working from Home? Tips to Automate MySQL, MariaDB, Postgres & MongoDB
SysAdmin Working from Home? Tips to Automate MySQL, MariaDB, Postgres & MongoDB
 
(slides) Polyglot persistence: utilizing open source databases as a Swiss poc...
(slides) Polyglot persistence: utilizing open source databases as a Swiss poc...(slides) Polyglot persistence: utilizing open source databases as a Swiss poc...
(slides) Polyglot persistence: utilizing open source databases as a Swiss poc...
 
Webinar slides: How to Migrate from Oracle DB to MariaDB
Webinar slides: How to Migrate from Oracle DB to MariaDBWebinar slides: How to Migrate from Oracle DB to MariaDB
Webinar slides: How to Migrate from Oracle DB to MariaDB
 
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControlWebinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
 
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
 
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
 
Disaster Recovery Planning for MySQL & MariaDB
Disaster Recovery Planning for MySQL & MariaDBDisaster Recovery Planning for MySQL & MariaDB
Disaster Recovery Planning for MySQL & MariaDB
 
MariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash CourseMariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash Course
 
Performance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDBPerformance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDB
 
Advanced MySql Data-at-Rest Encryption in Percona Server
Advanced MySql Data-at-Rest Encryption in Percona ServerAdvanced MySql Data-at-Rest Encryption in Percona Server
Advanced MySql Data-at-Rest Encryption in Percona Server
 
Polyglot Persistence Utilizing Open Source Databases as a Swiss Pocket Knife
Polyglot Persistence Utilizing Open Source Databases as a Swiss Pocket KnifePolyglot Persistence Utilizing Open Source Databases as a Swiss Pocket Knife
Polyglot Persistence Utilizing Open Source Databases as a Swiss Pocket Knife
 
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
 
Webinar slides: Our Guide to MySQL & MariaDB Performance Tuning
Webinar slides: Our Guide to MySQL & MariaDB Performance TuningWebinar slides: Our Guide to MySQL & MariaDB Performance Tuning
Webinar slides: Our Guide to MySQL & MariaDB Performance Tuning
 
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBWebinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
 
Webinar slides: How to Measure Database Availability?
Webinar slides: How to Measure Database Availability?Webinar slides: How to Measure Database Availability?
Webinar slides: How to Measure Database Availability?
 
Webinar slides: Designing Open Source Databases for High Availability
Webinar slides: Designing Open Source Databases for High AvailabilityWebinar slides: Designing Open Source Databases for High Availability
Webinar slides: Designing Open Source Databases for High Availability
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Galera cluster for MySQL - Introduction Slides

  • 1. Multi-Master Synchronous Replication Galera Cluster for MySQL August 2012 Alex Yu VP Products alex@severalnines.com Confidential
  • 2. Copyright Severalnines AB Agenda  About Severalnines  What is Galera Replication?  Galera Concepts  Node Provisioning  Network partitioning/Split brain  Configuration Example  Benchmarks & Performance Metrics  Best Practices  Monitoring and Management Confidential 2
  • 3. Copyright Severalnines AB About Us Stockholm, Tokyo and Singapore Database Automation and DBaaS software vendor Over 7,000 deployments to date Commercial product launched Q1 2011 Winner Best Startup EuroCloud Europe 2011 Launched Europe’s first Data Cloud in Nov 2011 Press coverage 2011: CIO Magazine, eWeek, PC-World, IDG News, Le Figaro, LeMondeInformatique, heise.de, Computerwelt, silicon.de, etc … Confidential 3
  • 4. Copyright Severalnines AB What is Galera Replication?  Synchronous (Virtually) Multi-Master Replication  Read and Write on any Node  No Master Failover! No Slave Lag! Application MySQL Server  Guaranteed write consistency WSREP API WSREP API  Cluster wide conflicts resolution (certification) WSREP Provider wsrep plugin  Highly Available and Scalable Replication Replication  No SPOF  Read and Write (Parallel Applier threads) scalability  Geographical Replication (Mix MySQL Async & Galera Sync)  Cluster (Group Communication Protocol)  Automatic Node Provisioning, QoS Confidential 4
  • 5. Copyright Severalnines AB Galera Cluster for MySQL Codership patches for MySQL  Binaries and source available at launchpad InnoDB (& MyISAM experimental) Client Client Client  No need to change DB schema/queries  Local queries LB Parallel Replication!  Multiple Applier Threads (1-512) R/W R/W R/W MySQL MySQL MySQL  Row events, row level locks [WSREP] [WSREP] [WSREP] Asynchronous Replication Galera Replication (Synchronous)  In/Out of the cluster Confidential 5
  • 6. Copyright Severalnines AB Galera Cluster for MySQL cont.  Higher probability for “deadlocks”  Cluster wide optimistic locking  Locking conflicts detected at commit Client Client Client  First to commit succeeds  Minimum 3 nodes required LB  “Donor” node blocks writes during full synch of joining/recovering node R/W R/W R/W  3rd node then is available for service MySQL MySQL MySQL [WSREP]  Gotchas: 2 recovering nodes will block the last node [WSREP] [WSREP]  Replication performance dependent on Galera Replication (Synchronous)  Network latency  Performance of the “slowest” or the farthest Node (RTT)  Number of deployed nodes Confidential 6
  • 7. Copyright Severalnines AB Synchronous Replication Transaction t1 Node 1 BEGIN COMMIT (REQ) COMMIT (ACK/returns) Statements Commit response time time COMMIT or Rollback WS Replication event OK or Conflict Node 2 Transaction applied (virtually synchronous) WS time Certification Apply event Node 3 Transaction applied (virtually synchronous) WS time Certification Apply event All nodes 100% sync Confidential 7
  • 8. Copyright Severalnines AB Galera Concepts Application State  A set of data that application decides to replicate  Default is the whole MySQL databases. Every node is a complete replica  Application state is identified by a Global Transaction ID Global Transaction ID (GTID)  f7720ae0-6f9b-11e1-0800-598d1b386dce:32520198989  CLUSTER/HISTORY/STATE UUID:TRX/STATE/SEQNO  All replicated transactions can be uniquely referenced in any node Initial state: f7720ae0-6f9b-11e1-0800-598d1b386dce:0 Undefined state: 00000000-0000-0000-0000-000000000000:-1 Confidential 8
  • 9. Copyright Severalnines AB Galera Concepts cont. MySQL [WSREP] Primary Component - PC  The whole cluster is a PC during normal operation  Node and network failures MySQL [WSREP] MySQL [WSREP]  Splits clusters into several components Primary Component Only PC can continue to modify state Quorum algorithm invoked to select a PC during cluster partitioning  Majority rules  Minority tries to reconnect with PC Confidential 9
  • 10. Copyright Severalnines AB Galera Concepts cont. State Snapshot Transfer - SST  A transfer of a consistent snapshot of a node state corresponding to a certain GTID  Initialize the state of a newly joining cluster node from an already initialized node (donor) Incremental State Transfer - IST  Catch up with the cluster by replaying missing transactions  Known initial node state  Enough transactions cached at the donor Confidential 10
  • 11. Copyright Severalnines AB Galera Concepts cont. Node Failures  A peer crash is indistinguishable from network failure  A node is considered failed when it no longer can be communicated with Node health verified by receiving messages or keepalives  evs.inactive_timeout  sets the timeout after which node is considered inactive (dead)  evs.suspect_timeout  sets the timeout after which the node can be pronounced dead if everyone else agrees Confidential 11
  • 12. Copyright Severalnines AB Galera Concepts cont. LAN vs WAN replication  No notion of local or remote node  Works as long as TCP works May need tuning to be more tolerant to network latency/issues Network params sample  evs.keepalive_period = PT3S  evs.inactive_check_period = PT10S  evs.suspect_timeout = PT30S  evs.inactive_timeout = PT1M  evs.consensus_timeout = PT1M Confidential 12
  • 13. Copyright Severalnines AB Node Provisioning Automatic node (re)synchronization A ‘donor’ is chosen to provision a ‘joiner’ node  ‘Donor’ node is blocked (write operations) until SST completes State Snapshot Transfer - SST  Scriptable interface  mysqldump (slow)  rsync (fast)  Percona Xtrabackup (faster and non-blocking) Confidential 13
  • 14. Copyright Severalnines AB Node Provisioning cont. Client Client Client Load balancer Node 1 MySQL [WSREP] Node 2 MySQL [WSREP] Confidential 14
  • 15. Copyright Severalnines AB Node Provisioning cont. Client Client Client Load balancer Node 1 MySQL [WSREP] MySQL Node 2 MySQL [WSREP] [WSREP] ‘Joiner’ Node 3 Confidential 15
  • 16. Copyright Severalnines AB Node Provisioning cont. Client Client Client Load balancer Node 1 MySQL [WSREP] ‘Joiner’ Node 3 MySQL Node 2 MySQL [WSREP] [WSREP] rsync receive wsrep_cluster_address=Node 2 SST Request Confidential 16
  • 17. Copyright Severalnines AB Node Provisioning cont. Client Client Client Load balancer Node 1 MySQL [WSREP] ‘Joiner’ Node 3 MySQL Node 2 MySQL [WSREP] [WSREP] rsync receive rsync send Node 2 in ‘donor mode’. Write operations blocked Confidential 17
  • 18. Copyright Severalnines AB Node Provisioning cont. Client Client Client Load balancer Node 1 MySQL [WSREP] Catch up MySQL Node 2 MySQL [WSREP] [WSREP] Node 3 Confidential 18
  • 19. Copyright Severalnines AB Network Partitioning/Split Brain Quorum based system  “Majority >50%” partition continues operation  “Minority” partition blocks operations  Until reconnected with Primary Component Use odd number of nodes  Minimum 3 (5, 7, 9 etc) Galera Arbitrator (garbd)  Useful if you have even number of nodes  Nodes across DCs  Replication relay Confidential 19
  • 20. Copyright Severalnines AB Network Partitioning/Split Brain cont. Client Client Client Load balancer MySQL [WSREP] MySQL 1 Primary Component [WSREP] MySQL [WSREP] DC1 DC2 Confidential 20
  • 21. Copyright Severalnines AB Network Partitioning/Split Brain cont. Client Client Client Load balancer MySQL [WSREP] MySQL Block operations until Primary Component ? [WSREP] reconnected with PC MySQL [WSREP] DC1 DC2 Confidential 21
  • 22. Copyright Severalnines AB Network Partitioning/Split Brain cont. Client Client Client Load balancer MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] DC1 DC2 Galera Arbitrator DC3 Confidential 22
  • 23. Copyright Severalnines AB Network Partitioning/Split Brain cont. Client Client Client Load balancer MySQL [WSREP] MySQL [WSREP] MySQL [WSREP] Replication Relay DC1 DC2 Galera Arbitrator DC3 Confidential 23
  • 24. Copyright Severalnines AB Network Partitioning/Split Brain cont. Client Client Client Load balancer MySQL [WSREP] MySQL Primary Component ? [WSREP] MySQL [WSREP] DC1 DC2 Galera Arbitrator DC3 Confidential 24
  • 25. Copyright Severalnines AB Galera Configuration Example [mysqld] wsrep_cluster_address=/usr/lib64/libgalera_smm.so wsrep_node_address=gcomm:// # NOTE: This must be changed to peer address ASAP! wsrep_node_name=node1 wsrep_provider='/usr/lib64/galera/libgalera_smm.so' wsrep_provider_options='gcache.size=1G;socket.ssl_key=my_key;socket.ssl_cert=my_cert ' wsrep_slave_threads=16 wsrep_sst_method=xtrabackup wsrep_sst_auth=root: innodb_buffer_pool_size=1G innodb_log_file_size=256M innodb_autoinc_lock_mode=2 innodb_flush_log_at_trx_commit=0 innodb_doublewrite=0 innodb_file_per_table=1 binlog_format=ROW datadir=/var/lib/mysql log-bin = mysql-bin server-id = 2 relay-log = mysql-relay-bin #read-only = 1 log-slave-updates = 1 Confidential 25
  • 26. Copyright Severalnines AB wsrep variables wsrep_provider  Path to wsrep provider library wsrep_cluster_address  URI form:'gcomm://another_node_address?opt1=val1&opt2=val2  'gcomm://' special meaning. Initialize the cluster (never leave it in my.cnf) wsrep_node_address  An optional address of the node. A short-cut way to configure listen addresses for replication and state transfers  By default it will be initialized to the first network interface returned by ifconfig. This could be unreliable.  For best results initialize it explicitly Confidential 26
  • 27. Copyright Severalnines AB wsrep variables cont. wsrep_node_name  An optional name for the node. It will be used in logging and to identify the desired donor for state transfer  Default it will be initialized to hostname wsrep_provider_options  Semicolon-separated list of options specific to provider  Ex:  gcache.size – a size of the permanent transaction on-disk cache  socket.ssl_key, socket.ssl_cert – SSL key and certificate files Confidential 27
  • 28. Copyright Severalnines AB wsrep variables cont. wsrep_slave_threads  Parallel applying threads (1-512)  >1 requires certain InnoDB settings. Applying of STATEMENT-based events is always serialized wsrep_sst_method  Base package contains scripts for mysqldump, rsync and xtrabackup based state snapshot transfers. Own scripts can be used  Default is mysqldump Confidential 28
  • 29. Copyright Severalnines AB Performance Metrics  wsrep_flow_control_paused  Fraction of the time replication was paused  wsrep_flow_control_sent  How many times this node paused replication  wsrep_local_recv_queue_avg  Average length of slave trx queue – a sign of slave side bottleneck  wsrep_cert_deps_distance  How many transactions can be applied in parallel  wsrep_local_send_queue_avg  A sign of network bottleneck Confidential 29
  • 30. Copyright Severalnines AB Number of conflicts/”deadlocks” wsrep_last_committed  Last committed transaction wsrep_local_cert_failures, wsrep_local_bf_aborts  Rollbacks, conflicts detected Confidential 30
  • 31. Copyright Severalnines AB Benchmarks: sysbench, tps http://codership.com/content/whats-difference-kenneth Confidential 31
  • 32. Copyright Severalnines AB Benchmarks: sysbench, latency http://codership.com/content/whats-difference-kenneth Confidential 32
  • 33. Copyright Severalnines AB Benchmarks: Comparing NDB vs Galera Note: No optimizations done for the NDB storage engine (DB schema nor queries) http://codership.com/content/whats-difference-kenneth Confidential 33
  • 34. Copyright Severalnines AB Benchmarks: Comparing NDB vs Galera Note: No optimizations done for the NDB storage engine (DB schema nor queries) http://codership.com/content/whats-difference-kenneth Confidential 34
  • 35. Copyright Severalnines AB Best Practices  Dedicated switch/network for Galera Nodes (1 GBit min)  Connection pools/Load balancing with applications  Gives best performance  Use static/elastic IPs for the Galera nodes  Con: Need to handle node membership changes  Con: JDBC/PHP etc are not aware of Galera specific Node states  Load Balancers  Hardware, e.g., IP5  SW load balancer  HAProxy with Galera specific health check scripts  IP dispatching in the kernal for example Linux LVS  GLB (Galera Load Balancer)  Con: Need to setup LB redundancy Confidential 35
  • 36. Copyright Severalnines AB Best Practices cont. Reference Node Client Client Client  Act as a ‘donor’ node  Backup node  No client connections LB R/W R/W R/W MySQL [WSREP] ... MySQL [WSREP] MySQL [WSREP] Donor & Backup Node Confidential 36
  • 37. Copyright Severalnines AB Best Practices cont. Minimize probability of deadlocks  Writes go only to 1 Node  Applications use connection pool or Client Client Client load balancer on read only nodes  Have 1 “reference” Node for write failover LB and donor R R W MySQL [WSREP] ... MySQL [WSREP] MySQL [WSREP] “Master” Node Confidential 37
  • 38. Copyright Severalnines AB Galera Limitations  MyISAM replication is experimental  DDL statements are replicated in statement level  Any writes to other table types, including system (mysql.*) tables are not replicated  CREATE USER..., but issuing: INSERT INTO mysql.user..., will not be replicated  Non-deterministic functions like NOW() are not supported  Query log cannot be directed to table  LOCK/UNLOCK TABLES cannot be supported in multi-master setups  lock functions (GET_LOCK(), RELEASE_LOCK()... )  Maximum allowed transaction size is defined by wsrep_max_ws_rows and wsrep_max_ws_size  XA transactions can not be supported due to possible rollback on commit Confidential 38
  • 39. Copyright Severalnines AB Monitoring and Management Confidential 39
  • 40. Copyright Severalnines AB ClusterControl  Host Monitoring (CPU, RAM, Disk, Network)  Configuration Management  DB Metrics Monitoring  Performance Management  DB Resources Monitoring  Database Upgrades/Downgrades  Cluster-wide Query Analyzer  Online Scaling of MySQL Servers  Schema Management  Configurable Resource Thresholds  Replication Fail-over  Alarms and Email Notifications  Clusterware – Process Management and Automated Recovery  Backup Scheduling  Manual start/stop of Nodes  Real-time Performance Probes Confidential 40
  • 42. Copyright Severalnines AB Galera Configurator Confidential 42
  • 43. Copyright Severalnines AB Galera Configurator cont. Confidential 43
  • 44. Copyright Severalnines AB Galera Configurator cont. Confidential 44
  • 45. Copyright Severalnines AB Deploy Galera Cluster with HAProxy cd ~/s9s-galera-2.10/mysql/scripts/install ./deploy.sh &> | tee -a cc.log wget http://severalnines.com/downloads/s9s-haproxy.tar.gz tar zxvf s9s-haproxy.tar.gz cd haproxy ./install-haproxy.sh <lb host> <rhel|debian> galera done... Confidential 45
  • 46. Copyright Severalnines AB Confidential 46
  • 47. Copyright Severalnines AB Confidential 47
  • 48. Copyright Severalnines AB Resources  Severalnines MySQL Galera Configurator  http://www.severalnines.com/resources/configurator  Supported platforms (MySQL Galera)  http://support.severalnines.com/entries/21589522-verified-and-supported-operating- systems  Galera limitations  http://support.severalnines.com/entries/21692388-limitations-in-galera-replication-for- mysql  ClusterControl server requirements  http://support.severalnines.com/entries/20614858-server-requirements-on-premise- amis-other-imageshttp://support.severalnines.com/entries/20614858-server- requirements-on-premise-amis-other-images Confidential 48