SlideShare une entreprise Scribd logo
1  sur  43
Télécharger pour lire hors ligne
SQLFire




Jags Ramnarayan –    Chief Architect, SQLFire
Carter Shanklin –    Product Manager, SQLFire
Sponsor Sessions Suck
•
    –
    –
    –
Speed Matters




Users demand fast applications and fast websites.
   The database is the hardest thing to scale.
SQLFire: Speed, Scale, SQL
       Speed                       Scale                          SQL
• In-memory for maximum   • Horizontally scalable.      • Familiar SQL interface.
  speed and minimum       • Add or remove nodes at      • SQL 92 compliant.
  latency.                  any time for more           • JDBC and ADO.NET
                            capacity or availability.     interfaces.
How does SQLFire get scale and speed?
•
    –
•
    –
•
    –
SQLFire at Strata 2012
Diverging needs for online and analytics
SQLFire at Strata 2012
SQLFire at Strata 2012
SQLFire at Strata 2012
SQLFire at Strata 2012
SQLFire at Strata 2012
SQLFire: What does it really look
            like?
SQLFire Tables Are Replicated By Default.
1    CREATE TABLE sales
                                                              SQLFire Node 1
2      (product_id int, store_id int,
                                                                  Replica
3      price float);                         sales
4
5
6                                                             SQLFire Node 2
7                                                                 Replica
8                                        Best for small and
9                                       frequently accessed
                                                data.
10
Partitioned Tables Are Split Among Members.
1    CREATE TABLE sales
                                                         SQLFire Node 1
2      (product_id int, store_id int,
                                                             Replica
3      price float)                       sales
                                                           Partition 1
4    PARTITION BY
5      COLUMN (product_id);
6                                                        SQLFire Node 2
7                                                            Replica
8
                                        Best for large     Partition 2
9
                                         data sets.
10
Types Of Partitioning In SQLFire.
      Type                    Purpose                                     Example
                    Built-in hashing algorithm
Hash Partitioning
                    splits data at random across     PARTITION BY COLUMN (customer_id);
   (Default)
                    available servers.
                    Manually divide data across      PARTITION BY LIST (home_state)
      List          servers based on discrete         (VALUES (‘CA’, ‘WA’),
                    criteria.                          VALUES (‘TX’, ‘OK’));
                    Manually divide data across      PARTITION BY RANGE (date)
     Range          servers based on continuous       (VALUES BETWEEN ‘2008-01-01’ AND ‘2008-12-31’,
                    criteria.                          VALUES BETWEEN ‘2009-01-01’ AND ‘2009-12-30’);
                    Fully dynamic division of data
   Expression       based on function execution.     PARTITION BY (MONTH(date));
                    Can use UDFs.
How does it scale for queries?
                                                  1M
  Partitioned Table                                         1000
PK queries per second                      790k
     (1kb Rows)                                              800
                                    604k
                                                             600
                             420k
                                                             400
                      200k
                                                             200
                                                       # Clients = 2*N
                 N=     2      4      6      8    10
                        Number Of Servers
How does it scale for updates?
                                                 1.3M
 Partitioned Table                                           1000
Updates Per Second
    (3 columns)                           950k
                                                              800
                                   750k
                                                              600
                            490k
                                                              400
                     220k
  85% < 1ms                                                   200
  latency
                                                        # Clients = 2*N
               N=      2      4      6      8     10
                      Number Of Servers
Redundancy Increases Availability.
1    CREATE TABLE sales
                                                                SQLFire Node 1
2      (product_id int, store_id int,
                                                                    Replica
3      price float)                           sales
                                                                  Partition 1
4    PARTITION BY
                                                                  Partition 2*
5      COLUMN (product_id);
6      REDUNDANCY 1;                                            SQLFire Node 2
7                                                                   Replica
8
                                        All data is available     Partition 2
9
                                           if Node 1 fails.       Partition 1*
10
Partitioning and redundancy
    Replication is
synchronous but done          Replication can be
     in parallel                “rack aware”




                             Single owner
 Redundancy = 2           for any row at point
  (but tunable)                  in time
SQLFire: Derp-Proof Database
•
•
•                          Was that cord
                           supposed to be
                             in the wall?
Linearly scaling joins


•

•
    –
Partition Aware DB Design
–
Collocate Data For Fast Joins.
1    CREATE TABLE sales                 Related data placed   SQLFire Node 1
2      (product_id int, store_id int,   on the same node.
                                                                  Replica
3      price float)
                                                                Customer 1
4    PARTITION BY                              C1             Customer 1 Sales
5      COLUMN (product_id);
6      COLOCATE WITH customers;                               SQLFire Node 2
7                                              C2                 Replica
8
                                         SQLFire can join       Customer 2
9                                         tables without      Customer 2 Sales
10                                        network hops.
Collocate Data For Fast Joins.
                       Related data placed   SQLFire Node 1
                       on the same node.
                                                 Replica
                                               Customer 1
                              C1             Customer 1 Sales

                                             SQLFire Node 2
                              C2                 Replica

                        SQLFire can join       Customer 2
                         tables without      Customer 2 Sales
                         network hops.
Collocate Data For Fast Joins.
                                          Related data placed    SQLFire Node 1
                                          on the same node.
                                                                     Replica
                                                                   Customer 1
                                                  C1             Customer 1 Sales
                                       Parallel scatter-gather
                                                                 SQLFire Node 2
                                                  C2                 Replica
                                                                   Customer 2
  In parallel, each node does hash join, aggregation locally     Customer 2 Sales
Dynamic Data Colocation
 Dynamic entity          Based on foreign
group formation          key relationships




                         Single master for
 Redundancy = 2           any entity group
Data-Aware Stored Procs
•
•
•
•
•
                Like Map/Reduce But Different
Scaling Stored Procedures
1    CALL maxSales(arguments)               SQLFire uses data-   maxSales on
2                                            aware routing to     local data
     ON TABLE sales
                                           route processing to
3    WHERE (Location in (‘CA’,’WA’,’OR')        the data.
4    WITH RESULT PROCESSOR
5    maxSalesReducer
                                              maxSalesReducer
6
7
8                                          Result Processors
9                                          give map/reduce       maxSales on
                                             functionality.       local data
10
Scalability: Consistency




               Assumes:
Most x-actions small in space and time
Write-write conflicts rare
Scalability: High performance persistence
•
                                                    Memory                                     Memory
                                                    Tables                                     Tables


•                         LOG
                        Compressor
                                                                     LOG
                                                                   Compressor




•
    –                             OS Buffers                                 OS Buffers

                                        Record1                                    Record1
                              Record1

                              Record2
                                        Record2    Append only           Record1

                                                                         Record2
                                                                                   Record2    Append only
                              Record3
                                        Record3
                                                  Operation logs         Record3
                                                                                   Record3
                                                                                             Operation logs
Demos!
Demo: Distributed Procedures
•
•
•
•
Demo: Caching
•
•
•
•
:sigh:
Download:                                        Just Google it
            Try SQLFire Today!
            Free for developer use to 3 nodes.

  Forum:
            Got questions? Get answers.


 Twitter:
            I need more followers to get a promotion.
Demo Details
Scaling Stored Procs (1)
                      Ubuntu
                    (database)
Insert Timeseries
Scaling Stored Procs (2)
                      Ubuntu
                    (database)
Insert Timeseries




             Compute Autocorrelations




                    Complete
Scaling Stored Procs (3)
                           Ubuntu                   Ubuntu                     Ubuntu
                         (database)               (database)                 (database)
     Insert Timeseries                Rebalance                  Rebalance




                  Compute Autocorrelations   Compute Autocorrelations   Compute Autocorrelations

   All using
standard SQL
     APIs
                         Complete                  Complete                   Complete
Caching Analytics (1)




                   Continuous Batch
                      Processing
Caching Analytics (2)
                 Ubuntu
               (database)
 Low latency                  In-memory
                                caching

                            JDBC row
                             loader



                                          Continuous Batch
                                             Processing
Caching Analytics (3)
                   Ubuntu
                 (database)
 Low latency                  In-memory
                                caching

   Scalable +
 Tunable Cache
    Policies

                                          Continuous Batch
                                             Processing
Caching Policies
• LRU Count
  – Overflow to disk or destroy.
• Time To Live
  – Counter ticks as soon as the row is loaded.
• Idle Time
  – Destroy rows when they are not accessed for a
    while.
• Specified in CREATE TABLE syntax.

Contenu connexe

Tendances

AD116 XPages Extension Library: Making Application Development Even Easier
AD116 XPages Extension Library: Making Application Development Even EasierAD116 XPages Extension Library: Making Application Development Even Easier
AD116 XPages Extension Library: Making Application Development Even Easierpdhannan
 
Bank Data Frank Peterson DB2 10-Early_Experiences_pdf
Bank Data   Frank Peterson DB2 10-Early_Experiences_pdfBank Data   Frank Peterson DB2 10-Early_Experiences_pdf
Bank Data Frank Peterson DB2 10-Early_Experiences_pdfSurekha Parekh
 
MySQL Cluster NoSQL Memcached API
MySQL Cluster NoSQL Memcached APIMySQL Cluster NoSQL Memcached API
MySQL Cluster NoSQL Memcached APIMat Keep
 
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...DataStax Academy
 
Throughput comparison: Dell PowerEdge R720 drive options
Throughput comparison: Dell PowerEdge R720 drive optionsThroughput comparison: Dell PowerEdge R720 drive options
Throughput comparison: Dell PowerEdge R720 drive optionsPrincipled Technologies
 
Microsoft SQL Server Distributing Data with R2 Bertucci
Microsoft SQL Server Distributing Data with R2 BertucciMicrosoft SQL Server Distributing Data with R2 Bertucci
Microsoft SQL Server Distributing Data with R2 BertucciMark Ginnebaugh
 
DB2 and storage management
DB2 and storage managementDB2 and storage management
DB2 and storage managementCraig Mullins
 
Gustavo Garnica: Evolución de la Plataforma Java y lo que Significa para Ti
Gustavo Garnica: Evolución de la Plataforma Java y lo que Significa para TiGustavo Garnica: Evolución de la Plataforma Java y lo que Significa para Ti
Gustavo Garnica: Evolución de la Plataforma Java y lo que Significa para TiSoftware Guru
 
SQL Server 2008 R2 Parallel Data Warehouse
SQL Server 2008 R2 Parallel Data WarehouseSQL Server 2008 R2 Parallel Data Warehouse
SQL Server 2008 R2 Parallel Data WarehouseMark Ginnebaugh
 
Improve DB2 z/OS Test Data Management
Improve DB2 z/OS Test Data ManagementImprove DB2 z/OS Test Data Management
Improve DB2 z/OS Test Data Managementsoftbasemarketing
 
Dell Acceleration Appliance for Databases 2.0 and Microsoft SQL Server 2014: ...
Dell Acceleration Appliance for Databases 2.0 and Microsoft SQL Server 2014: ...Dell Acceleration Appliance for Databases 2.0 and Microsoft SQL Server 2014: ...
Dell Acceleration Appliance for Databases 2.0 and Microsoft SQL Server 2014: ...Principled Technologies
 
Increased database performance and reduced costs with Dell PowerEdge FX2 & VM...
Increased database performance and reduced costs with Dell PowerEdge FX2 & VM...Increased database performance and reduced costs with Dell PowerEdge FX2 & VM...
Increased database performance and reduced costs with Dell PowerEdge FX2 & VM...Principled Technologies
 
Dell PowerEdge M520 server solution: Energy efficiency and database performance
Dell PowerEdge M520 server solution: Energy efficiency and database performanceDell PowerEdge M520 server solution: Energy efficiency and database performance
Dell PowerEdge M520 server solution: Energy efficiency and database performancePrincipled Technologies
 
Dell PowerEdge R820 and R910 servers: Performance and reliability
Dell PowerEdge R820 and R910 servers: Performance and reliabilityDell PowerEdge R820 and R910 servers: Performance and reliability
Dell PowerEdge R820 and R910 servers: Performance and reliabilityPrincipled Technologies
 
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...IBM India Smarter Computing
 
Ari Zilka Cluster Architecture Patterns
Ari Zilka Cluster Architecture PatternsAri Zilka Cluster Architecture Patterns
Ari Zilka Cluster Architecture Patternsdeimos
 
Solving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration DilemmaSolving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration DilemmaRandy Goering
 

Tendances (18)

Ta3
Ta3Ta3
Ta3
 
AD116 XPages Extension Library: Making Application Development Even Easier
AD116 XPages Extension Library: Making Application Development Even EasierAD116 XPages Extension Library: Making Application Development Even Easier
AD116 XPages Extension Library: Making Application Development Even Easier
 
Bank Data Frank Peterson DB2 10-Early_Experiences_pdf
Bank Data   Frank Peterson DB2 10-Early_Experiences_pdfBank Data   Frank Peterson DB2 10-Early_Experiences_pdf
Bank Data Frank Peterson DB2 10-Early_Experiences_pdf
 
MySQL Cluster NoSQL Memcached API
MySQL Cluster NoSQL Memcached APIMySQL Cluster NoSQL Memcached API
MySQL Cluster NoSQL Memcached API
 
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
 
Throughput comparison: Dell PowerEdge R720 drive options
Throughput comparison: Dell PowerEdge R720 drive optionsThroughput comparison: Dell PowerEdge R720 drive options
Throughput comparison: Dell PowerEdge R720 drive options
 
Microsoft SQL Server Distributing Data with R2 Bertucci
Microsoft SQL Server Distributing Data with R2 BertucciMicrosoft SQL Server Distributing Data with R2 Bertucci
Microsoft SQL Server Distributing Data with R2 Bertucci
 
DB2 and storage management
DB2 and storage managementDB2 and storage management
DB2 and storage management
 
Gustavo Garnica: Evolución de la Plataforma Java y lo que Significa para Ti
Gustavo Garnica: Evolución de la Plataforma Java y lo que Significa para TiGustavo Garnica: Evolución de la Plataforma Java y lo que Significa para Ti
Gustavo Garnica: Evolución de la Plataforma Java y lo que Significa para Ti
 
SQL Server 2008 R2 Parallel Data Warehouse
SQL Server 2008 R2 Parallel Data WarehouseSQL Server 2008 R2 Parallel Data Warehouse
SQL Server 2008 R2 Parallel Data Warehouse
 
Improve DB2 z/OS Test Data Management
Improve DB2 z/OS Test Data ManagementImprove DB2 z/OS Test Data Management
Improve DB2 z/OS Test Data Management
 
Dell Acceleration Appliance for Databases 2.0 and Microsoft SQL Server 2014: ...
Dell Acceleration Appliance for Databases 2.0 and Microsoft SQL Server 2014: ...Dell Acceleration Appliance for Databases 2.0 and Microsoft SQL Server 2014: ...
Dell Acceleration Appliance for Databases 2.0 and Microsoft SQL Server 2014: ...
 
Increased database performance and reduced costs with Dell PowerEdge FX2 & VM...
Increased database performance and reduced costs with Dell PowerEdge FX2 & VM...Increased database performance and reduced costs with Dell PowerEdge FX2 & VM...
Increased database performance and reduced costs with Dell PowerEdge FX2 & VM...
 
Dell PowerEdge M520 server solution: Energy efficiency and database performance
Dell PowerEdge M520 server solution: Energy efficiency and database performanceDell PowerEdge M520 server solution: Energy efficiency and database performance
Dell PowerEdge M520 server solution: Energy efficiency and database performance
 
Dell PowerEdge R820 and R910 servers: Performance and reliability
Dell PowerEdge R820 and R910 servers: Performance and reliabilityDell PowerEdge R820 and R910 servers: Performance and reliability
Dell PowerEdge R820 and R910 servers: Performance and reliability
 
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...
 
Ari Zilka Cluster Architecture Patterns
Ari Zilka Cluster Architecture PatternsAri Zilka Cluster Architecture Patterns
Ari Zilka Cluster Architecture Patterns
 
Solving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration DilemmaSolving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration Dilemma
 

Similaire à SQLFire at Strata 2012

Solving performance problems in MySQL without denormalization
Solving performance problems in MySQL without denormalizationSolving performance problems in MySQL without denormalization
Solving performance problems in MySQL without denormalizationdmcfarlane
 
Akiban Technologies: Renormalize
Akiban Technologies: RenormalizeAkiban Technologies: Renormalize
Akiban Technologies: RenormalizeAriel Weil
 
Akiban Technologies: Renormalize
Akiban Technologies: RenormalizeAkiban Technologies: Renormalize
Akiban Technologies: RenormalizeAriel Weil
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database OverviewSteve Min
 
Modernización del manejo de datos con v fabric
Modernización del manejo de datos con v fabricModernización del manejo de datos con v fabric
Modernización del manejo de datos con v fabricSoftware Guru
 
SQL Server Developer 70-433
SQL Server Developer 70-433SQL Server Developer 70-433
SQL Server Developer 70-433jasonyousef
 
VoltDB : A Technical Overview
VoltDB : A Technical OverviewVoltDB : A Technical Overview
VoltDB : A Technical OverviewTim Callaghan
 
My sql cluster_taipei_event
My sql cluster_taipei_eventMy sql cluster_taipei_event
My sql cluster_taipei_eventIvan Tu
 
Drilling Deep Into Exadata Performance
Drilling Deep Into Exadata PerformanceDrilling Deep Into Exadata Performance
Drilling Deep Into Exadata PerformanceEnkitec
 
Clustrix Database Overview
Clustrix Database OverviewClustrix Database Overview
Clustrix Database OverviewClustrix
 
NewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDNewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDTony Rogerson
 
MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesBernd Ocklin
 
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison Severalnines
 
Building a High-Volume Reporting System on Amazon AWS with MySQL, Tungsten, a...
Building a High-Volume Reporting System on Amazon AWS with MySQL, Tungsten, a...Building a High-Volume Reporting System on Amazon AWS with MySQL, Tungsten, a...
Building a High-Volume Reporting System on Amazon AWS with MySQL, Tungsten, a...Jeff Malek
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix
 
Building and deploying large scale real time news system with my sql and dist...
Building and deploying large scale real time news system with my sql and dist...Building and deploying large scale real time news system with my sql and dist...
Building and deploying large scale real time news system with my sql and dist...Tao Cheng
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextRafał Kuć
 

Similaire à SQLFire at Strata 2012 (20)

Solving performance problems in MySQL without denormalization
Solving performance problems in MySQL without denormalizationSolving performance problems in MySQL without denormalization
Solving performance problems in MySQL without denormalization
 
Akiban Technologies: Renormalize
Akiban Technologies: RenormalizeAkiban Technologies: Renormalize
Akiban Technologies: Renormalize
 
Akiban Technologies: Renormalize
Akiban Technologies: RenormalizeAkiban Technologies: Renormalize
Akiban Technologies: Renormalize
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database Overview
 
Modernización del manejo de datos con v fabric
Modernización del manejo de datos con v fabricModernización del manejo de datos con v fabric
Modernización del manejo de datos con v fabric
 
SQL Server Developer 70-433
SQL Server Developer 70-433SQL Server Developer 70-433
SQL Server Developer 70-433
 
VoltDB : A Technical Overview
VoltDB : A Technical OverviewVoltDB : A Technical Overview
VoltDB : A Technical Overview
 
My sql cluster_taipei_event
My sql cluster_taipei_eventMy sql cluster_taipei_event
My sql cluster_taipei_event
 
Drilling Deep Into Exadata Performance
Drilling Deep Into Exadata PerformanceDrilling Deep Into Exadata Performance
Drilling Deep Into Exadata Performance
 
Clustrix Database Overview
Clustrix Database OverviewClustrix Database Overview
Clustrix Database Overview
 
NewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDNewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACID
 
MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion Queries
 
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
 
Building a High-Volume Reporting System on Amazon AWS with MySQL, Tungsten, a...
Building a High-Volume Reporting System on Amazon AWS with MySQL, Tungsten, a...Building a High-Volume Reporting System on Amazon AWS with MySQL, Tungsten, a...
Building a High-Volume Reporting System on Amazon AWS with MySQL, Tungsten, a...
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmark
 
Building and deploying large scale real time news system with my sql and dist...
Building and deploying large scale real time news system with my sql and dist...Building and deploying large scale real time news system with my sql and dist...
Building and deploying large scale real time news system with my sql and dist...
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
 
The Very Very Latest In Database Development - Lucas Jellema - Oracle OpenWor...
The Very Very Latest In Database Development - Lucas Jellema - Oracle OpenWor...The Very Very Latest In Database Development - Lucas Jellema - Oracle OpenWor...
The Very Very Latest In Database Development - Lucas Jellema - Oracle OpenWor...
 
Aditi
AditiAditi
Aditi
 
Aditi
AditiAditi
Aditi
 

Dernier

Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceMartin Humpolec
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum ComputingGDSC PJATK
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 

Dernier (20)

Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your Salesforce
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum Computing
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 

SQLFire at Strata 2012

  • 1. SQLFire Jags Ramnarayan – Chief Architect, SQLFire Carter Shanklin – Product Manager, SQLFire
  • 3. Speed Matters Users demand fast applications and fast websites. The database is the hardest thing to scale.
  • 4. SQLFire: Speed, Scale, SQL Speed Scale SQL • In-memory for maximum • Horizontally scalable. • Familiar SQL interface. speed and minimum • Add or remove nodes at • SQL 92 compliant. latency. any time for more • JDBC and ADO.NET capacity or availability. interfaces.
  • 5. How does SQLFire get scale and speed? • – • – • –
  • 7. Diverging needs for online and analytics
  • 13. SQLFire: What does it really look like?
  • 14. SQLFire Tables Are Replicated By Default. 1 CREATE TABLE sales SQLFire Node 1 2 (product_id int, store_id int, Replica 3 price float); sales 4 5 6 SQLFire Node 2 7 Replica 8 Best for small and 9 frequently accessed data. 10
  • 15. Partitioned Tables Are Split Among Members. 1 CREATE TABLE sales SQLFire Node 1 2 (product_id int, store_id int, Replica 3 price float) sales Partition 1 4 PARTITION BY 5 COLUMN (product_id); 6 SQLFire Node 2 7 Replica 8 Best for large Partition 2 9 data sets. 10
  • 16. Types Of Partitioning In SQLFire. Type Purpose Example Built-in hashing algorithm Hash Partitioning splits data at random across PARTITION BY COLUMN (customer_id); (Default) available servers. Manually divide data across PARTITION BY LIST (home_state) List servers based on discrete (VALUES (‘CA’, ‘WA’), criteria. VALUES (‘TX’, ‘OK’)); Manually divide data across PARTITION BY RANGE (date) Range servers based on continuous (VALUES BETWEEN ‘2008-01-01’ AND ‘2008-12-31’, criteria. VALUES BETWEEN ‘2009-01-01’ AND ‘2009-12-30’); Fully dynamic division of data Expression based on function execution. PARTITION BY (MONTH(date)); Can use UDFs.
  • 17. How does it scale for queries? 1M Partitioned Table 1000 PK queries per second 790k (1kb Rows) 800 604k 600 420k 400 200k 200 # Clients = 2*N N= 2 4 6 8 10 Number Of Servers
  • 18. How does it scale for updates? 1.3M Partitioned Table 1000 Updates Per Second (3 columns) 950k 800 750k 600 490k 400 220k 85% < 1ms 200 latency # Clients = 2*N N= 2 4 6 8 10 Number Of Servers
  • 19. Redundancy Increases Availability. 1 CREATE TABLE sales SQLFire Node 1 2 (product_id int, store_id int, Replica 3 price float) sales Partition 1 4 PARTITION BY Partition 2* 5 COLUMN (product_id); 6 REDUNDANCY 1; SQLFire Node 2 7 Replica 8 All data is available Partition 2 9 if Node 1 fails. Partition 1* 10
  • 20. Partitioning and redundancy Replication is synchronous but done Replication can be in parallel “rack aware” Single owner Redundancy = 2 for any row at point (but tunable) in time
  • 21. SQLFire: Derp-Proof Database • • • Was that cord supposed to be in the wall?
  • 23. Partition Aware DB Design –
  • 24. Collocate Data For Fast Joins. 1 CREATE TABLE sales Related data placed SQLFire Node 1 2 (product_id int, store_id int, on the same node. Replica 3 price float) Customer 1 4 PARTITION BY C1 Customer 1 Sales 5 COLUMN (product_id); 6 COLOCATE WITH customers; SQLFire Node 2 7 C2 Replica 8 SQLFire can join Customer 2 9 tables without Customer 2 Sales 10 network hops.
  • 25. Collocate Data For Fast Joins. Related data placed SQLFire Node 1 on the same node. Replica Customer 1 C1 Customer 1 Sales SQLFire Node 2 C2 Replica SQLFire can join Customer 2 tables without Customer 2 Sales network hops.
  • 26. Collocate Data For Fast Joins. Related data placed SQLFire Node 1 on the same node. Replica Customer 1 C1 Customer 1 Sales Parallel scatter-gather SQLFire Node 2 C2 Replica Customer 2 In parallel, each node does hash join, aggregation locally Customer 2 Sales
  • 27. Dynamic Data Colocation Dynamic entity Based on foreign group formation key relationships Single master for Redundancy = 2 any entity group
  • 28. Data-Aware Stored Procs • • • • • Like Map/Reduce But Different
  • 29. Scaling Stored Procedures 1 CALL maxSales(arguments) SQLFire uses data- maxSales on 2 aware routing to local data ON TABLE sales route processing to 3 WHERE (Location in (‘CA’,’WA’,’OR') the data. 4 WITH RESULT PROCESSOR 5 maxSalesReducer maxSalesReducer 6 7 8 Result Processors 9 give map/reduce maxSales on functionality. local data 10
  • 30. Scalability: Consistency Assumes: Most x-actions small in space and time Write-write conflicts rare
  • 31. Scalability: High performance persistence • Memory Memory Tables Tables • LOG Compressor LOG Compressor • – OS Buffers OS Buffers Record1 Record1 Record1 Record2 Record2 Append only Record1 Record2 Record2 Append only Record3 Record3 Operation logs Record3 Record3 Operation logs
  • 35. :sigh: Download: Just Google it Try SQLFire Today! Free for developer use to 3 nodes. Forum: Got questions? Get answers. Twitter: I need more followers to get a promotion.
  • 37. Scaling Stored Procs (1) Ubuntu (database) Insert Timeseries
  • 38. Scaling Stored Procs (2) Ubuntu (database) Insert Timeseries Compute Autocorrelations Complete
  • 39. Scaling Stored Procs (3) Ubuntu Ubuntu Ubuntu (database) (database) (database) Insert Timeseries Rebalance Rebalance Compute Autocorrelations Compute Autocorrelations Compute Autocorrelations All using standard SQL APIs Complete Complete Complete
  • 40. Caching Analytics (1) Continuous Batch Processing
  • 41. Caching Analytics (2) Ubuntu (database) Low latency In-memory caching JDBC row loader Continuous Batch Processing
  • 42. Caching Analytics (3) Ubuntu (database) Low latency In-memory caching Scalable + Tunable Cache Policies Continuous Batch Processing
  • 43. Caching Policies • LRU Count – Overflow to disk or destroy. • Time To Live – Counter ticks as soon as the row is loaded. • Idle Time – Destroy rows when they are not accessed for a while. • Specified in CREATE TABLE syntax.

Notes de l'éditeur

  1. Let&apos;s turn now to a hands-on look at some SQLFire features.On the left we&apos;re going to have the SQL code you can use in SQLFire and on the right we&apos;ll talk about what the code actually does.For starters we&apos;ll create a very simple table, in just the same way you would create it in other databases. By default tables in SQLFire are replicated across all nodes in the SQLFire cluster.That means, for one thing, that if a server crashes all the data in that table is still available. This approach is best for small datasets and data that is frequently accessed or used in joins.
  2. Partitioning data is more sophisticated and more interesting. SQLFire has a keyword, &quot;PARTITION BY&quot;, which tells SQLFire that the data in that table should be split up across all available nodes.This approach is a must for large datasets.
  3. There are a lot of different ways to partition data in SQLFire, by default SQLFire will try to evenly distribute data at random across all servers If that&apos;s not good enough you can exert a lot of control over how data is divided and distributed using list, range or expression based partitioning.
  4. Partitioning creates a challenge, by default data lives only on one node and if you lose that node the data is offline. We can solve that with the redundancy keyword. Using this causes SQLFire to keep multiple copies of the data on different servers so that if you lose a node, all the data is still available. Redundancy is usually a good idea and you can even keep data in 3 or 4 different servers at once. Most typically you&apos;re going to want a redundancy of 1.
  5. Co-location is a key feature that allows SQLFire to be a real SQL database and horizontally scalable at the same time. When I talk to people who know distributed databases they usually ask &quot;how do you do distributed joins?&quot; The answer is, we don&apos;t. Instead we allow related data to be grouped together on the same physical node. This is done with the COLOCATE WITH keyword, which associates tables together based on a foreign key and keeps related rows on the same server. In this example we have customer 1 and customer 2 stored on different nodes. The COLOCATE WITH keyword lets me ensure that sales records from customer 1 end up on node 1 and records from customer 2 end up on node 2.
  6. Map-reduce is great when you have to sequentially apply an operation to every record. For instance, text tokenization, indexing. But, SQLFire DAP is a generic distributed RPC mechanism that brings the power of SQL searches to each partition node.For instance, data mining, scoring where tasks are continuously looking for data of interest using queries. By having each node return the result from its “in-process” memory and parallelizing the work on any number of processors, it becomes highly efficient way to parallel process data.