SlideShare une entreprise Scribd logo
1  sur  37
Télécharger pour lire hors ligne
Pulsar Virtual Summit North America 2021
Apache Pulsar:
Why Unified Messaging
and Streaming Is the
Future
Matteo Merli, Sijie Guo
@ Pulsar PMC
Who are we?
● Sijie Guo (@sijieg)
● CEO, StreamNative
● PMC Member of Pulsar/BookKeeper
● Ex Co-Founder, Streamlio
● Ex Twitter
● Matteo Merli (@merlimat)
● CTO, StreamNative
● Co-creator and PMC chair of Pulsar
● Ex Co-Founder, Streamlio
● Ex Yahoo!
StreamNative
Founded by the creators of Apache Pulsar, StreamNative provides a
cloud-native, unified messaging and streaming platform powered by
Apache Pulsar to support multi-cloud and hybrid-cloud strategies
Announcing StreamNative Platform 1.0
✓ Pulsar Transactions
✓ Kafka-on-Pulsar
✓ Function Mesh for serverless streaming
✓ Enterprise-ready security
✓ Pulsar Operators
✓ Seamless StreamNative Cloud experience
Pulsar Trends
Kafka -> Pulsar
Scale Cloud-Native
Pulsar + Flink
Pulsar at Scale
More companies in Production
Pulsar at Scale
Hit Trillion Messages Per Day
Cloud-Native
Kubernetes Drive Adoption of Pulsar
✓ 80% of Pulsar users deploy Pulsar in a cloud environment
✓ 62% of Pulsar users deploy Pulsar on Kubernetes
✓ 49% noted Pulsar’s Cloud-Native capabilities as one of the
top reasons they chose to adopt Pulsar
Cloud-Native
Built for Kubernetes
Containers
Cloud Native
Hybrid & MultiCloud
● Single Cloud Provider
● Monolithic
Architectures
● Single Tenant Systems
● No Geo-replication
VM / Early Cloud Era Containers / Modern Cloud Era
Microservices
Pulsar + Flink
Unified Stream and Batch
Kafka to Pulsar
More and More Kafka Users Adopt Pulsar
✓ 68% of respondents use Kafka in addition to Pulsar
✓ 34% of respondents use or plan to use Kafka-on-Pulsar
✓ Kafka and Pulsar serve different use cases
✓ Once adopted, Pulsar usage expands across organizations
Pulsar Adoption Use Cases
Adopted Pulsar to replace Kafka
in their DSP (Data Streaming
Platform).
● 1.5-2x lower in capex cost
● 5-50x improvement in
latency
● 2-3x lower in opex due
● 10 PB / day
Adopted Pulsar to power their
billing platform, Midas, which
processing hundreds of billions
of financial transactions daily.
Adoption then expanded to
Tencent’s Federated Learning
Platform and Tencent Gaming.
Use cases require a scalable
message queue for serving
mission-critical business
applications to replace
RabbitMQ.
In the process of expanding use
cases to build data streaming
services
Modern Data Needs
Messaging + Streaming
Messaging
● Queueing systems are ideal for work
queues that do not require tasks to
be performed in a particular order—
for example, sending one email
message to many recipients.
● RabbitMQ and Amazon SQS are
examples of popular queue-based
message systems.
Streaming
● Streaming works best in situations
where the order of messages is
important—for example, data
ingestion.
● Kafka and Amazon Kinesis are
examples of messaging systems that
use streaming semantics for
consuming messages.
Data in motion
Typical Architecture
E-Commerce w/o Pulsar
✓ Separate storage
✓ Tiering outside toolset
✓ Separate application and
data domains
✓ Different tech stacks
Why not a system that is
able to support messaging
and streaming?
E-Commerce with Pulsar
✓ Unified storage for in-
motion data
✓ Native tiered storage
✓ Single system to
exchange data
✓ Teams share toolset
Build Apache Pulsar for
unified messaging and
streaming
Step 1: A scalable storage for streams of data
Step 2: Separate serving from storage
Apache Pulsar
Apache BookKeeper
Broker 0
Producer Consumer
Broker 1 Broker 2
Bookie
0
Bookie
1
Bookie
2
Bookie
3
Bookie
4
Step 3: Unified API
Streaming
Messaging
Producer 1
Producer 2
Pulsar
Topic/Partition
m0
m1
m2
m3
m4
Consumer D-1
Consumer D-2
Consumer D-3
Subscription D
Key-Shared
Consumer C-1
Consumer C-2
Consumer C-3
Subscription C
m1
m2
m3
m4
m0
Shared
Failover
Consumer B-1
Consumer B-0
Subscription B
m1
m2
m3
m4
m0
In case of failure
in Consumer B-0
Consumer A-1
Consumer A-0
Subscription A
m1
m2
m3
m4
m0
Exclusive
X
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Step 3:
Unified API
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Step 4:
Schema
API
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Schema
API
Schema API
Step 5:
Functions
and IO API
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Schema
API
Schema API
Functions
API
Pulsar
IO/Connectors
Prebuilt Connectors
Custom Connectors
Step 6:
Tiered
Storage
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Schema
API
Schema API
Functions
API
Pulsar
IO/Connectors
Prebuilt Connectors
Custom Connectors
Tiered Storage
Step 7: Protocol Handlers
Apache Pulsar
Pulsar Protocol
Handler
Pulsar Clients
(queue + stream)
Kafka Protocol
Handler
AMQP Protocol
Handler
MQTT Protocol
Handler
Kafka Clients AMQP Clients MQTT Clients
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Schema
API
Schema API
Functions
API
Pulsar
IO/Connectors
Prebuilt Connectors
Custom Connectors
Tiered Storage
Step 8:
Transaction
API
Transaction
API
Pulsar 2.8 towards a
complete vision of unified
messaging and streaming
The future of Pulsar
Towards a self-adjusting
data platform
✓ Tuning data platforms to run at scale is hard
✓ Lots of configurations
✓ Requires in-depth knowledge of internals
✓ Workloads are constantly changing
Topic auto-partitioning
✓ Partitions are an artifact of implementation
✓ It’s not a natural property of the data
✓ Abstract the partitioning away from users
✓ Partitions are automatically split / merged based
✓ Rethink how an API should look like
Self-Adjusting Storage
✓ Ensure most optimal utilization of hardware
✓ No configuration
✓ Automatically adjust strategies based on changing
condition:
✓ Disk access
✓ Cache management
✓ Queue sizes
Pulsar Functions
✓ The foundation is now mature — UX is still poor
✓ Simpler tooling to create & manage functions
✓ CI/CD integration — Versioning — A/B testing
✓ Observability & Debuggability
✓ Improve support for Go and Python functions
✓ DSL — Provide higher level constructs to process data
Stream Storage
✓ Evolve the current state of Tiered Storage
✓ Integrate with data lake technologies
Working with the data
community

Contenu connexe

Tendances

Apache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and ArchitecturesApache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and ArchitecturesKai Wähner
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersJean-Paul Azar
 
Change Data Streaming Patterns for Microservices With Debezium
Change Data Streaming Patterns for Microservices With Debezium Change Data Streaming Patterns for Microservices With Debezium
Change Data Streaming Patterns for Microservices With Debezium confluent
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!confluent
 
NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?Anton Zadorozhniy
 
Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Kai Wähner
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developersconfluent
 
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and LinkerdService Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and LinkerdKai Wähner
 
Kafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaKafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaGuido Schmutz
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practicesconfluent
 
Redis and Kafka - Advanced Microservices Design Patterns Simplified
Redis and Kafka - Advanced Microservices Design Patterns SimplifiedRedis and Kafka - Advanced Microservices Design Patterns Simplified
Redis and Kafka - Advanced Microservices Design Patterns SimplifiedAllen Terleto
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Araf Karsh Hamid
 
Containers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatContainers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatAmazon Web Services
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Databricks
 
Kafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereKafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereGwen (Chen) Shapira
 
An Introduction to Prometheus
An Introduction to PrometheusAn Introduction to Prometheus
An Introduction to PrometheusEvgeny Shmarnev
 
Apache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial ServicesApache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial Servicesconfluent
 

Tendances (20)

Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Apache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and ArchitecturesApache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and Architectures
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer Consumers
 
Change Data Streaming Patterns for Microservices With Debezium
Change Data Streaming Patterns for Microservices With Debezium Change Data Streaming Patterns for Microservices With Debezium
Change Data Streaming Patterns for Microservices With Debezium
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
 
NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?
 
Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
Cloud Pub_Sub
Cloud Pub_SubCloud Pub_Sub
Cloud Pub_Sub
 
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and LinkerdService Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
 
Kafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaKafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around Kafka
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practices
 
Redis and Kafka - Advanced Microservices Design Patterns Simplified
Redis and Kafka - Advanced Microservices Design Patterns SimplifiedRedis and Kafka - Advanced Microservices Design Patterns Simplified
Redis and Kafka - Advanced Microservices Design Patterns Simplified
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
 
Containers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatContainers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red Hat
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
 
Kafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereKafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be there
 
An Introduction to Prometheus
An Introduction to PrometheusAn Introduction to Prometheus
An Introduction to Prometheus
 
Apache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial ServicesApache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial Services
 

Similaire à Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Summit NA 2021 Keynote

Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...Timothy Spann
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureTimothy Spann
 
Automation + dev ops summit hail hydrate! from stream to lake
Automation + dev ops summit   hail hydrate! from stream to lakeAutomation + dev ops summit   hail hydrate! from stream to lake
Automation + dev ops summit hail hydrate! from stream to lakeTimothy Spann
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsTimothy Spann
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeTimothy Spann
 
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Timothy Spann
 
Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Timothy Spann
 
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...Timothy Spann
 
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...Timothy Spann
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computingTimothy Spann
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Timothy Spann
 
What We Learned From Building a Modern Messaging and Streaming System for Cloud
What We Learned From Building a Modern Messaging and Streaming System for CloudWhat We Learned From Building a Modern Messaging and Streaming System for Cloud
What We Learned From Building a Modern Messaging and Streaming System for CloudStreamNative
 
Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022Timothy Spann
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...confluent
 
Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarScyllaDB
 
Open keynote_carolyn&matteo&sijie
Open keynote_carolyn&matteo&sijieOpen keynote_carolyn&matteo&sijie
Open keynote_carolyn&matteo&sijieStreamNative
 
ITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming AppsITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming AppsTimothy Spann
 
Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...Timothy Spann
 
Hail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceHail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceTimothy Spann
 

Similaire à Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Summit NA 2021 Keynote (20)

Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
 
Automation + dev ops summit hail hydrate! from stream to lake
Automation + dev ops summit   hail hydrate! from stream to lakeAutomation + dev ops summit   hail hydrate! from stream to lake
Automation + dev ops summit hail hydrate! from stream to lake
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
 
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
 
Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022
 
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
 
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
 
What We Learned From Building a Modern Messaging and Streaming System for Cloud
What We Learned From Building a Modern Messaging and Streaming System for CloudWhat We Learned From Building a Modern Messaging and Streaming System for Cloud
What We Learned From Building a Modern Messaging and Streaming System for Cloud
 
Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
 
Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache Pulsar
 
Open keynote_carolyn&matteo&sijie
Open keynote_carolyn&matteo&sijieOpen keynote_carolyn&matteo&sijie
Open keynote_carolyn&matteo&sijie
 
ITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming AppsITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming Apps
 
Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...
 
Hail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceHail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open source
 

Plus de StreamNative

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...StreamNative
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...StreamNative
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...StreamNative
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022StreamNative
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022StreamNative
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...StreamNative
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...StreamNative
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022StreamNative
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...StreamNative
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...StreamNative
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022StreamNative
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022StreamNative
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022StreamNative
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022StreamNative
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022StreamNative
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022StreamNative
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...StreamNative
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...StreamNative
 

Plus de StreamNative (20)

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
 

Dernier

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 

Dernier (20)

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 

Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Summit NA 2021 Keynote

  • 1. Pulsar Virtual Summit North America 2021 Apache Pulsar: Why Unified Messaging and Streaming Is the Future Matteo Merli, Sijie Guo @ Pulsar PMC
  • 2. Who are we? ● Sijie Guo (@sijieg) ● CEO, StreamNative ● PMC Member of Pulsar/BookKeeper ● Ex Co-Founder, Streamlio ● Ex Twitter ● Matteo Merli (@merlimat) ● CTO, StreamNative ● Co-creator and PMC chair of Pulsar ● Ex Co-Founder, Streamlio ● Ex Yahoo!
  • 3. StreamNative Founded by the creators of Apache Pulsar, StreamNative provides a cloud-native, unified messaging and streaming platform powered by Apache Pulsar to support multi-cloud and hybrid-cloud strategies
  • 4. Announcing StreamNative Platform 1.0 ✓ Pulsar Transactions ✓ Kafka-on-Pulsar ✓ Function Mesh for serverless streaming ✓ Enterprise-ready security ✓ Pulsar Operators ✓ Seamless StreamNative Cloud experience
  • 5. Pulsar Trends Kafka -> Pulsar Scale Cloud-Native Pulsar + Flink
  • 6. Pulsar at Scale More companies in Production
  • 7. Pulsar at Scale Hit Trillion Messages Per Day
  • 8. Cloud-Native Kubernetes Drive Adoption of Pulsar ✓ 80% of Pulsar users deploy Pulsar in a cloud environment ✓ 62% of Pulsar users deploy Pulsar on Kubernetes ✓ 49% noted Pulsar’s Cloud-Native capabilities as one of the top reasons they chose to adopt Pulsar
  • 9. Cloud-Native Built for Kubernetes Containers Cloud Native Hybrid & MultiCloud ● Single Cloud Provider ● Monolithic Architectures ● Single Tenant Systems ● No Geo-replication VM / Early Cloud Era Containers / Modern Cloud Era Microservices
  • 10. Pulsar + Flink Unified Stream and Batch
  • 11. Kafka to Pulsar More and More Kafka Users Adopt Pulsar ✓ 68% of respondents use Kafka in addition to Pulsar ✓ 34% of respondents use or plan to use Kafka-on-Pulsar ✓ Kafka and Pulsar serve different use cases ✓ Once adopted, Pulsar usage expands across organizations
  • 12. Pulsar Adoption Use Cases Adopted Pulsar to replace Kafka in their DSP (Data Streaming Platform). ● 1.5-2x lower in capex cost ● 5-50x improvement in latency ● 2-3x lower in opex due ● 10 PB / day Adopted Pulsar to power their billing platform, Midas, which processing hundreds of billions of financial transactions daily. Adoption then expanded to Tencent’s Federated Learning Platform and Tencent Gaming. Use cases require a scalable message queue for serving mission-critical business applications to replace RabbitMQ. In the process of expanding use cases to build data streaming services
  • 15. Messaging ● Queueing systems are ideal for work queues that do not require tasks to be performed in a particular order— for example, sending one email message to many recipients. ● RabbitMQ and Amazon SQS are examples of popular queue-based message systems. Streaming ● Streaming works best in situations where the order of messages is important—for example, data ingestion. ● Kafka and Amazon Kinesis are examples of messaging systems that use streaming semantics for consuming messages. Data in motion
  • 17. E-Commerce w/o Pulsar ✓ Separate storage ✓ Tiering outside toolset ✓ Separate application and data domains ✓ Different tech stacks
  • 18. Why not a system that is able to support messaging and streaming?
  • 19. E-Commerce with Pulsar ✓ Unified storage for in- motion data ✓ Native tiered storage ✓ Single system to exchange data ✓ Teams share toolset
  • 20. Build Apache Pulsar for unified messaging and streaming
  • 21. Step 1: A scalable storage for streams of data
  • 22. Step 2: Separate serving from storage Apache Pulsar Apache BookKeeper Broker 0 Producer Consumer Broker 1 Broker 2 Bookie 0 Bookie 1 Bookie 2 Bookie 3 Bookie 4
  • 23. Step 3: Unified API Streaming Messaging Producer 1 Producer 2 Pulsar Topic/Partition m0 m1 m2 m3 m4 Consumer D-1 Consumer D-2 Consumer D-3 Subscription D Key-Shared Consumer C-1 Consumer C-2 Consumer C-3 Subscription C m1 m2 m3 m4 m0 Shared Failover Consumer B-1 Consumer B-0 Subscription B m1 m2 m3 m4 m0 In case of failure in Consumer B-0 Consumer A-1 Consumer A-0 Subscription A m1 m2 m3 m4 m0 Exclusive X
  • 24. Reader and Batch API Pub/Sub API Publisher Subscriber Step 3: Unified API Stream Processor Applications Microservices or Event-Driven Architecture
  • 25. Step 4: Schema API Reader and Batch API Pub/Sub API Publisher Subscriber Stream Processor Applications Microservices or Event-Driven Architecture Schema API Schema API
  • 26. Step 5: Functions and IO API Reader and Batch API Pub/Sub API Publisher Subscriber Stream Processor Applications Microservices or Event-Driven Architecture Schema API Schema API Functions API Pulsar IO/Connectors Prebuilt Connectors Custom Connectors
  • 27. Step 6: Tiered Storage Reader and Batch API Pub/Sub API Publisher Subscriber Stream Processor Applications Microservices or Event-Driven Architecture Schema API Schema API Functions API Pulsar IO/Connectors Prebuilt Connectors Custom Connectors Tiered Storage
  • 28. Step 7: Protocol Handlers Apache Pulsar Pulsar Protocol Handler Pulsar Clients (queue + stream) Kafka Protocol Handler AMQP Protocol Handler MQTT Protocol Handler Kafka Clients AMQP Clients MQTT Clients
  • 29. Reader and Batch API Pub/Sub API Publisher Subscriber Stream Processor Applications Microservices or Event-Driven Architecture Schema API Schema API Functions API Pulsar IO/Connectors Prebuilt Connectors Custom Connectors Tiered Storage Step 8: Transaction API Transaction API
  • 30. Pulsar 2.8 towards a complete vision of unified messaging and streaming
  • 31. The future of Pulsar
  • 32. Towards a self-adjusting data platform ✓ Tuning data platforms to run at scale is hard ✓ Lots of configurations ✓ Requires in-depth knowledge of internals ✓ Workloads are constantly changing
  • 33. Topic auto-partitioning ✓ Partitions are an artifact of implementation ✓ It’s not a natural property of the data ✓ Abstract the partitioning away from users ✓ Partitions are automatically split / merged based ✓ Rethink how an API should look like
  • 34. Self-Adjusting Storage ✓ Ensure most optimal utilization of hardware ✓ No configuration ✓ Automatically adjust strategies based on changing condition: ✓ Disk access ✓ Cache management ✓ Queue sizes
  • 35. Pulsar Functions ✓ The foundation is now mature — UX is still poor ✓ Simpler tooling to create & manage functions ✓ CI/CD integration — Versioning — A/B testing ✓ Observability & Debuggability ✓ Improve support for Go and Python functions ✓ DSL — Provide higher level constructs to process data
  • 36. Stream Storage ✓ Evolve the current state of Tiered Storage ✓ Integrate with data lake technologies
  • 37. Working with the data community

Notes de l'éditeur

  1. Before diving into the “Unified Messaging and Streaming”, let’s take a look at the trends in Pulsar community.
  2. To understand what is happening behind the scene, we need to rewind back to the early days of Pulsar. Back to 2012, when we first set out to build Pulsar, we thought there should be a global geo-replicated infrastructure for all the messaging data. We didn’t start with the idea of making our own software, but started by observing the gaps in the existing technologies available at the time and realized how they were insufficient to serve the needs of an data-driven organization.
  3. Talking about these 2 different worlds Messaging - read slide These are like commands that represent changes that need to be made to the system An example : we send message that says “Process this order” or “change user to be deleted” but we don’t actually perform that change just notify Messaging systems are selected when synchronous communications breaks down In contrast - streaming systems deal with events. The state changes themselves, so instead of sending a message saying this user wants to update their email, we instead actually perform the update Events interlinked together that may be persisted, replayed or aggregated
  4. Talking about these 2 different worlds Messaging - read slide These are like commands that represent changes that need to be made to the system An example : we send message that says “Process this order” or “change user to be deleted” but we don’t actually perform that change just notify Messaging systems are selected when synchronous communications breaks down In contrast - streaming systems deal with events. The state changes themselves, so instead of sending a message saying this user wants to update their email, we instead actually perform the update Events interlinked together that may be persisted, replayed or aggregated
  5. Instructor Notes What we have here is a little bit of an example of what we might see in a modern organization that has run into both these issues We have basically 2 different regimes or 2 different worlds - different teams. Historically, these worlds often seem very different with entirely different tech stacks and entirely different teams. However, as data becomes more critical in informing applications, the need to have applications make more use of what data teams and data services are producing. Likewise getting the data out of applications and into the data realm has forced organizations to get better at being able to do both of these things really well. This can be a real challenge. So on the left we have the application side and these are applications that are interacting via messages and dealing with the aspects of running your systems and providing capabilities focused on business concerns On the right side we have services that deal with the data. Data bulk and large Sometimes the right side includes real time or batch processes such as sending large amounts of data, putting it into data lakes, making computing answers about it, sending data for another services or providing that data to other orgs that need it These 2 worlds generally are using different technologies and different tools and different processes - all leading to more complexity and cost
  6. Read slide Separate storage/transport systems for messaging, streaming, and big-data. Focus on ETL separate processes Messaging helps decouple apps, provides for reliable async communication, work queues, in core applications. Streaming allows for “medium-term” storage of streams (~30 days), aggregating streams of data and real-time processing for near real-time analytics. Batch processing and long-term object storage (S3, HDFS, etc) allows for processing historical data to learn from the past. “Tiering” of data from messaging -> streaming -> object storage is outside of core toolset and is maintained explicitly. Application and Data domains are separated, data is replicated into data domain. Results from data domain are loaded (ETL) back into application domain. Multiple teams with very different technology stacks. ==== To show how Pulsar provides that ability to be transformative here is a common example of an e-commerce system stack that contains both a streaming set of services and also data processing On the application side we have order services, inventory service and fulfillment Talk to each service (think Amazon) On the data side we have Spark - some batch processing using spark Flink - Real time inventory analysis using flink Another use case maybe some long term storage needs versus short term (30 days) then data warehouse layer Imagine a person ordering something and then check inventory and it isn’t there. Do you delete the order or put on backorder? Once the inventory gets replenished then how do we notify the customers that their order is now coming So need to join both sides together
  7. It is very nature to merge both. Talk about the technologies are evolved to a way to that is able to support both. Read slide and add more context: “Unified” storage/transport of message and streams with access to underlying data: Messaging - Decoupled applications with pub/sub, shared subscriptions for work queues, exclusive subscriptions for fanout and point-to-point messaging with flexible large numbers of non-partitioned topics. Streaming - Ordered, scalable partitioned topics with failover and key shared subscriptions. Pub/sub (broker controlled) or reader API (client controlled) for advanced stream processing, replay, etc. Big-data batch Access - Underlying segments of topics can be read directly, allow for scale-out parallelism. Tiered storage is core to Pulsar, no need for external tools. Application and data domains use single system to exchange data, with converged “messaging” and “streaming”. One or many teams, with shared toolset. Talk to diagram Talk to the slide and on the left side say how Pulsar can process real time streams and on the right can do batch processing, offload to tiered storage and read back in parallel batch fashion and even provide a stream back to other systems for consumption order services, inventory service and fulfillment - they still work from the messaging domain (use cases not too different) But now can support processing at much higher scale, any messages they have are kept in Pulsar as a single source of truth and these messages can be offloaded via Pulsar to long term storage Pulsar also provides the power to enable a unified batch and streaming job that can do both batch processing by reading from underlying storage and combine that with real time streams all with a single technology
  8. Let's take a retrospective look at how Pulsar has evolved through the years. When we started designing Pulsar as a new platform, we always had this idea of supporting both the Pub-Sub semantics as well as the data streaming pipelines, which at the time were a new and emerging thing. But it would be a lie to say we had everything pre-planned since the beginning. Instead, we spent a lot of time observing how people used these platforms and we tried to fill all the gaps we were seeing, evolving Pulsar with the changing needs of data applications.
  9. At the very core of Pulsar there has always been the concept of the "log". A distributed, replicated and immutable ledger where all the events are appended. BookKeeper has proved, throughout the years, to be the best storage solution for streams of data. It scales to very large number of logs, it offers consistency, durability, low latency and high-throughput and, more importantly, very convenient operational tooling. To summarize: using the log as a building block does a lot of the heavy lifting required to build a truly scalable system.
  10. Another architectural choice that came naturally from using BookKeeper has been the separation of the storage from the data serving layer. This comes from BookKeeper because BookKeeper requires to have a single writer for a each log. In our case the Broker acts as that single writer. This multi-layer architecture was exactly what we needed because it allows Pulsar to have: 1. Stateless brokers - Means topics can be easily moved across brokers without copying any data. For example, expanding cluster or adjusting the topics assignments after changing conditions. 2. Data locality - Because of this broker layer, the data for a single topic or partition does not have to be stored in one single storage node. Instead we can fully utilize the resources of the entire cluster.
  11. We just said that the log is the building block of Pulsa... but the log on its own is a very low level construct. Applications very often need much more sophisticated ways of interacting with the data than just reading through the log of events. Instead, we wanted to capture the right level of semantics needed to support a wide range of pub-sub and streaming use cases. The core idea was to leave the flexibility to consume data from topics in multiple different ways, depending on what the application needs. We ended up having 4 subscription types with different semantics and different properties, each one with its own merits.
  12. After the Pub/Sub API, the next addition was the Reader API. You can think of it as the "unmanaged" way to consume data from a topic. While there are many reasons for using a reader, the main users are typically Stream Processing frameworks because they tend to have their own checkpointing mechanisms or, similarly, batch systems that want to do a scan of the historical data.
  13. The common theme in the API exposed by Pulsar is the support for Schema. Having direct support for Schema inside Pulsar means that brokers can validate the schema of the data being published and that the expectation of consumers is matched as well. But it also means that it becomes very easy to "discover" the schema of the data. The discoverability of the schema means that you can write fully type safe generic consumers that don't need to be aware of one specific schema.
  14. Next we looked at what people were trying to do with messaging platforms and the realization was that there was always some portion of computation involved. Application very often need to do simple data transformations, enrichment and similar things. Functions were designed to provide the simplicity of the "Serverless" model with a very tight integration in the Pulsar platform. One example of how powerful Pulsar functions are is that we have created a connector framework, Pulsar IO, entirely based on Pulsar Functions. With Pulsar IO, you can choose between a large set of pre-built connectors, both sources or sinks, or build your own custom connectors.
  15. After that, the next trend saw is that more and more users wanted to use the "stream" concept not just as a temporary buffer, as a way to isolate the data ingestion and the processing. Instead, they increasingly want to keep the stream as a permanent, or at least long term "storage of record". Tiered storage was the missing link to enable this. By offloading cold data to cloud storage providers, we can have large scale data retention at a very effective cost, all while maintaining the stream view of the data and the same APIs.
  16. Another realization was that, because of its nature, messaging is always the integration point for different applications and components. This makes migration from other platforms a bit harder. You often have to coordinate that migration across different teams or organizations. To make it easier, we extended the Pulsar brokers to be able to speak several protocols, in addition the Pulsar native protocols. With Protocol Handlers, there is a pluggable way to add more ways to interact with the Pulsar service and the same topic data. We started with KoP, Kafka On Pulsar, then followed up by AMQP and MQTT. It is very powerful mechanism for a few reasons: 1. Applications can use existing client libraries with no code or dependencies changes 2. You can mix all sort of different protocols to interact with the same topic 3. It's exposed directly in Pulsar brokers, data is stored only once and there is no "proxy overhead"
  17. To really complete the full picture, in Pulsar 2.8 we introduced support for transactions. It's now possible to do very complex interactions and take advantage of the transactional properties, for example publishing messages atomically across multiple topics, or consuming and producing atomically.
  18. We can say that Pulsar 2.8 is a big milestone in the journey completing this vision of unified messaging and streaming platform. We are very excited and very proud of this release. This is culminating months and months of work by a “larger than ever” group of committers and contributors. And while transactions support is the biggest new feature, it is certainly not the only one. We have feature like Exclusive producer support, about which I will Be talking about tomorrow in an ad-hoc session, a new API for package management, to improve the way we manage the functions and connectors code artifacts, or finally simplified way to configure memory limit in Pulsar clients.
  19. After looking at the past, let's now take a look at some of the items that we want to focus on in the very near future.
  20. A problem that we're seeing overall in the data ecosystem is that these platforms can be very difficult to tune and operate when running at a large scale. This is not a problem specific to Pulsar, but it is something that we believe it should be addressed. Typically, there are a lot of configuration options and each of them requires in-depth knowledge of the internal of the system. Worse, when integrating multiple systems, like a comput framework, it might be very hard to predict how a change in the configuration will affect the overall stability and performance. Finally, the workloads are increasingly dynamic and constantly changing. It's not possible to have a static configuration that will have "optimal" performance in every condition.
  21. The first item I want to discuss is partitioning. People are used to see partitioning and sharding, but these are really artifacts of how systems are implemented. Partitions are usually not a natural property of the data. Because of that, we want to abstract the partition concept away from the user sight. Application developers should not be worried about partitions, operators should not be thinking at how many partitions are needed for a certain use case. Instead, the system should be able to figure it out on its own, internally splitting and merging partitions, while maintaining the fundamentals ordering guarantees.
  22. Tuning storage system can also be a very complex task. In particular, it can be very hard to predict the impact of configuration on the overall performance when we're crossing multiple layers: there is the Operating System, the disk device and the disk controller. In a similar way, the idea we have is to make it working with no configuration, in a way that the storage system is able to automatically adjust the strategies based on the changing conditions of the traffic. All aspects regarding the access pattern to the disk, what kind of cache eviction strategy and so on.
  23. When we introduced Pulsar Functions, we had the idea of making it a frictionless platform for developers to do data processing. Over few years, the foundation of Pulsar Functions runtime has really matured into a solid platform, although the user experience is still not great. While it is very easy for developers to write functions, we should strive to make it much easier to actually deploy and manage functions. For example, having functions tooling to be well integrated with CI/CD platforms, supporting versioning and out of the box support for A/B testings. Another aspect is observability and debuggability. The tooling and the platform needs to make it super-easy for users to discover issues in their own code or to detect performance issues. Finally, we are thinking on a more higher level DSL, that can support higher level constructs to further simplify writing data processing functions.
  24. We talked before about Tiered Storage and how it has enabled completely new use cases to be supported by Pulsar. The next step here is to make sure we can integrate with existing data lake technologies, like Delta Lake and Apache Hudi. The vision is to use the Data Lake as the tiered storage backend, so that the same data can be consumed as a stream or with the data lake tooling.
  25. As a final note, given the very nature of Pulsar, that sits between different systems and platforms and links all of them together, we want to reaffirm our commitment to work with the larger data community to ensure that Pulsar is supported everywhere, out of the box, as a first class citizen. We have been partnering with many Open Source communities like Trino, Druid, Pinot, Spark and Flink. We will continue to do so, and more in the future. We believe that this will benefits Pulsar, its users and the overall data ecosystem.