SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
© 2020 SPLUNK INC.
The Evolution of Data
Infrastructure at
Splunk
Flink Forward SF/Virtual 2020
Eric Sammer - VP, Distinguished Engineer
esammer@splunk.com
© 2020 SPLUNK INC.
What is Splunk?
A platform for the collection, storage, query, and analysis of event and time series data.
Logs, but other kinds of events too
Tons of query-time processing features
Core platform experience, increasingly domain-specific applications
© 2020 SPLUNK INC.
Ad hoc query - Human and apps
Scheduled query - Materialized view maintenance, app processes, alerting
Realtime query - Ad hoc (“Live tail”), app processes
Classes of Workload
© 2020 SPLUNK INC.
> +
© 2020 SPLUNK INC.
two and a half years ago...
© 2020 SPLUNK INC.
Connector
Agent
Indexer
Pipeline
Index
All data lands in an index
Index
© 2020 SPLUNK INC.
Index
Index
Query Application
Applications and platform services use
scheduled queries to build views or
process data
Query
View ApplicationQuery
© 2020 SPLUNK INC.
Challenges
Latency
Polling query workload overhead - View maintenance, alerts / triggers, application
processing
Insufficient programming model - State management, failure and recovery semantics,
parallelization, optimization, orchestration left to applications in many cases
© 2020 SPLUNK INC.
Polling query overhead: how bad is it, really?
© 2020 SPLUNK INC.
Bad.
This cohort:
p99: 94.91%
p90: 94.29%
p75: 79.63%
p50: 63.21%
Anonymized real deployments.
QueriesexecutedoverT
© 2020 SPLUNK INC.
If we could ... we’d get ...
Move view building, alerting, and app processing to the ingest stream
Up to 94% query workload reduction / more time for ad hoc queries
Reduced latency to data visibility / faster time to act
Safer, more consistent and efficient programming model for developers
© 2020 SPLUNK INC.
If we could ... we’d get ...
Let developers / end users tap into the ingest stream
Deeper integration opportunities with other data infra
New ingest-time extension points / support for more use cases
© 2020 SPLUNK INC.
If we could ... we’d get ...
Support additional ingest-time functionality
Better data quality downstream
Better data governance and control opportunities
Better DR/BCP/scaling primitives and function
© 2020 SPLUNK INC.
What would Splunk look like if we added stream processing as a platform primitive?
© 2020 SPLUNK INC.
phase 1: ingest
© 2020 SPLUNK INC.
Focus on primitives and ingestion process
Optional - support incremental adoption by on prem customers
Minimal disruption to current semantics (only improvements)
Easy wins
© 2020 SPLUNK INC.
Connector
Agent
Indexer
Index
View
Query
Connector
Agent
Indexer
Index
View
Query
Firehose
Pubsub Topic
DSP
Flink-Powered
Before
After
3rd Party
System
© 2020 SPLUNK INC.
Data Stream Processor (DSP) is Splunk’s Flink-powered stream processing engine
All connectors produce to the firehose pubsub topic in a common envelope
DSP has both system- and user-configured pipelines that process data
Pipeline = Source(s) → Function(s) → Sink(s)
Parse, route, enrich, filter, aggregate, join, union, branch, etc.
Functions can be written in DSL or “native” SDK
Streaming Infrastructure
© 2020 SPLUNK INC.
The Firehose
Common envelope for all data
“Body” can be any data type, “sourcetype” provides semantic information
Firehose represented as a source in DSP
More specific sources are firehose + filter
© 2020 SPLUNK INC.
Pipeline Patterns
“Upgrade” of semantic information
“Routing” implemented as branches into filter + sink pairs
“External functions” use queues to produce to / consume from external systems
StateFun 2.0 is more interesting though - adds state mgmt, removes overhead
“Group functions” are fragments of a DAG that can be inlined
Function-level RBAC + group functions = efficiently limit data access
© 2020 SPLUNK INC.
Phase 1 gives us...
Deeper integration opportunities with other data infra
New ingest-time extension points / support for more use cases
Better data quality downstream
Better data governance and control opportunities
Better DR/BCP/scaling primitives and function
© 2020 SPLUNK INC.
phase 2: move processing
upstream
© 2020 SPLUNK INC.
Push polling operations upstream (“to the left”)
Allow developers to migrate batch processes to the stream, in part or whole
Make a number of platform functions (e.g. alerting) streaming by default
© 2020 SPLUNK INC.
Connector
Agent
Indexer
Index
View
Query
Firehose
Pubsub Topic
DSP
Flink-Powered
3rd Party
System
Before
Connector
Agent
Indexer Index
View
Query
Firehose
Pubsub Topic
DSP
Flink-Powered
3rd Party
System
After
Application
Functions
Application
Data Infra
© 2020 SPLUNK INC.
Critial enablers
DSP uses a dynamic function registry to load plugins
Pre-built content system for applications to add pipeline templates, functions
Pubsub firehose lets applications see whatever they need
Auto-scale Flink clusters based on slot availability
© 2020 SPLUNK INC.
Phase 2 gives us...
Up to 94% query workload reduction / more time for ad hoc queries
Reduced latency to data visibility / faster time to act
Safer, more consistent and efficient programming model for developers
Even more
Deeper integration opportunities with other data infra
New ingest-time extension points / support for more use cases
© 2020 SPLUNK INC.
phase 3: realtime query
© 2020 SPLUNK INC.
In phase 2 we moved application workloads; this is about ad hoc realtime queries
Splunk has realtime query today, but there are some gotchas
Computationally expensive (but still better than many systems)
Complex data guarantees
Can we make it cheap to run thousands of concurrent realtime queries?
With rapid start/stop?
© 2020 SPLUNK INC.
I’ll let you know when I do. Sorry.
© 2020 SPLUNK INC.
adding it up
© 2020 SPLUNK INC.
When working with data, streams are the right primitive
Concrete semantics, patterns, and architecture are key
The details are important (e.g. state management)
As patterns evolve, so must we
Thank You
© 2020 SPLUNK INC.
esammer@splunk.com | @esammer

Contenu connexe

Tendances

Virtual Flink Forward 2020: Integrate Flink with Kubernetes natively - Yang Wang
Virtual Flink Forward 2020: Integrate Flink with Kubernetes natively - Yang WangVirtual Flink Forward 2020: Integrate Flink with Kubernetes natively - Yang Wang
Virtual Flink Forward 2020: Integrate Flink with Kubernetes natively - Yang WangFlink Forward
 
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...
Towards Flink 2.0:  Unified Batch & Stream Processing - Aljoscha Krettek, Ver...Towards Flink 2.0:  Unified Batch & Stream Processing - Aljoscha Krettek, Ver...
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...Flink Forward
 
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...Flink Forward
 
Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...
Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...
Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...Flink Forward
 
Flink Forward San Francisco 2019: Towards Flink 2.0: Rethinking the stack and...
Flink Forward San Francisco 2019: Towards Flink 2.0: Rethinking the stack and...Flink Forward San Francisco 2019: Towards Flink 2.0: Rethinking the stack and...
Flink Forward San Francisco 2019: Towards Flink 2.0: Rethinking the stack and...Flink Forward
 
Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...
Flink Forward San Francisco 2018 keynote:  Srikanth Satya - "Stream Processin...Flink Forward San Francisco 2018 keynote:  Srikanth Satya - "Stream Processin...
Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...Flink Forward
 
Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019Thomas Weise
 
A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH
A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH
A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH Flink Forward
 
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward
 
Scaling stream data pipelines with Pravega and Apache Flink
Scaling stream data pipelines with Pravega and Apache FlinkScaling stream data pipelines with Pravega and Apache Flink
Scaling stream data pipelines with Pravega and Apache FlinkTill Rohrmann
 
Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...
Flink Forward San Francisco 2018:  Dave Torok & Sameer Wadkar - "Embedding Fl...Flink Forward San Francisco 2018:  Dave Torok & Sameer Wadkar - "Embedding Fl...
Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...Flink Forward
 
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...Flink Forward
 
KEYNOTE Flink Forward San Francisco 2019: From Stream Processor to a Unified ...
KEYNOTE Flink Forward San Francisco 2019: From Stream Processor to a Unified ...KEYNOTE Flink Forward San Francisco 2019: From Stream Processor to a Unified ...
KEYNOTE Flink Forward San Francisco 2019: From Stream Processor to a Unified ...Flink Forward
 
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...Flink Forward
 
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...Flink Forward
 
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...Flink Forward
 
How to Improve Performance Testing Using InfluxDB and Apache JMeter
How to Improve Performance Testing Using InfluxDB and Apache JMeterHow to Improve Performance Testing Using InfluxDB and Apache JMeter
How to Improve Performance Testing Using InfluxDB and Apache JMeterInfluxData
 
Stream Processing with Apache Flink
Stream Processing with Apache FlinkStream Processing with Apache Flink
Stream Processing with Apache FlinkC4Media
 

Tendances (20)

Virtual Flink Forward 2020: Integrate Flink with Kubernetes natively - Yang Wang
Virtual Flink Forward 2020: Integrate Flink with Kubernetes natively - Yang WangVirtual Flink Forward 2020: Integrate Flink with Kubernetes natively - Yang Wang
Virtual Flink Forward 2020: Integrate Flink with Kubernetes natively - Yang Wang
 
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...
Towards Flink 2.0:  Unified Batch & Stream Processing - Aljoscha Krettek, Ver...Towards Flink 2.0:  Unified Batch & Stream Processing - Aljoscha Krettek, Ver...
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...
 
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...
 
Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...
Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...
Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...
 
Flink Forward San Francisco 2019: Towards Flink 2.0: Rethinking the stack and...
Flink Forward San Francisco 2019: Towards Flink 2.0: Rethinking the stack and...Flink Forward San Francisco 2019: Towards Flink 2.0: Rethinking the stack and...
Flink Forward San Francisco 2019: Towards Flink 2.0: Rethinking the stack and...
 
Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...
Flink Forward San Francisco 2018 keynote:  Srikanth Satya - "Stream Processin...Flink Forward San Francisco 2018 keynote:  Srikanth Satya - "Stream Processin...
Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...
 
Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019
 
A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH
A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH
A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH
 
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
 
Scaling stream data pipelines with Pravega and Apache Flink
Scaling stream data pipelines with Pravega and Apache FlinkScaling stream data pipelines with Pravega and Apache Flink
Scaling stream data pipelines with Pravega and Apache Flink
 
Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...
Flink Forward San Francisco 2018:  Dave Torok & Sameer Wadkar - "Embedding Fl...Flink Forward San Francisco 2018:  Dave Torok & Sameer Wadkar - "Embedding Fl...
Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...
 
dA Platform Overview
dA Platform OverviewdA Platform Overview
dA Platform Overview
 
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
 
KEYNOTE Flink Forward San Francisco 2019: From Stream Processor to a Unified ...
KEYNOTE Flink Forward San Francisco 2019: From Stream Processor to a Unified ...KEYNOTE Flink Forward San Francisco 2019: From Stream Processor to a Unified ...
KEYNOTE Flink Forward San Francisco 2019: From Stream Processor to a Unified ...
 
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
 
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
 
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...
Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...
 
How to Improve Performance Testing Using InfluxDB and Apache JMeter
How to Improve Performance Testing Using InfluxDB and Apache JMeterHow to Improve Performance Testing Using InfluxDB and Apache JMeter
How to Improve Performance Testing Using InfluxDB and Apache JMeter
 
Stream Processing with Apache Flink
Stream Processing with Apache FlinkStream Processing with Apache Flink
Stream Processing with Apache Flink
 
Flink. Pure Streaming
Flink. Pure StreamingFlink. Pure Streaming
Flink. Pure Streaming
 

Similaire à Virtual Flink Forward 2020: Keynote: The Evolution of Data Infrastructure at Splunk - Eric Sammer

Delivering New Visibility and Analytics for IT Operations
Delivering New Visibility and Analytics for IT OperationsDelivering New Visibility and Analytics for IT Operations
Delivering New Visibility and Analytics for IT OperationsGabrielle Knowles
 
SplunkLive Auckland - Operational Intelligence
SplunkLive Auckland - Operational IntelligenceSplunkLive Auckland - Operational Intelligence
SplunkLive Auckland - Operational IntelligenceSplunk
 
SplunkLive Wellington 2015 - Operational Intelligence
SplunkLive Wellington 2015 - Operational IntelligenceSplunkLive Wellington 2015 - Operational Intelligence
SplunkLive Wellington 2015 - Operational IntelligenceSplunk
 
SAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasySAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasyDataWorks Summit
 
SplunkLive! London 2017 - Happy Apps, Happy Users
SplunkLive! London 2017 - Happy Apps, Happy UsersSplunkLive! London 2017 - Happy Apps, Happy Users
SplunkLive! London 2017 - Happy Apps, Happy UsersSplunk
 
Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...DataWorks Summit
 
Next Generation Tooling for building streaming analytics app
Next Generation Tooling for building streaming analytics appNext Generation Tooling for building streaming analytics app
Next Generation Tooling for building streaming analytics appgvetticaden
 
SplunkLive! Frankfurt 2018 - Monitoring the End User Experience with Splunk
SplunkLive! Frankfurt 2018 - Monitoring the End User Experience with SplunkSplunkLive! Frankfurt 2018 - Monitoring the End User Experience with Splunk
SplunkLive! Frankfurt 2018 - Monitoring the End User Experience with SplunkSplunk
 
Splunk Data Onboarding Overview - Splunk Data Collection Architecture
Splunk Data Onboarding Overview - Splunk Data Collection ArchitectureSplunk Data Onboarding Overview - Splunk Data Collection Architecture
Splunk Data Onboarding Overview - Splunk Data Collection ArchitectureSplunk
 
Splunk Cloud and Splunk Enterprise 7.2
Splunk Cloud and Splunk Enterprise 7.2 Splunk Cloud and Splunk Enterprise 7.2
Splunk Cloud and Splunk Enterprise 7.2 Splunk
 
Splunk Cloud and Splunk Enterprise 7.2
Splunk Cloud and Splunk Enterprise 7.2 Splunk Cloud and Splunk Enterprise 7.2
Splunk Cloud and Splunk Enterprise 7.2 Splunk
 
Splunk Cloud and Splunk Enterprise 7.2
Splunk Cloud and Splunk Enterprise 7.2Splunk Cloud and Splunk Enterprise 7.2
Splunk Cloud and Splunk Enterprise 7.2Splunk
 
Spark and machine learning in microservices architecture
Spark and machine learning in microservices architectureSpark and machine learning in microservices architecture
Spark and machine learning in microservices architectureStepan Pushkarev
 
What's New with the Latest Splunk Platform Release
What's New with the Latest Splunk Platform ReleaseWhat's New with the Latest Splunk Platform Release
What's New with the Latest Splunk Platform ReleaseSplunk
 
SplunkLive! Munich 2018: Monitoring the End-User Experience with Splunk
SplunkLive! Munich 2018: Monitoring the End-User Experience with SplunkSplunkLive! Munich 2018: Monitoring the End-User Experience with Splunk
SplunkLive! Munich 2018: Monitoring the End-User Experience with SplunkSplunk
 
SplunkLive! Zurich 2018: Monitoring the End User Experience with Splunk
SplunkLive! Zurich 2018: Monitoring the End User Experience with SplunkSplunkLive! Zurich 2018: Monitoring the End User Experience with Splunk
SplunkLive! Zurich 2018: Monitoring the End User Experience with SplunkSplunk
 
Splunk Enterprise 6.3 - Splunk Tech Day
Splunk Enterprise 6.3 - Splunk Tech DaySplunk Enterprise 6.3 - Splunk Tech Day
Splunk Enterprise 6.3 - Splunk Tech DayZivaro Inc
 
Spring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour DallasSpring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour DallasVMware Tanzu
 

Similaire à Virtual Flink Forward 2020: Keynote: The Evolution of Data Infrastructure at Splunk - Eric Sammer (20)

Delivering New Visibility and Analytics for IT Operations
Delivering New Visibility and Analytics for IT OperationsDelivering New Visibility and Analytics for IT Operations
Delivering New Visibility and Analytics for IT Operations
 
SplunkLive Auckland - Operational Intelligence
SplunkLive Auckland - Operational IntelligenceSplunkLive Auckland - Operational Intelligence
SplunkLive Auckland - Operational Intelligence
 
SplunkLive Wellington 2015 - Operational Intelligence
SplunkLive Wellington 2015 - Operational IntelligenceSplunkLive Wellington 2015 - Operational Intelligence
SplunkLive Wellington 2015 - Operational Intelligence
 
SAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasySAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made Easy
 
SplunkLive! London 2017 - Happy Apps, Happy Users
SplunkLive! London 2017 - Happy Apps, Happy UsersSplunkLive! London 2017 - Happy Apps, Happy Users
SplunkLive! London 2017 - Happy Apps, Happy Users
 
Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...
 
Next Generation Tooling for building streaming analytics app
Next Generation Tooling for building streaming analytics appNext Generation Tooling for building streaming analytics app
Next Generation Tooling for building streaming analytics app
 
SplunkLive! Frankfurt 2018 - Monitoring the End User Experience with Splunk
SplunkLive! Frankfurt 2018 - Monitoring the End User Experience with SplunkSplunkLive! Frankfurt 2018 - Monitoring the End User Experience with Splunk
SplunkLive! Frankfurt 2018 - Monitoring the End User Experience with Splunk
 
Splunk Data Onboarding Overview - Splunk Data Collection Architecture
Splunk Data Onboarding Overview - Splunk Data Collection ArchitectureSplunk Data Onboarding Overview - Splunk Data Collection Architecture
Splunk Data Onboarding Overview - Splunk Data Collection Architecture
 
Streaming analytics manager
Streaming analytics managerStreaming analytics manager
Streaming analytics manager
 
Splunk Cloud and Splunk Enterprise 7.2
Splunk Cloud and Splunk Enterprise 7.2 Splunk Cloud and Splunk Enterprise 7.2
Splunk Cloud and Splunk Enterprise 7.2
 
Splunk Cloud and Splunk Enterprise 7.2
Splunk Cloud and Splunk Enterprise 7.2 Splunk Cloud and Splunk Enterprise 7.2
Splunk Cloud and Splunk Enterprise 7.2
 
Splunk Cloud and Splunk Enterprise 7.2
Splunk Cloud and Splunk Enterprise 7.2Splunk Cloud and Splunk Enterprise 7.2
Splunk Cloud and Splunk Enterprise 7.2
 
Spark and machine learning in microservices architecture
Spark and machine learning in microservices architectureSpark and machine learning in microservices architecture
Spark and machine learning in microservices architecture
 
What's New with the Latest Splunk Platform Release
What's New with the Latest Splunk Platform ReleaseWhat's New with the Latest Splunk Platform Release
What's New with the Latest Splunk Platform Release
 
Modern Monitoring
Modern MonitoringModern Monitoring
Modern Monitoring
 
SplunkLive! Munich 2018: Monitoring the End-User Experience with Splunk
SplunkLive! Munich 2018: Monitoring the End-User Experience with SplunkSplunkLive! Munich 2018: Monitoring the End-User Experience with Splunk
SplunkLive! Munich 2018: Monitoring the End-User Experience with Splunk
 
SplunkLive! Zurich 2018: Monitoring the End User Experience with Splunk
SplunkLive! Zurich 2018: Monitoring the End User Experience with SplunkSplunkLive! Zurich 2018: Monitoring the End User Experience with Splunk
SplunkLive! Zurich 2018: Monitoring the End User Experience with Splunk
 
Splunk Enterprise 6.3 - Splunk Tech Day
Splunk Enterprise 6.3 - Splunk Tech DaySplunk Enterprise 6.3 - Splunk Tech Day
Splunk Enterprise 6.3 - Splunk Tech Day
 
Spring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour DallasSpring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour Dallas
 

Plus de Flink Forward

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...Flink Forward
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorFlink Forward
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkFlink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink Forward
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraFlink Forward
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022Flink Forward
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink Forward
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesFlink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 

Plus de Flink Forward (20)

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 

Dernier

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Dernier (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 

Virtual Flink Forward 2020: Keynote: The Evolution of Data Infrastructure at Splunk - Eric Sammer

  • 1. © 2020 SPLUNK INC. The Evolution of Data Infrastructure at Splunk Flink Forward SF/Virtual 2020 Eric Sammer - VP, Distinguished Engineer esammer@splunk.com
  • 2. © 2020 SPLUNK INC. What is Splunk? A platform for the collection, storage, query, and analysis of event and time series data. Logs, but other kinds of events too Tons of query-time processing features Core platform experience, increasingly domain-specific applications
  • 3. © 2020 SPLUNK INC. Ad hoc query - Human and apps Scheduled query - Materialized view maintenance, app processes, alerting Realtime query - Ad hoc (“Live tail”), app processes Classes of Workload
  • 4. © 2020 SPLUNK INC. > +
  • 5. © 2020 SPLUNK INC. two and a half years ago...
  • 6. © 2020 SPLUNK INC. Connector Agent Indexer Pipeline Index All data lands in an index Index
  • 7. © 2020 SPLUNK INC. Index Index Query Application Applications and platform services use scheduled queries to build views or process data Query View ApplicationQuery
  • 8. © 2020 SPLUNK INC. Challenges Latency Polling query workload overhead - View maintenance, alerts / triggers, application processing Insufficient programming model - State management, failure and recovery semantics, parallelization, optimization, orchestration left to applications in many cases
  • 9. © 2020 SPLUNK INC. Polling query overhead: how bad is it, really?
  • 10. © 2020 SPLUNK INC. Bad. This cohort: p99: 94.91% p90: 94.29% p75: 79.63% p50: 63.21% Anonymized real deployments. QueriesexecutedoverT
  • 11. © 2020 SPLUNK INC. If we could ... we’d get ... Move view building, alerting, and app processing to the ingest stream Up to 94% query workload reduction / more time for ad hoc queries Reduced latency to data visibility / faster time to act Safer, more consistent and efficient programming model for developers
  • 12. © 2020 SPLUNK INC. If we could ... we’d get ... Let developers / end users tap into the ingest stream Deeper integration opportunities with other data infra New ingest-time extension points / support for more use cases
  • 13. © 2020 SPLUNK INC. If we could ... we’d get ... Support additional ingest-time functionality Better data quality downstream Better data governance and control opportunities Better DR/BCP/scaling primitives and function
  • 14. © 2020 SPLUNK INC. What would Splunk look like if we added stream processing as a platform primitive?
  • 15. © 2020 SPLUNK INC. phase 1: ingest
  • 16. © 2020 SPLUNK INC. Focus on primitives and ingestion process Optional - support incremental adoption by on prem customers Minimal disruption to current semantics (only improvements) Easy wins
  • 17. © 2020 SPLUNK INC. Connector Agent Indexer Index View Query Connector Agent Indexer Index View Query Firehose Pubsub Topic DSP Flink-Powered Before After 3rd Party System
  • 18. © 2020 SPLUNK INC. Data Stream Processor (DSP) is Splunk’s Flink-powered stream processing engine All connectors produce to the firehose pubsub topic in a common envelope DSP has both system- and user-configured pipelines that process data Pipeline = Source(s) → Function(s) → Sink(s) Parse, route, enrich, filter, aggregate, join, union, branch, etc. Functions can be written in DSL or “native” SDK Streaming Infrastructure
  • 19. © 2020 SPLUNK INC. The Firehose Common envelope for all data “Body” can be any data type, “sourcetype” provides semantic information Firehose represented as a source in DSP More specific sources are firehose + filter
  • 20. © 2020 SPLUNK INC. Pipeline Patterns “Upgrade” of semantic information “Routing” implemented as branches into filter + sink pairs “External functions” use queues to produce to / consume from external systems StateFun 2.0 is more interesting though - adds state mgmt, removes overhead “Group functions” are fragments of a DAG that can be inlined Function-level RBAC + group functions = efficiently limit data access
  • 21. © 2020 SPLUNK INC. Phase 1 gives us... Deeper integration opportunities with other data infra New ingest-time extension points / support for more use cases Better data quality downstream Better data governance and control opportunities Better DR/BCP/scaling primitives and function
  • 22. © 2020 SPLUNK INC. phase 2: move processing upstream
  • 23. © 2020 SPLUNK INC. Push polling operations upstream (“to the left”) Allow developers to migrate batch processes to the stream, in part or whole Make a number of platform functions (e.g. alerting) streaming by default
  • 24. © 2020 SPLUNK INC. Connector Agent Indexer Index View Query Firehose Pubsub Topic DSP Flink-Powered 3rd Party System Before Connector Agent Indexer Index View Query Firehose Pubsub Topic DSP Flink-Powered 3rd Party System After Application Functions Application Data Infra
  • 25. © 2020 SPLUNK INC. Critial enablers DSP uses a dynamic function registry to load plugins Pre-built content system for applications to add pipeline templates, functions Pubsub firehose lets applications see whatever they need Auto-scale Flink clusters based on slot availability
  • 26. © 2020 SPLUNK INC. Phase 2 gives us... Up to 94% query workload reduction / more time for ad hoc queries Reduced latency to data visibility / faster time to act Safer, more consistent and efficient programming model for developers Even more Deeper integration opportunities with other data infra New ingest-time extension points / support for more use cases
  • 27. © 2020 SPLUNK INC. phase 3: realtime query
  • 28. © 2020 SPLUNK INC. In phase 2 we moved application workloads; this is about ad hoc realtime queries Splunk has realtime query today, but there are some gotchas Computationally expensive (but still better than many systems) Complex data guarantees Can we make it cheap to run thousands of concurrent realtime queries? With rapid start/stop?
  • 29. © 2020 SPLUNK INC. I’ll let you know when I do. Sorry.
  • 30. © 2020 SPLUNK INC. adding it up
  • 31. © 2020 SPLUNK INC. When working with data, streams are the right primitive Concrete semantics, patterns, and architecture are key The details are important (e.g. state management) As patterns evolve, so must we
  • 32. Thank You © 2020 SPLUNK INC. esammer@splunk.com | @esammer