SlideShare une entreprise Scribd logo
1  sur  31
© Cloudera, Inc. All rights reserved.
INTRODUCING
CLOUDERA DATAFLOW (CDF)
Dinesh Chandrasekhar
Product Marketing Lead, Data-in-Motion BU
Cloudera
@AppInt4All
George Vetticaden
Product Management Lead, Data-in-Motion BU
Cloudera
@gvetticaden
© Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved.
Cloud
~$410 B
Streaming
~$1.65 B
Data Science
~$180 B
Big Data
~$210 B
IoT
~$1.2 T
MARKET OPPORTUNITIES
© Cloudera, Inc. All rights reserved. 3© Cloudera, Inc. All rights reserved.
IOT MARKET
By 2024 more than 24.9 Billion IoT connections will be established
An estimated $70 billion will be spent by global manufacturers on
IoT solutions in 2020
An estimated 646 million healthcare devices (excluding fitness
trackers and wearable devices) will be connected by 2020
An estimated 78% of cars shipped globally will be built with
hardware that connects to the internet by 2020
50% of decision-makers in IT, services, utilities, and manufacturing
have either deployed IoT, or will deploy it in the next 12-24 months
$70B
646M
78%
50%
24.9B
© Cloudera, Inc. All rights reserved. 4© Cloudera, Inc. All rights reserved.
KEY CUSTOMER CHALLENGES
Visibility: Lack visibility of end-to-end streaming data flows,
inability to troubleshoot bottlenecks, consumption patterns etc.
Data Ingestion: High-volume streaming sources, multiple message
formats, diverse protocols and multi-vendor devices creates data
ingestion challenges
Real-time Insights: Analyzing continuous and rapid inflow
(velocity) of streaming data at high volumes creates major
challenges for gaining real-time insights
© Cloudera, Inc. All rights reserved. 5© Cloudera, Inc. All rights reserved.
CLOUDERA DATAFLOW
© Cloudera, Inc. All rights reserved. 6© Cloudera, Inc. All rights reserved.
WHAT IS CLOUDERA DATAFLOW (CDF)?
Cloudera DataFlow (CDF) is a scalable, real-time
streaming data platform that collects, curates, and
analyzes data so customers gain key insights for
immediate actionable intelligence.
© Cloudera, Inc. All rights reserved. 7© Cloudera, Inc. All rights reserved.
Mid-2000’s
NiFi was developed
and used at NSA
2015
Onyara is acquired
HDF is born
2018
Strong Streaming Platform
- Support for Kafka 2.0
- SMM is introduced
Tomorrow:
Edge-to-AI
Bring this to the edge with
connected platforms
HISTORY OF CDF
Data-in-Motion:
• Comprehensive real-time streaming data
platform
• Manage data-in-motion from edge-to-
enterprise
• Power IoT-scale streaming architectures
Enable next generation
Modern Data Architecture
2019
Cloudera merger
Enable Edge Intelligence
© Cloudera, Inc. All rights reserved. 8© Cloudera, Inc. All rights reserved.
COMMON USE CASES
Data Movement
Optimize resource utilization by moving data
between data centers or between on-premises
infrastructure and cloud infrastructure
Optimize Log Collection & Analysis
Optimize log analytics solutions by using CDF
as a single platform to collect and deliver
multiple data sources
Gain key insights with Streaming Analytics
Accelerate big data ROI by analyzing
streaming data for patterns, comparing with ML
models and delivering actionable intelligence
Single view / 360° view of customer
Ingest, transform and combine customer
data from multiple sources into a single data
view / lake
Stream Processing
Combine multiple streams of data in real-
time, enrich the data and route it to different
end points based on rules
Capture IoT Data
Ingest sensor data from IoT devices and
stream it for further processing and
comprehensive analysis
© Cloudera, Inc. All rights reserved. 9© Cloudera, Inc. All rights reserved.
Public Sector Transportation Utilities Healthcare Manufacturing Retail
COMMON IOT USE CASES BY INDUSTRY
Fleet
Management
Connected
Cars
Smart
Cities
Predictive
Analytics
Inventory/
Material
Tracking
• IoT is a $1.13T market opportunity in 2021.
• Americas - $329B IoT spending. Manufacturing and Transportation are top industries, accounting for 26% of total spending.
• APAC - $500B IoT spending. Manufacturing, Utilities and Transportation are top industries.
• EMEA - $264B IoT spending. Manufacturing is top industry, powered by Industry 4.0 initiatives.
• Worldwide IoT Analytics and Information Management Market = $573M
Top 5
Use cases Utility
Monitoring
Predictive
Maintenance
Patient
Monitoring
Usage-based
Insurance
Asset
Tracking /
Monitoring
Edge Data
Collection
© Cloudera, Inc. All rights reserved. 10
CUSTOMERS
© Cloudera, Inc. All rights reserved.
Improving Healthcare with SMART data
Combine multi-format data
streams, with hundreds of
sources, into one platform
• Needed a platform that could
combine multi-format data
streaming
• Data scarcity & latency
problems
• Machine learning & data
science
• First to deliver SMART real-
time streaming data
• Clearsense’s Inception™
product enables fast decisions
for clinicians
• Customers have access to all
data sources with HDP & CDF
Cloud-based systems
architected to deliver
SMART data, using HDP
and CDF
• Mission critical data is now
available for doctors to make
critical decisions
• Cost efficiencies led to access for
2,000 rural providers
• Real-time data helps prevent
“Code Blue”
Mission-critical data and
relevant insight for 2,000
rural providers
Photo by rawpixel on Unsplash
Lack of medical
expertise around
patient care, post
surgery
• Patient Code Blue status
• Possible cardiac arrest 4–
6 hours post surgery
C H A L L E N G E R E S U L TS O L U T I O NI M P A C T
© Cloudera, Inc. All rights reserved.
Positioning technology products & services empower companies worldwide
Provide accurate data for
small carriers to improve
business results
• 95% of small carriers (less
than 50 trucks) have a deficit
of data available
• Estimated data, price points
and revenue base
opportunity for controlling
fuel cost
• Understanding of freight and
lane movement
• Leveraging big data powering
Blockchain, with machine
learning, to revolutionize
Transportation and Logistics
industries
• Analyzed fuel data; can
consolidate data set for small
carriers to generate community
data lake
Big Data in the Cloud
with HDP, CDF, and
Microsoft Azure
• Managing for 4 million
trucks daily
• $31 billion dollars in freight
movement guides
customers to profitability
• Blockchain driven
architecture
Double digit revenue
increase, year over year
C H A L L E N G E
Photo by rawpixel.com on Unsplash
Continuing on current
path would slow
organizational growth and
impact customers
• Being unable to predict
weather patterns would lead to
delays and decreased product
quality
• Operational inefficiencies
prevent reaching business
revenue goals, lack of insights
• Loss of product during
transportation
R E S U L TS O L U T I O NI M P A C T
© Cloudera, Inc. All rights reserved. 13
PRODUCT OVERVIEW
© Cloudera, Inc. All rights reserved. 14© Cloudera, Inc. All rights reserved.
CLOUDERA DATAFLOW
© Cloudera, Inc. All rights reserved. 15
CLOUDERA DATAFLOW Data-in-motion platform
© Cloudera, Inc. All rights reserved. 16© Cloudera, Inc. All rights reserved.
EDGE DATA MANAGEMENT
• Edge data collection powered by Apache MiNiFi
• MiNiFi – smaller footprint than NiFi
• Guaranteed delivery
• Data buffering
• Prioritized queuing
• Flow-specific QoS
• Data provenance
• Designed for extension
• C++ / Java agents
• Designed for IoT
© Cloudera, Inc. All rights reserved. 17
CLOUDERA DATAFLOW Data-in-motion platform
© Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved.
FLOW MANAGEMENT
• Web-based user interface
• Highly configurable
• Out-of-the-box data provenance
• Designed for extensibility
• Secure
• NiFi Registry
• DevOps support
• FDLC
• Versioning
• Deployment
© Cloudera, Inc. All rights reserved. 19© Cloudera, Inc. All rights reserved.
280+ PROCESSORS FOR DEEPER ECOSYSTEM INTEGRATION
Hash
Extract
Merge
Duplicate
Scan
GeoEnrich
Replace
ConvertSplit
Translate
Route Content
Route Context
Route Text
Control Rate
Distribute Load
Generate Table Fetch
Jolt Transform JSON
Prioritized Delivery
Encrypt
Tail
Evaluate
Execute
All Apache project logos are trademarks of the ASF and the respective projects.
Fetch
HTTP
Syslog
Email
HTML
Image
HL7
FTP
UDP
XML
SFTP
AMQP
WebSocket
© Cloudera, Inc. All rights reserved. 20
CLOUDERA DATAFLOW Data-in-motion platform
© Cloudera, Inc. All rights reserved.
Streaming Analytics Reference Architecture
Data Flow Apps
Powered by NiFi
Kafka is Everywhere. Critical Component of Streaming Architectures
Kafka Producers Kafka Topics Kafka TopicsKafka Consumers & Producers Kafka Consumers
US West Fleet
Truck Sensors C++
Agent
US Central Fleet
Truck Sensors C++
Agent
US East Fleet
Truck Sensors C++
Agent
Analytics App 1
Analytics App 2
Analytics App 5
Analytics App 3
Analytics App 4
© Cloudera, Inc. All rights reserved.
Cloudera Streams Messaging Manager (SMM)
What is SMM?
 Kafka Management and Monitoring
tool
 Cure the “Kafka Blindness”
 Single Monitoring Dashboard for all
your Kafka Clusters across 4 entities
– Broker
– Producer
– Topic
– Consumer
 REST as a First Class Citizen
 Alerting
 Schema Management
 Integration with Schema Registry
© Cloudera, Inc. All rights reserved. 23
CLOUDERA DATAFLOW Data-in-motion platform
© Cloudera, Inc. All rights reserved. 24© Cloudera, Inc. All rights reserved.
STREAMING ANALYTICS
• Pattern matching
• Predictive and Prescriptive Analytics
• Complex Event Processing
• Continuous & Real-time Insights
© Cloudera, Inc. All rights reserved.
OLAP Access PatternSQL Access Pattern
Streaming Event Storage Substrate
Topic A
Kafka Topic Kafka Topic
Topic B
Kafka Topic
Topic C
Kafka Topic
Topic D
Kafka Topic
Topic X
3 KafkaAnalyticsAccess Patterns
Streaming Access Pattern
N
ew
KAFKA SQL
New
KAFKA OLAP
New
© Cloudera, Inc. All rights reserved. 26
CLOUDERA DATAFLOW Data-in-motion platform
© Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved.
ENTERPRISE SERVICES
• Provisioning
• Management
• Monitoring
• Unified Security
• Single Sign-on
• Audit
• Compliance
• Edge-to-Enterprise Governance
© Cloudera, Inc. All rights reserved. 28
CLOUDERA DATAFLOW Data-in-motion platform
© Cloudera, Inc. All rights reserved. 29© Cloudera, Inc. All rights reserved.
KEY DIFFERENTIATORS
Comprehensive streaming platform – Only big data vendor to offer a comprehensive streaming
platform from real-time data ingestion, transformation, routing to descriptive, prescriptive and predictive
analytics.
100% open source technology – Only vendor with this strategy; prevents vendor lock-in
280+ pre-built processors – Only product to offer such comprehensive connectivity from edge to
enterprise
Built-in data provenance – Only product in the market to offer out-of-the-box data provenance on data-
in-motion
3 Streaming analytics engines – Only vendor to offer a choice of three streaming analytics engines to
customers for all their streaming architecture needs
© Cloudera, Inc. All rights reserved. 30
DEMO
© Cloudera, Inc. All rights reserved. 31
QUESTIONS?

Contenu connexe

Tendances

Data Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and FutureData Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and FutureLorenzo Nicora
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for DummiesRodney Joyce
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation Brett VanderPlaats
 
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)Cathrine Wilhelmsen
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Cathrine Wilhelmsen
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingDatabricks
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMatei Zaharia
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleAdam Doyle
 
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...Databricks
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & DeltaDatabricks
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseDatabricks
 
Building modern data lakes
Building modern data lakes Building modern data lakes
Building modern data lakes Minio
 
Technical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdfTechnical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdfIlham31574
 

Tendances (20)

Data Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and FutureData Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and Future
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
 
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured Streaming
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at Scale
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Building modern data lakes
Building modern data lakes Building modern data lakes
Building modern data lakes
 
Technical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdfTechnical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdf
 

Similaire à Introducing Cloudera DataFlow (CDF) 2.13.19

Addressing Challenges with IoT Edge Management
Addressing Challenges with IoT Edge ManagementAddressing Challenges with IoT Edge Management
Addressing Challenges with IoT Edge ManagementDataWorks Summit
 
Powering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopCloudera, Inc.
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixNicolas Morales
 
CWIN17 Frankfurt / Cloudera
CWIN17 Frankfurt / ClouderaCWIN17 Frankfurt / Cloudera
CWIN17 Frankfurt / ClouderaCapgemini
 
Cloudera - IoT & Smart Cities
Cloudera - IoT & Smart CitiesCloudera - IoT & Smart Cities
Cloudera - IoT & Smart CitiesCloudera, Inc.
 
Paris FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationParis FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationAbdelkrim Hadjidj
 
Digital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility companyDigital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility companyIlham Ahmed
 
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...actualtechmedia
 
Stl meetup cloudera platform - january 2020
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020Adam Doyle
 
BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsSkillspeed
 
CL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and PlanningCL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and PlanningCisco
 
Why Infrastructure matters?!
Why Infrastructure matters?!Why Infrastructure matters?!
Why Infrastructure matters?!Gabi Bauer
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 
Serverless service adoption for Thailand
Serverless service adoption for ThailandServerless service adoption for Thailand
Serverless service adoption for ThailandWatcharin Yang-Ngam
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
IoT Connected Brewery
IoT Connected BreweryIoT Connected Brewery
IoT Connected BreweryJason Hubbard
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewVMware Tanzu
 

Similaire à Introducing Cloudera DataFlow (CDF) 2.13.19 (20)

Addressing Challenges with IoT Edge Management
Addressing Challenges with IoT Edge ManagementAddressing Challenges with IoT Edge Management
Addressing Challenges with IoT Edge Management
 
Powering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache Hadoop
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with Bluemix
 
CWIN17 Frankfurt / Cloudera
CWIN17 Frankfurt / ClouderaCWIN17 Frankfurt / Cloudera
CWIN17 Frankfurt / Cloudera
 
Cloudera - IoT & Smart Cities
Cloudera - IoT & Smart CitiesCloudera - IoT & Smart Cities
Cloudera - IoT & Smart Cities
 
Paris FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationParis FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant Presentation
 
Digital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility companyDigital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility company
 
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
 
Stl meetup cloudera platform - january 2020
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020
 
BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in Logistics
 
CL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and PlanningCL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and Planning
 
Why Infrastructure matters?!
Why Infrastructure matters?!Why Infrastructure matters?!
Why Infrastructure matters?!
 
Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Serverless service adoption for Thailand
Serverless service adoption for ThailandServerless service adoption for Thailand
Serverless service adoption for Thailand
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
IoT Connected Brewery
IoT Connected BreweryIoT Connected Brewery
IoT Connected Brewery
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical Overview
 

Plus de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Cloudera, Inc.
 

Plus de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
 

Dernier

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Dernier (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

Introducing Cloudera DataFlow (CDF) 2.13.19

  • 1. © Cloudera, Inc. All rights reserved. INTRODUCING CLOUDERA DATAFLOW (CDF) Dinesh Chandrasekhar Product Marketing Lead, Data-in-Motion BU Cloudera @AppInt4All George Vetticaden Product Management Lead, Data-in-Motion BU Cloudera @gvetticaden
  • 2. © Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved. Cloud ~$410 B Streaming ~$1.65 B Data Science ~$180 B Big Data ~$210 B IoT ~$1.2 T MARKET OPPORTUNITIES
  • 3. © Cloudera, Inc. All rights reserved. 3© Cloudera, Inc. All rights reserved. IOT MARKET By 2024 more than 24.9 Billion IoT connections will be established An estimated $70 billion will be spent by global manufacturers on IoT solutions in 2020 An estimated 646 million healthcare devices (excluding fitness trackers and wearable devices) will be connected by 2020 An estimated 78% of cars shipped globally will be built with hardware that connects to the internet by 2020 50% of decision-makers in IT, services, utilities, and manufacturing have either deployed IoT, or will deploy it in the next 12-24 months $70B 646M 78% 50% 24.9B
  • 4. © Cloudera, Inc. All rights reserved. 4© Cloudera, Inc. All rights reserved. KEY CUSTOMER CHALLENGES Visibility: Lack visibility of end-to-end streaming data flows, inability to troubleshoot bottlenecks, consumption patterns etc. Data Ingestion: High-volume streaming sources, multiple message formats, diverse protocols and multi-vendor devices creates data ingestion challenges Real-time Insights: Analyzing continuous and rapid inflow (velocity) of streaming data at high volumes creates major challenges for gaining real-time insights
  • 5. © Cloudera, Inc. All rights reserved. 5© Cloudera, Inc. All rights reserved. CLOUDERA DATAFLOW
  • 6. © Cloudera, Inc. All rights reserved. 6© Cloudera, Inc. All rights reserved. WHAT IS CLOUDERA DATAFLOW (CDF)? Cloudera DataFlow (CDF) is a scalable, real-time streaming data platform that collects, curates, and analyzes data so customers gain key insights for immediate actionable intelligence.
  • 7. © Cloudera, Inc. All rights reserved. 7© Cloudera, Inc. All rights reserved. Mid-2000’s NiFi was developed and used at NSA 2015 Onyara is acquired HDF is born 2018 Strong Streaming Platform - Support for Kafka 2.0 - SMM is introduced Tomorrow: Edge-to-AI Bring this to the edge with connected platforms HISTORY OF CDF Data-in-Motion: • Comprehensive real-time streaming data platform • Manage data-in-motion from edge-to- enterprise • Power IoT-scale streaming architectures Enable next generation Modern Data Architecture 2019 Cloudera merger Enable Edge Intelligence
  • 8. © Cloudera, Inc. All rights reserved. 8© Cloudera, Inc. All rights reserved. COMMON USE CASES Data Movement Optimize resource utilization by moving data between data centers or between on-premises infrastructure and cloud infrastructure Optimize Log Collection & Analysis Optimize log analytics solutions by using CDF as a single platform to collect and deliver multiple data sources Gain key insights with Streaming Analytics Accelerate big data ROI by analyzing streaming data for patterns, comparing with ML models and delivering actionable intelligence Single view / 360° view of customer Ingest, transform and combine customer data from multiple sources into a single data view / lake Stream Processing Combine multiple streams of data in real- time, enrich the data and route it to different end points based on rules Capture IoT Data Ingest sensor data from IoT devices and stream it for further processing and comprehensive analysis
  • 9. © Cloudera, Inc. All rights reserved. 9© Cloudera, Inc. All rights reserved. Public Sector Transportation Utilities Healthcare Manufacturing Retail COMMON IOT USE CASES BY INDUSTRY Fleet Management Connected Cars Smart Cities Predictive Analytics Inventory/ Material Tracking • IoT is a $1.13T market opportunity in 2021. • Americas - $329B IoT spending. Manufacturing and Transportation are top industries, accounting for 26% of total spending. • APAC - $500B IoT spending. Manufacturing, Utilities and Transportation are top industries. • EMEA - $264B IoT spending. Manufacturing is top industry, powered by Industry 4.0 initiatives. • Worldwide IoT Analytics and Information Management Market = $573M Top 5 Use cases Utility Monitoring Predictive Maintenance Patient Monitoring Usage-based Insurance Asset Tracking / Monitoring Edge Data Collection
  • 10. © Cloudera, Inc. All rights reserved. 10 CUSTOMERS
  • 11. © Cloudera, Inc. All rights reserved. Improving Healthcare with SMART data Combine multi-format data streams, with hundreds of sources, into one platform • Needed a platform that could combine multi-format data streaming • Data scarcity & latency problems • Machine learning & data science • First to deliver SMART real- time streaming data • Clearsense’s Inception™ product enables fast decisions for clinicians • Customers have access to all data sources with HDP & CDF Cloud-based systems architected to deliver SMART data, using HDP and CDF • Mission critical data is now available for doctors to make critical decisions • Cost efficiencies led to access for 2,000 rural providers • Real-time data helps prevent “Code Blue” Mission-critical data and relevant insight for 2,000 rural providers Photo by rawpixel on Unsplash Lack of medical expertise around patient care, post surgery • Patient Code Blue status • Possible cardiac arrest 4– 6 hours post surgery C H A L L E N G E R E S U L TS O L U T I O NI M P A C T
  • 12. © Cloudera, Inc. All rights reserved. Positioning technology products & services empower companies worldwide Provide accurate data for small carriers to improve business results • 95% of small carriers (less than 50 trucks) have a deficit of data available • Estimated data, price points and revenue base opportunity for controlling fuel cost • Understanding of freight and lane movement • Leveraging big data powering Blockchain, with machine learning, to revolutionize Transportation and Logistics industries • Analyzed fuel data; can consolidate data set for small carriers to generate community data lake Big Data in the Cloud with HDP, CDF, and Microsoft Azure • Managing for 4 million trucks daily • $31 billion dollars in freight movement guides customers to profitability • Blockchain driven architecture Double digit revenue increase, year over year C H A L L E N G E Photo by rawpixel.com on Unsplash Continuing on current path would slow organizational growth and impact customers • Being unable to predict weather patterns would lead to delays and decreased product quality • Operational inefficiencies prevent reaching business revenue goals, lack of insights • Loss of product during transportation R E S U L TS O L U T I O NI M P A C T
  • 13. © Cloudera, Inc. All rights reserved. 13 PRODUCT OVERVIEW
  • 14. © Cloudera, Inc. All rights reserved. 14© Cloudera, Inc. All rights reserved. CLOUDERA DATAFLOW
  • 15. © Cloudera, Inc. All rights reserved. 15 CLOUDERA DATAFLOW Data-in-motion platform
  • 16. © Cloudera, Inc. All rights reserved. 16© Cloudera, Inc. All rights reserved. EDGE DATA MANAGEMENT • Edge data collection powered by Apache MiNiFi • MiNiFi – smaller footprint than NiFi • Guaranteed delivery • Data buffering • Prioritized queuing • Flow-specific QoS • Data provenance • Designed for extension • C++ / Java agents • Designed for IoT
  • 17. © Cloudera, Inc. All rights reserved. 17 CLOUDERA DATAFLOW Data-in-motion platform
  • 18. © Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved. FLOW MANAGEMENT • Web-based user interface • Highly configurable • Out-of-the-box data provenance • Designed for extensibility • Secure • NiFi Registry • DevOps support • FDLC • Versioning • Deployment
  • 19. © Cloudera, Inc. All rights reserved. 19© Cloudera, Inc. All rights reserved. 280+ PROCESSORS FOR DEEPER ECOSYSTEM INTEGRATION Hash Extract Merge Duplicate Scan GeoEnrich Replace ConvertSplit Translate Route Content Route Context Route Text Control Rate Distribute Load Generate Table Fetch Jolt Transform JSON Prioritized Delivery Encrypt Tail Evaluate Execute All Apache project logos are trademarks of the ASF and the respective projects. Fetch HTTP Syslog Email HTML Image HL7 FTP UDP XML SFTP AMQP WebSocket
  • 20. © Cloudera, Inc. All rights reserved. 20 CLOUDERA DATAFLOW Data-in-motion platform
  • 21. © Cloudera, Inc. All rights reserved. Streaming Analytics Reference Architecture Data Flow Apps Powered by NiFi Kafka is Everywhere. Critical Component of Streaming Architectures Kafka Producers Kafka Topics Kafka TopicsKafka Consumers & Producers Kafka Consumers US West Fleet Truck Sensors C++ Agent US Central Fleet Truck Sensors C++ Agent US East Fleet Truck Sensors C++ Agent Analytics App 1 Analytics App 2 Analytics App 5 Analytics App 3 Analytics App 4
  • 22. © Cloudera, Inc. All rights reserved. Cloudera Streams Messaging Manager (SMM) What is SMM?  Kafka Management and Monitoring tool  Cure the “Kafka Blindness”  Single Monitoring Dashboard for all your Kafka Clusters across 4 entities – Broker – Producer – Topic – Consumer  REST as a First Class Citizen  Alerting  Schema Management  Integration with Schema Registry
  • 23. © Cloudera, Inc. All rights reserved. 23 CLOUDERA DATAFLOW Data-in-motion platform
  • 24. © Cloudera, Inc. All rights reserved. 24© Cloudera, Inc. All rights reserved. STREAMING ANALYTICS • Pattern matching • Predictive and Prescriptive Analytics • Complex Event Processing • Continuous & Real-time Insights
  • 25. © Cloudera, Inc. All rights reserved. OLAP Access PatternSQL Access Pattern Streaming Event Storage Substrate Topic A Kafka Topic Kafka Topic Topic B Kafka Topic Topic C Kafka Topic Topic D Kafka Topic Topic X 3 KafkaAnalyticsAccess Patterns Streaming Access Pattern N ew KAFKA SQL New KAFKA OLAP New
  • 26. © Cloudera, Inc. All rights reserved. 26 CLOUDERA DATAFLOW Data-in-motion platform
  • 27. © Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved. ENTERPRISE SERVICES • Provisioning • Management • Monitoring • Unified Security • Single Sign-on • Audit • Compliance • Edge-to-Enterprise Governance
  • 28. © Cloudera, Inc. All rights reserved. 28 CLOUDERA DATAFLOW Data-in-motion platform
  • 29. © Cloudera, Inc. All rights reserved. 29© Cloudera, Inc. All rights reserved. KEY DIFFERENTIATORS Comprehensive streaming platform – Only big data vendor to offer a comprehensive streaming platform from real-time data ingestion, transformation, routing to descriptive, prescriptive and predictive analytics. 100% open source technology – Only vendor with this strategy; prevents vendor lock-in 280+ pre-built processors – Only product to offer such comprehensive connectivity from edge to enterprise Built-in data provenance – Only product in the market to offer out-of-the-box data provenance on data- in-motion 3 Streaming analytics engines – Only vendor to offer a choice of three streaming analytics engines to customers for all their streaming architecture needs
  • 30. © Cloudera, Inc. All rights reserved. 30 DEMO
  • 31. © Cloudera, Inc. All rights reserved. 31 QUESTIONS?

Notes de l'éditeur

  1. Data ingestion, transformation and routing done visually with no code using Apache NiFi & 260+ processors Build streaming apps and analytics from edge to datalake / EDW using builder Enable edge data collection and intelligence through MiNiFi agents Support massive IoT infrastructures Deliver perishable insights with pattern matching and Complex Event Processing (CEP) from real-time streams Manage, monitor, secure and govern streaming data
  2. What it actually is and What is the main use/goal of [product]?
  3. Provide context to why we added this to our stack at time. For CDF, it was to a) create more value from HDP by making it easier to get data into HDP and also to take advantage of growing IOT market opportunities and to address more encompassing view of data. It then was foundational for next step (DataPlane). History can help strengthen mental models of where this fits.
  4. TALK TRACK We usually help our customers get started with one of these CDF use cases: They augment their Splunk systems with a wider variety of data (via CDF), They ingest logs for cyber security and threat detection. They feed data to streaming analytics engines like Apache Spark or Apache Storm They move their own data internally between data centers on premises or to the cloud. And of course, they capture data from the Internet of Things. CDF was originally designed to be robust, so that it could continue to move data despite varying device footprints or fluctuating power or connectivity levels. The data keeps flowing, without being lost in transit. [NEXT SLIDE]
  5. Clearsense public case study, https://hortonworks.com/customers/clearsense/ Challenge Needed viable, economic, and secure platform that could combine multi-format data streaming Data scarcity/latency problems for healthcare organizations Clinicians wanted to use machine learning/data science to store/analyze data, but technology didn’t exist. Solution First to deliver SMART real-time streaming data to healthcare customers. Inception product makes data available for clinical, financial and operational decisions. Customers have access to all data sources, ingested with CDF, stored in HDP, delivered to the point of decision. Result Doctors and nurses now have a new level of mission-critical data and relevant insight that can be incorporated into clinical decisions. Cost efficiencies from running in the cloud have allowed Clearsense to offer healthcare predictive analytics to 2,000 rural providers that otherwise wouldn’t have access. Real-time data is displayed on “Mission Control” dashboard, which helps prevent Code Blue with patients.
  6. TMW/Trimble case study, https://hortonworks.com/customers/tmw-systems/ Challenge: Accurate data for small carriers needed to improve business results 95% small carriers have a deficit in the data available to them They are estimating data, price points, revenue-based opportunities and controlling fuel cost Solution: New approach enables advanced analytics leveraging Big Data. Analytics like market rate index, national rate, fuel surcharge, and maintenance cost are important because small businesses were growing at a fast rate. Leveraging big data powering Blockchain, with machine learning, to revolutionize Transportation and Logistics industries Analyzed fuel data; can consolidate data set for small carriers to generate community data lake to drive revenue, fuel and freight cost, lane analysis, and pricing ranges. Results: Double digit revenue Y/Y Managing 4M trucks on the nation/state roads, daily $31 billion dollars in freight movement guides customers to profitability Blockchain driven architecture
  7. Data ingestion, transformation and routing done visually with no code using Apache NiFi & 260+ processors Build streaming apps and analytics from edge to datalake / EDW using builder Enable edge data collection and intelligence through MiNiFi agents Support massive IoT infrastructures Deliver perishable insights with pattern matching and Complex Event Processing (CEP) from real-time streams Manage, monitor, secure and govern streaming data
  8. Web-based user interface Design, control, feedback & monitoring Highly configurable Loss tolerant vs guaranteed delivery Low latency vs high throughput Dynamic prioritization Flow can be modified at runtime Back pressure Data provenance Track dataflow from beginning to end Designed for extension Build your own processors Secure SSL, SSH, HTTPS, etc.
  9. Web-based user interface Design, control, feedback & monitoring Highly configurable Loss tolerant vs guaranteed delivery Low latency vs high throughput Dynamic prioritization Flow can be modified at runtime Back pressure Data provenance Track dataflow from beginning to end Designed for extension Build your own processors Secure SSL, SSH, HTTPS, etc.