SlideShare une entreprise Scribd logo
1  sur  54
Télécharger pour lire hors ligne
TECHNICAL OVERVIEW:
Pivotal Big Data Suite
Les Klein
Field CTO Data
Pivotal
@LesKlein #PivotalForum #Istanbul #BigData #Analytics
Forward Looking Statements
This presentation contains “forward-looking statements” as defined under the Federal Securities Laws. Actual results could differ materially
from those projected in the forward-looking statements as a result of certain risk factors, including but not limited to: (i) adverse changes in
general economic or market conditions; (ii) delays or reductions in information technology spending; (iii) the relative and varying rates of
product price and component cost declines and the volume and mixture of product and services revenues; (iv) competitive factors,
including but not limited to pricing pressures and new product introductions; (v) component and product quality and availability; (vi)
fluctuations in VMware’s Inc.’s operating results and risks associated with trading of VMware stock; (vii) the transition to new products, the
uncertainty of customer acceptance of new product offerings and rapid technological and market change; (viii) risks associated with
managing the growth of our business, including risks associated with acquisitions and investments and the challenges and costs of
integration, restructuring and achieving anticipated synergies; (ix) the ability to attract and retain highly qualified employees; (x) insufficient,
excess or obsolete inventory; (xi) fluctuating currency exchange rates; (xii) threats and other disruptions to our secure data centers and
networks; (xiii) our ability to protect our proprietary technology; (xiv) war or acts of terrorism; and (xv) other one-time events and other
important factors disclosed previously and from time to time in the filings EMC Corporation, the parent company of Pivotal, with the U.S.
Securities and Exchange Commission. EMC and Pivotal disclaim any obligation to update any such forward-looking statements after the
date of this release.
4© 2016 Pivotal Software, Inc. All rights reserved.
Pivotal Big Data Suite
Complete
platform
Hadoop Native
SQL
Deployment
options
Based on open
source
Flexible
licensing
Advanced data
services
PIVOTAL GREENPLUM
DATABASE
Data warehouse database
based on open source
Greenplum Database
PIVOTAL HDB
Open source analytical
database for Apache
Hadoop based on Apache
HAWQ
PIVOTAL GEMFIRE
Open source application
and transaction data grid
based on Apache Geode
Pivotal Big Data Suite
Open source data management portfolio
Great software companies leverage Big Data
to fundamentally change the consumer
experience and pioneer entirely new business
models
6© 2016 Pivotal Software, Inc. All rights reserved.
$4BN
Financial Services
$26BN
Hospitality
$50BN
Transportation
$54BN
Entertainment
$30BN
Automotive
$3.2BN
Industrial Products
CLOUD NATIVE SOFTWARE IS CHANGING INDUSTRIES
Data is Fueling Software
7© Copyright 2015 Pivotal. All rights reserved.
Hundreds of
thousands of “trip”
events each day
400+ billion of
viewing-related
events per day
Five billion
training data
points for Price
Tip feature
Disruptors Use a LOT of Data
8© Copyright 2015 Pivotal. All rights reserved.
“We’ve found that when a
host selects a price that’s
within 5% of their tip,
they’re nearly 4 times
more likely to get booked”
“The importance of accuracy and
efficiency […], will continue to
rise as we expand and improve
products like uberPOOL and
beyond.”
“Over 75% of what
people watch come from
our recommendations”
Data manifests as features in an app
9© Copyright 2015 Pivotal. All rights reserved.
(Data)
Microservices
Loosely coupled
services architecture,
bounded by context
Cloud-Native
Platforms
Enabling continuous
delivery & automated
operations
Open Source
Database
Innovation
Extreme scale &
performance advantages,
built for the cloud
Machine
Learning
Use of predictive
analytics to build
smart apps
How are they accomplishing this?
10© Copyright 2015 Pivotal. All rights reserved.
These companies…
Release new features in minutes, multiple times a day
Support a micro-services architecture
Consume a wide range of data sources and protocols
Store and Analyze all their data
Update algorithms and predictive models daily
Continuously ask lots of questions of their data
Modify data pipelines and add processing steps daily
11© 2016 Pivotal Software, Inc. All rights reserved.
…but most enterprises are not quite there yet
11
Applications
scalability
limited by databases
Real-time data insights limited
by disconnected OLTP and
OLAP systems
Data services are not
ready for
cloud platforms
App 2
App 1 App 3
Bottleneck
Transactional
Database
App
App
App
Transactional
Database
ETL / ELT
Batches
Δt
TRANSACTIONS ANALYTICS
Analytic
Database
Continuous
Delivery
12© 2016 Pivotal Software, Inc. All rights reserved.
Stream + Batch Processing
Programming + Operating Model
Cloud-Native Platform
Microservices FrameworkPlatform Runtime
Hadoop
DW
Spark
Microservices and Polyglot Persistence
IMDG
K/V Store
Relational DB
Big Data &
Machine Learning
Modern Cloud-Native Data Architecture
Cloud Infrastructure
13© 2016 Pivotal Software, Inc. All rights reserved.
New pressures are breaking fragile systems
13
Applications
scalability
limited by databases
Real-time data insights limited
by disconnected OLTP and
OLAP systems
Data services are not
ready for
cloud platforms
App 2
App 1 App 3
Bottleneck
Transactional
Database
App
App
App
Transactional
Database
ETL / ELT
Batches
Δt
TRANSACTIONS ANALYTICS
Analytic
Database
Continuous
Delivery
14© 2016 Pivotal Software, Inc. All rights reserved.
Apps scalability limited by scalability of databases
14
DB scalability limitations are aggravated by additional devices, clients and apps
App 2
App 1
App 3
Existing
Applications
New devices
And clients
New cloud native
scalable data apps
App 2
App 1 App 3
Bottleneck
Transactional
Database
Scale-out applications vs Scale-up databases
15© 2016 Pivotal Software, Inc. All rights reserved.
GemFire:
15
Cloud-scale high performance transactional data
• Horizontally scalable
• Ultra fast, low-latency in-memory
transactions
• Fully configurable data consistency
• Reliable eventing and notification model
• Highly Available, auto-healing
• Inter-cluster WAN replication
Custom Apps
App 1App 1App 1
App 2App 2App 2 Push Updates
Transactional
Native API
Rest / HTTP
Pivotal GemFire
16© 2016 Pivotal Software, Inc. All rights reserved.
Batch-mode latency prevents real-time analysis
16
Applications
scalability
limited by databases
Real-time data insights limited
by disconnected OLTP and
OLAP systems
Data services are not
ready for
cloud platforms
App 2
App 1 App 3
Bottleneck
Transactional
Database
App
App
App
Transactional
Database
ETL / ELT
Batches
Δt
TRANSACTIONS ANALYTICS
Analytic
Database
Continuous
Delivery
17© 2016 Pivotal Software, Inc. All rights reserved.
Data Temperature
Hot
Hot
Real-time data analytics is limited by data integration batches
17
Overnight ETL / ELT jobs expose data that is already outdated
App 1 App 3
App 2
Transactional
Database
ETL / ELT
Batches
Δt
TRANSACTIONS ANALYTICS• Analytical processes don’t
have access to the latest data
• ETL/ELT processes
are expensive and hard to
maintain
• Batch process windows limits
data scalability
MPP
Cold
18© 2016 Pivotal Software, Inc. All rights reserved.
Operationalized data insights need an event-driven architecture
18
Combination of SQL Analytics and NoSQL event-driven transactions is needed
App 1 App 3
App 2
Transactional
Database
TRANSACTIONS ANALYTICS• Data Insights must be
immediately pushed to
applications
• Apps should be able to react in
real-time to analytical
findings
MPP
Machine Learning
Advanced Analytics
ANSI SQL
APIs /
NoSQL
Data Insights
19© 2016 Pivotal Software, Inc. All rights reserved.
DataTemperatureWarmHot
GemFire and GPDB - Big Data meets Fast Data
19
Custom Apps
App 1App 1App 1
App 2App 2App 2
Pivotal GemFire
Data science,
analytics & ML
Transactional
Native API
Rest / HTTP
Analytical
ANSI SQL
Push
Updates
Pivotal Greenplum
Parallel Configurable
Data Load
Transactional
data
Write behind
Analytical
Data
to cache
22© 2016 Pivotal Software, Inc. All rights reserved.
…but most enterprises are not quite there yet
22
Applications
scalability
limited by databases
Real-time data insights limited
by disconnected OLTP and
OLAP systems
Data services are not
ready for
cloud platforms
App 2
App 1 App 3
Bottleneck
Transactional
Database
App
App
App
Transactional
Database
ETL / ELT
Batches
Δt
TRANSACTIONS ANALYTICS
Analytic
Database
Continuous
Delivery
24© 2016 Pivotal Software, Inc. All rights reserved.
Cloud Native apps are better suitable for NoSQL
24
Enabling fast and scalable event-driven data services
Unidirectional, request-response SQL Bidirectional, event-driven APIs
Monolithic apps needed complex schema-
based, SQL databases
Micro-services need much simpler schemas,
but much better scalability
SQL
API
API
API
26© 2016 Pivotal Software, Inc. All rights reserved.
PivotalCloudFoundry
GemFire for Pivotal Cloud Foundry
26
Lightning fast in-memory persistence for cloud native apps
• One-click provisioning
• Pre-packaged configuration
• Embedded monitoring by Pulse
• Auto application binding
• Multi-cloud support
• Reliable data replication between PCF
sites
Pivotal GemFire
Click to
Deploy
27© 2016 Pivotal Software, Inc. All rights reserved.
Cloud-ready, infra-structure
agnostic
Next-generation databases must keep up to cloud native apps
27
Can your database do all of this? GemFire IMDG DOES.
Horizontal Scalability Automatic fail-over Reliable eventing model
Multi-site High Availability
Seamless integration to
analytical databases
App 1 App 3App 2
29© 2016 Pivotal Software, Inc. All rights reserved.
Pivotal Greenplum
World’s First Open Source Massively Parallel Data Warehouse
30© 2016 Pivotal Software, Inc. All rights reserved.
• Relational database system for big data and data warehousing
•
• Mission critical & system of record product with supporting tools and ecosystem
•
• Fully open source with a global community of developers and users
•
• Large industrial focused system
•
• PostgreSQL based
•
• Multi-platform technology
• On-premise, Cloud, Enterprise Appliance
•
• It’s a Software product
Greenplum Database Mission & Strategy
31© 2016 Pivotal Software, Inc. All rights reserved.
Government
Tax & benefits fraud detection
Economic statistics research
Financial Services
Wealth management data science and product development
for Commercial Banking
Risk and trade repositories reporting
401K providers analytics on investment choices
Pharmaceutical
Vaccine potency prediction based on manufacturing sensors
IoT
Predictive maintenance for auto manufacturer, industrial
equipment and government agencies
Semiconductor Fab sensor analytics and reporting
Highlighted Greenplum Successes
Cyber Security & Surveillance
Internal email and communication surveillance and reporting
Corporate network anomalous behavior and intrusion
detections
Oil & Gas
Drilling equipment predictive maintenance
Communications
Mobile telephone company enterprise data warehouse
Network performance and availability analytics
Retail
Customer purchases analytics
Transportation
Airlines loyalty program analytics
32© 2016 Pivotal Software, Inc. All rights reserved.
POLYMORPHIC
STORAGE
HEAP, Append Only,
Columnar, External,
Compression
MULTI-VERSION
CONCURRENCY
CONTROL (MVCC)
Greenplum Overview Greenplum DBSYSTEM
ACCESS
DATA
PROCESSING
DATA
STORAGE
CLIENT ACCESS
PSQL, ODBC, JDBC
BULK LOAD/UNLOAD
GPLoad, GPFdist,
External Tables, GPHDFS
ADMIN TOOLS
GP Perfmon, GP Support
3rd PARTY TOOLS
Compatible with Industry
Standard BI & ETL Tools
SQL
STANDARD
COMPLIANCE
MASSIVELY
PARALLEL
PROCESSING (MPP)
IN-DATABASE
PROGRAMMING
LANGUAGES
PL/pgSQL, PL/Python,
PL/R, PL/Perl, PL/Java,
PL/C
IN-DATABASE
ANALYTICS &
EXTENSIONS
MADlib, PostGIS,
PGCrypto
FULLY ACID
COMPLIANT
TRANSACTIONAL
DATABASE
INDEXES
B-Tree, Bitmap,
GiST
BIG DATA
QUERY
OPTIMIZER
34© 2016 Pivotal Software, Inc. All rights reserved.
PostgreSQL Heritage
Greenplum Open
Source Launch
• Widely used
• Open Source
• PostgreSQL License
• Enterprise class open source relational engine
35© 2016 Pivotal Software, Inc. All rights reserved.
MPP Shared Nothing Architecture
Flexible framework for processing large datasets
…
Master
Host
SQL
Master Host and Standby Master Host
Master coordinates work with Segment
Hosts
Segment Host with one or more
Segment Instances
Segment Instances process queries
in parallel
Segment Hosts have their own CPU,
disk and memory (shared nothing)
High speed interconnect for continuous
pipelining of data processing
Interconnect
Segment Host
Segment Instance
Segment Instance
Segment Instance
Segment Instance
Segment Host
Segment Instance
Segment Instance
Segment Instance
Segment Instance
node1
Segment Host
Segment Instance
Segment Instance
Segment Instance
Segment Instance
node2
Segment Host
Segment Instance
Segment Instance
Segment Instance
Segment Instance
node3
Segment Host
Segment Instance
Segment Instance
Segment Instance
Segment Instance
nodeN
Greenplum DB
36© 2016 Pivotal Software, Inc. All rights reserved.
Greenplum DB
External
Sources
Loading, streaming,
etc.
Network
Interconnect
... ...
......
Master
Servers
Query planning &
dispatch
Segment
Servers
Query processing &
data storage
ETL
File
Systems
Fast Parallel Load & Unload
No Master Node bottleneck
10+ TB/Hour per Rack
Linear scalability
Low Latency
Data immediately available
No intermediate stores
No data “reorganization”
Load/Unload To & From:
File Systems
Any ETL Product
Hadoop & Amazon S3
Loading: Massively-Parallel Ingest
Extreme speed and immediate usability from files, ETL, Hadoop & S3
39© 2016 Pivotal Software, Inc. All rights reserved.
Polymorphic Storage™
User Definable Storage Layout
Columnar storage compresses better
Optimized for retrieving a subset of the
columns when querying
Compression can be set differently per
column: gzip (1-9), quicklz, delta, RLE
 Row oriented faster when returning
all columns
 HEAP for many updates and deletes
 Use indexes for drill through queries
TABLE ‘SALES’
Jun
Column-orientedRow-oriented
Oct Year -
1
Year -
2
External HDFS or S3
 Less accessed partitions
on external and
seamlessly query all data
 All major Hadoop
distributions
 Amazon S3 storage
 Others in development
Nov DecJul Aug Sep
40© 2016 Pivotal Software, Inc. All rights reserved.
Parent table
Feb 2014
RETExternal
Dec 2014Jan2013 Jan 2014
Partitions and External Partitions
...
• Hash Distribution to evenly spread data across all segment instances
• Range Partition within a segment instance to minimize scan work
• Partitioned Tables Support for External Tables as a Partition
– Readable external table
– Host file system, NFS mount, HDFS or Amazon S3
Greenplum DB
41© 2016 Pivotal Software, Inc. All rights reserved.
Hybrid Queries: Pivotal External Tables
• Readable Ext-Table MVP
• Readable Gzip Files
• Writable Ext-Table
• Investigation: Enhanced Security/Roles
• Investigation: Additional File Formats
S3 External Tables
Gemfire External Tables
• Hi Speed Ingestion
• Hi Concurrency Query Cache
GPHDFS
Roadmap
42© 2016 Pivotal Software, Inc. All rights reserved.
Greenplum Database Features for Data Scientists
• Window functions: Perform
calculations across a set of table rows
that are somehow related to the
current row
• Analytics extensions: In-database
machine learning at scale using
MADlib
• Procedural language extensions:
Extended functionality using non-SQL
programming languages and packages
(e.g. Python and R)
• Client Access: ODBC and JDBC
access to support connections to 3rd
party tools * Only a subset of Greenplum Database features
43© 2016 Pivotal Software, Inc. All rights reserved.
Procedural Languages
• User Defined Types
• User Defined Functions
• User Defined Aggregates
• Import of libraries from open source
44© 2016 Pivotal Software, Inc. All rights reserved.
Scalable, In-Database
Machine Learning
• Open source https://github.com/apache/incubator-madlib
• Downloads and docs http://madlib.incubator.apache.org/
• Wiki
https://cwiki.apache.org/confluence/display/MADLIB/
45© 2016 Pivotal Software, Inc. All rights reserved.
Functions
Linear Systems
• Sparse and Dense Solvers
• Linear Algebra
Matrix Factorization
• Singular Value Decomposition (SVD)
• Low Rank
Generalized Linear Models
• Linear Regression
• Logistic Regression
• Multinomial Logistic Regression
• Ordinal Regression
• Cox Proportional Hazards Regression
• Elastic Net Regularization
• Robust Variance (Huber-White),
Clustered Variance, Marginal Effects
Other Machine Learning Algorithms
• Principal Component Analysis (PCA)
• Association Rules (Apriori)
• Topic Modeling (Parallel LDA)
• Decision Trees
• Random Forest
• Support Vector Machines
• Conditional Random Field (CRF)
• Clustering (K-means)
• Cross Validation
• Naïve Bayes
• Support Vector Machines (SVM)
Descriptive Statistics
Sketch-Based Estimators
• CountMin (Cormode-Muth.)
• FM (Flajolet-Martin)
• MFV (Most Frequent Values)
Correlation and Covariance
Summary
Utility Modules
Array and Matrix Operations
Sparse Vectors
Random Sampling
Probability Functions
Data Preparation
PMML Export
Conjugate Gradient
Stemming
Inferential Statistics
Hypothesis Tests
Time Series
• ARIMA
April 2016
Path Functions
• Operations on Pattern Matches
46© 2016 Pivotal Software, Inc. All rights reserved.
GPDB Geospatial
Current Key Features:
• Points, Lines, Polygons,
Perimeter, Area, Intersection,
Contains, Distance, Long/Lat,
Spatial Indexes & Bounding Boxes
Round earth calculations
Ability to store
geospatial data and
query with with joins and
operators
Raster Image
Processing
47© 2016 Pivotal Software, Inc. All rights reserved.
Pivotal HDB
Hadoop Native SQL Database
48© 2016 Pivotal Software, Inc. All rights reserved.
49© 2016 Pivotal Software, Inc. All rights reserved.
Enabling data science and machine learning at scale
Making the Hadoop Data Lake More Consumable
2) Data scientists still have to resort
to sampling if they can't run
analytics in-database at scale
3) There are multiple data sets
and formats within Hadoop
SQL App
BUSINESS ANALYSTS DATA SCIENTISTS
DATA LAKE
DATA LAKE
Hive, HBase, etc.
DATA LAKE
1) Important people and tools
are cut-off because of SQL
completeness or performance.
50© 2016 Pivotal Software, Inc. All rights reserved.
As the lingua franca of analytics, SQL can't be ignored. Neither can performance.
Making the Hadoop Data Lake More Consumable
2) Data scientists still have to resort
to sampling if they can't run
analytics in-database at scale
3) There are multiple data sets
and formats within Hadoop
SQL App
BUSINESS ANALYSTS DATA SCIENTISTS
DATA LAKE
DATA LAKE
Hive, HBase, etc.
DATA LAKE
1) Important people and tools are
cut-off because of SQL
completeness or performance.
51© 2016 Pivotal Software, Inc. All rights reserved.
Lack of interactive, ANSI SQL capabilities inhibits adoption and value
Hadoop data lakes sit underutilized
Producing complex queries, large
joins, interactive queries
Existing investments in
visualization and BI tools
Large population of users
with SQL skills
DATA LAKE
DATA SCIENTISTS
BUSINESS ANALYSTS
SQL App
52© 2016 Pivotal Software, Inc. All rights reserved.
High performance, interactive SQL queries on Hadoop
HDB: The Hadoop Native SQL Database
● Highly efficient MPP
(massively parallel processing)
● Low-latency
● Petabyte scalability
● ACID transaction support
● SQL-92, 99, 2003 compatibility
● Advanced cost-based optimizer
DATA LAKE
SQL App
BUSINESS ANALYSTS
DATA SCIENTISTS
53© 2016 Pivotal Software, Inc. All rights reserved.
Integrate SQL and data science tools into an interactive, operationalized environment
Making the Hadoop Data Lake More Consumable
2) Data scientists still have to resort
to sampling if they can't run
analytics in-database at scale
3) There are multiple data sets
and formats within Hadoop
SQL App
BUSINESS ANALYSTS DATA SCIENTISTS
DATA LAKE
DATA LAKE
Hive, HBase, etc.
DATA LAKE
1) Important people and tools are
cut-off because of SQL
completeness or performance.
54© 2016 Pivotal Software, Inc. All rights reserved.
Using traditional, single-node Python or R for analytics means using subsets because of the
lack of parallelization
Predictive analytics not scaling with Python or R
<...>
Implications
• Time-consuming data movement
• Working with small sample sizes
requires extra testing cycles
against larger data sets
• Slow feature generation limits
algorithm development
DATA LAKE
DATA LAKE
DATA LAKE
SAMPLE 1
SAMPLE 2
SAMPLE n
55© 2016 Pivotal Software, Inc. All rights reserved.
ApacheTM MADlib® (incubating) is an open-source library for scalable in-database analytics
In-database analytics speeds predictive modeling
Scale-out mathematical, statistical and
machine learning methods for structured
and unstructured data
• SQL-based
• Analyze without sampling
• Open source
• Runs on HDB, Greenplum, and
Postgres
• Compliments support for procedural
languages: PL/R, PL/Python, PL/Java
Train a model...
Predict for new data...
DATA LAKE
56© 2016 Pivotal Software, Inc. All rights reserved.
Overcome complexity
Making the Hadoop Data Lake More Consumable
2) Data scientists still have to resort
to sampling if they can't run
analytics in-database at scale
3) There are multiple data sets
and formats within Hadoop
SQL App
BUSINESS ANALYSTS DATA SCIENTISTS
DATA LAKE
DATA LAKE
Hive, HBase, etc.
DATA LAKE
1) Important people and tools are
cut-off because of SQL
completeness or performance.
57© 2016 Pivotal Software, Inc. All rights reserved.
Schema Read
HDB’s Pivotal eXtension Framework (PXF) and HCatalog integration
Simplifying the data lake with data federation
• Enables connectivity between
Pivotal HDB and other stores
(Hive, HBase, HDFS files).
• Provides an extensible
framework to add support for
custom services
• Low latency on large data sets
• Considers cost model of
federated sources
HDFS DATA LAKE
HCatalog
CSV TXT Avro
Custom
Extensions
59© 2016 Pivotal Software, Inc. All rights reserved.
CUSTOMER
APP
Providing information in context with the right architecture and the right algorithms
HDB as part of an architecture: Next Likely Purchase
INTERNAL
APP
PURCHASE
NEXT OFFER
REAL-TIME VIEW OF
TRANSACTIONS AND OFFERS
REPORTS
60© 2016 Pivotal Software, Inc. All rights reserved.
CUSTOMER
APP
Providing information in context with the right architecture and the right algorithms
HDB as part of an architecture: Next Likely Purchase
INTERNAL
APP
PURCHASE
NEXT OFFER
REAL-TIME VIEW OF
TRANSACTIONS AND OFFERS
TRANSACTIONS
PMML
Model Creation &
Training
HDB Tables
HDFS Staging
1. Ingest, transform, and land data into HDFS
2. Score streaming data and serve to
application
DATA SCIENCE &
AD HOC QUERIES
REPORTS
61© 2016 Pivotal Software, Inc. All rights reserved.
Advanced Analytics
Performance
Exceptional MPP performance, low latency,
petabyte scalability, ACID reliability, fault tolerance
Most Complete
Language Compliance
Higher degree of SQL compatibility, SQL-92, 99,
2003, OLAP, leverage existing SQL skills
Advanced Query
Optimizer
Maximize performance and
do advanced queries with confidence
Elastic Architecture for
Scalability
Scale-up/down or scale-in/out, expand/shrink
clusters on the fly
Integrated w/MADlib
Machine Learning
Advanced MPP analytics, data science at scale,
directly on Hadoop data
MAD
Pivotal HDB Advantages
62© Copyright 2015 Pivotal. All rights reserved.
“Companies need to learn how to catch
people or things in the act of doing
something and affect the outcome“
PAUL MARITZ
Executive Chairman, Pivotal
Real-time and
Personalised Information
in Context is what Wins!
Pivotal Big Data Suite: A Technical Overview

Contenu connexe

Tendances

On-premise to Microsoft Azure Cloud Migration.
 On-premise to Microsoft Azure Cloud Migration. On-premise to Microsoft Azure Cloud Migration.
On-premise to Microsoft Azure Cloud Migration.Emtec Inc.
 
Serverless and Design Patterns In GCP
Serverless and Design Patterns In GCPServerless and Design Patterns In GCP
Serverless and Design Patterns In GCPOliver Fierro
 
Platform & Application Modernization
Platform & Application ModernizationPlatform & Application Modernization
Platform & Application ModernizationJK Tech
 
A Roadmap to Cloud Center of Excellence Adoption
A Roadmap to Cloud Center of Excellence AdoptionA Roadmap to Cloud Center of Excellence Adoption
A Roadmap to Cloud Center of Excellence AdoptionAmazon Web Services
 
Azure Cloud Adoption Framework + Governance - Sana Khan and Jay Kumar
Azure Cloud Adoption Framework + Governance - Sana Khan and Jay Kumar Azure Cloud Adoption Framework + Governance - Sana Khan and Jay Kumar
Azure Cloud Adoption Framework + Governance - Sana Khan and Jay Kumar Timothy McAliley
 
Cloud workload migration guidelines
Cloud workload migration guidelinesCloud workload migration guidelines
Cloud workload migration guidelinesJen Wei Lee
 
Oracle Cloud Infrastructure Overview Deck.pptx
Oracle Cloud Infrastructure Overview Deck.pptxOracle Cloud Infrastructure Overview Deck.pptx
Oracle Cloud Infrastructure Overview Deck.pptxLabibKhairi
 
Emerging Trends in Hybrid-Cloud & Multi-Cloud Strategies
Emerging Trends in Hybrid-Cloud & Multi-Cloud StrategiesEmerging Trends in Hybrid-Cloud & Multi-Cloud Strategies
Emerging Trends in Hybrid-Cloud & Multi-Cloud StrategiesChaitanya Atreya
 
Cloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs GoogleCloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs GooglePatrick Pierson
 
Microsoft Azure Databricks
Microsoft Azure DatabricksMicrosoft Azure Databricks
Microsoft Azure DatabricksSascha Dittmann
 
Microservices Part 3 Service Mesh and Kafka
Microservices Part 3 Service Mesh and KafkaMicroservices Part 3 Service Mesh and Kafka
Microservices Part 3 Service Mesh and KafkaAraf Karsh Hamid
 
Cloud dw benchmark using tpd-ds( Snowflake vs Redshift vs EMR Hive )
Cloud dw benchmark using tpd-ds( Snowflake vs Redshift vs EMR Hive )Cloud dw benchmark using tpd-ds( Snowflake vs Redshift vs EMR Hive )
Cloud dw benchmark using tpd-ds( Snowflake vs Redshift vs EMR Hive )SANG WON PARK
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergAnant Corporation
 
Webinar: Simplifying the Enterprise Hybrid Cloud with Azure Stack HCI
Webinar: Simplifying the Enterprise Hybrid Cloud with Azure Stack HCIWebinar: Simplifying the Enterprise Hybrid Cloud with Azure Stack HCI
Webinar: Simplifying the Enterprise Hybrid Cloud with Azure Stack HCIStorage Switzerland
 
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud MigrationCapgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud MigrationFloyd DCosta
 
DevOps for Databricks
DevOps for DatabricksDevOps for Databricks
DevOps for DatabricksDatabricks
 
Meet up roadmap cloudera 2020 - janeiro
Meet up   roadmap cloudera 2020 - janeiroMeet up   roadmap cloudera 2020 - janeiro
Meet up roadmap cloudera 2020 - janeiroThiago Santiago
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsGuido Schmutz
 

Tendances (20)

On-premise to Microsoft Azure Cloud Migration.
 On-premise to Microsoft Azure Cloud Migration. On-premise to Microsoft Azure Cloud Migration.
On-premise to Microsoft Azure Cloud Migration.
 
Serverless and Design Patterns In GCP
Serverless and Design Patterns In GCPServerless and Design Patterns In GCP
Serverless and Design Patterns In GCP
 
Platform & Application Modernization
Platform & Application ModernizationPlatform & Application Modernization
Platform & Application Modernization
 
A Roadmap to Cloud Center of Excellence Adoption
A Roadmap to Cloud Center of Excellence AdoptionA Roadmap to Cloud Center of Excellence Adoption
A Roadmap to Cloud Center of Excellence Adoption
 
Azure Cloud Adoption Framework + Governance - Sana Khan and Jay Kumar
Azure Cloud Adoption Framework + Governance - Sana Khan and Jay Kumar Azure Cloud Adoption Framework + Governance - Sana Khan and Jay Kumar
Azure Cloud Adoption Framework + Governance - Sana Khan and Jay Kumar
 
왜 네이버클라우드플랫폼인가?(박기은 CTO) - 대구 Cloud Innovation summit
왜 네이버클라우드플랫폼인가?(박기은 CTO) - 대구 Cloud Innovation summit왜 네이버클라우드플랫폼인가?(박기은 CTO) - 대구 Cloud Innovation summit
왜 네이버클라우드플랫폼인가?(박기은 CTO) - 대구 Cloud Innovation summit
 
Cloud workload migration guidelines
Cloud workload migration guidelinesCloud workload migration guidelines
Cloud workload migration guidelines
 
Oracle Cloud Infrastructure Overview Deck.pptx
Oracle Cloud Infrastructure Overview Deck.pptxOracle Cloud Infrastructure Overview Deck.pptx
Oracle Cloud Infrastructure Overview Deck.pptx
 
Emerging Trends in Hybrid-Cloud & Multi-Cloud Strategies
Emerging Trends in Hybrid-Cloud & Multi-Cloud StrategiesEmerging Trends in Hybrid-Cloud & Multi-Cloud Strategies
Emerging Trends in Hybrid-Cloud & Multi-Cloud Strategies
 
Implementing a Data Lake
Implementing a Data LakeImplementing a Data Lake
Implementing a Data Lake
 
Cloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs GoogleCloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs Google
 
Microsoft Azure Databricks
Microsoft Azure DatabricksMicrosoft Azure Databricks
Microsoft Azure Databricks
 
Microservices Part 3 Service Mesh and Kafka
Microservices Part 3 Service Mesh and KafkaMicroservices Part 3 Service Mesh and Kafka
Microservices Part 3 Service Mesh and Kafka
 
Cloud dw benchmark using tpd-ds( Snowflake vs Redshift vs EMR Hive )
Cloud dw benchmark using tpd-ds( Snowflake vs Redshift vs EMR Hive )Cloud dw benchmark using tpd-ds( Snowflake vs Redshift vs EMR Hive )
Cloud dw benchmark using tpd-ds( Snowflake vs Redshift vs EMR Hive )
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
 
Webinar: Simplifying the Enterprise Hybrid Cloud with Azure Stack HCI
Webinar: Simplifying the Enterprise Hybrid Cloud with Azure Stack HCIWebinar: Simplifying the Enterprise Hybrid Cloud with Azure Stack HCI
Webinar: Simplifying the Enterprise Hybrid Cloud with Azure Stack HCI
 
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud MigrationCapgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
 
DevOps for Databricks
DevOps for DatabricksDevOps for Databricks
DevOps for Databricks
 
Meet up roadmap cloudera 2020 - janeiro
Meet up   roadmap cloudera 2020 - janeiroMeet up   roadmap cloudera 2020 - janeiro
Meet up roadmap cloudera 2020 - janeiro
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
 

En vedette

Data and its Role in Your Digital Transformation
Data and its Role in Your Digital TransformationData and its Role in Your Digital Transformation
Data and its Role in Your Digital TransformationVMware Tanzu
 
EMC Pivotal overview deck
EMC Pivotal overview deckEMC Pivotal overview deck
EMC Pivotal overview deckmister_moun
 
Cloud Foundry - Second Generation Code (CCNG). Technical Overview
Cloud Foundry - Second Generation Code (CCNG). Technical Overview Cloud Foundry - Second Generation Code (CCNG). Technical Overview
Cloud Foundry - Second Generation Code (CCNG). Technical Overview Nima Badiey
 
Analytics in the Cloud: Getting The Most Out Of Analytics Deployments
Analytics in the Cloud: Getting The Most Out Of Analytics DeploymentsAnalytics in the Cloud: Getting The Most Out Of Analytics Deployments
Analytics in the Cloud: Getting The Most Out Of Analytics DeploymentsVMware Tanzu
 
Introduction to Greenplum
Introduction to GreenplumIntroduction to Greenplum
Introduction to GreenplumDave Cramer
 
Operationalizing Data Analytics
Operationalizing Data AnalyticsOperationalizing Data Analytics
Operationalizing Data AnalyticsVMware Tanzu
 
HAWQ: a massively parallel processing SQL engine in hadoop
HAWQ: a massively parallel processing SQL engine in hadoopHAWQ: a massively parallel processing SQL engine in hadoop
HAWQ: a massively parallel processing SQL engine in hadoopBigData Research
 
Pivotal Digital Transformation Forum: Journey to Become a Data-Driven Enterprise
Pivotal Digital Transformation Forum: Journey to Become a Data-Driven EnterprisePivotal Digital Transformation Forum: Journey to Become a Data-Driven Enterprise
Pivotal Digital Transformation Forum: Journey to Become a Data-Driven EnterpriseVMware Tanzu
 
How to Make Cars Smarter: A Step Towards Self-Driving Cars
How to Make Cars Smarter: A Step Towards Self-Driving CarsHow to Make Cars Smarter: A Step Towards Self-Driving Cars
How to Make Cars Smarter: A Step Towards Self-Driving CarsVMware Tanzu
 
Pivotal Digital Transformation Forum: Requirements to Deliver Business Innova...
Pivotal Digital Transformation Forum: Requirements to Deliver Business Innova...Pivotal Digital Transformation Forum: Requirements to Deliver Business Innova...
Pivotal Digital Transformation Forum: Requirements to Deliver Business Innova...VMware Tanzu
 
The Cloud Foundry Story
The Cloud Foundry StoryThe Cloud Foundry Story
The Cloud Foundry StoryVMware Tanzu
 
Pivotal Big Data Roadshow
Pivotal Big Data Roadshow Pivotal Big Data Roadshow
Pivotal Big Data Roadshow VMware Tanzu
 
Pivotal Digital Transformation Forum: Becoming a Data Driven Enterprise
Pivotal Digital Transformation Forum: Becoming a Data Driven EnterprisePivotal Digital Transformation Forum: Becoming a Data Driven Enterprise
Pivotal Digital Transformation Forum: Becoming a Data Driven EnterpriseVMware Tanzu
 
Business Impact From IoT? Just Add Data Science
Business Impact From IoT? Just Add Data ScienceBusiness Impact From IoT? Just Add Data Science
Business Impact From IoT? Just Add Data ScienceVMware Tanzu
 
Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications Hortonworks
 
Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?Ed Kohlwey
 

En vedette (20)

Data and its Role in Your Digital Transformation
Data and its Role in Your Digital TransformationData and its Role in Your Digital Transformation
Data and its Role in Your Digital Transformation
 
EMC Pivotal overview deck
EMC Pivotal overview deckEMC Pivotal overview deck
EMC Pivotal overview deck
 
Cloud Foundry - Second Generation Code (CCNG). Technical Overview
Cloud Foundry - Second Generation Code (CCNG). Technical Overview Cloud Foundry - Second Generation Code (CCNG). Technical Overview
Cloud Foundry - Second Generation Code (CCNG). Technical Overview
 
Analytics in the Cloud: Getting The Most Out Of Analytics Deployments
Analytics in the Cloud: Getting The Most Out Of Analytics DeploymentsAnalytics in the Cloud: Getting The Most Out Of Analytics Deployments
Analytics in the Cloud: Getting The Most Out Of Analytics Deployments
 
Introduction to Greenplum
Introduction to GreenplumIntroduction to Greenplum
Introduction to Greenplum
 
Operationalizing Data Analytics
Operationalizing Data AnalyticsOperationalizing Data Analytics
Operationalizing Data Analytics
 
HAWQ: a massively parallel processing SQL engine in hadoop
HAWQ: a massively parallel processing SQL engine in hadoopHAWQ: a massively parallel processing SQL engine in hadoop
HAWQ: a massively parallel processing SQL engine in hadoop
 
Pivotal Digital Transformation Forum: Journey to Become a Data-Driven Enterprise
Pivotal Digital Transformation Forum: Journey to Become a Data-Driven EnterprisePivotal Digital Transformation Forum: Journey to Become a Data-Driven Enterprise
Pivotal Digital Transformation Forum: Journey to Become a Data-Driven Enterprise
 
How to Make Cars Smarter: A Step Towards Self-Driving Cars
How to Make Cars Smarter: A Step Towards Self-Driving CarsHow to Make Cars Smarter: A Step Towards Self-Driving Cars
How to Make Cars Smarter: A Step Towards Self-Driving Cars
 
Business Analytics Overview
Business Analytics OverviewBusiness Analytics Overview
Business Analytics Overview
 
Pivotal Digital Transformation Forum: Requirements to Deliver Business Innova...
Pivotal Digital Transformation Forum: Requirements to Deliver Business Innova...Pivotal Digital Transformation Forum: Requirements to Deliver Business Innova...
Pivotal Digital Transformation Forum: Requirements to Deliver Business Innova...
 
Pivotal hawq internals
Pivotal hawq internalsPivotal hawq internals
Pivotal hawq internals
 
The Cloud Foundry Story
The Cloud Foundry StoryThe Cloud Foundry Story
The Cloud Foundry Story
 
Pivotal Big Data Roadshow
Pivotal Big Data Roadshow Pivotal Big Data Roadshow
Pivotal Big Data Roadshow
 
Pivotal Digital Transformation Forum: Becoming a Data Driven Enterprise
Pivotal Digital Transformation Forum: Becoming a Data Driven EnterprisePivotal Digital Transformation Forum: Becoming a Data Driven Enterprise
Pivotal Digital Transformation Forum: Becoming a Data Driven Enterprise
 
Greenplum Architecture
Greenplum ArchitectureGreenplum Architecture
Greenplum Architecture
 
Business Impact From IoT? Just Add Data Science
Business Impact From IoT? Just Add Data ScienceBusiness Impact From IoT? Just Add Data Science
Business Impact From IoT? Just Add Data Science
 
Apache HAWQ Architecture
Apache HAWQ ArchitectureApache HAWQ Architecture
Apache HAWQ Architecture
 
Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications
 
Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?
 

Similaire à Pivotal Big Data Suite: A Technical Overview

Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewVMware Tanzu
 
Pivotal Cloud Platform Roadshow Keynote
Pivotal Cloud Platform Roadshow KeynotePivotal Cloud Platform Roadshow Keynote
Pivotal Cloud Platform Roadshow Keynotecornelia davis
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Rittman Analytics
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoptionHortonworks
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Denodo
 
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...In-Memory Computing Summit
 
Information Virtualization: Query Federation on Data Lakes
Information Virtualization: Query Federation on Data LakesInformation Virtualization: Query Federation on Data Lakes
Information Virtualization: Query Federation on Data LakesDataWorks Summit
 
Data Day - Escuchando la red
Data Day - Escuchando la redData Day - Escuchando la red
Data Day - Escuchando la redSoftware Guru
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?Aerospike, Inc.
 
NetApp Industry Keynote - Flash Memory Summit - Aug2015
NetApp Industry Keynote - Flash Memory Summit - Aug2015NetApp Industry Keynote - Flash Memory Summit - Aug2015
NetApp Industry Keynote - Flash Memory Summit - Aug2015Val Bercovici
 
The New Possible: How Platform-as-a-Service Changes the Game
 The New Possible: How Platform-as-a-Service Changes the Game The New Possible: How Platform-as-a-Service Changes the Game
The New Possible: How Platform-as-a-Service Changes the GameInside Analysis
 
Splunk Enterprise 6.3 - Splunk Tech Day
Splunk Enterprise 6.3 - Splunk Tech DaySplunk Enterprise 6.3 - Splunk Tech Day
Splunk Enterprise 6.3 - Splunk Tech DayZivaro Inc
 
IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter AnalyticsAdrian Turcu
 
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformEMC
 
Streaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache KafkaStreaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache Kafkaconfluent
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
SQL + Hadoop: The High Performance Advantage�
SQL + Hadoop:  The High Performance Advantage�SQL + Hadoop:  The High Performance Advantage�
SQL + Hadoop: The High Performance Advantage�Actian Corporation
 
A New Day for Oracle Analytics
A New Day for Oracle AnalyticsA New Day for Oracle Analytics
A New Day for Oracle AnalyticsRich Clayton
 
IBM APM for Hybrid Applications
IBM APM for Hybrid ApplicationsIBM APM for Hybrid Applications
IBM APM for Hybrid ApplicationsMatthew Cheah
 
TechTalk: Accelerate Mobile Development using SDKs and Open APIs With CA API ...
TechTalk: Accelerate Mobile Development using SDKs and Open APIs With CA API ...TechTalk: Accelerate Mobile Development using SDKs and Open APIs With CA API ...
TechTalk: Accelerate Mobile Development using SDKs and Open APIs With CA API ...CA Technologies
 

Similaire à Pivotal Big Data Suite: A Technical Overview (20)

Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical Overview
 
Pivotal Cloud Platform Roadshow Keynote
Pivotal Cloud Platform Roadshow KeynotePivotal Cloud Platform Roadshow Keynote
Pivotal Cloud Platform Roadshow Keynote
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
 
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
 
Information Virtualization: Query Federation on Data Lakes
Information Virtualization: Query Federation on Data LakesInformation Virtualization: Query Federation on Data Lakes
Information Virtualization: Query Federation on Data Lakes
 
Data Day - Escuchando la red
Data Day - Escuchando la redData Day - Escuchando la red
Data Day - Escuchando la red
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?
 
NetApp Industry Keynote - Flash Memory Summit - Aug2015
NetApp Industry Keynote - Flash Memory Summit - Aug2015NetApp Industry Keynote - Flash Memory Summit - Aug2015
NetApp Industry Keynote - Flash Memory Summit - Aug2015
 
The New Possible: How Platform-as-a-Service Changes the Game
 The New Possible: How Platform-as-a-Service Changes the Game The New Possible: How Platform-as-a-Service Changes the Game
The New Possible: How Platform-as-a-Service Changes the Game
 
Splunk Enterprise 6.3 - Splunk Tech Day
Splunk Enterprise 6.3 - Splunk Tech DaySplunk Enterprise 6.3 - Splunk Tech Day
Splunk Enterprise 6.3 - Splunk Tech Day
 
IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter Analytics
 
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
 
Streaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache KafkaStreaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache Kafka
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
SQL + Hadoop: The High Performance Advantage�
SQL + Hadoop:  The High Performance Advantage�SQL + Hadoop:  The High Performance Advantage�
SQL + Hadoop: The High Performance Advantage�
 
A New Day for Oracle Analytics
A New Day for Oracle AnalyticsA New Day for Oracle Analytics
A New Day for Oracle Analytics
 
IBM APM for Hybrid Applications
IBM APM for Hybrid ApplicationsIBM APM for Hybrid Applications
IBM APM for Hybrid Applications
 
TechTalk: Accelerate Mobile Development using SDKs and Open APIs With CA API ...
TechTalk: Accelerate Mobile Development using SDKs and Open APIs With CA API ...TechTalk: Accelerate Mobile Development using SDKs and Open APIs With CA API ...
TechTalk: Accelerate Mobile Development using SDKs and Open APIs With CA API ...
 

Plus de VMware Tanzu

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItVMware Tanzu
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023VMware Tanzu
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleVMware Tanzu
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023VMware Tanzu
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductVMware Tanzu
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready AppsVMware Tanzu
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And BeyondVMware Tanzu
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfVMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023VMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023VMware Tanzu
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptxVMware Tanzu
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchVMware Tanzu
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishVMware Tanzu
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVMware Tanzu
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - FrenchVMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023VMware Tanzu
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootVMware Tanzu
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerVMware Tanzu
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeVMware Tanzu
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsVMware Tanzu
 

Plus de VMware Tanzu (20)

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About It
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at Scale
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a Product
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And Beyond
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - French
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - English
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - English
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - French
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software Engineer
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs Practice
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
 

Dernier

Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Dernier (20)

Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Pivotal Big Data Suite: A Technical Overview

  • 1.
  • 2. TECHNICAL OVERVIEW: Pivotal Big Data Suite Les Klein Field CTO Data Pivotal @LesKlein #PivotalForum #Istanbul #BigData #Analytics
  • 3. Forward Looking Statements This presentation contains “forward-looking statements” as defined under the Federal Securities Laws. Actual results could differ materially from those projected in the forward-looking statements as a result of certain risk factors, including but not limited to: (i) adverse changes in general economic or market conditions; (ii) delays or reductions in information technology spending; (iii) the relative and varying rates of product price and component cost declines and the volume and mixture of product and services revenues; (iv) competitive factors, including but not limited to pricing pressures and new product introductions; (v) component and product quality and availability; (vi) fluctuations in VMware’s Inc.’s operating results and risks associated with trading of VMware stock; (vii) the transition to new products, the uncertainty of customer acceptance of new product offerings and rapid technological and market change; (viii) risks associated with managing the growth of our business, including risks associated with acquisitions and investments and the challenges and costs of integration, restructuring and achieving anticipated synergies; (ix) the ability to attract and retain highly qualified employees; (x) insufficient, excess or obsolete inventory; (xi) fluctuating currency exchange rates; (xii) threats and other disruptions to our secure data centers and networks; (xiii) our ability to protect our proprietary technology; (xiv) war or acts of terrorism; and (xv) other one-time events and other important factors disclosed previously and from time to time in the filings EMC Corporation, the parent company of Pivotal, with the U.S. Securities and Exchange Commission. EMC and Pivotal disclaim any obligation to update any such forward-looking statements after the date of this release.
  • 4. 4© 2016 Pivotal Software, Inc. All rights reserved. Pivotal Big Data Suite Complete platform Hadoop Native SQL Deployment options Based on open source Flexible licensing Advanced data services PIVOTAL GREENPLUM DATABASE Data warehouse database based on open source Greenplum Database PIVOTAL HDB Open source analytical database for Apache Hadoop based on Apache HAWQ PIVOTAL GEMFIRE Open source application and transaction data grid based on Apache Geode Pivotal Big Data Suite Open source data management portfolio
  • 5. Great software companies leverage Big Data to fundamentally change the consumer experience and pioneer entirely new business models
  • 6. 6© 2016 Pivotal Software, Inc. All rights reserved. $4BN Financial Services $26BN Hospitality $50BN Transportation $54BN Entertainment $30BN Automotive $3.2BN Industrial Products CLOUD NATIVE SOFTWARE IS CHANGING INDUSTRIES Data is Fueling Software
  • 7. 7© Copyright 2015 Pivotal. All rights reserved. Hundreds of thousands of “trip” events each day 400+ billion of viewing-related events per day Five billion training data points for Price Tip feature Disruptors Use a LOT of Data
  • 8. 8© Copyright 2015 Pivotal. All rights reserved. “We’ve found that when a host selects a price that’s within 5% of their tip, they’re nearly 4 times more likely to get booked” “The importance of accuracy and efficiency […], will continue to rise as we expand and improve products like uberPOOL and beyond.” “Over 75% of what people watch come from our recommendations” Data manifests as features in an app
  • 9. 9© Copyright 2015 Pivotal. All rights reserved. (Data) Microservices Loosely coupled services architecture, bounded by context Cloud-Native Platforms Enabling continuous delivery & automated operations Open Source Database Innovation Extreme scale & performance advantages, built for the cloud Machine Learning Use of predictive analytics to build smart apps How are they accomplishing this?
  • 10. 10© Copyright 2015 Pivotal. All rights reserved. These companies… Release new features in minutes, multiple times a day Support a micro-services architecture Consume a wide range of data sources and protocols Store and Analyze all their data Update algorithms and predictive models daily Continuously ask lots of questions of their data Modify data pipelines and add processing steps daily
  • 11. 11© 2016 Pivotal Software, Inc. All rights reserved. …but most enterprises are not quite there yet 11 Applications scalability limited by databases Real-time data insights limited by disconnected OLTP and OLAP systems Data services are not ready for cloud platforms App 2 App 1 App 3 Bottleneck Transactional Database App App App Transactional Database ETL / ELT Batches Δt TRANSACTIONS ANALYTICS Analytic Database Continuous Delivery
  • 12. 12© 2016 Pivotal Software, Inc. All rights reserved. Stream + Batch Processing Programming + Operating Model Cloud-Native Platform Microservices FrameworkPlatform Runtime Hadoop DW Spark Microservices and Polyglot Persistence IMDG K/V Store Relational DB Big Data & Machine Learning Modern Cloud-Native Data Architecture Cloud Infrastructure
  • 13. 13© 2016 Pivotal Software, Inc. All rights reserved. New pressures are breaking fragile systems 13 Applications scalability limited by databases Real-time data insights limited by disconnected OLTP and OLAP systems Data services are not ready for cloud platforms App 2 App 1 App 3 Bottleneck Transactional Database App App App Transactional Database ETL / ELT Batches Δt TRANSACTIONS ANALYTICS Analytic Database Continuous Delivery
  • 14. 14© 2016 Pivotal Software, Inc. All rights reserved. Apps scalability limited by scalability of databases 14 DB scalability limitations are aggravated by additional devices, clients and apps App 2 App 1 App 3 Existing Applications New devices And clients New cloud native scalable data apps App 2 App 1 App 3 Bottleneck Transactional Database Scale-out applications vs Scale-up databases
  • 15. 15© 2016 Pivotal Software, Inc. All rights reserved. GemFire: 15 Cloud-scale high performance transactional data • Horizontally scalable • Ultra fast, low-latency in-memory transactions • Fully configurable data consistency • Reliable eventing and notification model • Highly Available, auto-healing • Inter-cluster WAN replication Custom Apps App 1App 1App 1 App 2App 2App 2 Push Updates Transactional Native API Rest / HTTP Pivotal GemFire
  • 16. 16© 2016 Pivotal Software, Inc. All rights reserved. Batch-mode latency prevents real-time analysis 16 Applications scalability limited by databases Real-time data insights limited by disconnected OLTP and OLAP systems Data services are not ready for cloud platforms App 2 App 1 App 3 Bottleneck Transactional Database App App App Transactional Database ETL / ELT Batches Δt TRANSACTIONS ANALYTICS Analytic Database Continuous Delivery
  • 17. 17© 2016 Pivotal Software, Inc. All rights reserved. Data Temperature Hot Hot Real-time data analytics is limited by data integration batches 17 Overnight ETL / ELT jobs expose data that is already outdated App 1 App 3 App 2 Transactional Database ETL / ELT Batches Δt TRANSACTIONS ANALYTICS• Analytical processes don’t have access to the latest data • ETL/ELT processes are expensive and hard to maintain • Batch process windows limits data scalability MPP Cold
  • 18. 18© 2016 Pivotal Software, Inc. All rights reserved. Operationalized data insights need an event-driven architecture 18 Combination of SQL Analytics and NoSQL event-driven transactions is needed App 1 App 3 App 2 Transactional Database TRANSACTIONS ANALYTICS• Data Insights must be immediately pushed to applications • Apps should be able to react in real-time to analytical findings MPP Machine Learning Advanced Analytics ANSI SQL APIs / NoSQL Data Insights
  • 19. 19© 2016 Pivotal Software, Inc. All rights reserved. DataTemperatureWarmHot GemFire and GPDB - Big Data meets Fast Data 19 Custom Apps App 1App 1App 1 App 2App 2App 2 Pivotal GemFire Data science, analytics & ML Transactional Native API Rest / HTTP Analytical ANSI SQL Push Updates Pivotal Greenplum Parallel Configurable Data Load Transactional data Write behind Analytical Data to cache
  • 20. 22© 2016 Pivotal Software, Inc. All rights reserved. …but most enterprises are not quite there yet 22 Applications scalability limited by databases Real-time data insights limited by disconnected OLTP and OLAP systems Data services are not ready for cloud platforms App 2 App 1 App 3 Bottleneck Transactional Database App App App Transactional Database ETL / ELT Batches Δt TRANSACTIONS ANALYTICS Analytic Database Continuous Delivery
  • 21. 24© 2016 Pivotal Software, Inc. All rights reserved. Cloud Native apps are better suitable for NoSQL 24 Enabling fast and scalable event-driven data services Unidirectional, request-response SQL Bidirectional, event-driven APIs Monolithic apps needed complex schema- based, SQL databases Micro-services need much simpler schemas, but much better scalability SQL API API API
  • 22. 26© 2016 Pivotal Software, Inc. All rights reserved. PivotalCloudFoundry GemFire for Pivotal Cloud Foundry 26 Lightning fast in-memory persistence for cloud native apps • One-click provisioning • Pre-packaged configuration • Embedded monitoring by Pulse • Auto application binding • Multi-cloud support • Reliable data replication between PCF sites Pivotal GemFire Click to Deploy
  • 23. 27© 2016 Pivotal Software, Inc. All rights reserved. Cloud-ready, infra-structure agnostic Next-generation databases must keep up to cloud native apps 27 Can your database do all of this? GemFire IMDG DOES. Horizontal Scalability Automatic fail-over Reliable eventing model Multi-site High Availability Seamless integration to analytical databases App 1 App 3App 2
  • 24. 29© 2016 Pivotal Software, Inc. All rights reserved. Pivotal Greenplum World’s First Open Source Massively Parallel Data Warehouse
  • 25. 30© 2016 Pivotal Software, Inc. All rights reserved. • Relational database system for big data and data warehousing • • Mission critical & system of record product with supporting tools and ecosystem • • Fully open source with a global community of developers and users • • Large industrial focused system • • PostgreSQL based • • Multi-platform technology • On-premise, Cloud, Enterprise Appliance • • It’s a Software product Greenplum Database Mission & Strategy
  • 26. 31© 2016 Pivotal Software, Inc. All rights reserved. Government Tax & benefits fraud detection Economic statistics research Financial Services Wealth management data science and product development for Commercial Banking Risk and trade repositories reporting 401K providers analytics on investment choices Pharmaceutical Vaccine potency prediction based on manufacturing sensors IoT Predictive maintenance for auto manufacturer, industrial equipment and government agencies Semiconductor Fab sensor analytics and reporting Highlighted Greenplum Successes Cyber Security & Surveillance Internal email and communication surveillance and reporting Corporate network anomalous behavior and intrusion detections Oil & Gas Drilling equipment predictive maintenance Communications Mobile telephone company enterprise data warehouse Network performance and availability analytics Retail Customer purchases analytics Transportation Airlines loyalty program analytics
  • 27. 32© 2016 Pivotal Software, Inc. All rights reserved. POLYMORPHIC STORAGE HEAP, Append Only, Columnar, External, Compression MULTI-VERSION CONCURRENCY CONTROL (MVCC) Greenplum Overview Greenplum DBSYSTEM ACCESS DATA PROCESSING DATA STORAGE CLIENT ACCESS PSQL, ODBC, JDBC BULK LOAD/UNLOAD GPLoad, GPFdist, External Tables, GPHDFS ADMIN TOOLS GP Perfmon, GP Support 3rd PARTY TOOLS Compatible with Industry Standard BI & ETL Tools SQL STANDARD COMPLIANCE MASSIVELY PARALLEL PROCESSING (MPP) IN-DATABASE PROGRAMMING LANGUAGES PL/pgSQL, PL/Python, PL/R, PL/Perl, PL/Java, PL/C IN-DATABASE ANALYTICS & EXTENSIONS MADlib, PostGIS, PGCrypto FULLY ACID COMPLIANT TRANSACTIONAL DATABASE INDEXES B-Tree, Bitmap, GiST BIG DATA QUERY OPTIMIZER
  • 28. 34© 2016 Pivotal Software, Inc. All rights reserved. PostgreSQL Heritage Greenplum Open Source Launch • Widely used • Open Source • PostgreSQL License • Enterprise class open source relational engine
  • 29. 35© 2016 Pivotal Software, Inc. All rights reserved. MPP Shared Nothing Architecture Flexible framework for processing large datasets … Master Host SQL Master Host and Standby Master Host Master coordinates work with Segment Hosts Segment Host with one or more Segment Instances Segment Instances process queries in parallel Segment Hosts have their own CPU, disk and memory (shared nothing) High speed interconnect for continuous pipelining of data processing Interconnect Segment Host Segment Instance Segment Instance Segment Instance Segment Instance Segment Host Segment Instance Segment Instance Segment Instance Segment Instance node1 Segment Host Segment Instance Segment Instance Segment Instance Segment Instance node2 Segment Host Segment Instance Segment Instance Segment Instance Segment Instance node3 Segment Host Segment Instance Segment Instance Segment Instance Segment Instance nodeN Greenplum DB
  • 30. 36© 2016 Pivotal Software, Inc. All rights reserved. Greenplum DB External Sources Loading, streaming, etc. Network Interconnect ... ... ...... Master Servers Query planning & dispatch Segment Servers Query processing & data storage ETL File Systems Fast Parallel Load & Unload No Master Node bottleneck 10+ TB/Hour per Rack Linear scalability Low Latency Data immediately available No intermediate stores No data “reorganization” Load/Unload To & From: File Systems Any ETL Product Hadoop & Amazon S3 Loading: Massively-Parallel Ingest Extreme speed and immediate usability from files, ETL, Hadoop & S3
  • 31. 39© 2016 Pivotal Software, Inc. All rights reserved. Polymorphic Storage™ User Definable Storage Layout Columnar storage compresses better Optimized for retrieving a subset of the columns when querying Compression can be set differently per column: gzip (1-9), quicklz, delta, RLE  Row oriented faster when returning all columns  HEAP for many updates and deletes  Use indexes for drill through queries TABLE ‘SALES’ Jun Column-orientedRow-oriented Oct Year - 1 Year - 2 External HDFS or S3  Less accessed partitions on external and seamlessly query all data  All major Hadoop distributions  Amazon S3 storage  Others in development Nov DecJul Aug Sep
  • 32. 40© 2016 Pivotal Software, Inc. All rights reserved. Parent table Feb 2014 RETExternal Dec 2014Jan2013 Jan 2014 Partitions and External Partitions ... • Hash Distribution to evenly spread data across all segment instances • Range Partition within a segment instance to minimize scan work • Partitioned Tables Support for External Tables as a Partition – Readable external table – Host file system, NFS mount, HDFS or Amazon S3 Greenplum DB
  • 33. 41© 2016 Pivotal Software, Inc. All rights reserved. Hybrid Queries: Pivotal External Tables • Readable Ext-Table MVP • Readable Gzip Files • Writable Ext-Table • Investigation: Enhanced Security/Roles • Investigation: Additional File Formats S3 External Tables Gemfire External Tables • Hi Speed Ingestion • Hi Concurrency Query Cache GPHDFS Roadmap
  • 34. 42© 2016 Pivotal Software, Inc. All rights reserved. Greenplum Database Features for Data Scientists • Window functions: Perform calculations across a set of table rows that are somehow related to the current row • Analytics extensions: In-database machine learning at scale using MADlib • Procedural language extensions: Extended functionality using non-SQL programming languages and packages (e.g. Python and R) • Client Access: ODBC and JDBC access to support connections to 3rd party tools * Only a subset of Greenplum Database features
  • 35. 43© 2016 Pivotal Software, Inc. All rights reserved. Procedural Languages • User Defined Types • User Defined Functions • User Defined Aggregates • Import of libraries from open source
  • 36. 44© 2016 Pivotal Software, Inc. All rights reserved. Scalable, In-Database Machine Learning • Open source https://github.com/apache/incubator-madlib • Downloads and docs http://madlib.incubator.apache.org/ • Wiki https://cwiki.apache.org/confluence/display/MADLIB/
  • 37. 45© 2016 Pivotal Software, Inc. All rights reserved. Functions Linear Systems • Sparse and Dense Solvers • Linear Algebra Matrix Factorization • Singular Value Decomposition (SVD) • Low Rank Generalized Linear Models • Linear Regression • Logistic Regression • Multinomial Logistic Regression • Ordinal Regression • Cox Proportional Hazards Regression • Elastic Net Regularization • Robust Variance (Huber-White), Clustered Variance, Marginal Effects Other Machine Learning Algorithms • Principal Component Analysis (PCA) • Association Rules (Apriori) • Topic Modeling (Parallel LDA) • Decision Trees • Random Forest • Support Vector Machines • Conditional Random Field (CRF) • Clustering (K-means) • Cross Validation • Naïve Bayes • Support Vector Machines (SVM) Descriptive Statistics Sketch-Based Estimators • CountMin (Cormode-Muth.) • FM (Flajolet-Martin) • MFV (Most Frequent Values) Correlation and Covariance Summary Utility Modules Array and Matrix Operations Sparse Vectors Random Sampling Probability Functions Data Preparation PMML Export Conjugate Gradient Stemming Inferential Statistics Hypothesis Tests Time Series • ARIMA April 2016 Path Functions • Operations on Pattern Matches
  • 38. 46© 2016 Pivotal Software, Inc. All rights reserved. GPDB Geospatial Current Key Features: • Points, Lines, Polygons, Perimeter, Area, Intersection, Contains, Distance, Long/Lat, Spatial Indexes & Bounding Boxes Round earth calculations Ability to store geospatial data and query with with joins and operators Raster Image Processing
  • 39. 47© 2016 Pivotal Software, Inc. All rights reserved. Pivotal HDB Hadoop Native SQL Database
  • 40. 48© 2016 Pivotal Software, Inc. All rights reserved.
  • 41. 49© 2016 Pivotal Software, Inc. All rights reserved. Enabling data science and machine learning at scale Making the Hadoop Data Lake More Consumable 2) Data scientists still have to resort to sampling if they can't run analytics in-database at scale 3) There are multiple data sets and formats within Hadoop SQL App BUSINESS ANALYSTS DATA SCIENTISTS DATA LAKE DATA LAKE Hive, HBase, etc. DATA LAKE 1) Important people and tools are cut-off because of SQL completeness or performance.
  • 42. 50© 2016 Pivotal Software, Inc. All rights reserved. As the lingua franca of analytics, SQL can't be ignored. Neither can performance. Making the Hadoop Data Lake More Consumable 2) Data scientists still have to resort to sampling if they can't run analytics in-database at scale 3) There are multiple data sets and formats within Hadoop SQL App BUSINESS ANALYSTS DATA SCIENTISTS DATA LAKE DATA LAKE Hive, HBase, etc. DATA LAKE 1) Important people and tools are cut-off because of SQL completeness or performance.
  • 43. 51© 2016 Pivotal Software, Inc. All rights reserved. Lack of interactive, ANSI SQL capabilities inhibits adoption and value Hadoop data lakes sit underutilized Producing complex queries, large joins, interactive queries Existing investments in visualization and BI tools Large population of users with SQL skills DATA LAKE DATA SCIENTISTS BUSINESS ANALYSTS SQL App
  • 44. 52© 2016 Pivotal Software, Inc. All rights reserved. High performance, interactive SQL queries on Hadoop HDB: The Hadoop Native SQL Database ● Highly efficient MPP (massively parallel processing) ● Low-latency ● Petabyte scalability ● ACID transaction support ● SQL-92, 99, 2003 compatibility ● Advanced cost-based optimizer DATA LAKE SQL App BUSINESS ANALYSTS DATA SCIENTISTS
  • 45. 53© 2016 Pivotal Software, Inc. All rights reserved. Integrate SQL and data science tools into an interactive, operationalized environment Making the Hadoop Data Lake More Consumable 2) Data scientists still have to resort to sampling if they can't run analytics in-database at scale 3) There are multiple data sets and formats within Hadoop SQL App BUSINESS ANALYSTS DATA SCIENTISTS DATA LAKE DATA LAKE Hive, HBase, etc. DATA LAKE 1) Important people and tools are cut-off because of SQL completeness or performance.
  • 46. 54© 2016 Pivotal Software, Inc. All rights reserved. Using traditional, single-node Python or R for analytics means using subsets because of the lack of parallelization Predictive analytics not scaling with Python or R <...> Implications • Time-consuming data movement • Working with small sample sizes requires extra testing cycles against larger data sets • Slow feature generation limits algorithm development DATA LAKE DATA LAKE DATA LAKE SAMPLE 1 SAMPLE 2 SAMPLE n
  • 47. 55© 2016 Pivotal Software, Inc. All rights reserved. ApacheTM MADlib® (incubating) is an open-source library for scalable in-database analytics In-database analytics speeds predictive modeling Scale-out mathematical, statistical and machine learning methods for structured and unstructured data • SQL-based • Analyze without sampling • Open source • Runs on HDB, Greenplum, and Postgres • Compliments support for procedural languages: PL/R, PL/Python, PL/Java Train a model... Predict for new data... DATA LAKE
  • 48. 56© 2016 Pivotal Software, Inc. All rights reserved. Overcome complexity Making the Hadoop Data Lake More Consumable 2) Data scientists still have to resort to sampling if they can't run analytics in-database at scale 3) There are multiple data sets and formats within Hadoop SQL App BUSINESS ANALYSTS DATA SCIENTISTS DATA LAKE DATA LAKE Hive, HBase, etc. DATA LAKE 1) Important people and tools are cut-off because of SQL completeness or performance.
  • 49. 57© 2016 Pivotal Software, Inc. All rights reserved. Schema Read HDB’s Pivotal eXtension Framework (PXF) and HCatalog integration Simplifying the data lake with data federation • Enables connectivity between Pivotal HDB and other stores (Hive, HBase, HDFS files). • Provides an extensible framework to add support for custom services • Low latency on large data sets • Considers cost model of federated sources HDFS DATA LAKE HCatalog CSV TXT Avro Custom Extensions
  • 50. 59© 2016 Pivotal Software, Inc. All rights reserved. CUSTOMER APP Providing information in context with the right architecture and the right algorithms HDB as part of an architecture: Next Likely Purchase INTERNAL APP PURCHASE NEXT OFFER REAL-TIME VIEW OF TRANSACTIONS AND OFFERS REPORTS
  • 51. 60© 2016 Pivotal Software, Inc. All rights reserved. CUSTOMER APP Providing information in context with the right architecture and the right algorithms HDB as part of an architecture: Next Likely Purchase INTERNAL APP PURCHASE NEXT OFFER REAL-TIME VIEW OF TRANSACTIONS AND OFFERS TRANSACTIONS PMML Model Creation & Training HDB Tables HDFS Staging 1. Ingest, transform, and land data into HDFS 2. Score streaming data and serve to application DATA SCIENCE & AD HOC QUERIES REPORTS
  • 52. 61© 2016 Pivotal Software, Inc. All rights reserved. Advanced Analytics Performance Exceptional MPP performance, low latency, petabyte scalability, ACID reliability, fault tolerance Most Complete Language Compliance Higher degree of SQL compatibility, SQL-92, 99, 2003, OLAP, leverage existing SQL skills Advanced Query Optimizer Maximize performance and do advanced queries with confidence Elastic Architecture for Scalability Scale-up/down or scale-in/out, expand/shrink clusters on the fly Integrated w/MADlib Machine Learning Advanced MPP analytics, data science at scale, directly on Hadoop data MAD Pivotal HDB Advantages
  • 53. 62© Copyright 2015 Pivotal. All rights reserved. “Companies need to learn how to catch people or things in the act of doing something and affect the outcome“ PAUL MARITZ Executive Chairman, Pivotal Real-time and Personalised Information in Context is what Wins!