SlideShare une entreprise Scribd logo
1  sur  100
1Pivotal Confidential–Internal Use Only 1Pivotal Confidential–Internal Use Only
Modern Data Architecture
Alexey Grishchenko
2Pivotal Confidential–Internal Use Only
About me
Enterprise Architect @ Pivotal
 7 years in data processing
 5 years with MPP
 4 years with Hadoop
 Spark contributor
 http://0x0fff.com
3Pivotal Confidential–Internal Use Only
How it started…
Front
End
4Pivotal Confidential–Internal Use Only
How it started…
Front
End
Back
End
5Pivotal Confidential–Internal Use Only
How it started…
Front
End
Back
End
DBMS
6Pivotal Confidential–Internal Use Only
How it started…
Front
End
Back
End
DBMS
What about BI?
7Pivotal Confidential–Internal Use Only
How it started…
Front
End
Back
End
DBMS
Just put it there!
8Pivotal Confidential–Internal Use Only
How it started…
Front
End
Back
End
DBMS
BI
9Pivotal Confidential–Internal Use Only
How it started…
Front
End
Back
End
DBMS
BI
Was it fast?
10Pivotal Confidential–Internal Use Only
How it started…
Front
End
10ms
Back
End
DBMS
BI
100ms
200ms
1-2 min
11Pivotal Confidential–Internal Use Only
How it started…
Front
End
10ms
Back
End
DBMS
BI
100ms
200ms
1-2 min
yes, single server…
12Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
200ms
1-2 min
More users got
workstations
13Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
400ms
800ms
1-2 min
14Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
400ms
800ms
1-2 min
Split!
15Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
300ms
600ms
1-2 min
16Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
300ms
600ms
1-2 min
Even more users?
17Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
300ms
600ms
1-2 min
Split!
18Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
400ms
1-2 min
Front
End
Back
End
Front
End
Back
End
19Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
400ms
1-2 min
Front
End
Back
End
Front
End
Back
End
What about
automated systems?
20Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
1 sec
5-10 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
21Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
1 sec
5-10 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Database, please, live!
22Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
1 sec
5-10 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
23Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
800ms
15-20 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
24Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
800ms
15-20 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
What if “split” didn’t
help this time?
25Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
800ms
15-20 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Split more! Eventually
it will help…
26Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
300ms
35-40 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
DBMS DBMSDBMSDBMS
27Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
300ms
35-40 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
DBMS DBMSDBMSDBMS
28Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
300ms
35-40 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
DBMS DBMSDBMSDBMS
Sales went
10% up!
29Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
300ms
35-40 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
DBMS DBMSDBMSDBMS
Sales went
10% up!
Sales went
20%
down!
30Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
600ms
2-3 hrs
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
DBMS DBMSDBMSDBMS
Sales went
10% up!
Sales went
20%
down!
31Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
600ms
2-3 hrs
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
DBMS DBMSDBMSDBMS
Sales went
10% up!
Sales went
20%
down!
Stop loading my
system with your
stupid reports!
32Pivotal Confidential–Internal Use Only
BI
The Era of Data Warehouse
100ms
DBMS
300ms
2 days
FE
BE
DBMS DBMSDBMSDBMS
FE
BE
FE
BE
FE
BE
FE
BE
ETL
DWH
1 day
33Pivotal Confidential–Internal Use Only
BI
The Era of Data Warehouse
100ms
DBMS
300ms
2 days
FE
BE
DBMS DBMSDBMSDBMS
FE
BE
FE
BE
FE
BE
FE
BE
ETL
DWH
1 day
We need more
reports!
34Pivotal Confidential–Internal Use Only
BI
The Era of Data Warehouse
100ms
DBMS
300ms
3-4 days
FE
BE
DBMS DBMSDBMSDBMS
FE
BE
FE
BE
FE
BE
FE
BE
ETL
DWH
1 day
Data
Mining
OLAP…
35Pivotal Confidential–Internal Use Only
BI
The Era of Data Warehouse
100ms
DBMS
300ms
3-4 days
FE
BE
DBMS DBMSDBMSDBMS
FE
BE
FE
BE
FE
BE
FE
BE
ETL
DWH
1 day
Data
Mining
OLAP… We need
secondary site!
36Pivotal Confidential–Internal Use Only
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
37Pivotal Confidential–Internal Use Only
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
WAL Replication
3-5 minutes late
38Pivotal Confidential–Internal Use Only
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
WAL Replication
3-5 minutes late
39Pivotal Confidential–Internal Use Only
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
WAL Replication
3-5 minutes late
Where is our
DWH? We need
this data now!
40Pivotal Confidential–Internal Use Only
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
WAL Replication
3-5 minutes late
41Pivotal Confidential–Internal Use Only
ETL
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
DWH
BI
Data
Mining
OLAP…
5-7 days
DBMS DBMS DBMS DBMS DBMS
42Pivotal Confidential–Internal Use Only
ETL
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
DWH
BI
Data
Mining
OLAP…
5-7 days
DBMS DBMS DBMS DBMS DBMS
Why is this data
so old?
43Pivotal Confidential–Internal Use Only
ETL
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
DWH
BI
Data
Mining
OLAP…
5-7 days
DBMS DBMS DBMS DBMS DBMS
44Pivotal Confidential–Internal Use Only
ETL
Advanced Architecture – ELT
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
DWH
BI
Data
Mining
OLAP…
5-7 days
DBMS DBMS DBMS DBMS DBMS
DBMS DBMS DBMS…
ETL
DDS
Data Marts Reports
Aggregates
OLAP
DBMS DBMS DBMS…
ELT
DDS
Data Marts Reports
Aggregates
OLAP
ODS ODS ODS…
45Pivotal Confidential–Internal Use Only
ELT
Advanced Architecture – ELT
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
DWH
BI
Data
Mining
OLAP…
5-7 days
DBMS DBMS DBMS DBMS DBMS
46Pivotal Confidential–Internal Use Only
ELT
Advanced Architecture – CDC
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
DWH
BI
Data
Mining
OLAP…
5-7 days
DBMS DBMS DBMS DBMS DBMS
DBMS DBMS DBMS…
ELT
DDS
Data Marts Reports
Aggregates
OLAP
ODS ODS ODS…
DBMS DBMS DBMS…
ELT
DDS
Data Marts Reports
Aggregates
OLAP
ODS ODS ODS…
CDC
1 day
1 hour
47Pivotal Confidential–Internal Use Only
ELT CDC
Advanced Architecture – CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
48Pivotal Confidential–Internal Use Only
ELT CDC
Advanced Architecture – CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Why is our
secondary site’s
DWH so old?
49Pivotal Confidential–Internal Use Only
ELT CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Moving Forward
50Pivotal Confidential–Internal Use Only
ELT CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Our problems are
Moving Forward
51Pivotal Confidential–Internal Use Only
ELT CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Our problems are
 Time to action takes up to 7 days
Moving Forward
52Pivotal Confidential–Internal Use Only
ELT CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Our problems are
 Time to action takes up to 7 days
 Amount of data is growing
Moving Forward
53Pivotal Confidential–Internal Use Only
ELT CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Our problems are
 Time to action takes up to 7 days
 Amount of data is growing
 DWH MPP storage is expensive
Moving Forward
54Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Our problems are
 Time to action takes up to 7 days
 Amount of data is growing
 DWH MPP storage is expensive
Data Lake
55Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Our problems are
 Time to action takes up to 7 days
 Amount of data is growing
 DWH MPP storage is expensive
Lambda
Data Lake
56Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures – Data Lake
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Hadoop
DBMS DBMS DBMS…
ELT
DDS
OLAP Data Marts
Aggregates
Reports
ODS ODS ODS…
CDC
DWH
ODS UDS
Analytical Archives
BI
Data
Mining
OLAP
SQL-on-Hadoop
Data Mining
At Scale
57Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures – Data Lake
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
58Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures – Data Lake
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
Data
Mining
BI OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
59Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures – Lambda
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
Data
Mining
BI OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
Source
Data
Speed Layer Batch Layer
Serving Layer
Query Query
Master Dataset
Batch
View
Batch
View
Batch
View
Real-time
View
Real-time
View
Real-time
View
60Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures – Lambda
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
Data
Mining
BI OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
61Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures – Lambda
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
62Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Our problems are
63Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Our problems are
 Too many standby systems
64Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Our problems are
 Too many standby systems
 How to replicate Hadoop cluster?
65Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Our problems are
 Too many standby systems
 How to replicate Hadoop cluster?
 How to sync data in real-time systems?
66Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Our problems are
 Too many standby systems
 How to replicate Hadoop cluster?
 How to sync data in real-time systems?
 How to better sync DWH?
67Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Our problems are
 Too many standby systems
 How to replicate Hadoop cluster?
 How to sync data in real-time systems?
 How to better sync DWH?
Pipelining
68Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
69Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
70Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
71Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Table
72Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
73Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
74Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
ETL
cp
Batch
ETL
75Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
ETL
cp
Batch
ETL
load
ODS
DWH
76Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
ETL
cp
Batch
ETL
load
ODS
DDS
DWH
77Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
ETL
cp
Batch
ETL
load
ODS
DDS
DataMart
DWH
78Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
ETL
cp
Batch
ETL
load
ODS
DDS
DataMart
DWH
JDBC
79Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
ETL
cp
Batch
ETL
ODS
DDS
DataMart
DWH
JDBC
80Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
load
ODS
DDS
DataMart
DWH
JDBC
API
Queue ETL
ETLBatch
81Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
load
ODS
DDS
DataMart
DWH
JDBC
API
Queue ETL
ETLBatch
loadETL
82Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
load
ODS
DDS
DataMart
DWH
JDBC
API
Queue ETL
ETLBatchApp
ETLBatch
load
loadETL
83Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
load
ODS
DDS
DataMart
DWH
JDBC
API
Queue ETL
ETLBatchApp
ETLBatch
load
loadETL
STG
BatchApp
Hadoop
HDFS
SQL
On
Hadoop
84Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
load
ODS
DDS
DataMart
DWH
JDBC
API
Queue ETL
ETLBatchApp
ETLBatch
load
loadETL
STG
BatchApp
Hadoop
HDFS
SQL
On
Hadoop
RTI
App
85Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
load
ODS
DDS
DataMart
DWH
JDBC
API
Queue ETL
ETLBatchApp
ETLBatch
load
loadETL
STG
BatchApp
Hadoop
HDFS
SQL
On
Hadoop
RTI
AppReplicate
86Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
87Pivotal Confidential–Internal Use Only
ELT CDC
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
OLAP
Data
Mining
RTBI…
FE
BE
FE
BE
FE
BE
CDC
Hadoop
In-Memory
Data Store
BI
Modern Data Architecture – Pipelining
Replication Queue
3-5 minutes late
In-Memory
Data Store
OLAP…
DWHHadoop
BI
Data
Mining
RTBI
DBMS DBMS DBMSWAL Replication
3-5 minutes late
88Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Pivotal GemFire
App
Spring XD
Streaming
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
MartPostgreSQL
SP
Table
ODS
ETL
ETL
89Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
HTTP
Pivotal GemFire
App
Spring XD
Streaming
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
MartPostgreSQL
SP
Table
ODS
ETL
ETL
Pivotal Cloud Foundry
FE
…
App
App
App
Queue BE
…
App
App
App
 Pivotal Labs – agile software
development for next-generation
applications
 Pivotal Cloud Foundry – PaaS for
customer applications
 RabbitMQ – distributed message
queue service on top of PCF
 Spring IO – foundation platform for
modern applications
90Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Spring XD
Streaming
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
MartPostgreSQL
SP
Table
ODS
ETL
ETL
Pivotal GemFire
App
Pivotal GemFire and Apache Geode (incubating) –
in-memory data grid enabling real-time data processing and
real-time decision making for enterprises
91Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Pivotal GemFire
App
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
MartPostgreSQL
SP
Table
ODS
ETL
ETL
Spring XD
Streaming
Spring XD – unified, distributed and extensible framework for
data pipelining: ingesting, batching, processing and exporting
92Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Pivotal GemFire
App
Spring XD
Streaming
ES
DDS
DataMart
Pivotal
Greenplum
PostgreSQL
SP
Table
ODS
ETL
ETL
Streaming
Data
Pivotal HD
Pivotal
HAWQ
Data
Mart
 Pivotal HD – leading Hadoop distribution based on ODP
 Pivotal HAWQ and Apache HAWQ (incubating) – bringing the
power of MPP to the Hadoop cluster, best in class SQL-on-
Hadoop solution
 Apache Spark – component of the Pivotal HD distribution,
modern framework for distributed data processing
93Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Pivotal GemFire
App
Spring XD
Streaming
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
Mart
ODS
ETL
ETL
PostgreSQL
SP
Table
 Pivotal PostgreSQL – commercially supported by Pivotal
open source distribution of PostgreSQL
94Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Pivotal GemFire
App
Spring XD
Streaming
Streaming
Data
Pivotal HD
Pivotal
HAWQ
Data
MartPostgreSQL
SP
Table
ETL
ETL
ES
DDS
DataMart
Pivotal
Greenplum
ODS
Pivotal Greenplum – leading analytical MPP database,
foundation for the enterprise data warehousing systems and
advanced analytics
95Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
Pivotal GemFire
App
Spring XD
Streaming
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
MartPostgreSQL
SP
Table
ODS
ETL
ETL
Data Lake
96Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Spring XD
Streaming
ES
DDS
DataMart
Pivotal
Greenplum
PostgreSQL
SP
Table
ODS
ETL
ETL
Pivotal GemFire
App
Streaming
Data
Pivotal HD
Pivotal
HAWQ
Data
Mart
BI
Lambda Architecture
97Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
ES
DDS
DataMart
Pivotal
Greenplum
PostgreSQL
SP
Table
ODS
ETL
ETL
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Streaming
Pivotal HD
BI
Pivotal GemFire
App
Spring XD
Streaming
Data
Pivotal
HAWQ
Data
Mart
Pipelining
98Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Pivotal GemFire
App
Spring XD
Streaming
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
MartPostgreSQL
SP
Table
ODS
ETL
ETL
99Pivotal Confidential–Internal Use Only 99Pivotal Confidential–Internal Use Only
Questions?
BUILT FOR THE SPEED OF BUSINESS

Contenu connexe

Tendances

Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceDenodo
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation Brett VanderPlaats
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake ArchitectureDATAVERSITY
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
 
DataMinds 2022 Azure Purview Erwin de Kreuk
DataMinds 2022 Azure Purview Erwin de KreukDataMinds 2022 Azure Purview Erwin de Kreuk
DataMinds 2022 Azure Purview Erwin de KreukErwin de Kreuk
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesDATAVERSITY
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for DinnerKent Graziano
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
Data Architecture Brief Overview
Data Architecture Brief OverviewData Architecture Brief Overview
Data Architecture Brief OverviewHal Kalechofsky
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta LakeDatabricks
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureDATAVERSITY
 
Data Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaData Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaScyllaDB
 

Tendances (20)

Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake Architecture
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
DataMinds 2022 Azure Purview Erwin de Kreuk
DataMinds 2022 Azure Purview Erwin de KreukDataMinds 2022 Azure Purview Erwin de Kreuk
DataMinds 2022 Azure Purview Erwin de Kreuk
 
Data modelling 101
Data modelling 101Data modelling 101
Data modelling 101
 
Mdm: why, when, how
Mdm: why, when, howMdm: why, when, how
Mdm: why, when, how
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & Approaches
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Data Architecture Brief Overview
Data Architecture Brief OverviewData Architecture Brief Overview
Data Architecture Brief Overview
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
 
Data Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaData Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation Criteria
 

En vedette

MapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APIMapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APImcsrivas
 
MapR Tutorial Series
MapR Tutorial SeriesMapR Tutorial Series
MapR Tutorial Seriesselvaraaju
 
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop DistributionArchitectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distributionmcsrivas
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkDatabricks
 
Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadeaviadea
 
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)Amazon Web Services
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsAnton Kirillov
 
Apache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterApache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterDatabricks
 
MapR Data Analyst
MapR Data AnalystMapR Data Analyst
MapR Data Analystselvaraaju
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark InternalsPietro Michiardi
 

En vedette (14)

Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
MapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APIMapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase API
 
Deep Learning for Fraud Detection
Deep Learning for Fraud DetectionDeep Learning for Fraud Detection
Deep Learning for Fraud Detection
 
Apache Spark & Hadoop
Apache Spark & HadoopApache Spark & Hadoop
Apache Spark & Hadoop
 
MapR Tutorial Series
MapR Tutorial SeriesMapR Tutorial Series
MapR Tutorial Series
 
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop DistributionArchitectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distribution
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache Spark
 
Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadea
 
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Apache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterApache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and Smarter
 
MapR Data Analyst
MapR Data AnalystMapR Data Analyst
MapR Data Analyst
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
 

Similaire à Modern Data Architecture

The ninja elephant, scaling the analytics database in Transwerwise
The ninja elephant, scaling the analytics database in TranswerwiseThe ninja elephant, scaling the analytics database in Transwerwise
The ninja elephant, scaling the analytics database in TranswerwiseFederico Campoli
 
The care and feeding of a MySQL database
The care and feeding of a MySQL databaseThe care and feeding of a MySQL database
The care and feeding of a MySQL databaseDave Stokes
 
20120426 high availability MySQL
20120426 high availability MySQL20120426 high availability MySQL
20120426 high availability MySQLJui-Nan Lin
 
High performance Infrastructure Oct 2013
High performance Infrastructure Oct 2013High performance Infrastructure Oct 2013
High performance Infrastructure Oct 2013Server Density
 
Pluk2013 bodybuilding ratheesh
Pluk2013 bodybuilding ratheeshPluk2013 bodybuilding ratheesh
Pluk2013 bodybuilding ratheeshRatheesh Kaniyala
 
All About Storeconfigs
All About StoreconfigsAll About Storeconfigs
All About StoreconfigsBrice Figureau
 
The Importance of Data
The Importance of DataThe Importance of Data
The Importance of DataTrendz Lab
 
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.frPGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.frjlb666
 
PhpTek Ten Things to do to make your MySQL servers Happier and Healthier
PhpTek Ten Things to do to make your MySQL servers Happier and HealthierPhpTek Ten Things to do to make your MySQL servers Happier and Healthier
PhpTek Ten Things to do to make your MySQL servers Happier and HealthierDave Stokes
 
IMS11 BMC Susbystem Optimizer - subzero
IMS11   BMC Susbystem Optimizer - subzeroIMS11   BMC Susbystem Optimizer - subzero
IMS11 BMC Susbystem Optimizer - subzeroRobert Hain
 
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...Severalnines
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberXiang Fu
 
The 5 Minute DBA-DBA Skills for Non-DBA
The 5 Minute DBA-DBA Skills for Non-DBAThe 5 Minute DBA-DBA Skills for Non-DBA
The 5 Minute DBA-DBA Skills for Non-DBApercona2013
 
VeeamON 2023 Architecting Veeam Backup for Microsoft 365 at Scale
VeeamON 2023 Architecting Veeam Backup for Microsoft 365 at ScaleVeeamON 2023 Architecting Veeam Backup for Microsoft 365 at Scale
VeeamON 2023 Architecting Veeam Backup for Microsoft 365 at ScaleJim Jones
 
The future of tape april 16
The future of tape april 16The future of tape april 16
The future of tape april 16Josef Weingand
 
Spectra Logic's BlackPearl Developers Summit 2016
Spectra Logic's BlackPearl Developers Summit 2016Spectra Logic's BlackPearl Developers Summit 2016
Spectra Logic's BlackPearl Developers Summit 2016spectralogic
 
Why Wordnik went non-relational
Why Wordnik went non-relationalWhy Wordnik went non-relational
Why Wordnik went non-relationalTony Tam
 

Similaire à Modern Data Architecture (20)

Pinto+Stratopan+Love
Pinto+Stratopan+LovePinto+Stratopan+Love
Pinto+Stratopan+Love
 
The ninja elephant, scaling the analytics database in Transwerwise
The ninja elephant, scaling the analytics database in TranswerwiseThe ninja elephant, scaling the analytics database in Transwerwise
The ninja elephant, scaling the analytics database in Transwerwise
 
The care and feeding of a MySQL database
The care and feeding of a MySQL databaseThe care and feeding of a MySQL database
The care and feeding of a MySQL database
 
20120426 high availability MySQL
20120426 high availability MySQL20120426 high availability MySQL
20120426 high availability MySQL
 
High performance Infrastructure Oct 2013
High performance Infrastructure Oct 2013High performance Infrastructure Oct 2013
High performance Infrastructure Oct 2013
 
Ds @ bol
Ds @ bolDs @ bol
Ds @ bol
 
Pluk2013 bodybuilding ratheesh
Pluk2013 bodybuilding ratheeshPluk2013 bodybuilding ratheesh
Pluk2013 bodybuilding ratheesh
 
All About Storeconfigs
All About StoreconfigsAll About Storeconfigs
All About Storeconfigs
 
The Importance of Data
The Importance of DataThe Importance of Data
The Importance of Data
 
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.frPGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
 
PhpTek Ten Things to do to make your MySQL servers Happier and Healthier
PhpTek Ten Things to do to make your MySQL servers Happier and HealthierPhpTek Ten Things to do to make your MySQL servers Happier and Healthier
PhpTek Ten Things to do to make your MySQL servers Happier and Healthier
 
IMS11 BMC Susbystem Optimizer - subzero
IMS11   BMC Susbystem Optimizer - subzeroIMS11   BMC Susbystem Optimizer - subzero
IMS11 BMC Susbystem Optimizer - subzero
 
Splunk-EMC
Splunk-EMCSplunk-EMC
Splunk-EMC
 
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
 
The 5 Minute DBA-DBA Skills for Non-DBA
The 5 Minute DBA-DBA Skills for Non-DBAThe 5 Minute DBA-DBA Skills for Non-DBA
The 5 Minute DBA-DBA Skills for Non-DBA
 
VeeamON 2023 Architecting Veeam Backup for Microsoft 365 at Scale
VeeamON 2023 Architecting Veeam Backup for Microsoft 365 at ScaleVeeamON 2023 Architecting Veeam Backup for Microsoft 365 at Scale
VeeamON 2023 Architecting Veeam Backup for Microsoft 365 at Scale
 
The future of tape april 16
The future of tape april 16The future of tape april 16
The future of tape april 16
 
Spectra Logic's BlackPearl Developers Summit 2016
Spectra Logic's BlackPearl Developers Summit 2016Spectra Logic's BlackPearl Developers Summit 2016
Spectra Logic's BlackPearl Developers Summit 2016
 
Why Wordnik went non-relational
Why Wordnik went non-relationalWhy Wordnik went non-relational
Why Wordnik went non-relational
 

Dernier

Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 

Dernier (20)

Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 

Modern Data Architecture

  • 1. 1Pivotal Confidential–Internal Use Only 1Pivotal Confidential–Internal Use Only Modern Data Architecture Alexey Grishchenko
  • 2. 2Pivotal Confidential–Internal Use Only About me Enterprise Architect @ Pivotal  7 years in data processing  5 years with MPP  4 years with Hadoop  Spark contributor  http://0x0fff.com
  • 3. 3Pivotal Confidential–Internal Use Only How it started… Front End
  • 4. 4Pivotal Confidential–Internal Use Only How it started… Front End Back End
  • 5. 5Pivotal Confidential–Internal Use Only How it started… Front End Back End DBMS
  • 6. 6Pivotal Confidential–Internal Use Only How it started… Front End Back End DBMS What about BI?
  • 7. 7Pivotal Confidential–Internal Use Only How it started… Front End Back End DBMS Just put it there!
  • 8. 8Pivotal Confidential–Internal Use Only How it started… Front End Back End DBMS BI
  • 9. 9Pivotal Confidential–Internal Use Only How it started… Front End Back End DBMS BI Was it fast?
  • 10. 10Pivotal Confidential–Internal Use Only How it started… Front End 10ms Back End DBMS BI 100ms 200ms 1-2 min
  • 11. 11Pivotal Confidential–Internal Use Only How it started… Front End 10ms Back End DBMS BI 100ms 200ms 1-2 min yes, single server…
  • 12. 12Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 200ms 1-2 min More users got workstations
  • 13. 13Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 400ms 800ms 1-2 min
  • 14. 14Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 400ms 800ms 1-2 min Split!
  • 15. 15Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 300ms 600ms 1-2 min
  • 16. 16Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 300ms 600ms 1-2 min Even more users?
  • 17. 17Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 300ms 600ms 1-2 min Split!
  • 18. 18Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 400ms 1-2 min Front End Back End Front End Back End
  • 19. 19Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 400ms 1-2 min Front End Back End Front End Back End What about automated systems?
  • 20. 20Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 1 sec 5-10 min Front End Back End Front End Back End Front End Back End Front End Back End
  • 21. 21Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 1 sec 5-10 min Front End Back End Front End Back End Front End Back End Front End Back End Database, please, live!
  • 22. 22Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 1 sec 5-10 min Front End Back End Front End Back End Front End Back End Front End Back End
  • 23. 23Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 800ms 15-20 min Front End Back End Front End Back End Front End Back End Front End Back End
  • 24. 24Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 800ms 15-20 min Front End Back End Front End Back End Front End Back End Front End Back End What if “split” didn’t help this time?
  • 25. 25Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 800ms 15-20 min Front End Back End Front End Back End Front End Back End Front End Back End Split more! Eventually it will help…
  • 26. 26Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 300ms 35-40 min Front End Back End Front End Back End Front End Back End Front End Back End DBMS DBMSDBMSDBMS
  • 27. 27Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 300ms 35-40 min Front End Back End Front End Back End Front End Back End Front End Back End DBMS DBMSDBMSDBMS
  • 28. 28Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 300ms 35-40 min Front End Back End Front End Back End Front End Back End Front End Back End DBMS DBMSDBMSDBMS Sales went 10% up!
  • 29. 29Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 300ms 35-40 min Front End Back End Front End Back End Front End Back End Front End Back End DBMS DBMSDBMSDBMS Sales went 10% up! Sales went 20% down!
  • 30. 30Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 600ms 2-3 hrs Front End Back End Front End Back End Front End Back End Front End Back End DBMS DBMSDBMSDBMS Sales went 10% up! Sales went 20% down!
  • 31. 31Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 600ms 2-3 hrs Front End Back End Front End Back End Front End Back End Front End Back End DBMS DBMSDBMSDBMS Sales went 10% up! Sales went 20% down! Stop loading my system with your stupid reports!
  • 32. 32Pivotal Confidential–Internal Use Only BI The Era of Data Warehouse 100ms DBMS 300ms 2 days FE BE DBMS DBMSDBMSDBMS FE BE FE BE FE BE FE BE ETL DWH 1 day
  • 33. 33Pivotal Confidential–Internal Use Only BI The Era of Data Warehouse 100ms DBMS 300ms 2 days FE BE DBMS DBMSDBMSDBMS FE BE FE BE FE BE FE BE ETL DWH 1 day We need more reports!
  • 34. 34Pivotal Confidential–Internal Use Only BI The Era of Data Warehouse 100ms DBMS 300ms 3-4 days FE BE DBMS DBMSDBMSDBMS FE BE FE BE FE BE FE BE ETL DWH 1 day Data Mining OLAP…
  • 35. 35Pivotal Confidential–Internal Use Only BI The Era of Data Warehouse 100ms DBMS 300ms 3-4 days FE BE DBMS DBMSDBMSDBMS FE BE FE BE FE BE FE BE ETL DWH 1 day Data Mining OLAP… We need secondary site!
  • 36. 36Pivotal Confidential–Internal Use Only The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP…
  • 37. 37Pivotal Confidential–Internal Use Only The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE WAL Replication 3-5 minutes late
  • 38. 38Pivotal Confidential–Internal Use Only The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE WAL Replication 3-5 minutes late
  • 39. 39Pivotal Confidential–Internal Use Only The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE WAL Replication 3-5 minutes late Where is our DWH? We need this data now!
  • 40. 40Pivotal Confidential–Internal Use Only The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE WAL Replication 3-5 minutes late
  • 41. 41Pivotal Confidential–Internal Use Only ETL The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late DWH BI Data Mining OLAP… 5-7 days DBMS DBMS DBMS DBMS DBMS
  • 42. 42Pivotal Confidential–Internal Use Only ETL The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late DWH BI Data Mining OLAP… 5-7 days DBMS DBMS DBMS DBMS DBMS Why is this data so old?
  • 43. 43Pivotal Confidential–Internal Use Only ETL The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late DWH BI Data Mining OLAP… 5-7 days DBMS DBMS DBMS DBMS DBMS
  • 44. 44Pivotal Confidential–Internal Use Only ETL Advanced Architecture – ELT 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late DWH BI Data Mining OLAP… 5-7 days DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS… ETL DDS Data Marts Reports Aggregates OLAP DBMS DBMS DBMS… ELT DDS Data Marts Reports Aggregates OLAP ODS ODS ODS…
  • 45. 45Pivotal Confidential–Internal Use Only ELT Advanced Architecture – ELT 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 1 day BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late DWH BI Data Mining OLAP… 5-7 days DBMS DBMS DBMS DBMS DBMS
  • 46. 46Pivotal Confidential–Internal Use Only ELT Advanced Architecture – CDC 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 1 day BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late DWH BI Data Mining OLAP… 5-7 days DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS… ELT DDS Data Marts Reports Aggregates OLAP ODS ODS ODS… DBMS DBMS DBMS… ELT DDS Data Marts Reports Aggregates OLAP ODS ODS ODS… CDC 1 day 1 hour
  • 47. 47Pivotal Confidential–Internal Use Only ELT CDC Advanced Architecture – CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH
  • 48. 48Pivotal Confidential–Internal Use Only ELT CDC Advanced Architecture – CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Why is our secondary site’s DWH so old?
  • 49. 49Pivotal Confidential–Internal Use Only ELT CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Moving Forward
  • 50. 50Pivotal Confidential–Internal Use Only ELT CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Our problems are Moving Forward
  • 51. 51Pivotal Confidential–Internal Use Only ELT CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Our problems are  Time to action takes up to 7 days Moving Forward
  • 52. 52Pivotal Confidential–Internal Use Only ELT CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Our problems are  Time to action takes up to 7 days  Amount of data is growing Moving Forward
  • 53. 53Pivotal Confidential–Internal Use Only ELT CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Our problems are  Time to action takes up to 7 days  Amount of data is growing  DWH MPP storage is expensive Moving Forward
  • 54. 54Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Our problems are  Time to action takes up to 7 days  Amount of data is growing  DWH MPP storage is expensive Data Lake
  • 55. 55Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Our problems are  Time to action takes up to 7 days  Amount of data is growing  DWH MPP storage is expensive Lambda Data Lake
  • 56. 56Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures – Data Lake 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Hadoop DBMS DBMS DBMS… ELT DDS OLAP Data Marts Aggregates Reports ODS ODS ODS… CDC DWH ODS UDS Analytical Archives BI Data Mining OLAP SQL-on-Hadoop Data Mining At Scale
  • 57. 57Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures – Data Lake 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH
  • 58. 58Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures – Data Lake 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late Data Mining BI OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ?
  • 59. 59Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures – Lambda 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late Data Mining BI OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? Source Data Speed Layer Batch Layer Serving Layer Query Query Master Dataset Batch View Batch View Batch View Real-time View Real-time View Real-time View
  • 60. 60Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures – Lambda 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late Data Mining BI OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ?
  • 61. 61Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures – Lambda 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining
  • 62. 62Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Our problems are
  • 63. 63Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Our problems are  Too many standby systems
  • 64. 64Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Our problems are  Too many standby systems  How to replicate Hadoop cluster?
  • 65. 65Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Our problems are  Too many standby systems  How to replicate Hadoop cluster?  How to sync data in real-time systems?
  • 66. 66Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Our problems are  Too many standby systems  How to replicate Hadoop cluster?  How to sync data in real-time systems?  How to better sync DWH?
  • 67. 67Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Our problems are  Too many standby systems  How to replicate Hadoop cluster?  How to sync data in real-time systems?  How to better sync DWH? Pipelining
  • 68. 68Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining
  • 69. 69Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP
  • 70. 70Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP
  • 71. 71Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Table
  • 72. 72Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table
  • 73. 73Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table CDC copy Parse Batch
  • 74. 74Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table CDC copy Parse Batch ETL cp Batch ETL
  • 75. 75Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table CDC copy Parse Batch ETL cp Batch ETL load ODS DWH
  • 76. 76Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table CDC copy Parse Batch ETL cp Batch ETL load ODS DDS DWH
  • 77. 77Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table CDC copy Parse Batch ETL cp Batch ETL load ODS DDS DataMart DWH
  • 78. 78Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table CDC copy Parse Batch ETL cp Batch ETL load ODS DDS DataMart DWH JDBC
  • 79. 79Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch ETL cp Batch ETL ODS DDS DataMart DWH JDBC
  • 80. 80Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch load ODS DDS DataMart DWH JDBC API Queue ETL ETLBatch
  • 81. 81Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch load ODS DDS DataMart DWH JDBC API Queue ETL ETLBatch loadETL
  • 82. 82Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch load ODS DDS DataMart DWH JDBC API Queue ETL ETLBatchApp ETLBatch load loadETL
  • 83. 83Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch load ODS DDS DataMart DWH JDBC API Queue ETL ETLBatchApp ETLBatch load loadETL STG BatchApp Hadoop HDFS SQL On Hadoop
  • 84. 84Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch load ODS DDS DataMart DWH JDBC API Queue ETL ETLBatchApp ETLBatch load loadETL STG BatchApp Hadoop HDFS SQL On Hadoop RTI App
  • 85. 85Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch load ODS DDS DataMart DWH JDBC API Queue ETL ETLBatchApp ETLBatch load loadETL STG BatchApp Hadoop HDFS SQL On Hadoop RTI AppReplicate
  • 86. 86Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining
  • 87. 87Pivotal Confidential–Internal Use Only ELT CDC FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH OLAP Data Mining RTBI… FE BE FE BE FE BE CDC Hadoop In-Memory Data Store BI Modern Data Architecture – Pipelining Replication Queue 3-5 minutes late In-Memory Data Store OLAP… DWHHadoop BI Data Mining RTBI DBMS DBMS DBMSWAL Replication 3-5 minutes late
  • 88. 88Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Pivotal GemFire App Spring XD Streaming Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data MartPostgreSQL SP Table ODS ETL ETL
  • 89. 89Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI HTTP Pivotal GemFire App Spring XD Streaming Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data MartPostgreSQL SP Table ODS ETL ETL Pivotal Cloud Foundry FE … App App App Queue BE … App App App  Pivotal Labs – agile software development for next-generation applications  Pivotal Cloud Foundry – PaaS for customer applications  RabbitMQ – distributed message queue service on top of PCF  Spring IO – foundation platform for modern applications
  • 90. 90Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Spring XD Streaming Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data MartPostgreSQL SP Table ODS ETL ETL Pivotal GemFire App Pivotal GemFire and Apache Geode (incubating) – in-memory data grid enabling real-time data processing and real-time decision making for enterprises
  • 91. 91Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Pivotal GemFire App Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data MartPostgreSQL SP Table ODS ETL ETL Spring XD Streaming Spring XD – unified, distributed and extensible framework for data pipelining: ingesting, batching, processing and exporting
  • 92. 92Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Pivotal GemFire App Spring XD Streaming ES DDS DataMart Pivotal Greenplum PostgreSQL SP Table ODS ETL ETL Streaming Data Pivotal HD Pivotal HAWQ Data Mart  Pivotal HD – leading Hadoop distribution based on ODP  Pivotal HAWQ and Apache HAWQ (incubating) – bringing the power of MPP to the Hadoop cluster, best in class SQL-on- Hadoop solution  Apache Spark – component of the Pivotal HD distribution, modern framework for distributed data processing
  • 93. 93Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Pivotal GemFire App Spring XD Streaming Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data Mart ODS ETL ETL PostgreSQL SP Table  Pivotal PostgreSQL – commercially supported by Pivotal open source distribution of PostgreSQL
  • 94. 94Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Pivotal GemFire App Spring XD Streaming Streaming Data Pivotal HD Pivotal HAWQ Data MartPostgreSQL SP Table ETL ETL ES DDS DataMart Pivotal Greenplum ODS Pivotal Greenplum – leading analytical MPP database, foundation for the enterprise data warehousing systems and advanced analytics
  • 95. 95Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture Pivotal GemFire App Spring XD Streaming BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data MartPostgreSQL SP Table ODS ETL ETL Data Lake
  • 96. 96Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Spring XD Streaming ES DDS DataMart Pivotal Greenplum PostgreSQL SP Table ODS ETL ETL Pivotal GemFire App Streaming Data Pivotal HD Pivotal HAWQ Data Mart BI Lambda Architecture
  • 97. 97Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture ES DDS DataMart Pivotal Greenplum PostgreSQL SP Table ODS ETL ETL Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Streaming Pivotal HD BI Pivotal GemFire App Spring XD Streaming Data Pivotal HAWQ Data Mart Pipelining
  • 98. 98Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Pivotal GemFire App Spring XD Streaming Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data MartPostgreSQL SP Table ODS ETL ETL
  • 99. 99Pivotal Confidential–Internal Use Only 99Pivotal Confidential–Internal Use Only Questions?
  • 100. BUILT FOR THE SPEED OF BUSINESS