SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
DataDevOps:
CREATING A DATA & 

ANALYTICS CULTURE
AT SCALE
PL
S
RUS
UA
RO
CZ
D
NL
B
F
A
HR
I
E
BG
TR
18
COUNTRIES
2.4M+
CARS & MOTOS
10M+
USERS PER
MONTH
ROAD TO
MICROSERVICE
ARCHITECTURE
DataDevOps | Sean Gustafson & Arif Wider
HOW WE STARTED IN 2007
BI TOOL
2007
MIDDLE TIERWEB CORE DB
CRM
DWH
ANALYST
BI DEV
STAGING
DataDevOps | Sean Gustafson & Arif Wider
APP
HOW THINGS GOT COMPLICATED IN 2011
BI TOOL
2011
$$$
APP
API
MIDDLE TIER
WEB
CORE DB
CRM
DWH
MYSQL
ANALYST
BI DEV
STAGING
2007
DataDevOps | Sean Gustafson & Arif Wider
HADOOP
REST API
APP
MYSQL
BI TOOL
CORE DB
CRM
DWH
MYSQL
ANALYST
BI DEV
STAGING
HOW WE SLICED THE MONOLITH IN 2013
2013
APP
APP
WEB
SEA
API
API
API
API
SYNC
ELASTIC
DE
2011
2007
DataDevOps | Sean Gustafson & Arif Wider
AWS
APP
HADOOP
REST API
BI TOOL
CORE DB
CRM
DWH
ANALYST
BI DEV
STAGING
APP
WEB
EXP SEA
API
API
API
API
SYNC
MONGO ELASTIC
DE
HOW A CENTRAL DATA TEAM DOESN’T SCALE
APP APP
2015
2013
2011
2007
APP
APP
APP
MYSQL
MYSQL
MYSQL
MYSQL APP
DataDevOps | Sean Gustafson & Arif Wider
AWS
CENTRAL DATA LAKE ON S3
AWS
HOW WE REARCHITECTURED OUR DATA LANDSCAPE
2017
BI DEV
DE
CORE DB
CORE DBAPI
API
API
CRM
APP
APP
APP
APP
APP
APP
APP
APP
REST API
2015
2013
2011
2007
BI TOOL
DWH
ANALYST
SCOUT24 WANTS 

TO BECOME A TRULY
DATA-DRIVEN COMPANY
DataDevOps | Sean Gustafson & Arif Wider
AC
T
PL
AN
D
O
CH
ECK
Fast & easy data-driven

product development…
…supported by 

Data & Analytics
DataDevOps | Sean Gustafson & Arif Wider
Everywhere in the company... ...without bloating up D‘n‘A
AC
T
PL
AN
D
O
CH
ECK
AC
T
PL
AN
D
O
CH
ECK
AC
T
PL
AN
D
O
CH
ECK
AC
T
PL
AN
D
O
CH
ECK
AC
T
PL
AN
D
O
CH
ECK
AC
T
PL
AN
D
O
CH
ECK
AC
T
PL
AN
D
O
CH
ECK
AC
T
PL
AN
D
O
CH
ECK
AC
T
PL
AN
D
O
CH
ECK
AC
T
PL
ANDO
CH
ECK
SCOUT24

DATA LANDSCAPE MANIFESTO
ROLES, RESPONSIBILITIES, AND VALUES 

FOR A DATA-DRIVEN COMPANY AT SCALE
DataDevOps | Sean Gustafson & Arif Wider
SCOUT24 DATA LANDSCAPE MANIFESTO
#1 Preamble
Data is a key asset of our company.
SCOUT24 DATA LANDSCAPE MANIFESTO
DataDevOps | Sean Gustafson & Arif Wider
#2 Central Data Team’s Responsibility
We, Data & Analytics, are responsible for
providing a solid Data Platform as well
as clear guidelines and training how to
participate in the Data Landscape. 

SCOUT24 DATA LANDSCAPE MANIFESTO
D’n’A
DATA PLATFORM
DATA LANDSCAPE
DataDevOps | Sean Gustafson & Arif Wider
SCOUT24 DATA LANDSCAPE MANIFESTO
#3 Data Autonomy, Not Anarchy
Data autonomy puts data producers &
data consumers in control of their data
& of their metrics and thereby allows
us to be data-driven at scale, but this
comes with responsibility.
SCOUT24 DATA LANDSCAPE MANIFESTO
M
ETRIC
CONSUMER
D’n’A
DATA PLATFORM
DATA LANDSCAPE
DATA
PRODUCER
DataDevOps | Sean Gustafson & Arif Wider
AWSCENTRAL DATA LAKE ON S3
ROLES & RESPONSIBILITIES
DATA CATALOG
D’n’A
CHECKOUT

SERVICE
PRODUCER
SPECIAL

OFFER

SERVICE
CONSUMER
DataDevOps | Sean Gustafson & Arif Wider
SCOUT24 DATA LANDSCAPE MANIFESTO
#4 Producer’s Responsibility
Data producers are responsible for
publishing data to the central Data
Lake, for the data's quality, and for
publishing metadata that makes it
easy to find and consume the data.
SCOUT24 DATA LANDSCAPE MANIFESTO
D’n’A
DATA PLATFORM
DATA LANDSCAPE
DATA
METADATA
PRODUCER
DataDevOps | Sean Gustafson & Arif Wider
AWSCENTRAL DATA LAKE ON S3
ROLES & RESPONSIBILITIES
DATA CATALOG
D’n’A
ORDER EVENTS
EVENT METADATA
CHECKOUT

SERVICE
PRODUCER
SPECIAL

OFFER

SERVICE
CONSUMER
DataDevOps | Sean Gustafson & Arif Wider
AWSCENTRAL DATA LAKE ON S3
ROLES & RESPONSIBILITIES
ORDER EVENTS
EVENT METADATA
CHECKOUT

SERVICE
DATA CATALOG
PRODUCER
D’n’A
INGESTION TEMPLATE
SPECIAL

OFFER

SERVICE
CONSUMER
DataDevOps | Sean Gustafson & Arif Wider
SCOUT24 DATA LANDSCAPE MANIFESTO
#5 Consumer’s Responsibility
Data consumers are responsible for the
definition & visualization of metrics and
for driving the implementation and
maintenance of these metrics.
SCOUT24 DATA LANDSCAPE MANIFESTO
M
ETRIC
CONSUMER
D’n’A
DATA PLATFORM
DATA LANDSCAPE
PRODUCER
DataDevOps | Sean Gustafson & Arif Wider
AWSCENTRAL DATA LAKE ON S3
ROLES & RESPONSIBILITIES
ORDER EVENTS
EVENT METADATA
CHECKOUT

SERVICE
DATA CATALOG
PRODUCER
SPECIAL

OFFER

SERVICE
CONSUMER
D’n’A
INGESTION TEMPLATE VIEW: ORDER HISTORY BY USER
DataDevOps | Sean Gustafson & Arif Wider
SCOUT24 DATA LANDSCAPE MANIFESTO
#6 Transparency Over Continuity
We value data transparency over data
continuity, which means we may
break metric comparability if it is for
the cause of enabling better insights.
SCOUT24 DATA LANDSCAPE MANIFESTO
M
ETRIC
CONSUMER
D’n’A
DATA PLATFORM
CORE
METRIC
DATA LANDSCAPE
PRODUCER
DataDevOps | Sean Gustafson & Arif Wider
SCOUT24 DATA LANDSCAPE MANIFESTO
The Ultimate Goal
SCOUT24 DATA LANDSCAPE MANIFESTO
A federal landscape of data producers
and consumers with just enough norms
to ensure seamless co-operation
without severely impeding autonomy. M
ETRIC
CONSUMER
D’n’A
DATA PLATFORM
DATA LANDSCAPE
DATA
METADATA
PRODUCER
DataDevOps | Sean Gustafson & Arif Wider
Consequences for Product Teams
‣ Think about data & reporting
‣ Deliver your data to the lake
‣ Provide meta data
‣ Eat your own dog food: Consume your own data
DataDevOps | Sean Gustafson & Arif Wider
Benefits for Product Teams
‣ Independently work with data
‣ No dependencies to data teams
‣ It’s easy to consume data produced by other teams
‣ Faster product & measurement iterations
DevOps
#DataDevOps
DataDevOps | Sean Gustafson & Arif Wider
How to convince everyone to go along?
‘Nudge’ them to participate
Promote the platform
Refuse new use cases in Data Warehouse
DataDevOps | Sean Gustafson & Arif Wider
Design ‘nudges’ into the Platform
Make Data Lake easier than something else:
‣ automatic table publishing, partition detection
‣ backup and disaster recovery
‣ access control for restricted data
‣ optimize file formats (e.g. parquet) for efficiency
DataDevOps | Sean Gustafson & Arif Wider
Learnings and lessons
‣ Change needs to be technological, organizational and cultural
‣ Build features to give benefits that counteract resistance
‣ Communication is the key
DataDevOps | Sean Gustafson & Arif Wider
MOST IMPORTANTLY:
Have a strong opinion about how the
company should use data and build a
platform that pushes toward that vision.
Dr. Sean Gustafson & Dr. Arif Wider
  @seangustafson           @arifwider
THANKS

Contenu connexe

Tendances

Intergen Convergence 2017 - Keeping safe, staying safe
Intergen Convergence 2017 - Keeping safe, staying safeIntergen Convergence 2017 - Keeping safe, staying safe
Intergen Convergence 2017 - Keeping safe, staying safeIntergen
 
Intergen Convergence 2017 - Data as your most important asset
Intergen Convergence 2017 - Data as your most important assetIntergen Convergence 2017 - Data as your most important asset
Intergen Convergence 2017 - Data as your most important assetIntergen
 
Welcome and Insights from ADAPT
Welcome and Insights from ADAPTWelcome and Insights from ADAPT
Welcome and Insights from ADAPTBen Turner
 
Data Patterns- Andrew Jones (By ThoughtWorks)
Data Patterns- Andrew Jones  (By ThoughtWorks) Data Patterns- Andrew Jones  (By ThoughtWorks)
Data Patterns- Andrew Jones (By ThoughtWorks) Thoughtworks
 
Digital platforms- Shaping the economy, business and organisation of the futu...
Digital platforms- Shaping the economy, business and organisation of the futu...Digital platforms- Shaping the economy, business and organisation of the futu...
Digital platforms- Shaping the economy, business and organisation of the futu...Thoughtworks
 
IT Agility How to Enable Workforce and Workspace Ttransformation
IT Agility How to Enable Workforce and Workspace TtransformationIT Agility How to Enable Workforce and Workspace Ttransformation
IT Agility How to Enable Workforce and Workspace TtransformationBen Turner
 
Intergen Convergence 2017 - The future is here
Intergen Convergence 2017 - The future is hereIntergen Convergence 2017 - The future is here
Intergen Convergence 2017 - The future is hereIntergen
 
Lean Enterprise Finding Your Innovation Focus AWS Summit SG 2017
Lean Enterprise Finding Your Innovation Focus  AWS Summit SG 2017Lean Enterprise Finding Your Innovation Focus  AWS Summit SG 2017
Lean Enterprise Finding Your Innovation Focus AWS Summit SG 2017Amazon Web Services
 
Intergen Convergence 2017 - Unleash your digital enterprise
Intergen Convergence 2017 - Unleash your digital enterpriseIntergen Convergence 2017 - Unleash your digital enterprise
Intergen Convergence 2017 - Unleash your digital enterpriseIntergen
 
The New Reality: the Role of PaaS in Technology Innovation - Franklin Herbas
The New Reality: the Role of PaaS in Technology Innovation - Franklin HerbasThe New Reality: the Role of PaaS in Technology Innovation - Franklin Herbas
The New Reality: the Role of PaaS in Technology Innovation - Franklin Herbasjaxconf
 
NetApp at Gartner Symposium Show Guide
NetApp at Gartner Symposium Show GuideNetApp at Gartner Symposium Show Guide
NetApp at Gartner Symposium Show GuideNetAppUK
 
Operating in a Multi-execution Engine Hadoop Environment by Erik Halseth of D...
Operating in a Multi-execution Engine Hadoop Environment by Erik Halseth of D...Operating in a Multi-execution Engine Hadoop Environment by Erik Halseth of D...
Operating in a Multi-execution Engine Hadoop Environment by Erik Halseth of D...Data Con LA
 
XebiCon'17 : Les piliers technologiques de Renault Digital - Guillaume Pinot ...
XebiCon'17 : Les piliers technologiques de Renault Digital - Guillaume Pinot ...XebiCon'17 : Les piliers technologiques de Renault Digital - Guillaume Pinot ...
XebiCon'17 : Les piliers technologiques de Renault Digital - Guillaume Pinot ...Publicis Sapient Engineering
 
Dave Elliman - Applying Continuous Intelligence ThoughtWorks Live UK 2018
Dave Elliman - Applying Continuous Intelligence ThoughtWorks Live UK 2018Dave Elliman - Applying Continuous Intelligence ThoughtWorks Live UK 2018
Dave Elliman - Applying Continuous Intelligence ThoughtWorks Live UK 2018Thoughtworks
 
PlanGrid overview
PlanGrid overviewPlanGrid overview
PlanGrid overviewPlanGrid
 
DOES SFO 2016 - Aimee Bechtle - Utilizing Distributed Dojos to Transform a Wo...
DOES SFO 2016 - Aimee Bechtle - Utilizing Distributed Dojos to Transform a Wo...DOES SFO 2016 - Aimee Bechtle - Utilizing Distributed Dojos to Transform a Wo...
DOES SFO 2016 - Aimee Bechtle - Utilizing Distributed Dojos to Transform a Wo...Gene Kim
 
Big data analytics overview
Big data analytics overviewBig data analytics overview
Big data analytics overviewWise Men
 
SAP's best kept secret
SAP's best kept secretSAP's best kept secret
SAP's best kept secretChris Hendry
 
Consumer Oriented Service Delivery Models - Success Connect
Consumer Oriented Service Delivery Models - Success ConnectConsumer Oriented Service Delivery Models - Success Connect
Consumer Oriented Service Delivery Models - Success ConnectAccenture Technology
 

Tendances (20)

Intergen Convergence 2017 - Keeping safe, staying safe
Intergen Convergence 2017 - Keeping safe, staying safeIntergen Convergence 2017 - Keeping safe, staying safe
Intergen Convergence 2017 - Keeping safe, staying safe
 
Intergen Convergence 2017 - Data as your most important asset
Intergen Convergence 2017 - Data as your most important assetIntergen Convergence 2017 - Data as your most important asset
Intergen Convergence 2017 - Data as your most important asset
 
Welcome and Insights from ADAPT
Welcome and Insights from ADAPTWelcome and Insights from ADAPT
Welcome and Insights from ADAPT
 
Data Patterns- Andrew Jones (By ThoughtWorks)
Data Patterns- Andrew Jones  (By ThoughtWorks) Data Patterns- Andrew Jones  (By ThoughtWorks)
Data Patterns- Andrew Jones (By ThoughtWorks)
 
Digital platforms- Shaping the economy, business and organisation of the futu...
Digital platforms- Shaping the economy, business and organisation of the futu...Digital platforms- Shaping the economy, business and organisation of the futu...
Digital platforms- Shaping the economy, business and organisation of the futu...
 
IT Agility How to Enable Workforce and Workspace Ttransformation
IT Agility How to Enable Workforce and Workspace TtransformationIT Agility How to Enable Workforce and Workspace Ttransformation
IT Agility How to Enable Workforce and Workspace Ttransformation
 
Intergen Convergence 2017 - The future is here
Intergen Convergence 2017 - The future is hereIntergen Convergence 2017 - The future is here
Intergen Convergence 2017 - The future is here
 
Lean Enterprise Finding Your Innovation Focus AWS Summit SG 2017
Lean Enterprise Finding Your Innovation Focus  AWS Summit SG 2017Lean Enterprise Finding Your Innovation Focus  AWS Summit SG 2017
Lean Enterprise Finding Your Innovation Focus AWS Summit SG 2017
 
Intergen Convergence 2017 - Unleash your digital enterprise
Intergen Convergence 2017 - Unleash your digital enterpriseIntergen Convergence 2017 - Unleash your digital enterprise
Intergen Convergence 2017 - Unleash your digital enterprise
 
Future ready
Future readyFuture ready
Future ready
 
The New Reality: the Role of PaaS in Technology Innovation - Franklin Herbas
The New Reality: the Role of PaaS in Technology Innovation - Franklin HerbasThe New Reality: the Role of PaaS in Technology Innovation - Franklin Herbas
The New Reality: the Role of PaaS in Technology Innovation - Franklin Herbas
 
NetApp at Gartner Symposium Show Guide
NetApp at Gartner Symposium Show GuideNetApp at Gartner Symposium Show Guide
NetApp at Gartner Symposium Show Guide
 
Operating in a Multi-execution Engine Hadoop Environment by Erik Halseth of D...
Operating in a Multi-execution Engine Hadoop Environment by Erik Halseth of D...Operating in a Multi-execution Engine Hadoop Environment by Erik Halseth of D...
Operating in a Multi-execution Engine Hadoop Environment by Erik Halseth of D...
 
XebiCon'17 : Les piliers technologiques de Renault Digital - Guillaume Pinot ...
XebiCon'17 : Les piliers technologiques de Renault Digital - Guillaume Pinot ...XebiCon'17 : Les piliers technologiques de Renault Digital - Guillaume Pinot ...
XebiCon'17 : Les piliers technologiques de Renault Digital - Guillaume Pinot ...
 
Dave Elliman - Applying Continuous Intelligence ThoughtWorks Live UK 2018
Dave Elliman - Applying Continuous Intelligence ThoughtWorks Live UK 2018Dave Elliman - Applying Continuous Intelligence ThoughtWorks Live UK 2018
Dave Elliman - Applying Continuous Intelligence ThoughtWorks Live UK 2018
 
PlanGrid overview
PlanGrid overviewPlanGrid overview
PlanGrid overview
 
DOES SFO 2016 - Aimee Bechtle - Utilizing Distributed Dojos to Transform a Wo...
DOES SFO 2016 - Aimee Bechtle - Utilizing Distributed Dojos to Transform a Wo...DOES SFO 2016 - Aimee Bechtle - Utilizing Distributed Dojos to Transform a Wo...
DOES SFO 2016 - Aimee Bechtle - Utilizing Distributed Dojos to Transform a Wo...
 
Big data analytics overview
Big data analytics overviewBig data analytics overview
Big data analytics overview
 
SAP's best kept secret
SAP's best kept secretSAP's best kept secret
SAP's best kept secret
 
Consumer Oriented Service Delivery Models - Success Connect
Consumer Oriented Service Delivery Models - Success ConnectConsumer Oriented Service Delivery Models - Success Connect
Consumer Oriented Service Delivery Models - Success Connect
 

Similaire à Data DevOps - Arif Wider and Sean Gustafson (ThoughtWorks Live)

DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & AnalyticsDataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & AnalyticsDr. Arif Wider
 
The Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
The Scout24 Data Landscape Manifesto: Building an Opinionated Data PlatformThe Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
The Scout24 Data Landscape Manifesto: Building an Opinionated Data PlatformRising Media Ltd.
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"MDS ap
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
 
Taking Complexity Out of Data Science with AWS and Zoomdata PPT
Taking Complexity Out of Data Science with AWS and Zoomdata PPTTaking Complexity Out of Data Science with AWS and Zoomdata PPT
Taking Complexity Out of Data Science with AWS and Zoomdata PPTAmazon Web Services
 
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...Rainer Sternfeld
 
SAP Data Hub e SUSE Container as a Service Platform
SAP Data Hub e SUSE Container as a Service PlatformSAP Data Hub e SUSE Container as a Service Platform
SAP Data Hub e SUSE Container as a Service PlatformSUSE Italy
 
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...MapR Technologies
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Data Driven Decisions at Scale
Data Driven Decisions at ScaleData Driven Decisions at Scale
Data Driven Decisions at ScaleDatabricks
 
DataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
DataDevOps - A Manifesto on Shared Data Responsibility in Times of MicroservicesDataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
DataDevOps - A Manifesto on Shared Data Responsibility in Times of MicroservicesDr. Arif Wider
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkDatabricks
 
Growth hacking in the age of Data
Growth hacking in the age of DataGrowth hacking in the age of Data
Growth hacking in the age of DataDaniel Saito
 
Future of Enterprise PaaS (Cloud Foundry Summit 2014)
 Future of Enterprise PaaS (Cloud Foundry Summit 2014) Future of Enterprise PaaS (Cloud Foundry Summit 2014)
Future of Enterprise PaaS (Cloud Foundry Summit 2014)VMware Tanzu
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...BigDataEverywhere
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixNicolas Morales
 
Indexing the Real World Sensor Networks (at RE.WORK Internet of Things Summit...
Indexing the Real World Sensor Networks (at RE.WORK Internet of Things Summit...Indexing the Real World Sensor Networks (at RE.WORK Internet of Things Summit...
Indexing the Real World Sensor Networks (at RE.WORK Internet of Things Summit...Rainer Sternfeld
 

Similaire à Data DevOps - Arif Wider and Sean Gustafson (ThoughtWorks Live) (20)

DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & AnalyticsDataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
 
The Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
The Scout24 Data Landscape Manifesto: Building an Opinionated Data PlatformThe Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
The Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Taking Complexity Out of Data Science with AWS and Zoomdata PPT
Taking Complexity Out of Data Science with AWS and Zoomdata PPTTaking Complexity Out of Data Science with AWS and Zoomdata PPT
Taking Complexity Out of Data Science with AWS and Zoomdata PPT
 
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
 
SAP Data Hub e SUSE Container as a Service Platform
SAP Data Hub e SUSE Container as a Service PlatformSAP Data Hub e SUSE Container as a Service Platform
SAP Data Hub e SUSE Container as a Service Platform
 
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Big Data - What the Heck?
Big Data - What the Heck?Big Data - What the Heck?
Big Data - What the Heck?
 
What the Heck is Big Data?
What the Heck is Big Data?What the Heck is Big Data?
What the Heck is Big Data?
 
Data Driven Decisions at Scale
Data Driven Decisions at ScaleData Driven Decisions at Scale
Data Driven Decisions at Scale
 
DataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
DataDevOps - A Manifesto on Shared Data Responsibility in Times of MicroservicesDataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
DataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
 
Growth hacking in the age of Data
Growth hacking in the age of DataGrowth hacking in the age of Data
Growth hacking in the age of Data
 
Future of Enterprise PaaS (Cloud Foundry Summit 2014)
 Future of Enterprise PaaS (Cloud Foundry Summit 2014) Future of Enterprise PaaS (Cloud Foundry Summit 2014)
Future of Enterprise PaaS (Cloud Foundry Summit 2014)
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with Bluemix
 
Indexing the Real World Sensor Networks (at RE.WORK Internet of Things Summit...
Indexing the Real World Sensor Networks (at RE.WORK Internet of Things Summit...Indexing the Real World Sensor Networks (at RE.WORK Internet of Things Summit...
Indexing the Real World Sensor Networks (at RE.WORK Internet of Things Summit...
 

Plus de Thoughtworks

Design System as a Product
Design System as a ProductDesign System as a Product
Design System as a ProductThoughtworks
 
Designers, Developers & Dogs
Designers, Developers & DogsDesigners, Developers & Dogs
Designers, Developers & DogsThoughtworks
 
Cloud-first for fast innovation
Cloud-first for fast innovationCloud-first for fast innovation
Cloud-first for fast innovationThoughtworks
 
More impact with flexible teams
More impact with flexible teamsMore impact with flexible teams
More impact with flexible teamsThoughtworks
 
Culture of Innovation
Culture of InnovationCulture of Innovation
Culture of InnovationThoughtworks
 
Developer Experience
Developer ExperienceDeveloper Experience
Developer ExperienceThoughtworks
 
When we design together
When we design togetherWhen we design together
When we design togetherThoughtworks
 
Hardware is hard(er)
Hardware is hard(er)Hardware is hard(er)
Hardware is hard(er)Thoughtworks
 
Customer-centric innovation enabled by cloud
 Customer-centric innovation enabled by cloud Customer-centric innovation enabled by cloud
Customer-centric innovation enabled by cloudThoughtworks
 
Amazon's Culture of Innovation
Amazon's Culture of InnovationAmazon's Culture of Innovation
Amazon's Culture of InnovationThoughtworks
 
When in doubt, go live
When in doubt, go liveWhen in doubt, go live
When in doubt, go liveThoughtworks
 
Don't cross the Rubicon
Don't cross the RubiconDon't cross the Rubicon
Don't cross the RubiconThoughtworks
 
Your test coverage is a lie!
Your test coverage is a lie!Your test coverage is a lie!
Your test coverage is a lie!Thoughtworks
 
Docker container security
Docker container securityDocker container security
Docker container securityThoughtworks
 
Redefining the unit
Redefining the unitRedefining the unit
Redefining the unitThoughtworks
 
Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22Thoughtworks
 
A Tribute to Turing
A Tribute to TuringA Tribute to Turing
A Tribute to TuringThoughtworks
 
Rsa maths worked out
Rsa maths worked outRsa maths worked out
Rsa maths worked outThoughtworks
 

Plus de Thoughtworks (20)

Design System as a Product
Design System as a ProductDesign System as a Product
Design System as a Product
 
Designers, Developers & Dogs
Designers, Developers & DogsDesigners, Developers & Dogs
Designers, Developers & Dogs
 
Cloud-first for fast innovation
Cloud-first for fast innovationCloud-first for fast innovation
Cloud-first for fast innovation
 
More impact with flexible teams
More impact with flexible teamsMore impact with flexible teams
More impact with flexible teams
 
Culture of Innovation
Culture of InnovationCulture of Innovation
Culture of Innovation
 
Dual-Track Agile
Dual-Track AgileDual-Track Agile
Dual-Track Agile
 
Developer Experience
Developer ExperienceDeveloper Experience
Developer Experience
 
When we design together
When we design togetherWhen we design together
When we design together
 
Hardware is hard(er)
Hardware is hard(er)Hardware is hard(er)
Hardware is hard(er)
 
Customer-centric innovation enabled by cloud
 Customer-centric innovation enabled by cloud Customer-centric innovation enabled by cloud
Customer-centric innovation enabled by cloud
 
Amazon's Culture of Innovation
Amazon's Culture of InnovationAmazon's Culture of Innovation
Amazon's Culture of Innovation
 
When in doubt, go live
When in doubt, go liveWhen in doubt, go live
When in doubt, go live
 
Don't cross the Rubicon
Don't cross the RubiconDon't cross the Rubicon
Don't cross the Rubicon
 
Error handling
Error handlingError handling
Error handling
 
Your test coverage is a lie!
Your test coverage is a lie!Your test coverage is a lie!
Your test coverage is a lie!
 
Docker container security
Docker container securityDocker container security
Docker container security
 
Redefining the unit
Redefining the unitRedefining the unit
Redefining the unit
 
Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22
 
A Tribute to Turing
A Tribute to TuringA Tribute to Turing
A Tribute to Turing
 
Rsa maths worked out
Rsa maths worked outRsa maths worked out
Rsa maths worked out
 

Dernier

Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 

Dernier (20)

Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 

Data DevOps - Arif Wider and Sean Gustafson (ThoughtWorks Live)

  • 1. DataDevOps: CREATING A DATA & 
 ANALYTICS CULTURE AT SCALE
  • 4. DataDevOps | Sean Gustafson & Arif Wider HOW WE STARTED IN 2007 BI TOOL 2007 MIDDLE TIERWEB CORE DB CRM DWH ANALYST BI DEV STAGING
  • 5. DataDevOps | Sean Gustafson & Arif Wider APP HOW THINGS GOT COMPLICATED IN 2011 BI TOOL 2011 $$$ APP API MIDDLE TIER WEB CORE DB CRM DWH MYSQL ANALYST BI DEV STAGING 2007
  • 6. DataDevOps | Sean Gustafson & Arif Wider HADOOP REST API APP MYSQL BI TOOL CORE DB CRM DWH MYSQL ANALYST BI DEV STAGING HOW WE SLICED THE MONOLITH IN 2013 2013 APP APP WEB SEA API API API API SYNC ELASTIC DE 2011 2007
  • 7. DataDevOps | Sean Gustafson & Arif Wider AWS APP HADOOP REST API BI TOOL CORE DB CRM DWH ANALYST BI DEV STAGING APP WEB EXP SEA API API API API SYNC MONGO ELASTIC DE HOW A CENTRAL DATA TEAM DOESN’T SCALE APP APP 2015 2013 2011 2007 APP APP APP MYSQL MYSQL MYSQL MYSQL APP
  • 8. DataDevOps | Sean Gustafson & Arif Wider AWS CENTRAL DATA LAKE ON S3 AWS HOW WE REARCHITECTURED OUR DATA LANDSCAPE 2017 BI DEV DE CORE DB CORE DBAPI API API CRM APP APP APP APP APP APP APP APP REST API 2015 2013 2011 2007 BI TOOL DWH ANALYST
  • 9. SCOUT24 WANTS 
 TO BECOME A TRULY DATA-DRIVEN COMPANY
  • 10. DataDevOps | Sean Gustafson & Arif Wider AC T PL AN D O CH ECK Fast & easy data-driven
 product development… …supported by 
 Data & Analytics
  • 11. DataDevOps | Sean Gustafson & Arif Wider Everywhere in the company... ...without bloating up D‘n‘A AC T PL AN D O CH ECK AC T PL AN D O CH ECK AC T PL AN D O CH ECK AC T PL AN D O CH ECK AC T PL AN D O CH ECK AC T PL AN D O CH ECK AC T PL AN D O CH ECK AC T PL AN D O CH ECK AC T PL AN D O CH ECK AC T PL ANDO CH ECK
  • 12. SCOUT24
 DATA LANDSCAPE MANIFESTO ROLES, RESPONSIBILITIES, AND VALUES 
 FOR A DATA-DRIVEN COMPANY AT SCALE
  • 13. DataDevOps | Sean Gustafson & Arif Wider SCOUT24 DATA LANDSCAPE MANIFESTO #1 Preamble Data is a key asset of our company. SCOUT24 DATA LANDSCAPE MANIFESTO
  • 14. DataDevOps | Sean Gustafson & Arif Wider #2 Central Data Team’s Responsibility We, Data & Analytics, are responsible for providing a solid Data Platform as well as clear guidelines and training how to participate in the Data Landscape. 
 SCOUT24 DATA LANDSCAPE MANIFESTO D’n’A DATA PLATFORM DATA LANDSCAPE
  • 15. DataDevOps | Sean Gustafson & Arif Wider SCOUT24 DATA LANDSCAPE MANIFESTO #3 Data Autonomy, Not Anarchy Data autonomy puts data producers & data consumers in control of their data & of their metrics and thereby allows us to be data-driven at scale, but this comes with responsibility. SCOUT24 DATA LANDSCAPE MANIFESTO M ETRIC CONSUMER D’n’A DATA PLATFORM DATA LANDSCAPE DATA PRODUCER
  • 16. DataDevOps | Sean Gustafson & Arif Wider AWSCENTRAL DATA LAKE ON S3 ROLES & RESPONSIBILITIES DATA CATALOG D’n’A CHECKOUT
 SERVICE PRODUCER SPECIAL
 OFFER
 SERVICE CONSUMER
  • 17. DataDevOps | Sean Gustafson & Arif Wider SCOUT24 DATA LANDSCAPE MANIFESTO #4 Producer’s Responsibility Data producers are responsible for publishing data to the central Data Lake, for the data's quality, and for publishing metadata that makes it easy to find and consume the data. SCOUT24 DATA LANDSCAPE MANIFESTO D’n’A DATA PLATFORM DATA LANDSCAPE DATA METADATA PRODUCER
  • 18. DataDevOps | Sean Gustafson & Arif Wider AWSCENTRAL DATA LAKE ON S3 ROLES & RESPONSIBILITIES DATA CATALOG D’n’A ORDER EVENTS EVENT METADATA CHECKOUT
 SERVICE PRODUCER SPECIAL
 OFFER
 SERVICE CONSUMER
  • 19. DataDevOps | Sean Gustafson & Arif Wider AWSCENTRAL DATA LAKE ON S3 ROLES & RESPONSIBILITIES ORDER EVENTS EVENT METADATA CHECKOUT
 SERVICE DATA CATALOG PRODUCER D’n’A INGESTION TEMPLATE SPECIAL
 OFFER
 SERVICE CONSUMER
  • 20. DataDevOps | Sean Gustafson & Arif Wider SCOUT24 DATA LANDSCAPE MANIFESTO #5 Consumer’s Responsibility Data consumers are responsible for the definition & visualization of metrics and for driving the implementation and maintenance of these metrics. SCOUT24 DATA LANDSCAPE MANIFESTO M ETRIC CONSUMER D’n’A DATA PLATFORM DATA LANDSCAPE PRODUCER
  • 21. DataDevOps | Sean Gustafson & Arif Wider AWSCENTRAL DATA LAKE ON S3 ROLES & RESPONSIBILITIES ORDER EVENTS EVENT METADATA CHECKOUT
 SERVICE DATA CATALOG PRODUCER SPECIAL
 OFFER
 SERVICE CONSUMER D’n’A INGESTION TEMPLATE VIEW: ORDER HISTORY BY USER
  • 22. DataDevOps | Sean Gustafson & Arif Wider SCOUT24 DATA LANDSCAPE MANIFESTO #6 Transparency Over Continuity We value data transparency over data continuity, which means we may break metric comparability if it is for the cause of enabling better insights. SCOUT24 DATA LANDSCAPE MANIFESTO M ETRIC CONSUMER D’n’A DATA PLATFORM CORE METRIC DATA LANDSCAPE PRODUCER
  • 23. DataDevOps | Sean Gustafson & Arif Wider SCOUT24 DATA LANDSCAPE MANIFESTO The Ultimate Goal SCOUT24 DATA LANDSCAPE MANIFESTO A federal landscape of data producers and consumers with just enough norms to ensure seamless co-operation without severely impeding autonomy. M ETRIC CONSUMER D’n’A DATA PLATFORM DATA LANDSCAPE DATA METADATA PRODUCER
  • 24. DataDevOps | Sean Gustafson & Arif Wider Consequences for Product Teams ‣ Think about data & reporting ‣ Deliver your data to the lake ‣ Provide meta data ‣ Eat your own dog food: Consume your own data
  • 25. DataDevOps | Sean Gustafson & Arif Wider Benefits for Product Teams ‣ Independently work with data ‣ No dependencies to data teams ‣ It’s easy to consume data produced by other teams ‣ Faster product & measurement iterations
  • 28. DataDevOps | Sean Gustafson & Arif Wider How to convince everyone to go along? ‘Nudge’ them to participate Promote the platform Refuse new use cases in Data Warehouse
  • 29. DataDevOps | Sean Gustafson & Arif Wider Design ‘nudges’ into the Platform Make Data Lake easier than something else: ‣ automatic table publishing, partition detection ‣ backup and disaster recovery ‣ access control for restricted data ‣ optimize file formats (e.g. parquet) for efficiency
  • 30. DataDevOps | Sean Gustafson & Arif Wider Learnings and lessons ‣ Change needs to be technological, organizational and cultural ‣ Build features to give benefits that counteract resistance ‣ Communication is the key
  • 31. DataDevOps | Sean Gustafson & Arif Wider MOST IMPORTANTLY: Have a strong opinion about how the company should use data and build a platform that pushes toward that vision.
  • 32. Dr. Sean Gustafson & Dr. Arif Wider   @seangustafson           @arifwider THANKS