Contenu connexe Similaire à Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both? (20) Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?4. Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?
Mike Ferguson
Managing Director
Intelligent Business Strategies
Denodo Sponsored Webinar
May 2023
5. 5
Copyright © Intelligent Business Strategies 1992-2023
About Intelligent Business Strategies
▪ UK-based independent IT analyst and consulting firm founded 1992 specialising in data management and
analytics
▪ Mike Ferguson is co-founder of the Data and Analytics Retreat and Conference Chairman Big Data LDN
▪ Three main lines of business
Education
• Centralised Data Governance of a Distributed Data
Landscape
• Practical Guidelines for Implementing a Data Mesh
• DW Modernisation
• DW Migration to the Cloud
• Machine Learning & Advanced Analytics
• Embedded Analytics, Intelligent Apps & AI
Automation
• Public classes (anyone)
• On-site classes (single client)
• Customers, vendors, systems integrators
• On-line (public & on-sites)
Consulting
• Customers
• D&A Strategy, Data Architecture
• D&A Technology selection
• D&A Reviews, Data Governance
• Project advisory
• Vendors
• Product strategy
• Product positioning
• Marketing support
• Speaking at vendor events
• White papers
• Webinars
• Venture Capitalists
• Due-diligence, Asset advisory
Research
• Market research
• 4th Industrial
Revolution Survey
• D&A product research
• Data Catalogs
• Data Fabric
• Data Governance
www.intelligentbusiness.biz
6. 6
Copyright © Intelligent Business Strategies 1992-2023
Topics
• What is Data Fabric and how can it help when you have a distributed data estate?
• What capabilities should you look for in Data Fabric?
• What do these capabilities make possible?
• What is Data Mesh?
• What are data products in a Data Mesh and how can you build them?
• What are the implications of decentralising data engineering, and how do you co-ordinate data
product development?
• Is federated governance of data products really possible?
• How Does Data Fabric Help in Building a Data Mesh?
7. 7
Copyright © Intelligent Business Strategies 1992-2023
Many Companies Today Have Data Housed In Multiple Data Stores Across a Hybrid,
Multi-Cloud Distributed Data Estate
Analytical
Systems
OLTP systems Analytical Systems
Files
OLTP Systems
Cloud Based Applications On-Premises Systems Edge Devices
SaaS Applications
Content
Files
Content
IoT data
Multiple clouds
8. 8
Copyright © Intelligent Business Strategies 1992-2023
Technology Requirements – Need A Data Catalog And Data Fabric Software To Connect To
and Discover Data, Build Data Pipelines, Produce, Publish And Govern Data Products
Data Fabric software helps avoid or reduce the chances of data silos
Manage and organise storage, Data Discovery, Data classification, Data Catalog, Data Governance (Data Quality,
Security, Privacy, Retention), Data Preparation/ integration, Data Vurtualisation, APIs, Metadata, Data Marketplace
Enterprise Data Fabric Software
Data catalog
Analytical
Systems
OLTP systems Analytical Systems
Files
OLTP Systems
Cloud Based Applications On-Premises Systems Edge Devices
SaaS Applications
Content
Files
Content
IoT data
Multiple clouds
9. 9
Copyright © Intelligent Business Strategies 1992-2023
Enterprise Data Fabric Software
Key Requirements – We Need A Data Catalog To Automatically Discovery What Data Is
Available, Its Quality, Sensitivity And Where It Is Across The Landscape
Automatic data discovery, classification & data quality profiling
Automatically discover, classify, data quality profile and catalogue data
Data catalog
Analytical
Systems
OLTP systems Analytical Systems
Files
OLTP Systems
Cloud Based Applications On-Premises Systems Edge Devices
SaaS Applications
Content
Files
Content
IoT data
Multiple clouds
10. 10
Copyright © Intelligent Business Strategies 1992-2022
Citizen Data Engineer IT Developer
IT Data Architect
IDE
Why Data Fabric? IT And Business User Role-Based User Interfaces With Shared
Metadata Plus APIs So Developers Don’t Have To Code Everything Themselves
Role-based
UIs to the
same data
fabric platform
User Interfaces
Microservices Based Data Fabric
APIs
catalog
shared
metadata
Analytical
Systems
OLTP systems Analytical Systems
Files
OLTP Systems
Cloud Based Applications On-Premises Systems Edge Devices
SaaS Applications
Content
Files
Content
IoT data
Multiple clouds
11. 11
Copyright © Intelligent Business Strategies 1992-2022
Topics – Where Are We?
• What is Data Fabric and how can it help when you have a distributed data estate?
➢ What capabilities should you look for in Data Fabric?
• What do these capabilities make possible?
• What is Data Mesh?
• What are data products in a Data Mesh and how can you build them?
• What are the implications of decentralising data engineering, and how do you co-ordinate data
product development?
• Is federated governance of data products really possible?
• How Does Data Fabric Help in Building a Data Mesh?
12. 12
Copyright © Intelligent Business Strategies 1992-2022
What Capabilities Should You Expect to Find in Data Fabric Platforms? – 1
Cloud
(pay-as-you-use,
scalability)
Secure data &
application
connectors
Data catalog
(auto data discovery
& data classification)
Collaborative Development
(shared metadata,
components, templates)
Flexible deployment options
On-premises
or
13. 13
Copyright © Intelligent Business Strategies 1992-2022
What Capabilities Should You Expect to Find in Data Fabric Platforms? - 2
Data cleaning & integration
(CDC & batch,
data in motion & data at rest)
Orchestration
CI/CD,
Git, test,
deploy
DataOps
Resilience
(auto-detect schema &
infrastructure change,
version management)
Data
Virtualisation
14. 14
Copyright © Intelligent Business Strategies 1992-2022
What Capabilities Should You Expect to Find in Data Fabric Platforms? - 3
Analytical services
(ML models to predict/classify &
recommend, NLP, deep learning)
Decision services to
decide & act
Unified Data Governance
Data masking & encryption,
IAM, policies, observability
Data Marketplace
for business ready
data products
Custom
code
Extensability
15. 15
Copyright © Intelligent Business Strategies 1992-2022
Topics – Where Are We?
• What is Data Fabric and how can it help when you have a distributed data estate?
• What capabilities should you look for in Data Fabric?
➢ What do these capabilities make possible?
• What is Data Mesh?
• What are data products in a Data Mesh and how can you build them?
• What are the implications of decentralising data engineering, and how do you co-ordinate data
product development?
• Is federated governance of data products really possible?
• How Does Data Fabric Help in Building a Data Mesh?
16. 16
Copyright © Intelligent Business Strategies 1992-2022
What Do Data Fabric Capabilities Make Possible?
Analytical
Systems
OLTP systems Analytical Systems
Files
OLTP Systems
Cloud Based Applications On-Premises Systems Edge Devices
SaaS Applications
Content
Files
Content
IoT data
Multiple clouds
Data Fabric Data catalog
You can
• Leave data where it is
• Connect to a broad range of data sources across a distributed data estate
• Automatically discover and classify data across multiple data source using a data catalog
• Support multiple teams of data producers, business analysts and data scientists who need to find, access,
transform, integrate and analyse data
• Share business ready data and metadata across the enterprise in a compliant manner
• Provision data virtually and physically
• Govern access to data from a common platform
team team team team team
17. 17
Copyright © Intelligent Business Strategies 1992-2022
Topics – Where Are We?
• What is Data Fabric and how can it help when you have a distributed data estate?
• What capabilities should you look for in Data Fabric?
• What do these capabilities make possible?
➢ What is Data Mesh?
• What are data products in a Data Mesh and how can you build them?
• What are the implications of decentralising data engineering, and how do you co-ordinate data
product development?
• Is federated governance of data products really possible?
• How Does Data Fabric Help in Building a Data Mesh?
18. 18
Copyright © Intelligent Business Strategies 1992-2023
Mortgages
Products
Customers
Claims Car loans
Current
account
Savings
Premium
payments
Credit card
payments
Interaction
s
What Is Data Mesh?
– A Decentralised Domain-Oriented Approach To Data Product Development
Data Mesh
Ingest process serve
source Interface
Application,
Microservice,
Data stream
Another data product
Claims
Application
Claims DataOps pipeline Claims
Business Domain
designed in governance
Interface
(E.g. SQL,
REST,
GraphQL)
+ Infrastructure
runtime spec
+ metadata
(glossary terms
& lineage)
Data product
team
Each domain owns and creates data products and needs to support
• History
• Schema change detection
• Data product versioning
• De-identification of sensitive data in data products
Copyright © Intelligent Business Strategies 1992-2023
19. 19
Copyright © Intelligent Business Strategies 1992-2023
Application
DataOps pipeline Data product
governance is designed-in Interface
+ Infrastructure
runtime spec
+ metadata
(glossary terms &
lineage)
Business Domain
Application
DataOps pipeline
Data product
governance is designed-in
Interface
+ Infrastructure
runtime spec
+ metadata
(glossary terms &
lineage)
What Is Data Mesh?
– A Decentralised Approach to Data Engineering and Data Product Development
Application
DataOps pipeline Data product
governance is designed-in
Interface
+ Infrastructure
runtime spec
+ metadata
(glossary terms &
lineage)
Application
DataOps pipeline
Data product
governance is designed-in
Interface
+ Infrastructure
runtime spec
+ metadata
(glossary terms &
lineage)
Mortgages
Products
Customers
Claims Car loans
Current account
Savings
Premium
payments
Credit card
payments
Inter-actions
Business Domain Business Domain
Business Domain
Data Mesh
team team
team
team
Copyright © Intelligent Business Strategies 1992-2023
20. 20
Copyright © Intelligent Business Strategies 1992-2023
Topics – Where Are We?
• What is Data Fabric and how can it help when you have a distributed data estate?
• What capabilities should you look for in Data Fabric?
• What do these capabilities make possible?
• What is Data Mesh?
➢ What are data products in a Data Mesh and how can you build them?
• What are the implications of decentralising data engineering, and how do you co-ordinate data
product development?
• How Does Data Fabric Help in Building a Data Mesh?
• Is federated governance of data products really possible?
21. 21
Copyright © Intelligent Business Strategies 1992-2023
What Is A Data Product?
∑∫(x)
SQL query script
(as-a-service)
BI Reports &
Spreadsheets
Dashboards / Stories /
Conversations
Decision services
(e.g. recommendations, alerts,
problem predictors, opportunity
predictors)
Action Services
(e.g. alerts, transactions,
task automations)
Key performance
indicators
Analytical ML models
(as-a-service)
AI-Driven intelligent Processes
Plans
Customers
Ready made
Data Products
Orders
Payments
Products
Virtual
Data Products
AI- Driven Automated Tasks
Analytical Products Created Using Data Products
Data Products
Analytic apps
22. 22
Copyright © Intelligent Business Strategies 1992-2023
What Is Data Mesh? – Domains Can Consume Data Products Produced By Other
Domains And Then Create New Data Products To Add To A Data Mesh
Application
Data integration pipeline Data product
designed in governance
Interface
+ Infrastructure
runtime spec
+ metadata
(glossary terms &
lineage)
Mortgages
Products
Customers
Claims Car loans
Current
account
Savings
Premium
payments
Credit card
payments
Inter-
actions
Business Domain
Publish / add new
data product to the
Data Mesh
Data Mesh
consume
produce
Data
product
Data governance
policies must stay with
the data product
The data product is the basic
building block of a data mesh
virtual or physical
23. 23
Copyright © Intelligent Business Strategies 1992-2023
Potential Analytical Consumers Of Reusable Data Products In A Data Mesh
Mortgages
Products
Customers
Claims Car loans
Current
account
Savings
Premium
payments
Credit card
payments
Interactions
Data Mesh
Cloud storage
Feature
store
Consuming analytical systems
Graph
DB
DW
mart
Data science
sandboxes
pipeline
(could be a lakehouse)
Deployed ML model
Data
virtualisation
pipeline
pipeline
pipeline
pipeline
virtual or physical data products
24. 24
Copyright © Intelligent Business Strategies 1992-2023
Topics – Where Are We?
• What is Data Fabric and how can it help when you have a distributed data estate?
• What capabilities should you look for in Data Fabric?
• What do these capabilities make possible?
• What is Data Mesh?
• What are data products in a Data Mesh and how can you build them?
➢ What are the implications of decentralising data engineering, and how do you co-ordinate data
product development?
• How Does Data Fabric Help in Building a Data Mesh?
• Is federated governance of data products really possible?
25. 25
Copyright © Intelligent Business Strategies 1992-2023
Federated Operating Model – Business Strategy Aligned Teams Producing Data
Products And ML Models Using Common Extensible Data Fabric Software
C-Suite
Business
Strategy
Streaming
analytics
Analytical
ecosystem
Common Extensible Data Fabric (D&A Platform)
(data & analytics pipeline development, data governance)
EDW
MDM
NoSQL
Graph DB
𝑓𝑥 𝑓𝑥
Analytical
products
Data
products
CIO or CDO
Programme Office +
A Data & Analytics
Centre of Enablement
catalog
Data product
producer
team
Data product
producer
team
Data product
producer
team
team team team
A federated organisational set-up
Data producer teams are not
central IT and should be able
to use the Data Fabric
platform without needing IT
Analytical
Systems
OLTP systems Analytical Systems
Files
OLTP Systems
Cloud Based Applications On-Premises Systems Edge Devices
SaaS Applications
Content
Files
Content
IoT data
Multiple clouds
26. 26
Copyright © Intelligent Business Strategies 1992-2023
Democratisation Has a Massive Impact on IT – Forces IT to Reorganise to Embed
Expertise in Domain-Oriented Teams if Companies Democratise Data Engineering
Partners
Customers
Suppliers
Employees
Things
HR
Sales
Marketing
Service
Finance
Procure-
ment
Operations Distribution
raw data
trusted
data
product
DataOps
pipeline
Business domain
team
raw data
trusted
data
product
DataOps
pipeline
Business domain
team
raw data
trusted
data
product
DataOps
pipeline
Business domain
team
raw data
trusted
data
product
DataOps
pipeline
Business domain
team
raw data
trusted
data
product
DataOps
pipeline
Business domain
team
raw data
trusted
data
product
DataOps
pipeline
Business domain
team
raw data
trusted
data
product
DataOps
pipeline
Business domain
team
raw data
trusted
data
product
DataOps
pipeline
Business domain
team
Federated IT set-up
needed with a D & A
experts ready to help
enable the domains
Embed expertise
Establish best practices
Upskill citizen data
engineers
Take complexity away
from citizen data
engineers
Push expertise into
business teams
IT Must
Reorganise!
27. 27
Copyright © Intelligent Business Strategies 1992-2023
Topics – Where Are We?
• What is Data Fabric and how can it help when you have a distributed data estate?
• What capabilities should you look for in Data Fabric?
• What do these capabilities make possible?
• What is Data Mesh?
• What are data products in a Data Mesh and how can you build them?
• What are the implications of decentralising data engineering, and how do you co-ordinate data product
development?
➢ How Does Data Fabric Help in Building a Data Mesh?
• Is federated governance of data products really possible?
28. 28
Copyright © Intelligent Business Strategies 1992-2023
Creating Reusable Data Products Is An Approach Fast Gaining Momentum as
a Way of Incrementally Building Up a Secure and Compliant Data Foundation
Analytical
Systems
OLTP systems Analytical Systems
Files
OLTP Systems
Cloud Based Applications On-Premises Systems Edge Devices
SaaS Applications
Content
Files
Content
IoT data
Multiple clouds
High Quality, Compliant Data Products
Data Fabric Data catalogue
team team team team team
29. 29
Copyright © Intelligent Business Strategies 1992-2023
Organised Democratisation – Need A Collaborative And Governed Environment To
Rapidly Produce Trusted Data Products Using Data Fabric
Need to work together for competitive advantage
Enterprise Data Fabric
IoT
RDBMS
office docs
social
Cloud
clickstream
web logs
XML,
JSON
web
services
NoSQL
Files
data
Catalog
Data Products
data
product
Ready made data
& analytical
products
Data
marketplace
Domain expert
Domain expert
Project
Bus. analyst IT Data Architect
Domain expert
Domain expert
Project
Bus. analyst IT Data Architect
Domain expert
Domain expert
Project
Bus. analyst IT Data Architect
Domain expert
Domain expert
Project
Bus. analyst IT Data Architect
data
product
community
data
product
data
product
Data Catalog
30. 30
Copyright © Intelligent Business Strategies 1992-2022
What Do Data Fabric Capabilities Make Possible? - An Enterprise Data Marketplace?
Enterprise Data Marketplace
A catalog application that governs the
sharing of published ready made, trusted,
data and analytical products that are
available as services with common data
names documented in a business
glossary, full metadata lineage and that
are tagged and organised to make them
easy to find, access, share and reuse
across the enterprise
31. 31
Copyright © Intelligent Business Strategies 1992-2022
Data Fabric Enterprise Data Marketplace Operations
Data &
analytical
products
PUBLISH
data
consumers
shop for
data
Data Marketplace
ready made
data & analytical
products
Data
Catalog
Find
Access
Use
Rate
Enrich
Share
Follow
data
producers
Trusted
internal data
Trusted
external data
Trusted
queries
Trusted virtual
data products
32. 32
Copyright © Intelligent Business Strategies 1992-2022
Consumers Can Create New BI And Analytical Products From Trusted Data And Then
Publish Them In The Marketplace
information
consumers shop for
ready-to-go data
and analytical
assets
shop
for data
Data &
Analytics
Catalog
Data
Marketplace
Trusted data
service
Query service
BI report /
dashboard /
story
BI Insights pipeline
Trusted data
service
Analytical
service
Predictive insights pipeline (rapid assembly)
Trusted data
service
Analytical
service
Decision
service
Prescriptive analytical pipeline (rapid assembly)
BI report /
dashboard /
story
Trusted data
service
New virtual
data service
Enrich data
Trusted data
service
publish newly created analytical products into a catalog
Data Products
33. 33
Copyright © Intelligent Business Strategies 1992-2022
Types Of Data Marketplace
▪ Public data marketplaces
• Enables data from multiple data providers to be made available to multiple consumers in one
place
• Data is typically immediately accessible via query
• Marketplace provider contractually agrees with data providers to publish data
• Marketplace provider responsible to loading data and keeping it up to date
• Data can be monetised
▪ Internal data marketplaces
• Enables teams of internal data producers to publish and share data products
• Data is published by data owners or data stewards
• Includes standard processes to govern publishing, sharing and tracking of data products
• Data can be shared internally and with other parties, e.g., suppliers, partners
• Can also hold data purchased from external data providers
• Increasingly also includes analytical ML models available for use across the enterprise
• Data can be monetised
34. 34
Copyright © Intelligent Business Strategies 1992-2022
Data Sharing – It is Often Not Good To Provision Data Physically!
Ordered data
Data
product
Data
product
Data
product
COPY
COPY
COPY
Data
product
Data
product
Data
product
Provisioning lots of physical copies
could create more chaos
All copies of data need to be catalogued otherwise
you will lose control of governing the data
35. 35
Copyright © Intelligent Business Strategies 1992-2022
Data Virtualisation Plays A Key Role In Data Fabric Enterprise Data Marketplace
Operations Because It Minimises The Need To Physically Provision Data
COPY
COPY
COPY
Data
Virtualisation
Ordered data
Data
product
Data
product
Data
product
Data
product
Data
product
Data
product
36. 36
Copyright © Intelligent Business Strategies 1992-2022
Data Fabric Vs Best of Breed Tools - What Are The Issues?
– Best of Breed Tool Overlap, Lack of Integration, Re-Invention, Don’t Fit Together…..
Tool E
Tool G
Tool A
Tool D
Tool B
Tool F
Tool C
Tool overlaps often leads to
reinvention across tools
Components developed in different
tools don’t work together
Tool A Tool B
Lack of integration between tools
stalls productivity, can cause rework /
reinvention and slows development
37. 37
Copyright © Intelligent Business Strategies 1992-2022
OR
Creation of Data Today is Similar to the Picture on the Left When Most Want the Picture on
the Right - We Need Findable, Trusted, Compliant and Re-Usable Data Products!
Image source: https://ebcwblog.wordpress.com/2014/10/02/how-to-decorate-with-books/ Image Source: Maughan Library, London (King's College London Library)
38. 38
Copyright © Intelligent Business Strategies 1992-2022
Topics – Where Are We?
• What is Data Fabric and how can it help when you have a distributed data estate?
• What capabilities should you look for in Data Fabric?
• What do these capabilities make possible?
• What is Data Mesh?
• What are data products in a Data Mesh and how can you build them?
• What are the implications of decentralising data engineering, and how do you co-ordinate data
product development?
• How Does Data Fabric Help in Building a Data Mesh?
➢ Is federated governance of data products really possible?
39. 39
Copyright © Intelligent Business Strategies 1992-2023
Data Virtualisation Can Offer Universal Access Control Across Multiple Data
Stores and Also Enforce Data Privacy All In One Place
Policy definition
& maintenance
Classified data
Analytical
Systems
OLTP systems Analytical Systems
Files
OLTP Systems
Cloud Based Applications On-Premises Systems Edge Devices
SaaS Applications
Content
Files
Content
IoT data
Data
catalog
Enforcement
Logical Data Fabric
NoSQL
RDBMS Cloud storage Hadoop
Files
Unified data access control
RDBMS
Files
XXXX-XXXX
Dynamic masking of personal data
at point of access
NoSQL
APIs
virtual view
Edge
RDBMS
Streaming
virtual view
virtual view
XXXX-XXXX
Copyright © Intelligent Business Strategies 1992-2023
40. 40
Copyright © Intelligent Business Strategies 1992-2023
About Mike Ferguson
www.intelligentbusiness.biz
mferguson@intelligentbusiness.biz
@mikeferguson1
(+44) 1625 520700
Mike Ferguson is Managing Director of Intelligent Business Strategies Limited. As an
independent IT industry analyst and consultant, he specialises in BI / analytics and data
management. With over 40 years of IT experience, Mike has consulted for dozens of
companies on BI/Analytics, data strategy, technology selection, enterprise architecture,
and data management. Mike is also conference chairman of Big Data LDN, the largest
data and analytics conference in Europe and a member of the EDM Council CDMC
Executive Advisory Board. He has spoken at events all over the world and written
numerous articles. Formerly he was a principal and co-founder of Codd and Date – the
inventors of the Relational Model, Chief Architect at Teradata on the Teradata DBMS and
European Managing Director of Database Associates. He teaches popular master
classes in Data Warehouse Modernisation, Practical Guidelines for Implementing a Data
Mesh, Big Data, Centralised Data Governance of a Distributed Data Landscape,
Machine Learning and Advanced Analytics, and Embedded Analytics, Intelligent Apps
and AI Automation
Thank You!
45. Master Data Analytical Data Retail Data
Connection layer
Integration layer
Data Products layer
Reporting layer
JDBC ODBC REST GraphQL OData
46. Master Data Analytical Data Retail Data
Connection layer
Integration layer
Data Products layer
Reporting layer
JDBC ODBC REST GraphQL OData
•
•
•
•
•
52. DENODO DATAFEST EMEA 2023
The Agile Data Management
and Analytics Conference
www.denododatafest.com/EMEA