In this conference session we share how we are using Tableau “out of the box” and also describe how it fits into our overall data environment. In addition, we’ll describe how we expect to use the Data Catalog and Object Model, our explorations of large-scale data stores, and challenges we are working on including governance and data lineage. Video of session can be viewed here: https://youtu.be/Nr24tw3dmZQ
13. ● Data and Analytics are embraced across the company
○ Engineering, UX, Customer Service, Finance, & more
● A/B Testing of almost everything...
○ Product, Signup Methods, Payments, Messaging, & more
● Algorithms for...
○ Recommendations, Content, Marketing, & more
Data is Ubiquitous
BLAKE IRVINE | TABLEAU CONFERENCE 2018
14. Employees
BLAKE IRVINE | TABLEAU CONFERENCE 2018
5000 employees
300 in data teams
200+ in dedicated analytic teams
40. ● Vertical Teams
Organization
Content Marketing Growth Tech
Data Engineering
Science & Analytics
Business Teams
Analytic Teams
Engineering Teams
#content-analytics
#marketing-analytics
#growth-analytics
#tech-analytics
BLAKE IRVINE | TABLEAU CONFERENCE 2018
41. ● Growing user base
● We’ve started up:
○ A Tableau User Group
○ Education tracks
● Early days... much more to do here!
○ Office Hours
○ Tableau Days
○ Data Doctor & more
Community
BLAKE IRVINE | TABLEAU CONFERENCE 2018
47. ● The vast majority of our data sources are Extracts
○ Very few live connections
● Why?
○ BIG DATA
○ Some direct connections to Presto or MPP
● Extracts provide an aggregation and caching layer
We Love Data Extracts!
BLAKE IRVINE | TABLEAU CONFERENCE 2018
49. 1 Use Big Data Portal to develop query
2 Commit query to ETL repository & deploy
3 Configure ETL workflow so data dependencies are met
4 Use ETL job to publish TDE to server
5 Connect to TDE, Develop Viz, Publish to server, Share
“Best Practice” Pattern
BLAKE IRVINE | TABLEAU CONFERENCE 2018
50. 1 Use Big Data Portal to develop query
2 Paste the query into Tableau
3 Develop Viz
4 Publish, and Schedule data refresh on Tableau Server
“Self-Serve” Pattern
BLAKE IRVINE | TABLEAU CONFERENCE 2018
51. ● “Best Practice” Pattern is:
○ More robust
○ But complex
● “Self-Serve” Pattern is:
○ Easy and convenient
○ Less scalable
○ Harder to manage
Dilemma...
BLAKE IRVINE | TABLEAU CONFERENCE 2018
57. We have REALLY big data
1 Trillion
New Data Events Daily
150 Petabyte
Warehouse
300 Terabytes
Written Daily
5 Petabytes
Read Daily
BLAKE IRVINE | TABLEAU CONFERENCE 2018
58. ● Data volume
● Level of Detail
Constantly Balancing
● Speed of access
● Data prep
BLAKE IRVINE | TABLEAU CONFERENCE 2018
59. Development Choices
Choice 1 Choice 2 Choice 3
Data Engine MPP Cloud TDE
Data Size < 1B rows < 10B rows < 100M rows
Performance
Up to many
minutes
Many
minutes
Up to many
seconds
BLAKE IRVINE | TABLEAU CONFERENCE 2018
60. ● For REALLY big data use cases
● For very fast interactivity
● For custom UI/UX/dataviz
● Custom Analytic Tools
○ Web app built with Javascript
○ Data stored in Druid
Choice 4...
BLAKE IRVINE | TABLEAU CONFERENCE 2018
61. ● Druid
○ An open source data system for analytic applications
○ Distributed, horizontally scalable architecture
○ VERY, VERY fast
○ Queries are in JSON format to REST endpoint
Druid white paper: http://static.druid.io/docs/druid.pdf
BLAKE IRVINE | TABLEAU CONFERENCE 2018
62. ● Can we connect Tableau to Druid?
○ All the performance benefits of Druid...
○ Tableau or web apps use same data store…
● We are exploring this...
○ There is now a Druid SQL layer based on Apache Calcite
○ Have done some testing, finding limitations
Tableau ?
BLAKE IRVINE | TABLEAU CONFERENCE 2018
63. ● TDE -> Hyper with 2018.2 upgrade
○ Happening now(ish)
○ Expectations: faster for small and medium data (<100M)
● Snowflake
○ Fast for “large” data stores (1B+)
● Data scale is always a challenge!
In the meantime...
BLAKE IRVINE | TABLEAU CONFERENCE 2018
65. ● Where did this data come from?
● Can I trust this data?
Challenge 2: Data Lineage
● Tableau PRO: very easy to pull in data, analyze, and publish
● Tableau CON: very easy to pull in data, analyze, and publish
BLAKE IRVINE | TABLEAU CONFERENCE 2018
68. ● ...but not about Tableau
We have Data Lineage...
BLAKE IRVINE | TABLEAU CONFERENCE 2018
69. ● Can the upcoming Metadata APIs and Object Model help?
● Metadata APIs:
○ Inventory of workbooks, data sources, and metrics
○ Identify similar existing data and workbooks?
● Automate building of similar insights, and integrate to our
existing data lineage system
Metadata APIs
BLAKE IRVINE | TABLEAU CONFERENCE 2018
76. ● Improved layout & pagination
● Export to different formats
● Distribution management: what, who, and when
What we’d like
BLAKE IRVINE | TABLEAU CONFERENCE 2018