SlideShare une entreprise Scribd logo
1  sur  22
Télécharger pour lire hors ligne
Real-time
Analytics Using
Apache Pinot
How LinkedIn, Uber Eats and Stripe create
Real Time Dashboards for millions of users.
Agenda
Who is Barkha? (why would you want to listen to me?)
The evolution of Analytics
How LinkedIn Solved their Problem
Try some Pinot with me
Overheard @ Big Data Fest 2023
• 5 Year trends in Big data will see
• Streaming APIs
• Will Data Warehouse Survive?
• Integration with LLM/AI/ML
• Thiago de Faria
• 5 Year trends in Big data will see
• Democratization of Data Warehousing
• Commoditized Data Warehousing
• Most companies are barely doing BI let alone AI.
• Joe Reis
About Barkha
• Founder South Florida Women in
Technology
• Developer Advocate @StarTree
• Linkedin.com/in/BarkhaHerman
• Twitter @BarkhaH
Analytics?
Real Time?
Scale?
OH WHY?
Why do we need Real-time
Analytics? Or Analytics? Or at
Scale?
Historic
Analytics
Batch
Shared Data
No Scale Concerns
Modern
Analytics
Data Freshness
Daily reports vs.
How late is my food
delivery?
Query Performance
Reports < 2 minutes vs.
Dashboards take < 10
millisecond to load
Scale
All division managers
worldwide access report
(> 1000) vs.
Millions of users access
dashboard
How LinkedIn
solved Analytics @
Scale
By inventing Pinot
LinkedIn: Who Viewed
your Profile? • Capture profile view information
and its deduplication
• Compute view sources (e.g.,
search, profile page, etc.)
• View relevance (e.g., a senior
leader viewed your profile)
• View obfuscations based on the
viewing member’s privacy settings
Before Pinot
• Elastic Search based solution
• 1000 Nodes
• 1500 queries / sec
• 20+ million users
After Pinot
• 75 Nodes
• 5000 queries / sec
• 70+ million users
Pinot
Building
Blocks • Segment is the physical
store.
• Table are conceptual and
accept both real-time and
batch data.
• Tenants provide
functional segregation.
• Cluster allow for scale
based on use.
Pinot
Building
Blocks
Indexes
Pinot
supports
the
following
indexing
techniques
Inverted index - Used for exact lookups
Range index - Used for range queries.
Text index - Used for phrase, term, Boolean, prefix, or regex queries.
Geospatial index - Based on H3, a hexagon-based hierarchical gridding.
Used for finding points that exist within a certain distance from another point.
JSON index - Used for querying columns in JSON documents.
Star-Tree index - Pre-aggregates results across multiple columns.
StarTree Index
Don’t pre cube everything…
Apache Pinot Architecture
Demo
Pizza Shop Demo
https://github.com/startreedata/pizza-shop-demo
Overheard @ Big Data Fest 2023
• 5 Year trends in Big data will see
• Streaming APIs  Apache Pinot is built to solve Streaming First Problems
• Will Data Warehouse Survive?  Apache Pinot builds Customer Facing Analytics which is on the rise
• Integration with LLM/AI/ML  Apps built on top of Pinot such as ThirdEye use Statistics and allow for AI/ML Add Ons.
• Thiago de Faria
• 5 Year trends in Big data will see
• Democratization of Data Warehousing  Apache Pinot builds Customer Facing Analytics which is on the rise
• Commoditized Data Warehousing  Apache Pinot builds Customer Facing Analytics which is on the rise
• Most companies are barely doing BI let alone AI.  Easy Analytics + Apps built on top of Pinot such as ThirdEye.
• Joe Reis
Using Real-
time Analytics
@ Scale
What can you do with it?
Who Uses
Apache
Pinot?
What’s Next?
Please Connect!!!!! I need brownie points.
Thank you for listening!

Contenu connexe

Similaire à Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot

Big Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationBig Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationIntellipaat
 
Big Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationBig Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationIntellipaat
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarImpetus Technologies
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarImpetus Technologies
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMBig Data Joe™ Rossi
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMBig Data Joe™ Rossi
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMBig Data Joe™ Rossi
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMBig Data Joe™ Rossi
 
Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You! DataKitchen
 
Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You! DataKitchen
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyDataWorks Summit
 
Top Reasons to Choose Python for Your Next Web Development Project
Top Reasons to Choose Python for Your Next Web Development ProjectTop Reasons to Choose Python for Your Next Web Development Project
Top Reasons to Choose Python for Your Next Web Development ProjectCapital Numbers
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyDataWorks Summit
 
Top Reasons to Choose Python for Your Next Web Development Project
Top Reasons to Choose Python for Your Next Web Development ProjectTop Reasons to Choose Python for Your Next Web Development Project
Top Reasons to Choose Python for Your Next Web Development ProjectCapital Numbers
 
A developer's introduction to big data processing with Azure Databricks
A developer's introduction to big data processing with Azure DatabricksA developer's introduction to big data processing with Azure Databricks
A developer's introduction to big data processing with Azure DatabricksMicrosoft Tech Community
 
A developer's introduction to big data processing with Azure Databricks
A developer's introduction to big data processing with Azure DatabricksA developer's introduction to big data processing with Azure Databricks
A developer's introduction to big data processing with Azure DatabricksMicrosoft Tech Community
 
The State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleThe State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleVoltDB
 
The State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleThe State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleVoltDB
 

Similaire à Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot (20)

Big Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationBig Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview Preparation
 
Big Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationBig Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview Preparation
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
 
Intro big data analytics
Intro big data analyticsIntro big data analytics
Intro big data analytics
 
Intro big data analytics
Intro big data analyticsIntro big data analytics
Intro big data analytics
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBM
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBM
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBM
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBM
 
Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!
 
Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
 
Top Reasons to Choose Python for Your Next Web Development Project
Top Reasons to Choose Python for Your Next Web Development ProjectTop Reasons to Choose Python for Your Next Web Development Project
Top Reasons to Choose Python for Your Next Web Development Project
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
 
Top Reasons to Choose Python for Your Next Web Development Project
Top Reasons to Choose Python for Your Next Web Development ProjectTop Reasons to Choose Python for Your Next Web Development Project
Top Reasons to Choose Python for Your Next Web Development Project
 
A developer's introduction to big data processing with Azure Databricks
A developer's introduction to big data processing with Azure DatabricksA developer's introduction to big data processing with Azure Databricks
A developer's introduction to big data processing with Azure Databricks
 
A developer's introduction to big data processing with Azure Databricks
A developer's introduction to big data processing with Azure DatabricksA developer's introduction to big data processing with Azure Databricks
A developer's introduction to big data processing with Azure Databricks
 
The State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleThe State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and Scale
 
The State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleThe State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and Scale
 

Plus de Anant Corporation

QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137Anant Corporation
 
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfKono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfAnant Corporation
 
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...Anant Corporation
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAnant Corporation
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapAnant Corporation
 
Machine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowMachine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowAnant Corporation
 
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksAnant Corporation
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionAnant Corporation
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Anant Corporation
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & FutureCassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & FutureAnant Corporation
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Anant Corporation
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackAnant Corporation
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergAnant Corporation
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsAnant Corporation
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraAnant Corporation
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Anant Corporation
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessAnant Corporation
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsAnant Corporation
 

Plus de Anant Corporation (20)

QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
 
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfKono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
 
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
 
YugabyteDB Developer Tools
YugabyteDB Developer ToolsYugabyteDB Developer Tools
YugabyteDB Developer Tools
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
 
Machine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowMachine Learning Orchestration with Airflow
Machine Learning Orchestration with Airflow
 
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward Talks
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & FutureCassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
CL 121
CL 121CL 121
CL 121
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
 

Dernier

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 

Dernier (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 

Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot

  • 1. Real-time Analytics Using Apache Pinot How LinkedIn, Uber Eats and Stripe create Real Time Dashboards for millions of users.
  • 2. Agenda Who is Barkha? (why would you want to listen to me?) The evolution of Analytics How LinkedIn Solved their Problem Try some Pinot with me
  • 3. Overheard @ Big Data Fest 2023 • 5 Year trends in Big data will see • Streaming APIs • Will Data Warehouse Survive? • Integration with LLM/AI/ML • Thiago de Faria • 5 Year trends in Big data will see • Democratization of Data Warehousing • Commoditized Data Warehousing • Most companies are barely doing BI let alone AI. • Joe Reis
  • 4. About Barkha • Founder South Florida Women in Technology • Developer Advocate @StarTree • Linkedin.com/in/BarkhaHerman • Twitter @BarkhaH
  • 5. Analytics? Real Time? Scale? OH WHY? Why do we need Real-time Analytics? Or Analytics? Or at Scale?
  • 7. Modern Analytics Data Freshness Daily reports vs. How late is my food delivery? Query Performance Reports < 2 minutes vs. Dashboards take < 10 millisecond to load Scale All division managers worldwide access report (> 1000) vs. Millions of users access dashboard
  • 8. How LinkedIn solved Analytics @ Scale By inventing Pinot
  • 9. LinkedIn: Who Viewed your Profile? • Capture profile view information and its deduplication • Compute view sources (e.g., search, profile page, etc.) • View relevance (e.g., a senior leader viewed your profile) • View obfuscations based on the viewing member’s privacy settings
  • 10. Before Pinot • Elastic Search based solution • 1000 Nodes • 1500 queries / sec • 20+ million users
  • 11. After Pinot • 75 Nodes • 5000 queries / sec • 70+ million users
  • 12. Pinot Building Blocks • Segment is the physical store. • Table are conceptual and accept both real-time and batch data. • Tenants provide functional segregation. • Cluster allow for scale based on use.
  • 14. Indexes Pinot supports the following indexing techniques Inverted index - Used for exact lookups Range index - Used for range queries. Text index - Used for phrase, term, Boolean, prefix, or regex queries. Geospatial index - Based on H3, a hexagon-based hierarchical gridding. Used for finding points that exist within a certain distance from another point. JSON index - Used for querying columns in JSON documents. Star-Tree index - Pre-aggregates results across multiple columns.
  • 15. StarTree Index Don’t pre cube everything…
  • 18. Overheard @ Big Data Fest 2023 • 5 Year trends in Big data will see • Streaming APIs  Apache Pinot is built to solve Streaming First Problems • Will Data Warehouse Survive?  Apache Pinot builds Customer Facing Analytics which is on the rise • Integration with LLM/AI/ML  Apps built on top of Pinot such as ThirdEye use Statistics and allow for AI/ML Add Ons. • Thiago de Faria • 5 Year trends in Big data will see • Democratization of Data Warehousing  Apache Pinot builds Customer Facing Analytics which is on the rise • Commoditized Data Warehousing  Apache Pinot builds Customer Facing Analytics which is on the rise • Most companies are barely doing BI let alone AI.  Easy Analytics + Apps built on top of Pinot such as ThirdEye. • Joe Reis
  • 19. Using Real- time Analytics @ Scale What can you do with it?
  • 21. What’s Next? Please Connect!!!!! I need brownie points.
  • 22. Thank you for listening!