SlideShare une entreprise Scribd logo
1  sur  24
State of the Project
Martin Traverso July 16, 2018
Presto turns 6!
Presto @ Facebook
• Multiple uses

• Warehouse ETL and ad-hoc analytics

• Dashboards

• Analytics backend for A/B testing system

• Analytics backed for user facing products

• 1000s of nodes across several clusters

• 100s of PBs and quadrillions of rows processed/day

• > 70% of new warehouse ETL workloads on Presto
Since last year's meetup...
• 30 releases (0.176 to 0.205)

• ~3,500 commits (13,800 total)

• ~140 contributors (330 total)
Workload management
• Fairness in local scheduler

• Resource group improvements

• Query runtime limits
Since last year's meetup...
Security
• Authentication

• Client certificates

• Password

• JSON Web Tokens (RFC 7519)

• Worker-to-worker encryption
Since last year's meetup...
Security
• Column-level access control
public interface ConnectorAccessControl
{
...
default void checkCanSelectFromColumns(
ConnectorTransactionHandle transaction,
Identity identity,
SchemaTableName table,
Set<String> columns)
}
Since last year's meetup...
Execution
• Partitioned (a.k.a "bucket by bucket") query execution

• Distributed sort

• Dynamic writer scaling

• Support for JOIN spilling

• Improved memory accounting

• Java 9 support

• ... many, many performance and reliability improvements
Since last year's meetup...
Language
• Geospatial functions and optimizations (SQL/MM Part 3)

• Language constructs

• GROUPING

• CURRENT_USER

• New data types

• IPADDRESS

• SET_DIGEST
Since last year's meetup...
Language
• Decimal literals
123.45678 DECIMAL(8,5)
.12345 DECIMAL(5,5)
1e5 DOUBLE
0.3e-4 DOUBLE
123.456e2 DOUBLE
Since last year's meetup...
Language
• Ordered aggregates
SELECT id, array_agg(value ORDER BY index)
FROM t,
UNNEST(t.elements) WITH ORDINALITY
AS u(index, value)
GROUP BY id
Since last year's meetup...
Language
• LATERAL join
SELECT orderkey, total
FROM orders o, LATERAL (
SELECT sum(totalprice) total
FROM orders
WHERE custkey = o.custkey
AND orderdate <= o.orderdate
AND orderpriority > o.orderpriority
) t
Since last year's meetup...
Connectors
• TPC-DS

• Redshift

• Thrift

• Cassandra

• INSERT support
Since last year's meetup...
What's next
Scalability
• Larger clusters (> ~1000 nodes)

• HTTP/2

• Multiple query coordinators
What's next
Language
• TIMESTAMP semantics

• (c1, c2, ..., cN) IN <subquery>

• Improved correlated subqueries

• Equality for complex types containing NULLs

• Function namespaces (SQL path)
What's next
Connectors
• Elasticsearch

• Apache Phoenix

• Kudu
What's next
and more..
• Cost-based join type selection and reordering

• Dynamic filtering

• Warnings framework

• Role management
What's next
Future Projects
Execution
• Failure recovery 

• Automatic skew handling

• Workload re-balancing

• Temporary tables

• Multi-phase query executions
Future Projects
Language
• User-defined functions

• SQL 2016 features

• Polymorphic table functions

• JSON syntax

• Row pattern matching
Future Projects
Others
• Client protocol overhaul

• Plan IR overhaul

• Extensible optimizer rules

• User-defined functions
Future Projects
State of Presto Project Overview

Contenu connexe

Tendances

Presto Fast SQL on Anything
Presto Fast SQL on AnythingPresto Fast SQL on Anything
Presto Fast SQL on AnythingAlluxio, Inc.
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Icebergkbajda
 
Presto Summit 2018 - 08 - FINRA
Presto Summit 2018  - 08 - FINRAPresto Summit 2018  - 08 - FINRA
Presto Summit 2018 - 08 - FINRAkbajda
 
Presto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and FuturePresto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and FutureDataWorks Summit
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containerskbajda
 
presto-at-netflix-hadoop-summit-15
presto-at-netflix-hadoop-summit-15presto-at-netflix-hadoop-summit-15
presto-at-netflix-hadoop-summit-15Zhenxiao Luo
 
Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine kiran palaka
 
Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015Taro L. Saito
 
Presto Summit 2018 - 10 - Qubole
Presto Summit 2018  - 10 - QubolePresto Summit 2018  - 10 - Qubole
Presto Summit 2018 - 10 - Qubolekbajda
 
Presto @ Uber Hadoop summit2017
Presto @ Uber Hadoop summit2017Presto @ Uber Hadoop summit2017
Presto @ Uber Hadoop summit2017Zhenxiao Luo
 
Bleeding Edge Databases
Bleeding Edge DatabasesBleeding Edge Databases
Bleeding Edge DatabasesLynn Langit
 
Superset druid realtime
Superset druid realtimeSuperset druid realtime
Superset druid realtimearupmalakar
 
Netflix running Presto in the AWS Cloud
Netflix running Presto in the AWS CloudNetflix running Presto in the AWS Cloud
Netflix running Presto in the AWS CloudZhenxiao Luo
 
Presto Summit 2018 - 02 - LinkedIn
Presto Summit 2018  - 02 - LinkedInPresto Summit 2018  - 02 - LinkedIn
Presto Summit 2018 - 02 - LinkedInkbajda
 
Securing Data in Hadoop at Uber
Securing Data in Hadoop at UberSecuring Data in Hadoop at Uber
Securing Data in Hadoop at UberDataWorks Summit
 
Presto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @FacebookPresto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @FacebookTreasure Data, Inc.
 
Spark Summit EU 2015: Revolutionizing Big Data in the Enterprise with Spark
Spark Summit EU 2015: Revolutionizing Big Data in the Enterprise with SparkSpark Summit EU 2015: Revolutionizing Big Data in the Enterprise with Spark
Spark Summit EU 2015: Revolutionizing Big Data in the Enterprise with SparkDatabricks
 

Tendances (20)

Presto Fast SQL on Anything
Presto Fast SQL on AnythingPresto Fast SQL on Anything
Presto Fast SQL on Anything
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Iceberg
 
Presto Summit 2018 - 08 - FINRA
Presto Summit 2018  - 08 - FINRAPresto Summit 2018  - 08 - FINRA
Presto Summit 2018 - 08 - FINRA
 
Presto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and FuturePresto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and Future
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containers
 
presto-at-netflix-hadoop-summit-15
presto-at-netflix-hadoop-summit-15presto-at-netflix-hadoop-summit-15
presto-at-netflix-hadoop-summit-15
 
Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine
 
Presto@Uber
Presto@UberPresto@Uber
Presto@Uber
 
Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015
 
Presto Summit 2018 - 10 - Qubole
Presto Summit 2018  - 10 - QubolePresto Summit 2018  - 10 - Qubole
Presto Summit 2018 - 10 - Qubole
 
Presto @ Uber Hadoop summit2017
Presto @ Uber Hadoop summit2017Presto @ Uber Hadoop summit2017
Presto @ Uber Hadoop summit2017
 
Bleeding Edge Databases
Bleeding Edge DatabasesBleeding Edge Databases
Bleeding Edge Databases
 
Quark Virtualization Engine for Analytics
Quark Virtualization Engine for Analytics Quark Virtualization Engine for Analytics
Quark Virtualization Engine for Analytics
 
Hugfr SPARK & RIAK -20160114_hug_france
Hugfr  SPARK & RIAK -20160114_hug_franceHugfr  SPARK & RIAK -20160114_hug_france
Hugfr SPARK & RIAK -20160114_hug_france
 
Superset druid realtime
Superset druid realtimeSuperset druid realtime
Superset druid realtime
 
Netflix running Presto in the AWS Cloud
Netflix running Presto in the AWS CloudNetflix running Presto in the AWS Cloud
Netflix running Presto in the AWS Cloud
 
Presto Summit 2018 - 02 - LinkedIn
Presto Summit 2018  - 02 - LinkedInPresto Summit 2018  - 02 - LinkedIn
Presto Summit 2018 - 02 - LinkedIn
 
Securing Data in Hadoop at Uber
Securing Data in Hadoop at UberSecuring Data in Hadoop at Uber
Securing Data in Hadoop at Uber
 
Presto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @FacebookPresto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @Facebook
 
Spark Summit EU 2015: Revolutionizing Big Data in the Enterprise with Spark
Spark Summit EU 2015: Revolutionizing Big Data in the Enterprise with SparkSpark Summit EU 2015: Revolutionizing Big Data in the Enterprise with Spark
Spark Summit EU 2015: Revolutionizing Big Data in the Enterprise with Spark
 

Similaire à State of Presto Project Overview

Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander ZaitsevWebinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander ZaitsevAltinity Ltd
 
WyspaIT 2016 - Azure Stream Analytics i Azure Machine Learning w analizie str...
WyspaIT 2016 - Azure Stream Analytics i Azure Machine Learning w analizie str...WyspaIT 2016 - Azure Stream Analytics i Azure Machine Learning w analizie str...
WyspaIT 2016 - Azure Stream Analytics i Azure Machine Learning w analizie str...Łukasz Grala
 
Time Series Databases for IoT (On-premises and Azure)
Time Series Databases for IoT (On-premises and Azure)Time Series Databases for IoT (On-premises and Azure)
Time Series Databases for IoT (On-premises and Azure)Ivo Andreev
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on RAjay Ohri
 
Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014Dimitar Danailov
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQLJim Mlodgenski
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
TiDB Introduction - San Francisco MySQL Meetup
TiDB Introduction - San Francisco MySQL MeetupTiDB Introduction - San Francisco MySQL Meetup
TiDB Introduction - San Francisco MySQL MeetupMorgan Tocker
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentationMichael Keane
 
APEX 5 IR: Guts & Performance
APEX 5 IR:  Guts & PerformanceAPEX 5 IR:  Guts & Performance
APEX 5 IR: Guts & PerformanceKaren Cannell
 
APEX 5 IR Guts and Performance
APEX 5 IR Guts and PerformanceAPEX 5 IR Guts and Performance
APEX 5 IR Guts and PerformanceKaren Cannell
 
Advanced SQL For Data Scientists
Advanced SQL For Data ScientistsAdvanced SQL For Data Scientists
Advanced SQL For Data ScientistsDatabricks
 
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineApache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineDataWorks Summit
 
Skillwise - Enhancing dotnet app
Skillwise - Enhancing dotnet appSkillwise - Enhancing dotnet app
Skillwise - Enhancing dotnet appSkillwise Group
 
APEX 5 Interactive Reports: Guts and PErformance
APEX 5 Interactive Reports: Guts and PErformanceAPEX 5 Interactive Reports: Guts and PErformance
APEX 5 Interactive Reports: Guts and PErformanceKaren Cannell
 
Scalable Event Processing with WSO2CEP @ WSO2Con2015eu
Scalable Event Processing with WSO2CEP @  WSO2Con2015euScalable Event Processing with WSO2CEP @  WSO2Con2015eu
Scalable Event Processing with WSO2CEP @ WSO2Con2015euSriskandarajah Suhothayan
 
Stream Processing in Uber
Stream Processing in UberStream Processing in Uber
Stream Processing in UberC4Media
 
New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...
New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...
New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...jexp
 
Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks
Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks
Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks Lucidworks
 

Similaire à State of Presto Project Overview (20)

Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander ZaitsevWebinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
 
WyspaIT 2016 - Azure Stream Analytics i Azure Machine Learning w analizie str...
WyspaIT 2016 - Azure Stream Analytics i Azure Machine Learning w analizie str...WyspaIT 2016 - Azure Stream Analytics i Azure Machine Learning w analizie str...
WyspaIT 2016 - Azure Stream Analytics i Azure Machine Learning w analizie str...
 
Time Series Databases for IoT (On-premises and Azure)
Time Series Databases for IoT (On-premises and Azure)Time Series Databases for IoT (On-premises and Azure)
Time Series Databases for IoT (On-premises and Azure)
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on R
 
Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
TiDB Introduction - San Francisco MySQL Meetup
TiDB Introduction - San Francisco MySQL MeetupTiDB Introduction - San Francisco MySQL Meetup
TiDB Introduction - San Francisco MySQL Meetup
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
APEX 5 IR: Guts & Performance
APEX 5 IR:  Guts & PerformanceAPEX 5 IR:  Guts & Performance
APEX 5 IR: Guts & Performance
 
APEX 5 IR Guts and Performance
APEX 5 IR Guts and PerformanceAPEX 5 IR Guts and Performance
APEX 5 IR Guts and Performance
 
Advanced SQL For Data Scientists
Advanced SQL For Data ScientistsAdvanced SQL For Data Scientists
Advanced SQL For Data Scientists
 
TiDB Introduction
TiDB IntroductionTiDB Introduction
TiDB Introduction
 
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineApache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
 
Skillwise - Enhancing dotnet app
Skillwise - Enhancing dotnet appSkillwise - Enhancing dotnet app
Skillwise - Enhancing dotnet app
 
APEX 5 Interactive Reports: Guts and PErformance
APEX 5 Interactive Reports: Guts and PErformanceAPEX 5 Interactive Reports: Guts and PErformance
APEX 5 Interactive Reports: Guts and PErformance
 
Scalable Event Processing with WSO2CEP @ WSO2Con2015eu
Scalable Event Processing with WSO2CEP @  WSO2Con2015euScalable Event Processing with WSO2CEP @  WSO2Con2015eu
Scalable Event Processing with WSO2CEP @ WSO2Con2015eu
 
Stream Processing in Uber
Stream Processing in UberStream Processing in Uber
Stream Processing in Uber
 
New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...
New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...
New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...
 
Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks
Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks
Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks
 

Plus de kbajda

Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Bostonkbajda
 
Presto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - LyftPresto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - Lyftkbajda
 
Presto Summit 2018 - 06 - Facebook Geospatial
Presto Summit 2018 - 06 - Facebook GeospatialPresto Summit 2018 - 06 - Facebook Geospatial
Presto Summit 2018 - 06 - Facebook Geospatialkbajda
 
Presto Summit 2018 - 05 - Uber Elasticsearch
Presto Summit 2018 - 05 - Uber ElasticsearchPresto Summit 2018 - 05 - Uber Elasticsearch
Presto Summit 2018 - 05 - Uber Elasticsearchkbajda
 
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CAPresto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CAkbajda
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkkbajda
 

Plus de kbajda (6)

Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Boston
 
Presto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - LyftPresto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - Lyft
 
Presto Summit 2018 - 06 - Facebook Geospatial
Presto Summit 2018 - 06 - Facebook GeospatialPresto Summit 2018 - 06 - Facebook Geospatial
Presto Summit 2018 - 06 - Facebook Geospatial
 
Presto Summit 2018 - 05 - Uber Elasticsearch
Presto Summit 2018 - 05 - Uber ElasticsearchPresto Summit 2018 - 05 - Uber Elasticsearch
Presto Summit 2018 - 05 - Uber Elasticsearch
 
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CAPresto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talk
 

Dernier

LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...ttt fff
 

Dernier (20)

LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
 

State of Presto Project Overview

  • 1. State of the Project Martin Traverso July 16, 2018
  • 3. Presto @ Facebook • Multiple uses • Warehouse ETL and ad-hoc analytics • Dashboards • Analytics backend for A/B testing system • Analytics backed for user facing products • 1000s of nodes across several clusters • 100s of PBs and quadrillions of rows processed/day • > 70% of new warehouse ETL workloads on Presto
  • 4.
  • 5. Since last year's meetup... • 30 releases (0.176 to 0.205) • ~3,500 commits (13,800 total) • ~140 contributors (330 total)
  • 6. Workload management • Fairness in local scheduler • Resource group improvements • Query runtime limits Since last year's meetup...
  • 7. Security • Authentication • Client certificates • Password • JSON Web Tokens (RFC 7519) • Worker-to-worker encryption Since last year's meetup...
  • 8. Security • Column-level access control public interface ConnectorAccessControl { ... default void checkCanSelectFromColumns( ConnectorTransactionHandle transaction, Identity identity, SchemaTableName table, Set<String> columns) } Since last year's meetup...
  • 9. Execution • Partitioned (a.k.a "bucket by bucket") query execution • Distributed sort • Dynamic writer scaling • Support for JOIN spilling • Improved memory accounting • Java 9 support • ... many, many performance and reliability improvements Since last year's meetup...
  • 10. Language • Geospatial functions and optimizations (SQL/MM Part 3) • Language constructs • GROUPING • CURRENT_USER • New data types • IPADDRESS • SET_DIGEST Since last year's meetup...
  • 11. Language • Decimal literals 123.45678 DECIMAL(8,5) .12345 DECIMAL(5,5) 1e5 DOUBLE 0.3e-4 DOUBLE 123.456e2 DOUBLE Since last year's meetup...
  • 12. Language • Ordered aggregates SELECT id, array_agg(value ORDER BY index) FROM t, UNNEST(t.elements) WITH ORDINALITY AS u(index, value) GROUP BY id Since last year's meetup...
  • 13. Language • LATERAL join SELECT orderkey, total FROM orders o, LATERAL ( SELECT sum(totalprice) total FROM orders WHERE custkey = o.custkey AND orderdate <= o.orderdate AND orderpriority > o.orderpriority ) t Since last year's meetup...
  • 14. Connectors • TPC-DS • Redshift • Thrift • Cassandra • INSERT support Since last year's meetup...
  • 16. Scalability • Larger clusters (> ~1000 nodes) • HTTP/2 • Multiple query coordinators What's next
  • 17. Language • TIMESTAMP semantics • (c1, c2, ..., cN) IN <subquery> • Improved correlated subqueries • Equality for complex types containing NULLs • Function namespaces (SQL path) What's next
  • 18. Connectors • Elasticsearch • Apache Phoenix • Kudu What's next
  • 19. and more.. • Cost-based join type selection and reordering • Dynamic filtering • Warnings framework • Role management What's next
  • 21. Execution • Failure recovery • Automatic skew handling • Workload re-balancing • Temporary tables • Multi-phase query executions Future Projects
  • 22. Language • User-defined functions • SQL 2016 features • Polymorphic table functions • JSON syntax • Row pattern matching Future Projects
  • 23. Others • Client protocol overhaul • Plan IR overhaul • Extensible optimizer rules • User-defined functions Future Projects