Soumettre la recherche
Mettre en ligne
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
•
1 j'aime
•
973 vues
Luke Han
Suivre
Apache Tez Introducation - Apache Kylin Meetup @Shanghai
Lire moins
Lire la suite
Logiciels
Signaler
Partager
Signaler
Partager
1 sur 24
Télécharger maintenant
Télécharger pour lire hors ligne
Recommandé
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthel
t3rmin4t0r
A TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with Presto
Yu Liu
Data organization: hive meetup
Data organization: hive meetup
t3rmin4t0r
Managing Machine Learning workflows on Treasure Data
Managing Machine Learning workflows on Treasure Data
Aki Ariga
Apache Tez – Present and Future
Apache Tez – Present and Future
Jianfeng Zhang
Improve data engineering work with Digdag and Presto UDF
Improve data engineering work with Digdag and Presto UDF
Kentaro Yoshida
Apache Tez – Present and Future
Apache Tez – Present and Future
DataWorks Summit
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
Ryu Kobayashi
Recommandé
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthel
t3rmin4t0r
A TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with Presto
Yu Liu
Data organization: hive meetup
Data organization: hive meetup
t3rmin4t0r
Managing Machine Learning workflows on Treasure Data
Managing Machine Learning workflows on Treasure Data
Aki Ariga
Apache Tez – Present and Future
Apache Tez – Present and Future
Jianfeng Zhang
Improve data engineering work with Digdag and Presto UDF
Improve data engineering work with Digdag and Presto UDF
Kentaro Yoshida
Apache Tez – Present and Future
Apache Tez – Present and Future
DataWorks Summit
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
Ryu Kobayashi
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
gluent.
201810 td tech_talk
201810 td tech_talk
Keisuke Suzuki
Llap: Locality is Dead
Llap: Locality is Dead
t3rmin4t0r
Quick Introduction to Apache Tez
Quick Introduction to Apache Tez
GetInData
October 2014 HUG : Hive On Spark
October 2014 HUG : Hive On Spark
Yahoo Developer Network
Recent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future Presto
Kai Sasaki
Stinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
Hortonworks
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
Muga Nishizawa
EMR and DynamoDB
EMR and DynamoDB
Sohail M. Khan
LLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
DataWorks Summit
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User Group
Rommel Garcia
Spark Summit EU talk by Brij Bhushan Ravat
Spark Summit EU talk by Brij Bhushan Ravat
Spark Summit
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
alanfgates
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
DataWorks Summit/Hadoop Summit
An Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present Future
DataWorks Summit/Hadoop Summit
Geographica: A Benchmark for Geospatial RDF Stores
Geographica: A Benchmark for Geospatial RDF Stores
Kostis Kyzirakos
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
Hortonworks
Apache Tez - Accelerating Hadoop Data Processing
Apache Tez - Accelerating Hadoop Data Processing
hitesh1892
Tune up Yarn and Hive
Tune up Yarn and Hive
rxu
Apache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and Future
DataWorks Summit
Apache Tez – Present and Future
Apache Tez – Present and Future
Rajesh Balamohan
February 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and Insides
Yahoo Developer Network
Contenu connexe
Tendances
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
gluent.
201810 td tech_talk
201810 td tech_talk
Keisuke Suzuki
Llap: Locality is Dead
Llap: Locality is Dead
t3rmin4t0r
Quick Introduction to Apache Tez
Quick Introduction to Apache Tez
GetInData
October 2014 HUG : Hive On Spark
October 2014 HUG : Hive On Spark
Yahoo Developer Network
Recent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future Presto
Kai Sasaki
Stinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
Hortonworks
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
Muga Nishizawa
EMR and DynamoDB
EMR and DynamoDB
Sohail M. Khan
LLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
DataWorks Summit
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User Group
Rommel Garcia
Spark Summit EU talk by Brij Bhushan Ravat
Spark Summit EU talk by Brij Bhushan Ravat
Spark Summit
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
alanfgates
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
DataWorks Summit/Hadoop Summit
An Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present Future
DataWorks Summit/Hadoop Summit
Geographica: A Benchmark for Geospatial RDF Stores
Geographica: A Benchmark for Geospatial RDF Stores
Kostis Kyzirakos
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
Hortonworks
Apache Tez - Accelerating Hadoop Data Processing
Apache Tez - Accelerating Hadoop Data Processing
hitesh1892
Tune up Yarn and Hive
Tune up Yarn and Hive
rxu
Apache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and Future
DataWorks Summit
Tendances
(20)
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
201810 td tech_talk
201810 td tech_talk
Llap: Locality is Dead
Llap: Locality is Dead
Quick Introduction to Apache Tez
Quick Introduction to Apache Tez
October 2014 HUG : Hive On Spark
October 2014 HUG : Hive On Spark
Recent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future Presto
Stinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
EMR and DynamoDB
EMR and DynamoDB
LLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User Group
Spark Summit EU talk by Brij Bhushan Ravat
Spark Summit EU talk by Brij Bhushan Ravat
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
An Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present Future
Geographica: A Benchmark for Geospatial RDF Stores
Geographica: A Benchmark for Geospatial RDF Stores
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Tez - Accelerating Hadoop Data Processing
Apache Tez - Accelerating Hadoop Data Processing
Tune up Yarn and Hive
Tune up Yarn and Hive
Apache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and Future
Similaire à 3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
Apache Tez – Present and Future
Apache Tez – Present and Future
Rajesh Balamohan
February 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and Insides
Yahoo Developer Network
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
DataWorks Summit
Tez Data Processing over Yarn
Tez Data Processing over Yarn
InMobi Technology
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engine
bigdatagurus_meetup
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
Data Con LA
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
DataWorks Summit
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
Hortonworks
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks
Mhug apache storm
Mhug apache storm
Joseph Niemiec
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Modern Data Stack France
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query Processing
Bikas Saha
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
Hortonworks
Apache Apex Meetup at Cask
Apache Apex Meetup at Cask
Apache Apex
Stream Processing Everywhere - What to use?
Stream Processing Everywhere - What to use?
MapR Technologies
Interactive query in hadoop
Interactive query in hadoop
Rommel Garcia
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
Hortonworks
DataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application Meetup
Thomas Weise
Tajo_Meetup_20141120
Tajo_Meetup_20141120
Hyoungjun Kim
La big datacamp2014_vikram_dixit
La big datacamp2014_vikram_dixit
Data Con LA
Similaire à 3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
(20)
Apache Tez – Present and Future
Apache Tez – Present and Future
February 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and Insides
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
Tez Data Processing over Yarn
Tez Data Processing over Yarn
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engine
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Mhug apache storm
Mhug apache storm
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query Processing
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
Apache Apex Meetup at Cask
Apache Apex Meetup at Cask
Stream Processing Everywhere - What to use?
Stream Processing Everywhere - What to use?
Interactive query in hadoop
Interactive query in hadoop
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
DataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application Meetup
Tajo_Meetup_20141120
Tajo_Meetup_20141120
La big datacamp2014_vikram_dixit
La big datacamp2014_vikram_dixit
Plus de Luke Han
Augmented OLAP for Big Data
Augmented OLAP for Big Data
Luke Han
Apache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin and Use Cases - 2018 Big Data Spain
Luke Han
Refactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics Products
Luke Han
Building Enterprise OLAP on Hadoop for FSI
Building Enterprise OLAP on Hadoop for FSI
Luke Han
Apache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and Japan
Luke Han
The Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke Han
Luke Han
The Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke Han
Luke Han
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
Luke Han
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
Luke Han
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
Luke Han
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
Luke Han
Apache Kylin Open Source Journey for QCon2015 Beijing
Apache Kylin Open Source Journey for QCon2015 Beijing
Luke Han
ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015
Luke Han
Apache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big Data
Luke Han
Apache Kylin Introduction
Apache Kylin Introduction
Luke Han
Adding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark Meetup
Luke Han
Apache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 Beijing
Luke Han
Kylin OLAP Engine Tour
Kylin OLAP Engine Tour
Luke Han
Actuate presentation 2011
Actuate presentation 2011
Luke Han
Plus de Luke Han
(19)
Augmented OLAP for Big Data
Augmented OLAP for Big Data
Apache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin and Use Cases - 2018 Big Data Spain
Refactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics Products
Building Enterprise OLAP on Hadoop for FSI
Building Enterprise OLAP on Hadoop for FSI
Apache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and Japan
The Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke Han
The Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke Han
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
Apache Kylin Open Source Journey for QCon2015 Beijing
Apache Kylin Open Source Journey for QCon2015 Beijing
ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015
Apache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Introduction
Apache Kylin Introduction
Adding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark Meetup
Apache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 Beijing
Kylin OLAP Engine Tour
Kylin OLAP Engine Tour
Actuate presentation 2011
Actuate presentation 2011
Dernier
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
itservices996
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
vyaparkranti
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
Shane Coughlan
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
Bert Jan Schrijver
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
rahul_net
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
Safe Software
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
preethippts
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
OnePlan Solutions
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
RTS corp
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
vaideheekore1
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024
Anthony Dahanne
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
Hironori Washizaki
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
Jean Silva
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Angel Borroy López
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Cizo Technology Services
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
osttopstonverter
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
Roberto Pérez Alcolea
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
team-WIBU
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
OnePlan Solutions
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
Lionel Briand
Dernier
(20)
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
1.
© Hortonworks Inc.
2015 Page 1 Apache Tez -‐ Next Generation of execution engine upon hadoop Jeff Zhang (@zjffdu)
2.
© Hortonworks Inc.
2015 Who’s this guy • Start use pig from 2009. Become Pig committer from Nov 2009 • Join Hortonworks in 2014. • Tez Committer from Oct 2014
3.
© Hortonworks Inc.
2015 Agenda •Tez Introduction •Tez Feature Deep Dive •Tez Status & Roadmap
4.
© Hortonworks Inc.
2015 I/O Synchronization Barrier I/O Synchronization Barrier Job 1 ( Join a & b ) Job 3 ( Group by of c ) Job 2 (Group by of a Join b) Job 4 (Join of S & R ) Hive -‐ MR Example of MR versus Tez Page 4 Single Job Hive -‐ Tez Join a & b Group by of a Join b Group by of c Job 4 (Join of S & R )
5.
© Hortonworks Inc.
2015 Tez – Introduction Page 5 • Distributed execution framework targeted towards data-‐processing applications. • Based on expressing a computation as a dataflow graph (DAG). • Highly customizable to meet a broad spectrum of use cases. • Built on top of YARN – the resource management framework for Hadoop. • Open source Apache project and Apache licensed.
6.
© Hortonworks Inc.
2015 What is DAG & Why DAG Projection Filter GroupBy … Join Union Intersect … Split … • Directed Acyclic Graph • Any complicated DAG can been composed of the following 3 basic paradigm – Sequential – Merge – Divide
7.
© Hortonworks Inc.
2015 Expressing DAG in Tez API • DAG API (Logic View) – Allowuser to build DAG – Topological structure of the data computation flow • Runtime API (Runtime View) – Application logic of each computation unit (vertex) – How to move/read/write data between vertices
8.
© Hortonworks Inc.
2015 DAG API (Logic View) Page 8 • Vertex (Processor, Parallelism, Resource, etc…) • Edge (EdgeProperty) – DataMovement – Scatter Gather (Join, GroupBy … ) – Broadcast ( Pig Replicated Join / Hive Broadcast Join ) – One-‐to-‐One ( Pig Order by ) – Custom
9.
© Hortonworks Inc.
2015 Runtime API (Runtime View) Page 9 ProcessorInput Output • Input – Through which processor receives data on an edge – Vertex can have multiple inputs • Processor – Application Logic (One vertex one processor) – Consume the inputs and produce the outputs • Output – Through which processor writes data to an edge – One vertex can have multiple outputs • Example of Input/Output/Processor – MRInput & MROutput (InputFormat/OutputFormat) – OrderedGroupedKVInput & OrderedPartitionedKVOutput (Scatter Gather) – UnorderedKVInput & UnorderedKVOutput (Broadcast & One-‐to-‐One) – PigProcessor/HiveProcessor
10.
© Hortonworks Inc.
2015 Benefit of DAG • Easier to express computation in DAG • No intermediate data written to HDFS • Less pressure on NameNode • No resource queuing effort & less resource contention • More optimization opportunity with more global context
11.
© Hortonworks Inc.
2015 Agenda •Tez Introduction •Tez Feature Deep Dive •Tez Improvement & Debuggability •Tez Status & Roadmap
12.
© Hortonworks Inc.
2015 Container-‐Reuse • Reuse the same container across DAG/Vertices/Tasks • Benefit of Container-‐Reuse – Less resources consumed – Reduce overhead of launching JVM – Reduce overhead of negotiate with Resource Manager – Reduce overhead of resource localization – Reduce network IO – Object Caching (Object Sharing)
13.
© Hortonworks Inc.
2015 Tez Session • Multiple Jobs/DAGs in one AM • Container-‐reuse across Jobs/DAGs • Data sharing between Jobs/DAGs
14.
© Hortonworks Inc.
2015 Dynamic Parallelism Estimation • VertexManager – Listen to the other vertices status – Coordinate and schedule its tasks – Communication between vertices
15.
© Hortonworks Inc.
2015 ATS Integration • Tez is fully integrated with YARN ATS (Application Timeline Service) – DAG Status, DAG Metrics, Task Status, Task Metrics are captured • Diagnostics & Performance analysis – Data Source for monitoring & diagnostics – Data Source for performance analysis
16.
© Hortonworks Inc.
2015 Recovery • AM can crash in corner cases – OOM – Node failure – … • Continue from the last checkpoint • Transparent to end users AM Crash
17.
© Hortonworks Inc.
2015 Order By of Pig f = Load ‘foo’ as (x, y); o = Order f by x;Load Sample (Calculate Histogram) HDFS Partition Sort Broadcast Load Sample (Calculate Histogram) Partition Sort One-‐to-‐One Scatter Gather Scatter Gather
18.
© Hortonworks Inc.
2015 Tez UI
19.
© Hortonworks Inc.
2015 Tez UI
20.
Tez UI 20 Download data
from ATS
21.
© Hortonworks Inc.
2015 RoadMap • Shared output edges – Same output to multiple vertices • Local mode stabilization • Optimizing (include/exclude) vertex at runtime • Partial completion VertexManager • Co-‐Scheduling • Framework stats for better runtime decisions
22.
© Hortonworks Inc.
2015 Tez – Adoption • Apache Hive • Start from Hive 0.13 • set hive.exec.engine = tez • Apache Pig • Start from Pig 0.14 • pig -‐x tez • Cascading • Flink Page 22
23.
© Hortonworks Inc.
2015 Tez Community • Useful Links – http://tez.apache.org/ – JIRA : https://issues.apache.org/jira/browse/TEZ – Code Repository: https://git-‐wip-‐us.apache.org/repos/asf/tez.git – Mailing Lists – Dev List: dev@tez.apache.org – User List: user@tez.apache.org – Issues List: issues@tez.apache.org • Tez Meetup – http://www.meetup.com/Apache-‐Tez-‐User-‐Group
24.
© Hortonworks Inc.
2015 Thank You! Questions & Answers Page 24
Télécharger maintenant