SlideShare une entreprise Scribd logo
1  sur  37
HADOOP EAGLE
Full-stack realtime monitoring framework for eBay hadoop
Edward Zhang yonzhang@ebay.com , @yonzhang2012
Hao Chen hchen9@ebay.com, @ihaoch
Use case: Detect node anomaly by analyzing task failure ratio across all nodes
Assumption : task failure ratio for every node should be approximately equal
Algorithm : node by node compare (symmetry violation) and per node trend
HADOOP EAGLE – EBAY INC 2
HADOOP EAGLE
Background – initial use cases
3
Host: Task failure based anomaly host detection
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
Anomaly Detection & Alerting Analysis Auto-Remediation
4
Scale Challenges @ eBay Hadoop Monitoring
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
• 10+ large Hadoop clusters
• 10,000+ data nodes
• 50,000+ jobs per day
• 50,000,000+ tasks per day
• 500+ types of Hadoop/Hbase native metrics
• Billions of audit events, metrics per day
5
Use cases challenges @ eBay Hadoop Monitoring
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
• Host
• Task failure ratio based machine anomaly detection
• Job monitoring across its lifetime
• Real-time running job performance analysis
• Near real-time job history analytics
• Data skew detection
• Hadoop native metrics
• Hdfs
• Hbase
• M/R
• Logs
• GC log
• Hadoop daemon log
• Audit log
• HDFS image file
• Yarn Framework
• Queue
HADOOP EAGLE – EBAY INC 6
HADOOP EAGLE
Engineering Challenges @ eBay Hadoop Monitoring
• Varieties of data sources
M/R history job, running, GC log, namenode log, hadoop native metrics, YARN
queue, audit log, hdfs image file etc.
• Varieties of data collectors
pull form hdfs, pull YARN API, ship logs, …
• Complex business logic
join outside data, pre-aggregations, memory window …
• Alert rules can’t be hot deployed
• Scalability issue with single process
7
Job History Performance Analyzer
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
• Monitor job history files in near real-time
• Crawl job history files immediately after it is completed
• Apply expertise rules for job performance suggestions
• Job history trend for the same type of job
Job
Start
Event
Task
Start
Event
Task
End
Event
Task
roll-up
Task2
Start
Event
Task2
End
Event
Task
roll-
up
Job
End
Event
Job
Suggestion
Rules
8
Job real-time monitoring
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
• Monitoring running job in real time
• Minute-level job progress snapshots
• Minute-level resource usage
snapshots
• CPU, HDFS I/O, Disk I/O, slot
seconds
• Roll up to user/queue/cluster level
• Slide window based alert
9
Service: GC Log / Server Log
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
• GC event detection and prediction
• Log metrics statistics
• Real-time log anomaly detection
Why Eagle Monitoring Framework
HADOOP EAGLE – EBAY INC 10
HADOOP EAGLE
11
• Data collector -> data processing -> metric pre-agg/alert engine -> storage -> dashboards
• We need create framework to cover full stack in monitoring system
Programming Paradigm and Abstraction
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
12
As a framework, Eagle does not
assume :
• Data source (where, what)
• Business logic execution path (how)
• Policy engine implementation (how)
• Data sink (where, what)
Eagle as a Framework
HADOOP EAGLE – EBAY INC
As a framework, Eagle does the
following:
• SQL-like service API
• High-performing query framework
• Lightweight streaming process java API
• Extensible policy engine implementation
• Scalable and distributed rule evaluation
• Native HBase data storage support
• Metadata driven stream processing
• Data source extensibility
• Data sink extensibility
• Interactive dashboard
HADOOP EAGLE
Eagle Overall Architecture
13HADOOP EAGLE – EBAY INC
HADOOP EAGLE
Eagle Monitoring Framework Internals
HADOOP EAGLE – EBAY INC 14
• Lightweight Streaming Process Framework
• Extensible & Scalable Policy Framework for Alert
• Eagle Query Framework
• Interactive Dashboards
HADOOP EAGLE
15
Facts
• Computation is based on single
event which constitutes endless
continuous stream
• Computation can be
aggregation, time-window,
length-window or join outside
data etc.
• Filter design pattern is used for
modularizing code at the
beginning
Lightweight Streaming Process Framework
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
Abstraction
 Inspired by cascading framework, we
abstract a light-weight streaming
programing API which is independent of
execution environment
 Streaming process is directed acyclic
graph
 This layer of indirection is for code
modularization, code reuse and prevention
of coupling with specific execution
environment
 Runs on single process, Storm or other
streaming technology like Spark
16
Step 1: Task DAG graph setup
Eagle Stream Data Processing API
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
@Override
protected void buildDependency(FlowDef def, DataProcessConfig config) {
Task header = Task.newTask("wordgenerator").setExecutor(source).completeBuild();
Task uppertask = Task.newTask("uppercase").setExecutor(new
UppercaseExecutor()).connectFrom(header).completeBuild();
Task groupbyUppercaseTask = Task.newTask("groupby_uppercase").setExecutor(new
GroupbyCountExecutor()).connectFrom(uppertask).completeBuild();
def.endBy(groupbyUppercaseTask);
}
Step 2: Inter-task data exchange protocol
@Override
protected void buildDependency(FlowDef def, DataProcessConfig config) {
Task header = Task.newTask("wordgenerator").setExecutor(source).completeBuild();
Task uppertask = Task.newTask("uppercase").setExecutor(new
UppercaseExecutor()).connectFrom(header).completeBuild();
Task groupbyUppercaseTask = Task.newTask("groupby_uppercase").setExecutor(new
GroupbyCountExecutor()).connectFrom(uppertask).completeBuild();
def.endBy(groupbyUppercaseTask);
}
17
Execution Graph development, compile and deploy
Development / Compile Phase Deployment / Runtime Phase
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
Eagle Monitoring Framework Internals
HADOOP EAGLE – EBAY INC 18
• Lightweight Streaming Process Framework
• Extensible & Scalable Policy Framework for Alert
• Eagle Query Framework
• Interactive Dashboards
HADOOP EAGLE
19
Extensible & Scalable Policy framework
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
Scalability
• Dynamic policy partitioning across compute nodes based on configurable partition class
• Dynamic policy deployment
• Event partitioning by storm and policy partitioning by Eagle (N events * M policies)
Extensibility
• Support new policy evaluation engine, for example Siddhi, Esper, Machine learning etc.
Features
• Policy CRUD
• Stream metadata (event attribute name, attribute type, attribute value resolver, …)
20
Dynamic Policy Partitioning
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
21
Scalability of Policy Evaluation
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
22
Extensibility of policy framework
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
public interface PolicyEvaluatorServiceProvider {
public String getPolicyType();
public Class<? extends PolicyEvaluator> getPolicyEvaluator();
public Class<? extends PolicyDefinitionParser> getPolicyDefinitionParser();
public Class<? extends PolicyEvaluatorBuilder> getPolicyEvaluatorBuilder();
public List<Module> getBindingModules();
}
Policy Evaluator Provider use SPI to register policy engine implementations
Eagle Monitoring Framework Internals
HADOOP EAGLE – EBAY INC 23
• Lightweight Streaming Process Framework
• Extensible & Scalable Policy Framework for Alert
• Eagle Query Framework
• Interactive Dashboards
HADOOP EAGLE
Eagle Query Framework
HADOOP EAGLE – EBAY INC 24
HADOOP EAGLE
Persistence
• Metric
• Event
• Metadata
• Alert
• Log
• Customized
Structure
• …
Query
• Search
• Filter
• Aggregation
• Sort
• Expression
• ….
Features
• Simple API
• Powerful query
• High performance
• Scalability
• Pluggable
• …
The light-weight metadata-driven store layer to serve
commonly shared storage & query requirements of most monitoring system
Eagle Query Framework
HADOOP EAGLE – EBAY INC 25
HADOOP EAGLE
Eagle Query Framework
HADOOP EAGLE – EBAY INC 26
HADOOP EAGLE
• Metadata definition ORM
• High performance RESTful API supporting CRUD
• SQL-like declarative query syntax
• Generic service client library
• Native support HBase and RDBMS
• Interactive and customizable dashboard
27
• Annotations are metadata to entity
• Metadata driven query compiling and
response rendering
• Metadata driven ser/deser
• Rename column to shorter string(hbase)
• Entity metadata primitives
• Table
• ColumnFamily
• Prefix(the very first partition key)
• Service(entity identifier)
• Partition
• Tags
• Indexes
• Column
Metadata definition ORM
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
@Table("alertdef")
@ColumnFamily("f")
@Prefix("alertdef")
@Service(AlertConstants.ALERT_DEFINITION_SERVICE_ENDPOINT_NAME)
@TimeSeries(false)
@Partition({"cluster", "datacenter"})
@Tags({"programId", "alertExecutorId", "policyId", "policyType"})
@Indexes({
@Index(name="Index_1_alertExecutorId", columns = { "alertExecutorID" })
})
public class AlertDefinitionAPIEntity extends TaggedLogAPIEntity{
@Column("a")
private String desc;
@Column("b")
private String policyDef;
@Column("c")
private String dedupeDef;
@Column("d")
private String notificationDef;
@Column("e")
private String remediationDef;
@Column("f")
private boolean enabled;
28
Generic RESTful API & Query
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
::=
<EntityName>
“[" <FilterCondition> "]"
"<" <GroupbyFields> ">"
"{" <AggregatedFunctions> "}” [ "." "{" <SortbyOptions> "}" ]
eagle-service/rest/entities?query=
29
Generic RESTful API Query Syntax
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
query=JobExecutionService[@cluster=“xyz” AND @datacenter=“abc”]{@startTime,@numTotalMaps}&startTime=&endTime=&pageSize=100
Aggregation Query ::= <EntityName> [QueryCondition]<GroupbyFields>{ AggregatedFunctions}.{SortbyOptions}
query=JobExecutionService[@cluster=“xyz” AND @datacenter=“abc”]<@user>{count, min(endTime-startTime)}&startTime=&endTime=&pageSize=100
query=TaskFailureCountService[@cluster=“xyz” AND @datacenter=“abc” AND
@failureCount>10]{@startTime,@failureCount}&startTime=&endTime=&pageSize=100
CONTAINS, IN, !=, =, <, <=, >, >=
query=TaskFailureCountService[@cluster=“xyz” AND @datacenter=“abc” AND
@failureCount>10]{@startTime,@failureCount}&startTime=&endTime=&pageSize=100&startRowkey=BgVz-9R…….
Search Query
Aggregate Query
TimeSeries Histogram Query
query=GenericMetricService[@cluster="ares" AND @datacenter="lvs"]<@user>{sum(value)}.{sum(value) desc} &timeSeries=true&intervmin=1440
&pageSize=10000000&startTime=2014-07-01 00:00:00&endTime=2014-08-01 00:00:00&metricName=eagle.hdfs.spacesize.cluster
Operators
Numeric Filters
Paginations
30
Generic Eagle Service Client Library
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
• Basic CRUD
• Fluent DSL
• Metric Builder API
• Parallel Client
• Asynchronous Client
client.metric("unit.test.metrics")
.batch(5)
.tags(tags)
.send("unit.test.metrics", System.currentTimeMillis(), tags, 0.1, 0.2, 0.3)
.send(System.currentTimeMillis(), 0.1)
.send(System.currentTimeMillis(),0.1,0.2)
.send(System.currentTimeMillis(),tags,0.1,0.2,0.3)
.send("unit.test.anothermetrics",System.currentTimeMillis(),tags,0.1,0.2,0.3)
.flush();
client.search("GenericMetricService[@cluster="cluster4ut" AND @datacenter =
"datacenter4ut"]<@cluster>{sum(value)}")
.startTime(0)
.endTime(System.currentTimeMillis()+24 * 3600 * 1000)
.metricName("unit.test.metrics")
.pageSize(1000)
.send();
31
Uniform rowkey design
• Metric
• Entity
• Log
HBase Storage Design
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
Rowkey ::= Prefix | Partition Keys | timestamp | tagName | tagValue | …
Rowkey ::= Metric Name | Partition Keys | timestamp | tagName | tagValue | …
Rowkey ::= Default Prefix | Partition Keys | timestamp | tagName | tagValue | …
Rowkey ::= Log Type | Partition Keys | timestamp | tagName | tagValue | …
Rowvalue ::= Log Content
com.ebay.eagle.coprocessor.AggregateProtocol
32
HBase Coprocessor
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
avg count max min sum
nocoprocesso in single
region
coprocessor in single
region
estimated in cluster
33
• Uniform HBASE row-key design for all types of monitoring data sources
• Logically partition data by tags which is defined in annotation @Partition({“cluster”,
“datacenter”})
• Physically shard data by HBASE native feature: rowkey range and region mapping
• Write throughput optimized by using HBASE multi-put
• Co-processor to maximize query performance
• Push evaluation of numeric filters down to HBase
• Secondary index support
• Inspection of RESTful resources and entity metadata
• Numeric filters
• Expression evaluation in output fields
• Rowkey inspection
Tuning for HBase Storage
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
Eagle Monitoring Framework Internals
HADOOP EAGLE – EBAY INC 34
• Lightweight Streaming Process Framework
• Extensible & Scalable Policy Framework for Alert
• Eagle Query Framework
• Interactive Dashboards
HADOOP EAGLE
35
• Interactive: IPython notebook-
like interactive visualization
analysis and troubleshooting.
• Dashboard: Customizable
dashboard layout and drill-down
path, persist and share.
Generic Dashboard Analytics for Eagle Store
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
36
Open Source Soon …
HADOOP EAGLE – EBAY INC
HADOOP EAGLE
• First use case: Eagle to secure
Hadoop platform based on Eagle
framework
• Work closely with Hortonworks,
Dataguise, …
• Share with community and get
community’s support
• Continue to open source job
monitoring, GC monitoring etc.
37
Q & A
HADOOP EAGLE – EBAY INC
HADOOP EAGLE

Contenu connexe

Tendances

Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue
Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and HueHadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue
Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and Huegethue
 
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveApache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveXu Jiang
 
Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015Seshu Adunuthula
 
Hivemall: Scalable machine learning library for Apache Hive/Spark/Pig
Hivemall: Scalable machine learning library for Apache Hive/Spark/PigHivemall: Scalable machine learning library for Apache Hive/Spark/Pig
Hivemall: Scalable machine learning library for Apache Hive/Spark/PigDataWorks Summit/Hadoop Summit
 
Hadoop summit 2010, HONU
Hadoop summit 2010, HONUHadoop summit 2010, HONU
Hadoop summit 2010, HONUJerome Boulon
 
Gobblin' Big Data With Ease @ QConSF 2014
Gobblin' Big Data With Ease @ QConSF 2014Gobblin' Big Data With Ease @ QConSF 2014
Gobblin' Big Data With Ease @ QConSF 2014Lin Qiao
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDataWorks Summit
 
Hive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it finalHive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it finalHortonworks
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackDataWorks Summit/Hadoop Summit
 
Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid DataWorks Summit
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersDataWorks Summit/Hadoop Summit
 
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for HadoopHBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for HadoopHBaseCon
 
eBay Experimentation Platform on Hadoop
eBay Experimentation Platform on HadoopeBay Experimentation Platform on Hadoop
eBay Experimentation Platform on HadoopTony Ng
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Databricks
 

Tendances (20)

Polyalgebra
PolyalgebraPolyalgebra
Polyalgebra
 
Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue
Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and HueHadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue
Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue
 
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveApache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
 
ebay
ebayebay
ebay
 
Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015
 
Hivemall: Scalable machine learning library for Apache Hive/Spark/Pig
Hivemall: Scalable machine learning library for Apache Hive/Spark/PigHivemall: Scalable machine learning library for Apache Hive/Spark/Pig
Hivemall: Scalable machine learning library for Apache Hive/Spark/Pig
 
Hadoop summit 2010, HONU
Hadoop summit 2010, HONUHadoop summit 2010, HONU
Hadoop summit 2010, HONU
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 
Gobblin' Big Data With Ease @ QConSF 2014
Gobblin' Big Data With Ease @ QConSF 2014Gobblin' Big Data With Ease @ QConSF 2014
Gobblin' Big Data With Ease @ QConSF 2014
 
LEGO: Data Driven Growth Hacking Powered by Big Data
LEGO: Data Driven Growth Hacking Powered by Big Data LEGO: Data Driven Growth Hacking Powered by Big Data
LEGO: Data Driven Growth Hacking Powered by Big Data
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
 
Hadoop and HBase @eBay
Hadoop and HBase @eBayHadoop and HBase @eBay
Hadoop and HBase @eBay
 
Hive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it finalHive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it final
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stack
 
Time-oriented event search. A new level of scale
Time-oriented event search. A new level of scale Time-oriented event search. A new level of scale
Time-oriented event search. A new level of scale
 
Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
 
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for HadoopHBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
 
eBay Experimentation Platform on Hadoop
eBay Experimentation Platform on HadoopeBay Experimentation Platform on Hadoop
eBay Experimentation Platform on Hadoop
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
 

En vedette

From Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for AllFrom Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for AllDataWorks Summit
 
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicBig Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicDataWorks Summit
 
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachEvolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachDataWorks Summit
 
Airflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data PipelinesAirflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data PipelinesDataWorks Summit
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitDataWorks Summit
 
Internet of things Crash Course Workshop
Internet of things Crash Course WorkshopInternet of things Crash Course Workshop
Internet of things Crash Course WorkshopDataWorks Summit
 
June 10 145pm hortonworks_tan & welch_v2
June 10 145pm hortonworks_tan & welch_v2June 10 145pm hortonworks_tan & welch_v2
June 10 145pm hortonworks_tan & welch_v2DataWorks Summit
 
large scale collaborative filtering using Apache Giraph
large scale collaborative filtering using Apache Giraphlarge scale collaborative filtering using Apache Giraph
large scale collaborative filtering using Apache GiraphDataWorks Summit
 
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...DataWorks Summit
 
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterHadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterDataWorks Summit
 
Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015Hao Chen
 
Scaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresScaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresDataWorks Summit
 
Apache Lens: Unified OLAP on Realtime and Historic Data
Apache Lens: Unified OLAP on Realtime and Historic DataApache Lens: Unified OLAP on Realtime and Historic Data
Apache Lens: Unified OLAP on Realtime and Historic DataDataWorks Summit
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application ResourcesDataWorks Summit
 
Internet of Things Crash Course Workshop at Hadoop Summit
Internet of Things Crash Course Workshop at Hadoop SummitInternet of Things Crash Course Workshop at Hadoop Summit
Internet of Things Crash Course Workshop at Hadoop SummitDataWorks Summit
 
Sqoop on Spark for Data Ingestion
Sqoop on Spark for Data IngestionSqoop on Spark for Data Ingestion
Sqoop on Spark for Data IngestionDataWorks Summit
 
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceImproving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceDataWorks Summit
 
Spark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop SummitSpark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop SummitDataWorks Summit
 
Apache Eagle: 来自eBay的分布式实时Hadoop数据安全引擎
Apache Eagle: 来自eBay的分布式实时Hadoop数据安全引擎Apache Eagle: 来自eBay的分布式实时Hadoop数据安全引擎
Apache Eagle: 来自eBay的分布式实时Hadoop数据安全引擎Qingwen zhao
 

En vedette (20)

From Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for AllFrom Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for All
 
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicBig Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
 
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachEvolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
 
Airflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data PipelinesAirflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data Pipelines
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
 
Internet of things Crash Course Workshop
Internet of things Crash Course WorkshopInternet of things Crash Course Workshop
Internet of things Crash Course Workshop
 
June 10 145pm hortonworks_tan & welch_v2
June 10 145pm hortonworks_tan & welch_v2June 10 145pm hortonworks_tan & welch_v2
June 10 145pm hortonworks_tan & welch_v2
 
large scale collaborative filtering using Apache Giraph
large scale collaborative filtering using Apache Giraphlarge scale collaborative filtering using Apache Giraph
large scale collaborative filtering using Apache Giraph
 
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
 
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterHadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
 
Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015
 
Scaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresScaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value Stores
 
Apache Lens: Unified OLAP on Realtime and Historic Data
Apache Lens: Unified OLAP on Realtime and Historic DataApache Lens: Unified OLAP on Realtime and Historic Data
Apache Lens: Unified OLAP on Realtime and Historic Data
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resources
 
Internet of Things Crash Course Workshop at Hadoop Summit
Internet of Things Crash Course Workshop at Hadoop SummitInternet of Things Crash Course Workshop at Hadoop Summit
Internet of Things Crash Course Workshop at Hadoop Summit
 
Sqoop on Spark for Data Ingestion
Sqoop on Spark for Data IngestionSqoop on Spark for Data Ingestion
Sqoop on Spark for Data Ingestion
 
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceImproving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of Service
 
Spark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop SummitSpark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop Summit
 
Pm session10
Pm session10Pm session10
Pm session10
 
Apache Eagle: 来自eBay的分布式实时Hadoop数据安全引擎
Apache Eagle: 来自eBay的分布式实时Hadoop数据安全引擎Apache Eagle: 来自eBay的分布式实时Hadoop数据安全引擎
Apache Eagle: 来自eBay的分布式实时Hadoop数据安全引擎
 

Similaire à Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop

Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseHao Chen
 
OWASP ZAP Workshop for QA Testers
OWASP ZAP Workshop for QA TestersOWASP ZAP Workshop for QA Testers
OWASP ZAP Workshop for QA TestersJavan Rasokat
 
Apache Eagle: Architecture Evolvement and New Features
Apache Eagle: Architecture Evolvement and New FeaturesApache Eagle: Architecture Evolvement and New Features
Apache Eagle: Architecture Evolvement and New FeaturesHao Chen
 
Apache Eagle in Action
Apache Eagle in ActionApache Eagle in Action
Apache Eagle in ActionHao Chen
 
Google App Engine for Java
Google App Engine for JavaGoogle App Engine for Java
Google App Engine for JavaLars Vogel
 
Cannibalising The Google App Engine
Cannibalising The  Google  App  EngineCannibalising The  Google  App  Engine
Cannibalising The Google App Enginecatherinewall
 
Apache Eagle Dublin Hadoop Summit 2016
Apache Eagle   Dublin Hadoop Summit 2016Apache Eagle   Dublin Hadoop Summit 2016
Apache Eagle Dublin Hadoop Summit 2016Edward Zhang
 
6 tips for improving ruby performance
6 tips for improving ruby performance6 tips for improving ruby performance
6 tips for improving ruby performanceEngine Yard
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09Chris Purrington
 
Cloud Platforms for Java
Cloud Platforms for JavaCloud Platforms for Java
Cloud Platforms for Java3Pillar Global
 
Łukasz Romaszewski on Internet of Things Raspberry Pi and Java Embedded JavaC...
Łukasz Romaszewski on Internet of Things Raspberry Pi and Java Embedded JavaC...Łukasz Romaszewski on Internet of Things Raspberry Pi and Java Embedded JavaC...
Łukasz Romaszewski on Internet of Things Raspberry Pi and Java Embedded JavaC...Tomek Borek
 
Spring boot
Spring bootSpring boot
Spring bootsdeeg
 
OpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid InfrastructureOpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid Infrastructurerhirschfeld
 
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructuredevopsdaysaustin
 
Google app-engine-cloudcamplagos2011
Google app-engine-cloudcamplagos2011Google app-engine-cloudcamplagos2011
Google app-engine-cloudcamplagos2011Opevel
 
Integrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsIntegrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsDamien Dallimore
 
What to expect from Java 9
What to expect from Java 9What to expect from Java 9
What to expect from Java 9Ivan Krylov
 

Similaire à Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop (20)

Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San Jose
 
OWASP ZAP Workshop for QA Testers
OWASP ZAP Workshop for QA TestersOWASP ZAP Workshop for QA Testers
OWASP ZAP Workshop for QA Testers
 
Apache Eagle: Architecture Evolvement and New Features
Apache Eagle: Architecture Evolvement and New FeaturesApache Eagle: Architecture Evolvement and New Features
Apache Eagle: Architecture Evolvement and New Features
 
Apache Eagle in Action
Apache Eagle in ActionApache Eagle in Action
Apache Eagle in Action
 
Google App Engine for Java
Google App Engine for JavaGoogle App Engine for Java
Google App Engine for Java
 
Cannibalising The Google App Engine
Cannibalising The  Google  App  EngineCannibalising The  Google  App  Engine
Cannibalising The Google App Engine
 
Apache Eagle Dublin Hadoop Summit 2016
Apache Eagle   Dublin Hadoop Summit 2016Apache Eagle   Dublin Hadoop Summit 2016
Apache Eagle Dublin Hadoop Summit 2016
 
6 tips for improving ruby performance
6 tips for improving ruby performance6 tips for improving ruby performance
6 tips for improving ruby performance
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09
 
Iac d.damyanov 4.pptx
Iac d.damyanov 4.pptxIac d.damyanov 4.pptx
Iac d.damyanov 4.pptx
 
Cloud Platforms for Java
Cloud Platforms for JavaCloud Platforms for Java
Cloud Platforms for Java
 
Łukasz Romaszewski on Internet of Things Raspberry Pi and Java Embedded JavaC...
Łukasz Romaszewski on Internet of Things Raspberry Pi and Java Embedded JavaC...Łukasz Romaszewski on Internet of Things Raspberry Pi and Java Embedded JavaC...
Łukasz Romaszewski on Internet of Things Raspberry Pi and Java Embedded JavaC...
 
Spring boot
Spring bootSpring boot
Spring boot
 
OpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid InfrastructureOpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid Infrastructure
 
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
 
Google app-engine-cloudcamplagos2011
Google app-engine-cloudcamplagos2011Google app-engine-cloudcamplagos2011
Google app-engine-cloudcamplagos2011
 
Integrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsIntegrating Splunk into your Spring Applications
Integrating Splunk into your Spring Applications
 
Google App Engine
Google App EngineGoogle App Engine
Google App Engine
 
What to expect from Java 9
What to expect from Java 9What to expect from Java 9
What to expect from Java 9
 

Plus de DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Plus de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Dernier

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 

Dernier (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 

Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop

  • 1. HADOOP EAGLE Full-stack realtime monitoring framework for eBay hadoop Edward Zhang yonzhang@ebay.com , @yonzhang2012 Hao Chen hchen9@ebay.com, @ihaoch
  • 2. Use case: Detect node anomaly by analyzing task failure ratio across all nodes Assumption : task failure ratio for every node should be approximately equal Algorithm : node by node compare (symmetry violation) and per node trend HADOOP EAGLE – EBAY INC 2 HADOOP EAGLE Background – initial use cases
  • 3. 3 Host: Task failure based anomaly host detection HADOOP EAGLE – EBAY INC HADOOP EAGLE Anomaly Detection & Alerting Analysis Auto-Remediation
  • 4. 4 Scale Challenges @ eBay Hadoop Monitoring HADOOP EAGLE – EBAY INC HADOOP EAGLE • 10+ large Hadoop clusters • 10,000+ data nodes • 50,000+ jobs per day • 50,000,000+ tasks per day • 500+ types of Hadoop/Hbase native metrics • Billions of audit events, metrics per day
  • 5. 5 Use cases challenges @ eBay Hadoop Monitoring HADOOP EAGLE – EBAY INC HADOOP EAGLE • Host • Task failure ratio based machine anomaly detection • Job monitoring across its lifetime • Real-time running job performance analysis • Near real-time job history analytics • Data skew detection • Hadoop native metrics • Hdfs • Hbase • M/R • Logs • GC log • Hadoop daemon log • Audit log • HDFS image file • Yarn Framework • Queue
  • 6. HADOOP EAGLE – EBAY INC 6 HADOOP EAGLE Engineering Challenges @ eBay Hadoop Monitoring • Varieties of data sources M/R history job, running, GC log, namenode log, hadoop native metrics, YARN queue, audit log, hdfs image file etc. • Varieties of data collectors pull form hdfs, pull YARN API, ship logs, … • Complex business logic join outside data, pre-aggregations, memory window … • Alert rules can’t be hot deployed • Scalability issue with single process
  • 7. 7 Job History Performance Analyzer HADOOP EAGLE – EBAY INC HADOOP EAGLE • Monitor job history files in near real-time • Crawl job history files immediately after it is completed • Apply expertise rules for job performance suggestions • Job history trend for the same type of job Job Start Event Task Start Event Task End Event Task roll-up Task2 Start Event Task2 End Event Task roll- up Job End Event Job Suggestion Rules
  • 8. 8 Job real-time monitoring HADOOP EAGLE – EBAY INC HADOOP EAGLE • Monitoring running job in real time • Minute-level job progress snapshots • Minute-level resource usage snapshots • CPU, HDFS I/O, Disk I/O, slot seconds • Roll up to user/queue/cluster level • Slide window based alert
  • 9. 9 Service: GC Log / Server Log HADOOP EAGLE – EBAY INC HADOOP EAGLE • GC event detection and prediction • Log metrics statistics • Real-time log anomaly detection
  • 10. Why Eagle Monitoring Framework HADOOP EAGLE – EBAY INC 10 HADOOP EAGLE
  • 11. 11 • Data collector -> data processing -> metric pre-agg/alert engine -> storage -> dashboards • We need create framework to cover full stack in monitoring system Programming Paradigm and Abstraction HADOOP EAGLE – EBAY INC HADOOP EAGLE
  • 12. 12 As a framework, Eagle does not assume : • Data source (where, what) • Business logic execution path (how) • Policy engine implementation (how) • Data sink (where, what) Eagle as a Framework HADOOP EAGLE – EBAY INC As a framework, Eagle does the following: • SQL-like service API • High-performing query framework • Lightweight streaming process java API • Extensible policy engine implementation • Scalable and distributed rule evaluation • Native HBase data storage support • Metadata driven stream processing • Data source extensibility • Data sink extensibility • Interactive dashboard HADOOP EAGLE
  • 13. Eagle Overall Architecture 13HADOOP EAGLE – EBAY INC HADOOP EAGLE
  • 14. Eagle Monitoring Framework Internals HADOOP EAGLE – EBAY INC 14 • Lightweight Streaming Process Framework • Extensible & Scalable Policy Framework for Alert • Eagle Query Framework • Interactive Dashboards HADOOP EAGLE
  • 15. 15 Facts • Computation is based on single event which constitutes endless continuous stream • Computation can be aggregation, time-window, length-window or join outside data etc. • Filter design pattern is used for modularizing code at the beginning Lightweight Streaming Process Framework HADOOP EAGLE – EBAY INC HADOOP EAGLE Abstraction  Inspired by cascading framework, we abstract a light-weight streaming programing API which is independent of execution environment  Streaming process is directed acyclic graph  This layer of indirection is for code modularization, code reuse and prevention of coupling with specific execution environment  Runs on single process, Storm or other streaming technology like Spark
  • 16. 16 Step 1: Task DAG graph setup Eagle Stream Data Processing API HADOOP EAGLE – EBAY INC HADOOP EAGLE @Override protected void buildDependency(FlowDef def, DataProcessConfig config) { Task header = Task.newTask("wordgenerator").setExecutor(source).completeBuild(); Task uppertask = Task.newTask("uppercase").setExecutor(new UppercaseExecutor()).connectFrom(header).completeBuild(); Task groupbyUppercaseTask = Task.newTask("groupby_uppercase").setExecutor(new GroupbyCountExecutor()).connectFrom(uppertask).completeBuild(); def.endBy(groupbyUppercaseTask); } Step 2: Inter-task data exchange protocol @Override protected void buildDependency(FlowDef def, DataProcessConfig config) { Task header = Task.newTask("wordgenerator").setExecutor(source).completeBuild(); Task uppertask = Task.newTask("uppercase").setExecutor(new UppercaseExecutor()).connectFrom(header).completeBuild(); Task groupbyUppercaseTask = Task.newTask("groupby_uppercase").setExecutor(new GroupbyCountExecutor()).connectFrom(uppertask).completeBuild(); def.endBy(groupbyUppercaseTask); }
  • 17. 17 Execution Graph development, compile and deploy Development / Compile Phase Deployment / Runtime Phase HADOOP EAGLE – EBAY INC HADOOP EAGLE
  • 18. Eagle Monitoring Framework Internals HADOOP EAGLE – EBAY INC 18 • Lightweight Streaming Process Framework • Extensible & Scalable Policy Framework for Alert • Eagle Query Framework • Interactive Dashboards HADOOP EAGLE
  • 19. 19 Extensible & Scalable Policy framework HADOOP EAGLE – EBAY INC HADOOP EAGLE Scalability • Dynamic policy partitioning across compute nodes based on configurable partition class • Dynamic policy deployment • Event partitioning by storm and policy partitioning by Eagle (N events * M policies) Extensibility • Support new policy evaluation engine, for example Siddhi, Esper, Machine learning etc. Features • Policy CRUD • Stream metadata (event attribute name, attribute type, attribute value resolver, …)
  • 20. 20 Dynamic Policy Partitioning HADOOP EAGLE – EBAY INC HADOOP EAGLE
  • 21. 21 Scalability of Policy Evaluation HADOOP EAGLE – EBAY INC HADOOP EAGLE
  • 22. 22 Extensibility of policy framework HADOOP EAGLE – EBAY INC HADOOP EAGLE public interface PolicyEvaluatorServiceProvider { public String getPolicyType(); public Class<? extends PolicyEvaluator> getPolicyEvaluator(); public Class<? extends PolicyDefinitionParser> getPolicyDefinitionParser(); public Class<? extends PolicyEvaluatorBuilder> getPolicyEvaluatorBuilder(); public List<Module> getBindingModules(); } Policy Evaluator Provider use SPI to register policy engine implementations
  • 23. Eagle Monitoring Framework Internals HADOOP EAGLE – EBAY INC 23 • Lightweight Streaming Process Framework • Extensible & Scalable Policy Framework for Alert • Eagle Query Framework • Interactive Dashboards HADOOP EAGLE
  • 24. Eagle Query Framework HADOOP EAGLE – EBAY INC 24 HADOOP EAGLE Persistence • Metric • Event • Metadata • Alert • Log • Customized Structure • … Query • Search • Filter • Aggregation • Sort • Expression • …. Features • Simple API • Powerful query • High performance • Scalability • Pluggable • … The light-weight metadata-driven store layer to serve commonly shared storage & query requirements of most monitoring system
  • 25. Eagle Query Framework HADOOP EAGLE – EBAY INC 25 HADOOP EAGLE
  • 26. Eagle Query Framework HADOOP EAGLE – EBAY INC 26 HADOOP EAGLE • Metadata definition ORM • High performance RESTful API supporting CRUD • SQL-like declarative query syntax • Generic service client library • Native support HBase and RDBMS • Interactive and customizable dashboard
  • 27. 27 • Annotations are metadata to entity • Metadata driven query compiling and response rendering • Metadata driven ser/deser • Rename column to shorter string(hbase) • Entity metadata primitives • Table • ColumnFamily • Prefix(the very first partition key) • Service(entity identifier) • Partition • Tags • Indexes • Column Metadata definition ORM HADOOP EAGLE – EBAY INC HADOOP EAGLE @Table("alertdef") @ColumnFamily("f") @Prefix("alertdef") @Service(AlertConstants.ALERT_DEFINITION_SERVICE_ENDPOINT_NAME) @TimeSeries(false) @Partition({"cluster", "datacenter"}) @Tags({"programId", "alertExecutorId", "policyId", "policyType"}) @Indexes({ @Index(name="Index_1_alertExecutorId", columns = { "alertExecutorID" }) }) public class AlertDefinitionAPIEntity extends TaggedLogAPIEntity{ @Column("a") private String desc; @Column("b") private String policyDef; @Column("c") private String dedupeDef; @Column("d") private String notificationDef; @Column("e") private String remediationDef; @Column("f") private boolean enabled;
  • 28. 28 Generic RESTful API & Query HADOOP EAGLE – EBAY INC HADOOP EAGLE ::= <EntityName> “[" <FilterCondition> "]" "<" <GroupbyFields> ">" "{" <AggregatedFunctions> "}” [ "." "{" <SortbyOptions> "}" ] eagle-service/rest/entities?query=
  • 29. 29 Generic RESTful API Query Syntax HADOOP EAGLE – EBAY INC HADOOP EAGLE query=JobExecutionService[@cluster=“xyz” AND @datacenter=“abc”]{@startTime,@numTotalMaps}&startTime=&endTime=&pageSize=100 Aggregation Query ::= <EntityName> [QueryCondition]<GroupbyFields>{ AggregatedFunctions}.{SortbyOptions} query=JobExecutionService[@cluster=“xyz” AND @datacenter=“abc”]<@user>{count, min(endTime-startTime)}&startTime=&endTime=&pageSize=100 query=TaskFailureCountService[@cluster=“xyz” AND @datacenter=“abc” AND @failureCount>10]{@startTime,@failureCount}&startTime=&endTime=&pageSize=100 CONTAINS, IN, !=, =, <, <=, >, >= query=TaskFailureCountService[@cluster=“xyz” AND @datacenter=“abc” AND @failureCount>10]{@startTime,@failureCount}&startTime=&endTime=&pageSize=100&startRowkey=BgVz-9R……. Search Query Aggregate Query TimeSeries Histogram Query query=GenericMetricService[@cluster="ares" AND @datacenter="lvs"]<@user>{sum(value)}.{sum(value) desc} &timeSeries=true&intervmin=1440 &pageSize=10000000&startTime=2014-07-01 00:00:00&endTime=2014-08-01 00:00:00&metricName=eagle.hdfs.spacesize.cluster Operators Numeric Filters Paginations
  • 30. 30 Generic Eagle Service Client Library HADOOP EAGLE – EBAY INC HADOOP EAGLE • Basic CRUD • Fluent DSL • Metric Builder API • Parallel Client • Asynchronous Client client.metric("unit.test.metrics") .batch(5) .tags(tags) .send("unit.test.metrics", System.currentTimeMillis(), tags, 0.1, 0.2, 0.3) .send(System.currentTimeMillis(), 0.1) .send(System.currentTimeMillis(),0.1,0.2) .send(System.currentTimeMillis(),tags,0.1,0.2,0.3) .send("unit.test.anothermetrics",System.currentTimeMillis(),tags,0.1,0.2,0.3) .flush(); client.search("GenericMetricService[@cluster="cluster4ut" AND @datacenter = "datacenter4ut"]<@cluster>{sum(value)}") .startTime(0) .endTime(System.currentTimeMillis()+24 * 3600 * 1000) .metricName("unit.test.metrics") .pageSize(1000) .send();
  • 31. 31 Uniform rowkey design • Metric • Entity • Log HBase Storage Design HADOOP EAGLE – EBAY INC HADOOP EAGLE Rowkey ::= Prefix | Partition Keys | timestamp | tagName | tagValue | … Rowkey ::= Metric Name | Partition Keys | timestamp | tagName | tagValue | … Rowkey ::= Default Prefix | Partition Keys | timestamp | tagName | tagValue | … Rowkey ::= Log Type | Partition Keys | timestamp | tagName | tagValue | … Rowvalue ::= Log Content
  • 32. com.ebay.eagle.coprocessor.AggregateProtocol 32 HBase Coprocessor HADOOP EAGLE – EBAY INC HADOOP EAGLE 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 avg count max min sum nocoprocesso in single region coprocessor in single region estimated in cluster
  • 33. 33 • Uniform HBASE row-key design for all types of monitoring data sources • Logically partition data by tags which is defined in annotation @Partition({“cluster”, “datacenter”}) • Physically shard data by HBASE native feature: rowkey range and region mapping • Write throughput optimized by using HBASE multi-put • Co-processor to maximize query performance • Push evaluation of numeric filters down to HBase • Secondary index support • Inspection of RESTful resources and entity metadata • Numeric filters • Expression evaluation in output fields • Rowkey inspection Tuning for HBase Storage HADOOP EAGLE – EBAY INC HADOOP EAGLE
  • 34. Eagle Monitoring Framework Internals HADOOP EAGLE – EBAY INC 34 • Lightweight Streaming Process Framework • Extensible & Scalable Policy Framework for Alert • Eagle Query Framework • Interactive Dashboards HADOOP EAGLE
  • 35. 35 • Interactive: IPython notebook- like interactive visualization analysis and troubleshooting. • Dashboard: Customizable dashboard layout and drill-down path, persist and share. Generic Dashboard Analytics for Eagle Store HADOOP EAGLE – EBAY INC HADOOP EAGLE
  • 36. 36 Open Source Soon … HADOOP EAGLE – EBAY INC HADOOP EAGLE • First use case: Eagle to secure Hadoop platform based on Eagle framework • Work closely with Hortonworks, Dataguise, … • Share with community and get community’s support • Continue to open source job monitoring, GC monitoring etc.
  • 37. 37 Q & A HADOOP EAGLE – EBAY INC HADOOP EAGLE

Notes de l'éditeur

  1. Anomaly detection algorithm Continuously crawl job history files immediately after it is completed Calculate minute level job failure ratio for each node A node is identified to be anomalous when either of the following 2 conditions happen This node continuously fails tasks within this node This node has significant higher failure ratio than rest of nodes within the cluster
  2. Inspired by TSDB, Ganglia, Nagios, Zabbix etc. Most of them focus on infrastructure level data collection and alert, but they don’t consider business logic complexity – how to prepare data
  3. Uniform HBASE row-key design for all types of monitoring data sources Logically partition data by tags which is defined in annotation @Partition({“cluster”, “datacenter”}) Physically shard data by HBASE native feature: rowkey range and region mapping Write throughput optimized by using HBASE multi-put Co-processor to maximize query performance Push evaluation of numeric filters down to HBase Secondary index support Inspection of RESTful resources and entity metadata Numeric filters Expression evaluation in output fields Rowkey inspection