SlideShare a Scribd company logo
1 of 17
A Developers Guide To Coprocessors
Hbasecon 2013John Weatherford
https://github.com/jweatherford
Telescope is the leading provider of interactive
television, audience participation and customer engagement
solutions.
Clients include TV networks, producers, digital
platforms, studios, and sponsors seeking to
reach, engage, and retain mass-audiences and consumers in
real-time.
Who Is Telescope?
Arbitrary code that can run on each server
Extendthe functionality of Hbase
Avoid bothering the core committers
What Is A Coprocessor
Region 2
Endpoint
Region 3
Post-Action
Endpoint
Endpoints
Call a function explicitly
Execute code on all regions
Action
Observers
React to an event
Run code before or after
Two Types of Coprocessors
Pre-ActionClient
Region 1
Endpoint
Client
What Can I Do With Coprocessors
Ideas
what can be done
Access Control
Secondary Indexes
Optimized Search
Data Aggregation
Control compaction times
Real Time Analytics
Reduce result sets
Cache Request
Email split alerts
Getting Started With Code
preGet(ObserverContext<RegionCoprocessorEnvironment> c, Get get,
List<KeyValue> result)
postGet(ObserverContext<RegionCoprocessorEnvironment> c, Get get,
List<KeyValue> result)
prePut(ObserverContext<RegionCoprocessorEnvironment> c, Put put,
WALEdit edit, boolean writeToWAL)
postPut(ObserverContext<RegionCoprocessorEnvironment> c, Put put,
WALEdit edit, boolean writeToWAL)
preDelete(ObserverContext<RegionCoprocessorEnvironment> c, Delete delete,
WALEdit edit, boolean writeToWAL)
postDelete(ObserverContext<RegionCoprocessorEnvironment> c, Delete delete,
WALEdit edit, boolean writeToWAL)
Our First Observer
Intercept and modify the action
Consider all circumstances that will trigger the observer
Compile your jar to the same version of Java running your
Hbase Regions
Look for output from the coprocessor
key: id-1332343
twitter:name: “loljk4u”
twitter:message: “<3”
twitter:length: 0x2
twitter:registered: 0xFF
favorite:name: “Taylor”
favorite:song: “I knew
you were trouble”
Our First Observer
Motivation Apache flume only writes one column per put
{twitter:
{ name: “loljk4u”,
message: “<3”,
length: 2,
registered: true
},
favorite:
{ name: “Taylor”
...
JSON
key: id-1332343
family: twitter
qualifier: json_raw
value: “{twitter:
{name: “loljk4u”,
message: “<3”,
length: 2,
registered: true
...
Single
Row Put
preput()
put
JsonColumnExpander
//get the arguments on the coprocessor
public void start(CoprocessorEnvironment env) throws IOException {
Configuration c = env.getConfiguration();
families = c.get("families", "").split(":");
}
public void prePut(ObserverContext<…> e, Put put, WALEdit edit, boolean waL) {
if(!put.has(FAMILY, JSON_COLUMN)) { return; } //check for the json_raw column
String json = Bytes.toString(put.get(FAMILY, JSON_COLUMN).get(0).getValue());
for(Entry<String, ?> column : columns.entrySet()) { //loop through the json
String value = (String) column.getValue();
put.add(family, Bytes.toBytes(column.getKey()), Bytes.toBytes(value));
}
//remove the original json from the put
put.add(FAMILY, JSON_COLUMN, "--removed--".getBytes());
}
Loading the Coprocessor
Push the jar to where your cluster can find it
$>hadoop fs –put JsonColumnExpander.jar /
Alter the table to enable the coprocessor
$> alter „test', METHOD =>
'table_att', 'coprocessor'=>'hdfs:///JsonColumnExpander.jar|telesco
pe.hbase.JsonColumnExpander|1001|arg1=1,arg2=2„
Verify the load by checking the master web UI.
Running The Code
Trigger the coprocessor with a put on the table
Put put = new Put(“rowkey”);
Put.add(“twitter”.toBytes(), “json_raw”.toBytes(), json_data);
Check each server’s local logs
http://regionnode:60030/logs/
hbase-hbase-regionserver-node2.
dev-hadoop.telescope.tv.out
Creating Your First Endpoint
Define the available methods a protocol
Implement the protocol
Extend BaseRegionEndpoint
Load the endpoint on the table
Endpoint Example
public interface TrendsProtocol extends CoprocessorProtocol{
HashMap<String, Long> getData() throws IOException;
}
//The endpoint class implements the protocol we wrote above
public class TrendsEndpoint extends BaseEndpointCoprocessor implements TrendsProtocol {
@Override
public HashMap<String, Long> getTrends() throws IOException {
RegionCoprocessorEnvironment environment = getEnvironment();
InternalScanner scanner = environment.getRegion().getScanner(s);
try {
List<KeyValue> curVals = new ArrayList<KeyValue>();
do {
curVals.clear();
for(KeyValue pair : curVals){
//loop through values on the region and process
}
}while(!done);
}
}
}
Endpoint Returned Results
htable = HBaseDB.getTable(connection, “hbase_demo");
Map<byte[], HashMap<String, Long>> results = null;
results = m_analytics.coprocessorExec(
TrendsProtocol.class,
null, //start row
null, //end row
new Batch.Call<TrendsProtocol, HashMap<String, Long>>(){
@Override
public HashMap<String, Long> call(TrendsProtocol trends)throws IOException {
return trends.getData();
}
}
);
for (Map.Entry<byte[], Boolean> entry : results.entrySet()) {
//process results from each region server
}
Addendum to Endpoints
0.96 is changing Endpoints to use protobuf
public static abstract class RowCountService
implements com.google.protobuf.Service {
...
public interface Interface {
public abstract void getRowCount(
com.google.protobuf.RpcController controller,
CountRequest request,
com.google.protobuf.RpcCallback done);
public abstract void getKeyValueCount(
com.google.protobuf.RpcController controller,
CountRequest request,
com.google.protobuf.RpcCallback done);
}
}
Telescope’s Coprocessors
Observers collect real time analytics data for our
moderation platform as well as to create aggregate tables
for the steaming data
Endpoints optimize searches and transmit only the
necessary data. Perform simple reporting queries that
don’t need the full power of mapreduce.
Questions?
Alreadyusing coprocessors? I would love to hear about it.
Curious to know more about a specific part?
All code samples and table definitions can be found at
https://github.com/jweatherford

More Related Content

What's hot

006 performance tuningandclusteradmin
006 performance tuningandclusteradmin006 performance tuningandclusteradmin
006 performance tuningandclusteradminScott Miao
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceCloudera, Inc.
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationSchubert Zhang
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreCloudera, Inc.
 
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaCloudera, Inc.
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0enissoz
 
HBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBaseHBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBaseCloudera, Inc.
 
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...Equnix Business Solutions
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon
 
Background Tasks in Node - Evan Tahler, TaskRabbit
Background Tasks in Node - Evan Tahler, TaskRabbitBackground Tasks in Node - Evan Tahler, TaskRabbit
Background Tasks in Node - Evan Tahler, TaskRabbitRedis Labs
 
Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Denish Patel
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
Feed Burner Scalability
Feed Burner ScalabilityFeed Burner Scalability
Feed Burner Scalabilitydidip
 
Streaming replication in PostgreSQL
Streaming replication in PostgreSQLStreaming replication in PostgreSQL
Streaming replication in PostgreSQLAshnikbiz
 
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuPostgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuRedis Labs
 
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Spark Summit
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustAltinity Ltd
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.Jack Levin
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand EnvironmentHBaseCon
 

What's hot (20)

006 performance tuningandclusteradmin
006 performance tuningandclusteradmin006 performance tuningandclusteradmin
006 performance tuningandclusteradmin
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance Evaluation
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
 
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0
 
HBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBaseHBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBase
 
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environment
 
Background Tasks in Node - Evan Tahler, TaskRabbit
Background Tasks in Node - Evan Tahler, TaskRabbitBackground Tasks in Node - Evan Tahler, TaskRabbit
Background Tasks in Node - Evan Tahler, TaskRabbit
 
Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
Feed Burner Scalability
Feed Burner ScalabilityFeed Burner Scalability
Feed Burner Scalability
 
Hbase Nosql
Hbase NosqlHbase Nosql
Hbase Nosql
 
Streaming replication in PostgreSQL
Streaming replication in PostgreSQLStreaming replication in PostgreSQL
Streaming replication in PostgreSQL
 
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuPostgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
 
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
 

Similar to HBaseCon 2013: A Developer’s Guide to Coprocessors

OSGi and Eclipse RCP
OSGi and Eclipse RCPOSGi and Eclipse RCP
OSGi and Eclipse RCPEric Jain
 
Release with confidence
Release with confidenceRelease with confidence
Release with confidenceJohn Congdon
 
Monitoring InfluxEnterprise
Monitoring InfluxEnterpriseMonitoring InfluxEnterprise
Monitoring InfluxEnterpriseInfluxData
 
What the CRaC - Superfast JVM startup
What the CRaC - Superfast JVM startupWhat the CRaC - Superfast JVM startup
What the CRaC - Superfast JVM startupGerrit Grunwald
 
Do you know what your drupal is doing? Observe it!
Do you know what your drupal is doing? Observe it!Do you know what your drupal is doing? Observe it!
Do you know what your drupal is doing? Observe it!Luca Lusso
 
Average- An android project
Average- An android projectAverage- An android project
Average- An android projectIpsit Dash
 
Hazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSHazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSuzquiano
 
How to go the extra mile on monitoring
How to go the extra mile on monitoringHow to go the extra mile on monitoring
How to go the extra mile on monitoringTiago Simões
 
Bang-Bang, you have been hacked - Yonatan Levin, KolGene
Bang-Bang, you have been hacked - Yonatan Levin, KolGeneBang-Bang, you have been hacked - Yonatan Levin, KolGene
Bang-Bang, you have been hacked - Yonatan Levin, KolGeneDroidConTLV
 
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websitesBurn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websitesLindsay Holmwood
 
Camel one v3-6
Camel one v3-6Camel one v3-6
Camel one v3-6wxdydx
 
Dev fest 2020 taiwan how to debug microservices on kubernetes as a pros (ht...
Dev fest 2020 taiwan   how to debug microservices on kubernetes as a pros (ht...Dev fest 2020 taiwan   how to debug microservices on kubernetes as a pros (ht...
Dev fest 2020 taiwan how to debug microservices on kubernetes as a pros (ht...KAI CHU CHUNG
 
An introduction to Test Driven Development on MapReduce
An introduction to Test Driven Development on MapReduceAn introduction to Test Driven Development on MapReduce
An introduction to Test Driven Development on MapReduceAnanth PackkilDurai
 
Node.js - async for the rest of us.
Node.js - async for the rest of us.Node.js - async for the rest of us.
Node.js - async for the rest of us.Mike Brevoort
 
Bringing order to the chaos! - Paulo Lopes - Codemotion Amsterdam 2018
Bringing order to the chaos! - Paulo Lopes - Codemotion Amsterdam 2018Bringing order to the chaos! - Paulo Lopes - Codemotion Amsterdam 2018
Bringing order to the chaos! - Paulo Lopes - Codemotion Amsterdam 2018Codemotion
 
Automated Abstraction of Flow of Control in a System of Distributed Software...
Automated Abstraction of Flow of Control in a System of Distributed  Software...Automated Abstraction of Flow of Control in a System of Distributed  Software...
Automated Abstraction of Flow of Control in a System of Distributed Software...nimak
 
Real World Lessons on the Pain Points of Node.JS Application
Real World Lessons on the Pain Points of Node.JS ApplicationReal World Lessons on the Pain Points of Node.JS Application
Real World Lessons on the Pain Points of Node.JS ApplicationBen Hall
 
The unconventional devices for the Android video streaming
The unconventional devices for the Android video streamingThe unconventional devices for the Android video streaming
The unconventional devices for the Android video streamingMatteo Bonifazi
 

Similar to HBaseCon 2013: A Developer’s Guide to Coprocessors (20)

OSGi and Eclipse RCP
OSGi and Eclipse RCPOSGi and Eclipse RCP
OSGi and Eclipse RCP
 
Release with confidence
Release with confidenceRelease with confidence
Release with confidence
 
Monitoring InfluxEnterprise
Monitoring InfluxEnterpriseMonitoring InfluxEnterprise
Monitoring InfluxEnterprise
 
Air superiority for Android Apps
Air superiority for Android AppsAir superiority for Android Apps
Air superiority for Android Apps
 
What the CRaC - Superfast JVM startup
What the CRaC - Superfast JVM startupWhat the CRaC - Superfast JVM startup
What the CRaC - Superfast JVM startup
 
Do you know what your drupal is doing? Observe it!
Do you know what your drupal is doing? Observe it!Do you know what your drupal is doing? Observe it!
Do you know what your drupal is doing? Observe it!
 
Average- An android project
Average- An android projectAverage- An android project
Average- An android project
 
Hazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSHazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMS
 
How to go the extra mile on monitoring
How to go the extra mile on monitoringHow to go the extra mile on monitoring
How to go the extra mile on monitoring
 
Lesson 2
Lesson 2Lesson 2
Lesson 2
 
Bang-Bang, you have been hacked - Yonatan Levin, KolGene
Bang-Bang, you have been hacked - Yonatan Levin, KolGeneBang-Bang, you have been hacked - Yonatan Levin, KolGene
Bang-Bang, you have been hacked - Yonatan Levin, KolGene
 
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websitesBurn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
 
Camel one v3-6
Camel one v3-6Camel one v3-6
Camel one v3-6
 
Dev fest 2020 taiwan how to debug microservices on kubernetes as a pros (ht...
Dev fest 2020 taiwan   how to debug microservices on kubernetes as a pros (ht...Dev fest 2020 taiwan   how to debug microservices on kubernetes as a pros (ht...
Dev fest 2020 taiwan how to debug microservices on kubernetes as a pros (ht...
 
An introduction to Test Driven Development on MapReduce
An introduction to Test Driven Development on MapReduceAn introduction to Test Driven Development on MapReduce
An introduction to Test Driven Development on MapReduce
 
Node.js - async for the rest of us.
Node.js - async for the rest of us.Node.js - async for the rest of us.
Node.js - async for the rest of us.
 
Bringing order to the chaos! - Paulo Lopes - Codemotion Amsterdam 2018
Bringing order to the chaos! - Paulo Lopes - Codemotion Amsterdam 2018Bringing order to the chaos! - Paulo Lopes - Codemotion Amsterdam 2018
Bringing order to the chaos! - Paulo Lopes - Codemotion Amsterdam 2018
 
Automated Abstraction of Flow of Control in a System of Distributed Software...
Automated Abstraction of Flow of Control in a System of Distributed  Software...Automated Abstraction of Flow of Control in a System of Distributed  Software...
Automated Abstraction of Flow of Control in a System of Distributed Software...
 
Real World Lessons on the Pain Points of Node.JS Application
Real World Lessons on the Pain Points of Node.JS ApplicationReal World Lessons on the Pain Points of Node.JS Application
Real World Lessons on the Pain Points of Node.JS Application
 
The unconventional devices for the Android video streaming
The unconventional devices for the Android video streamingThe unconventional devices for the Android video streaming
The unconventional devices for the Android video streaming
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Recently uploaded (20)

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

HBaseCon 2013: A Developer’s Guide to Coprocessors

  • 1. A Developers Guide To Coprocessors Hbasecon 2013John Weatherford https://github.com/jweatherford
  • 2. Telescope is the leading provider of interactive television, audience participation and customer engagement solutions. Clients include TV networks, producers, digital platforms, studios, and sponsors seeking to reach, engage, and retain mass-audiences and consumers in real-time. Who Is Telescope?
  • 3. Arbitrary code that can run on each server Extendthe functionality of Hbase Avoid bothering the core committers What Is A Coprocessor
  • 4. Region 2 Endpoint Region 3 Post-Action Endpoint Endpoints Call a function explicitly Execute code on all regions Action Observers React to an event Run code before or after Two Types of Coprocessors Pre-ActionClient Region 1 Endpoint Client
  • 5. What Can I Do With Coprocessors Ideas what can be done Access Control Secondary Indexes Optimized Search Data Aggregation Control compaction times Real Time Analytics Reduce result sets Cache Request Email split alerts
  • 6. Getting Started With Code preGet(ObserverContext<RegionCoprocessorEnvironment> c, Get get, List<KeyValue> result) postGet(ObserverContext<RegionCoprocessorEnvironment> c, Get get, List<KeyValue> result) prePut(ObserverContext<RegionCoprocessorEnvironment> c, Put put, WALEdit edit, boolean writeToWAL) postPut(ObserverContext<RegionCoprocessorEnvironment> c, Put put, WALEdit edit, boolean writeToWAL) preDelete(ObserverContext<RegionCoprocessorEnvironment> c, Delete delete, WALEdit edit, boolean writeToWAL) postDelete(ObserverContext<RegionCoprocessorEnvironment> c, Delete delete, WALEdit edit, boolean writeToWAL)
  • 7. Our First Observer Intercept and modify the action Consider all circumstances that will trigger the observer Compile your jar to the same version of Java running your Hbase Regions Look for output from the coprocessor
  • 8. key: id-1332343 twitter:name: “loljk4u” twitter:message: “<3” twitter:length: 0x2 twitter:registered: 0xFF favorite:name: “Taylor” favorite:song: “I knew you were trouble” Our First Observer Motivation Apache flume only writes one column per put {twitter: { name: “loljk4u”, message: “<3”, length: 2, registered: true }, favorite: { name: “Taylor” ... JSON key: id-1332343 family: twitter qualifier: json_raw value: “{twitter: {name: “loljk4u”, message: “<3”, length: 2, registered: true ... Single Row Put preput() put
  • 9. JsonColumnExpander //get the arguments on the coprocessor public void start(CoprocessorEnvironment env) throws IOException { Configuration c = env.getConfiguration(); families = c.get("families", "").split(":"); } public void prePut(ObserverContext<…> e, Put put, WALEdit edit, boolean waL) { if(!put.has(FAMILY, JSON_COLUMN)) { return; } //check for the json_raw column String json = Bytes.toString(put.get(FAMILY, JSON_COLUMN).get(0).getValue()); for(Entry<String, ?> column : columns.entrySet()) { //loop through the json String value = (String) column.getValue(); put.add(family, Bytes.toBytes(column.getKey()), Bytes.toBytes(value)); } //remove the original json from the put put.add(FAMILY, JSON_COLUMN, "--removed--".getBytes()); }
  • 10. Loading the Coprocessor Push the jar to where your cluster can find it $>hadoop fs –put JsonColumnExpander.jar / Alter the table to enable the coprocessor $> alter „test', METHOD => 'table_att', 'coprocessor'=>'hdfs:///JsonColumnExpander.jar|telesco pe.hbase.JsonColumnExpander|1001|arg1=1,arg2=2„ Verify the load by checking the master web UI.
  • 11. Running The Code Trigger the coprocessor with a put on the table Put put = new Put(“rowkey”); Put.add(“twitter”.toBytes(), “json_raw”.toBytes(), json_data); Check each server’s local logs http://regionnode:60030/logs/ hbase-hbase-regionserver-node2. dev-hadoop.telescope.tv.out
  • 12. Creating Your First Endpoint Define the available methods a protocol Implement the protocol Extend BaseRegionEndpoint Load the endpoint on the table
  • 13. Endpoint Example public interface TrendsProtocol extends CoprocessorProtocol{ HashMap<String, Long> getData() throws IOException; } //The endpoint class implements the protocol we wrote above public class TrendsEndpoint extends BaseEndpointCoprocessor implements TrendsProtocol { @Override public HashMap<String, Long> getTrends() throws IOException { RegionCoprocessorEnvironment environment = getEnvironment(); InternalScanner scanner = environment.getRegion().getScanner(s); try { List<KeyValue> curVals = new ArrayList<KeyValue>(); do { curVals.clear(); for(KeyValue pair : curVals){ //loop through values on the region and process } }while(!done); } } }
  • 14. Endpoint Returned Results htable = HBaseDB.getTable(connection, “hbase_demo"); Map<byte[], HashMap<String, Long>> results = null; results = m_analytics.coprocessorExec( TrendsProtocol.class, null, //start row null, //end row new Batch.Call<TrendsProtocol, HashMap<String, Long>>(){ @Override public HashMap<String, Long> call(TrendsProtocol trends)throws IOException { return trends.getData(); } } ); for (Map.Entry<byte[], Boolean> entry : results.entrySet()) { //process results from each region server }
  • 15. Addendum to Endpoints 0.96 is changing Endpoints to use protobuf public static abstract class RowCountService implements com.google.protobuf.Service { ... public interface Interface { public abstract void getRowCount( com.google.protobuf.RpcController controller, CountRequest request, com.google.protobuf.RpcCallback done); public abstract void getKeyValueCount( com.google.protobuf.RpcController controller, CountRequest request, com.google.protobuf.RpcCallback done); } }
  • 16. Telescope’s Coprocessors Observers collect real time analytics data for our moderation platform as well as to create aggregate tables for the steaming data Endpoints optimize searches and transmit only the necessary data. Perform simple reporting queries that don’t need the full power of mapreduce.
  • 17. Questions? Alreadyusing coprocessors? I would love to hear about it. Curious to know more about a specific part? All code samples and table definitions can be found at https://github.com/jweatherford

Editor's Notes

  1. Thank you for comingI am John WeatherfordThis is going to have Java code
  2. We create digital products and applications for multiple devices and deliver campaigns and solutions across multiple platforms, live events, and more.Our major campaigns are Idol, Voice
  3. There are two different types of Coprocessors, endpoints and observers.Observers are code that is triggered by an Hbase operation. In the relational DB model, this is logically similar to a trigger.Endpoints are code that is called explicitly as a function on the server. In the relational DB model, this is logically similar to a stored proceedureCoprocessors can be run on all regions, just the regions of a particular table or just on the master VERIFY THIS IS TRUE
  4. RegionObserverMasterObserverWALObserver
  5. RidiulousHbase exampleRestrict data changes after midnight? Rick Roll a random data requests
  6. Coprocessor class? What does all this extend?
  7. Example: AsyncHbase writer for Apache Flume doesn’t allow more than a single column write per operation. The goal of this observer is to allow flume to send all the column data we need and simply organize it when we get to the server
  8. There are two ways to load the Jar, through the hbase-site.xml and altering the table. For demonstration we will be altering the table.Check the github repo for a base script that can be used to load the coprocessor through each stepSHOW: PICTURE: Insert picture of the loaded coprocessor
  9. Each server has local logs that can be accessed through the master UI. Should the coprocessor have some sort of error, we can find the output here.
  10. Explain what a protocol is.Endpoints aren’t triggered by actIt is important to remember endpoints run on all servers that contain any key within the start and end key passed.ions on the table, but called directly from the client.
  11. Remember the endpoint runs on all the region servers so we are returned a set of results in a mapCall the endpoint in your client code.https://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.html
  12. Remember the endpoint runs on all the region servers so we are returned a set of results in a mapCall the endpoint in your client code.https://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.html