SlideShare une entreprise Scribd logo
1  sur  53
History of Event Collector
One of the legacy systems in Treasure Data
Mitsunori Komatsu
About me
• Mitsunori Komatsu (@komamitsu), 

Software engineer (Backend team)
• Joined Treasure Data almost 5 years ago
• Hive, Presto, PlazmaDB, Mobile SDKs, Datatank,
Workflow, … Event Collector, Bigdam (Pig,
Impala…)
• Favorite language: OCaml
• RE OSS dev
• MessagePack-Java, Digdag, Fluency
Retired legacy systems in
Treasure Data
Apache Pig integration
Apache Impala integration
Prestogres
?????????
Retired legacy systems in
Treasure Data
Apache Pig integration
Apache Impala integration
Prestogres
Event Collector

(retirement candidate)
E C
What’s Event Collector?
• HTTP server application receives events from
JavaScript SDK, Mobile SDKs, etc…
• Buffers events for several minutes in local disk and
uploads them to the existing import endpoint in
Treasure Data
• Existing original data ingestion endpoint td-api isn’t
good at handling frequent small uploads
• Consists of Fluentd in/out plugins (similar to in_http /
out_tdlog) to rely on Fluentd’s buffering mechanizm
• It’s been developed ad hoc and improved ad hoc…
Fluentd
in_event_collector
out_event_collector
td-api
Redis
dedup
1 event, 1 event, …
events set
events set
System architecture (2014-06)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
buffer chunks (retention time: 1min)
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Fluentd
in_event_collector
out_event_collector
td-api
Redis
dedup
1 event, 1 event, …
events set
events set
System architecture (2014-06)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
buffer chunks (retention time: 1min)
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Fluentd
in_event_collector
out_event_collector
td-api
Redis
dedup
1 event, 1 event, …
events set
events set
System architecture (2014-06)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
buffer chunks (retention time: 1min)
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Fluentd
in_event_collector
out_event_collector
td-api
Redis
dedup
1 event, 1 event, …
events set
events set
System architecture (2014-06)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
buffer chunks (retention time: 1min)
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Fluentd
in_event_collector
out_event_collector
td-api
Redis
dedup
1 event, 1 event, …
events set
events set
System architecture (2014-06)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
buffer chunks (retention time: 1min)
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Problem #0 (2014-11)
• Event Collector stores UUID of request from
Mobile SDKs into Redis for de-duplication
• Event Collector should be scaled out. But the
Redis needed to be scaled out…
Sharded Redis
• UUIDs of requests are hash-partitioned and stored
in a sharded Redis cluster
• We created a bit intelligent Redis client that can
• fail over to secondary Redis instance
• double-write UUIDs to another Redis instance as
well as current assigned one so that re-
partitioning can be done w/o duplicated data
Sharded Redis
Event
Collector
Redis#0 Redis#1
Redis list
Redis#0
Redis#1
event#0 w/ UUID#0 (=>1000)
1000 % 2 = 0
=> Redis#0
UUID#0
UUID#0
Sharded Redis
Event
Collector
Redis#0 Redis#1
Redis list
Redis#0
Redis#1
event#1 w/ UUID#1 (=>1001)
1001 % 2 = 1
=> Redis#1
UUID#1
UUID#0 UUID#1
Sharded Redis
Event
Collector
Redis#0 Redis#1
Redis list
- Redis#0
- Redis#1
New Redis list
- Redis#0
- Redis#1
- Redis#2event#4 w/ UUID#4 (=>1004)
1004 % 2 = 0
=> Redis#0


1004 % 3 = 2
=> Redis#2 for replication
UUID#4
UUID#0
UUID#4
UUID#1
Redis#2
UUID#4
UUID#4
Replicate
UUID in
advance
Not used
for dedup
for now
Sharded Redis
Event
Collector
Redis#0 Redis#1
Redis list
- Redis#0
- Redis#1
- Redis#2
event#11 w/ UUID#11 (=>1011) 1011 % 3 = 2
=> Redis#2
UUID#11
UUID#0
UUID#4
UUID#1
UUID#7
Redis#2
UUID#4

UUID#7
UUID#11
For UUID#4/7, this
Redis stores them in
advance and can
dedup them
Sharded Redis
Fluentd
in_event_collector
td-api
Sharded Redis
dedup
events set
System architecture (2014-11)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
Redis#0
UUID#0, UUID#4,
UUID#8, UUID#12,

…, UUID#N
out_event_collector
1 event, 1 event, …
events set
buffer chunks (retention time: 1min)
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Problem #1 (2015-01)
• The usage of Event Collector was getting increased. It
sometimes got down…















• “TCP: Possible SYN flooding on port xxxx. Sending
cookies.” in kern.log
• “$ netstat -s

19855 times the listen queue of a socket overflowed

19855 SYNs to LISTEN sockets dropped”
Listen socket backlog
• The default listen socket backlog queue
length was only 1024.
• The short queue length was subject to traffic
spikes
• Increased it up to 8192.
Fluentd
in_event_collector
td-api
Sharded Redis
dedup
events set
System architecture (2015-01)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
Redis#0
UUID#0, UUID#4,
UUID#8, UUID#12,

…, UUID#N
- LISTEN socket backlog : 8192
out_event_collector
1 event, 1 event, …
events set
buffer chunks (retention time: 1min)
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Problem #2 (2015-05)
• The usage of Event Collector was getting
more increased. It still sometimes got
down…
• There were 2 options, to optimize the
source code and to run Event Collector in
multiprocess
Optimization of source code
• First…, profile! profile! profile!
• 2 performance bottlenecks were:
• sending metrics per request to another Fluentd
➡ Aggregated 5 seconds range metrics in memory to
reduce the number of messages to another Fluentd
• parsing UserAgent
➡ Cached 100 UserAgentParser (ua-parser) instances
with LRU eviction
• The performance was improved 50 times
In multiprocess
• Only 1 Event Collector process was running. But
Ruby can’t make the best of multi-core with multi-
threads
• It was time to run Event Collector in multiprocess!
fluent-plugin-multiprocess
• With fluent-plugin-multiprocess, 8 sets of input / output
plugin workers of Event Collector can run in multi processes
• The performance improved 6 times with it
• Drawback:
• The number of output plugin workers also increased
• As a result, the number of uploaded chunks to td-api
increased significantly. td-api sometimes suffered from a
lot of tiny uploaded chunk files…
• In other words, td-api was a bottleneck in Event
Collector’s scalability
fluent-plugin-multiprocess
Event CollectorNginx td-api
input plugin worker output plugin worker
input plugin worker output plugin worker
input plugin worker output plugin worker
events
Fluentd (multi process #0)Fluentd (multi process #0)Fluentd (multi process #0)
in_event_collector
td-api
Sharded Redis
dedup
events set
System architecture (2015-05)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
Redis#0
UUID#0, UUID#4,
UUID#8, UUID#12,

…, UUID#N
- LISTEN socket backlog : 8192
- Avoid instantiating Ruby object
- Reduce metrics requests to Fluentd
out_event_collector
1 event, 1 event, …
events set
buffer chunks (retention time: 1min)
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Nginx
Problem #3 (2015-12)
• td-api sometimes got unstable since the
number of uploaded chunk files had
increased….
• We needed to reduce the number of
uploaded chunk files to td-api from Event
Collector
detach_process
• Found a Fluentd configuration item
“detach_process” for input plugins
• With it, multiple input_plugin workers can run in
multi processes keeping the number of
output_plugin workers to 1
• The number of uploaded chunk files to td-api
would be reduced!
detach_process
Event CollectorNginx td-api
input plugin
worker

(parent)
output plugin
worker
events
input plugin
worker
input plugin
worker
input plugin
worker
Fluentd
in_event_collector (detach_process #0)
in_event_collector (detach_process #0)
in_event_collector (detach_process #0)
td-api
Sharded Redis
dedup
events set
System architecture (2015-12)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
Redis#0
UUID#0, UUID#4,
UUID#8, UUID#12,

…, UUID#N
- LISTEN socket backlog : 8192
- Avoid instantiating Ruby object
- Reduce metrics requests to Fluentd
out_event_collector
1 event, 1 event, …
events set
buffer chunks (retention time: 1min)
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Nginx
Problem #4 (2016-04)
• There were some issues with
“detach_process” in a race condition
• Also, “detach_process” got a deprecated
feature!
https://www.fluentd.org/blog/fluentd-v0.14.9-has-been-released
Backed to fluent-plugin-multiprocess…
Fluentd (multi process #0)
Fluentd (multi process #0)Fluentd (multi process #0)
in_event_collector
td-api
Sharded Redis
dedup
events set
System architecture (2016-04)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
Redis#0
UUID#0, UUID#4,
UUID#8, UUID#12,

…, UUID#N
- LISTEN socket backlog : 8192
- Avoid instantiating Ruby object
- Reduce metrics requests to Fluentd
out_event_collector
1 event, 1 event, …
events set
buffer chunks (retention time: 1min)
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Nginx
Problem #5 (2016-05)
• The throughput of Redis cluster sometimes got
a bottleneck
• When dedup with Redis cluster for requests
from Mobile SDK got stuck, processing
requests from JavaScript SDK / Postback
from SaaS were stuck too…
dedup in another thread
• If de-duplication processing runs in a different thread from
input_plugin worker, input_plugin worker can continue to
process requests even when accesses to Redis gets stuck
• existing output_plugin runs in another thread, so it might be
an option to dedup in it
• But output_plugin handles large chunk files. So if it retries
around the end of a chunk file, all records in the chunk file
are handled as duplicated records.
• We needed to mitigate the impact of this case
• Let’s insert a new thin output plugin worker!
dedup in another thread
input_plugin
worker
Redis
Cluster
output_plugin
worker
Before
event w/ UUID
GetAndSet:
UUID
Fluentd's

Buffer
Emit event
response: OK
:
:
event w/ UUID
Chunked events
Upload to td-api
If dedup gets stuck, it
affects processing of
all requests…
:
:
dedup in another thread
input_plugin
worker
Redis
Cluster
output_plugin
worker
After
event w/ UUID
GetAndSet:
UUID
Fluentd's

Buffer#1
Emit event
response: OK
:
:
event w/ UUID
Chunked events
Retention time: 1m
Upload to td-api
thin

output_plugin
worker
Fluentd's

Buffer#0
Small
Chunked events
Retention time: 5s
Emit event
:
:
Even if dedup gets stuck, it
doesn’t affect processing of
all requests!
Made input_plugin return
response ASAP w/o dedup
Fluentd (multi process #0)Fluentd (multi process #0)Fluentd (multi process #0)
in_event_collector
out_event_collector
td-api
Sharded Redis
dedup
1 event, 1 event, …
events set
events set
System architecture (2016-05)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
Redis#0
UUID#0, UUID#4,
UUID#8, UUID#12,

…, UUID#N
- LISTEN socket backlog : 8192
- Avoid instantiating Ruby object
- Reduce metrics requests to Fluentd
out_object_handover
small events set
buffer chunks (retention time: 5s)
buffer chunks (retention time: 1min)
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Nginx
Problem #6 (2016-06)
• The de-duplication with Redis tended to be
delayed
• Even we upgrade the instance type of Redis
cluster, Redis runs on a single core and can’t use
benefits of multicores…
• Actually, we used Redis as just a KVS. The
complex data types of Redis wasn’t needed in the
end
Memcached
• Replaced the dedup cluster with Memcached
• Any problems didn’t occur during the migration
thanks to the double write feature
• Based on benchmark results using actual access
pattern of Event Collector, the performance
improved twice and the memory consumption of
dedup cluster was reduced down to 68%
comparing to Redis
Fluentd (multi process #0)Fluentd (multi process #0)Fluentd (multi process #0)
in_event_collector
out_event_collector
td-api
Sharded Memcached
1 event, 1 event, …
events set
events set
System architecture (2016-06)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
Memcached#0
UUID#0, UUID#4,
UUID#8, UUID#12,

…, UUID#N
- LISTEN socket backlog : 8192
- Avoid instantiating Ruby object
- Reduce metrics requests to Fluentd
out_object_handover
small events set
buffer chunks (retention time: 5s)
buffer chunks (retention time: 1min)
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Nginx
dedup
Problem #7 (2016-12)
• Needed to support 36 hours TTL on the dedup
cluster for one of our customers while using default 1
hour TTL for other customers’ requests
• It sounded easy since Memcached’s APIs support TTL
• But the Memcached dedup cluster stopped reclaiming
expired entries that had 1 hour TTL and memory
consumptions got increased drastically…
Reclamation mechanism in
Memcached
• When reading Memcached source code, found the
cause
• Memcached removes expired entries as far as it
continues to find expired entries in a row. It
works fine when all entries have similar TTL
• But, if Memcached has 1 hour TTL and 36 hours
TTL entries, it stops reclaiming when it finds a 36
hours TTL entry even it has expired 1 hour TTL
entries a lot behind

Reclamation mechanism in
Memcached
“lru_crawler crawl all”
• Found Memcached provides an API “lru_crawler”















• Let’s call “lru_crawler crawl all” repeatedly!

- Takes a single, or a list of, numeric classids (ie: 1,3,10). This instructs
the crawler to start at the tail of each of these classids and run to the
head. The crawler cannot be stopped or restarted until it completes the
previous request.
The special keyword "all" instructs it to crawl all slabs with items in
them.
https://github.com/memcached/memcached/blob/master/doc/protocol.txt
Fluentd (multi process #0)Fluentd (multi process #0)Fluentd (multi process #0)
in_event_collector
out_event_collector
td-api
Sharded Memcached
1 event, 1 event, …
events set
events set
System architecture (2016-12)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
Memcached#0
UUID#0, UUID#4,
UUID#8, UUID#12,

…, UUID#N
- LISTEN socket backlog : 8192
- Avoid instantiating Ruby object
- Reduce metrics requests to Fluentd
out_object_handover
small events set
buffer chunks (retention time: 5s)
buffer chunks (retention time: 3min)
lru_crawler
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Nginx
dedup
Fluentd
in_event_collector
out_event_collector
td-api
Redis
dedup
1 event, 1 event, …
events set
events set
System architecture (2014-06)
HTTPS
UUID#0, UUID#1,
UUID#2, UUID#3,

…, UUID#N
buffer chunks (retention time: 1min)
HTTPS
event
w/ UUID
JS SDK /
Mobile SDKs /

Postback
from SaaS
Usage change
requests/sec
2015-05 910
2018-02 19300
2015-05
2018-02
- Increased more than 21 times!
- Event Collector is very important

as a part of CDP service
Triggered an alert!
But it still has problems…
• Buffered data stored in local disk isn’t replicated
• Uploads of many small chunk files to td-api
• Aggregation those small ones with in_forward plugin
is an option, though...
• Further performance improvement



But it still has problems…
• Buffered data stored in local disk isn’t replicated
➡ Bigdam will resolve it!
• Uploads of many small chunk files to td-api
• Aggregation those small ones with in_forward plugin
is an option, though...
• Further performance improvement



But it still has problems…
• Buffered data stored in local disk isn’t replicated
➡ Bigdam will resolve it!
• Uploads of many small chunk files to td-api
• Aggregation those small ones with in_forward plugin
is an option, though...
➡ Bigdam will resolve it!
• Further performance improvement



But it still has problems…
• Buffered data stored in local disk isn’t replicated
➡ Bigdam will resolve it!
• Uploads of many small chunk files to td-api
• Aggregation those small ones with in_forward plugin
is an option, though...
➡ Bigdam will resolve it!
• Further performance improvement
➡ Bigdam will resolve it!

Contenu connexe

Tendances

Open Source Logging and Metric Tools
Open Source Logging and Metric ToolsOpen Source Logging and Metric Tools
Open Source Logging and Metric ToolsPhase2
 
Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...Paul Brebner
 
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...Cisco DevNet
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackJakub Hajek
 
DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...
 DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and... DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...
DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...PROIDEA
 
Stac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakesStac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakesiguazio
 
Grid Middleware – Principles, Practice and Potential
Grid Middleware – Principles, Practice and PotentialGrid Middleware – Principles, Practice and Potential
Grid Middleware – Principles, Practice and PotentialPaul Brebner
 
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Paul Brebner
 
Building a system for machine and event-oriented data - SF HUG Nov 2015
Building a system for machine and event-oriented data - SF HUG Nov 2015Building a system for machine and event-oriented data - SF HUG Nov 2015
Building a system for machine and event-oriented data - SF HUG Nov 2015Felicia Haggarty
 
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...PROIDEA
 
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...DataStax Academy
 
CNCF, State of Serverless & Project Nuclio
CNCF, State of Serverless & Project NuclioCNCF, State of Serverless & Project Nuclio
CNCF, State of Serverless & Project NuclioLee Calcote
 
Keep your Hadoop cluster at its best!
Keep your Hadoop cluster at its best!Keep your Hadoop cluster at its best!
Keep your Hadoop cluster at its best!Sheetal Dolas
 
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EUBuilding Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EUYaron Haviv
 
Monitoring Cassandra with Riemann
Monitoring Cassandra with RiemannMonitoring Cassandra with Riemann
Monitoring Cassandra with RiemannPatricia Gorla
 
nuclio Overview October 2017
nuclio Overview October 2017nuclio Overview October 2017
nuclio Overview October 2017iguazio
 
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSOpenstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSSadique Puthen
 
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesApache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesDataWorks Summit
 
Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016
Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016
Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016Zabbix
 

Tendances (20)

Open Source Logging and Metric Tools
Open Source Logging and Metric ToolsOpen Source Logging and Metric Tools
Open Source Logging and Metric Tools
 
Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...
 
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic Stack
 
DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...
 DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and... DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...
DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...
 
Stac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakesStac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakes
 
Grid Middleware – Principles, Practice and Potential
Grid Middleware – Principles, Practice and PotentialGrid Middleware – Principles, Practice and Potential
Grid Middleware – Principles, Practice and Potential
 
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
 
Building a system for machine and event-oriented data - SF HUG Nov 2015
Building a system for machine and event-oriented data - SF HUG Nov 2015Building a system for machine and event-oriented data - SF HUG Nov 2015
Building a system for machine and event-oriented data - SF HUG Nov 2015
 
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
 
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
 
CNCF, State of Serverless & Project Nuclio
CNCF, State of Serverless & Project NuclioCNCF, State of Serverless & Project Nuclio
CNCF, State of Serverless & Project Nuclio
 
Keep your Hadoop cluster at its best!
Keep your Hadoop cluster at its best!Keep your Hadoop cluster at its best!
Keep your Hadoop cluster at its best!
 
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EUBuilding Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
 
Monitoring Cassandra with Riemann
Monitoring Cassandra with RiemannMonitoring Cassandra with Riemann
Monitoring Cassandra with Riemann
 
Fraud Detection Architecture
Fraud Detection ArchitectureFraud Detection Architecture
Fraud Detection Architecture
 
nuclio Overview October 2017
nuclio Overview October 2017nuclio Overview October 2017
nuclio Overview October 2017
 
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSOpenstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
 
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesApache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
 
Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016
Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016
Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016
 

Similaire à History of Event Collector in Treasure Data

How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.Renzo Tomà
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analyticsamesar0
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageSATOSHI TAGOMORI
 
Building a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaBuilding a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaTreasure Data, Inc.
 
Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015Eric Sammer
 
Debugging the Web with Fiddler
Debugging the Web with FiddlerDebugging the Web with Fiddler
Debugging the Web with FiddlerIdo Flatow
 
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupRafal Kwasny
 
Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...Eric Sammer
 
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015Mike Broberg
 
Data Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksData Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksMatthias Niehoff
 
Sysdig Tokyo Meetup 2018 02-27
Sysdig Tokyo Meetup 2018 02-27Sysdig Tokyo Meetup 2018 02-27
Sysdig Tokyo Meetup 2018 02-27Michael Ducy
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analyticskgshukla
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and ThenSATOSHI TAGOMORI
 
Web_of_Things_2013
Web_of_Things_2013Web_of_Things_2013
Web_of_Things_2013Max Kleiner
 
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly SolarWinds Loggly
 
Log everything! @DC13
Log everything! @DC13Log everything! @DC13
Log everything! @DC13DECK36
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDogRedis Labs
 
Getting More Out of the Node.js, PHP, and Python Agents - AppSphere16
Getting More Out of the Node.js, PHP, and Python Agents - AppSphere16Getting More Out of the Node.js, PHP, and Python Agents - AppSphere16
Getting More Out of the Node.js, PHP, and Python Agents - AppSphere16AppDynamics
 

Similaire à History of Event Collector in Treasure Data (20)

Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
 
How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analytics
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
 
Building a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaBuilding a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with Rocana
 
Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015
 
Debugging the Web with Fiddler
Debugging the Web with FiddlerDebugging the Web with Fiddler
Debugging the Web with Fiddler
 
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
 
Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...
 
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015
 
Data Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksData Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and Frameworks
 
Sysdig Tokyo Meetup 2018 02-27
Sysdig Tokyo Meetup 2018 02-27Sysdig Tokyo Meetup 2018 02-27
Sysdig Tokyo Meetup 2018 02-27
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analytics
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and Then
 
Web_of_Things_2013
Web_of_Things_2013Web_of_Things_2013
Web_of_Things_2013
 
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
 
Log everything! @DC13
Log everything! @DC13Log everything! @DC13
Log everything! @DC13
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 
Getting More Out of the Node.js, PHP, and Python Agents - AppSphere16
Getting More Out of the Node.js, PHP, and Python Agents - AppSphere16Getting More Out of the Node.js, PHP, and Python Agents - AppSphere16
Getting More Out of the Node.js, PHP, and Python Agents - AppSphere16
 

Dernier

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 

Dernier (20)

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 

History of Event Collector in Treasure Data

  • 1. History of Event Collector One of the legacy systems in Treasure Data Mitsunori Komatsu
  • 2. About me • Mitsunori Komatsu (@komamitsu), 
 Software engineer (Backend team) • Joined Treasure Data almost 5 years ago • Hive, Presto, PlazmaDB, Mobile SDKs, Datatank, Workflow, … Event Collector, Bigdam (Pig, Impala…) • Favorite language: OCaml • RE OSS dev • MessagePack-Java, Digdag, Fluency
  • 3. Retired legacy systems in Treasure Data Apache Pig integration Apache Impala integration Prestogres ?????????
  • 4. Retired legacy systems in Treasure Data Apache Pig integration Apache Impala integration Prestogres Event Collector
 (retirement candidate) E C
  • 5. What’s Event Collector? • HTTP server application receives events from JavaScript SDK, Mobile SDKs, etc… • Buffers events for several minutes in local disk and uploads them to the existing import endpoint in Treasure Data • Existing original data ingestion endpoint td-api isn’t good at handling frequent small uploads • Consists of Fluentd in/out plugins (similar to in_http / out_tdlog) to rely on Fluentd’s buffering mechanizm • It’s been developed ad hoc and improved ad hoc…
  • 6. Fluentd in_event_collector out_event_collector td-api Redis dedup 1 event, 1 event, … events set events set System architecture (2014-06) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N buffer chunks (retention time: 1min) HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS
  • 7. Fluentd in_event_collector out_event_collector td-api Redis dedup 1 event, 1 event, … events set events set System architecture (2014-06) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N buffer chunks (retention time: 1min) HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS
  • 8. Fluentd in_event_collector out_event_collector td-api Redis dedup 1 event, 1 event, … events set events set System architecture (2014-06) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N buffer chunks (retention time: 1min) HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS
  • 9. Fluentd in_event_collector out_event_collector td-api Redis dedup 1 event, 1 event, … events set events set System architecture (2014-06) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N buffer chunks (retention time: 1min) HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS
  • 10. Fluentd in_event_collector out_event_collector td-api Redis dedup 1 event, 1 event, … events set events set System architecture (2014-06) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N buffer chunks (retention time: 1min) HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS
  • 11. Problem #0 (2014-11) • Event Collector stores UUID of request from Mobile SDKs into Redis for de-duplication • Event Collector should be scaled out. But the Redis needed to be scaled out…
  • 12. Sharded Redis • UUIDs of requests are hash-partitioned and stored in a sharded Redis cluster • We created a bit intelligent Redis client that can • fail over to secondary Redis instance • double-write UUIDs to another Redis instance as well as current assigned one so that re- partitioning can be done w/o duplicated data
  • 13. Sharded Redis Event Collector Redis#0 Redis#1 Redis list Redis#0 Redis#1 event#0 w/ UUID#0 (=>1000) 1000 % 2 = 0 => Redis#0 UUID#0 UUID#0
  • 14. Sharded Redis Event Collector Redis#0 Redis#1 Redis list Redis#0 Redis#1 event#1 w/ UUID#1 (=>1001) 1001 % 2 = 1 => Redis#1 UUID#1 UUID#0 UUID#1
  • 15. Sharded Redis Event Collector Redis#0 Redis#1 Redis list - Redis#0 - Redis#1 New Redis list - Redis#0 - Redis#1 - Redis#2event#4 w/ UUID#4 (=>1004) 1004 % 2 = 0 => Redis#0 
 1004 % 3 = 2 => Redis#2 for replication UUID#4 UUID#0 UUID#4 UUID#1 Redis#2 UUID#4 UUID#4 Replicate UUID in advance Not used for dedup for now
  • 16. Sharded Redis Event Collector Redis#0 Redis#1 Redis list - Redis#0 - Redis#1 - Redis#2 event#11 w/ UUID#11 (=>1011) 1011 % 3 = 2 => Redis#2 UUID#11 UUID#0 UUID#4 UUID#1 UUID#7 Redis#2 UUID#4
 UUID#7 UUID#11 For UUID#4/7, this Redis stores them in advance and can dedup them
  • 18. Fluentd in_event_collector td-api Sharded Redis dedup events set System architecture (2014-11) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N Redis#0 UUID#0, UUID#4, UUID#8, UUID#12,
 …, UUID#N out_event_collector 1 event, 1 event, … events set buffer chunks (retention time: 1min) HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS
  • 19. Problem #1 (2015-01) • The usage of Event Collector was getting increased. It sometimes got down…
 
 
 
 
 
 
 
 • “TCP: Possible SYN flooding on port xxxx. Sending cookies.” in kern.log • “$ netstat -s
 19855 times the listen queue of a socket overflowed
 19855 SYNs to LISTEN sockets dropped”
  • 20. Listen socket backlog • The default listen socket backlog queue length was only 1024. • The short queue length was subject to traffic spikes • Increased it up to 8192.
  • 21. Fluentd in_event_collector td-api Sharded Redis dedup events set System architecture (2015-01) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N Redis#0 UUID#0, UUID#4, UUID#8, UUID#12,
 …, UUID#N - LISTEN socket backlog : 8192 out_event_collector 1 event, 1 event, … events set buffer chunks (retention time: 1min) HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS
  • 22. Problem #2 (2015-05) • The usage of Event Collector was getting more increased. It still sometimes got down… • There were 2 options, to optimize the source code and to run Event Collector in multiprocess
  • 23. Optimization of source code • First…, profile! profile! profile! • 2 performance bottlenecks were: • sending metrics per request to another Fluentd ➡ Aggregated 5 seconds range metrics in memory to reduce the number of messages to another Fluentd • parsing UserAgent ➡ Cached 100 UserAgentParser (ua-parser) instances with LRU eviction • The performance was improved 50 times
  • 24. In multiprocess • Only 1 Event Collector process was running. But Ruby can’t make the best of multi-core with multi- threads • It was time to run Event Collector in multiprocess!
  • 25. fluent-plugin-multiprocess • With fluent-plugin-multiprocess, 8 sets of input / output plugin workers of Event Collector can run in multi processes • The performance improved 6 times with it • Drawback: • The number of output plugin workers also increased • As a result, the number of uploaded chunks to td-api increased significantly. td-api sometimes suffered from a lot of tiny uploaded chunk files… • In other words, td-api was a bottleneck in Event Collector’s scalability
  • 26. fluent-plugin-multiprocess Event CollectorNginx td-api input plugin worker output plugin worker input plugin worker output plugin worker input plugin worker output plugin worker events
  • 27. Fluentd (multi process #0)Fluentd (multi process #0)Fluentd (multi process #0) in_event_collector td-api Sharded Redis dedup events set System architecture (2015-05) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N Redis#0 UUID#0, UUID#4, UUID#8, UUID#12,
 …, UUID#N - LISTEN socket backlog : 8192 - Avoid instantiating Ruby object - Reduce metrics requests to Fluentd out_event_collector 1 event, 1 event, … events set buffer chunks (retention time: 1min) HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS Nginx
  • 28. Problem #3 (2015-12) • td-api sometimes got unstable since the number of uploaded chunk files had increased…. • We needed to reduce the number of uploaded chunk files to td-api from Event Collector
  • 29. detach_process • Found a Fluentd configuration item “detach_process” for input plugins • With it, multiple input_plugin workers can run in multi processes keeping the number of output_plugin workers to 1 • The number of uploaded chunk files to td-api would be reduced!
  • 30. detach_process Event CollectorNginx td-api input plugin worker
 (parent) output plugin worker events input plugin worker input plugin worker input plugin worker
  • 31. Fluentd in_event_collector (detach_process #0) in_event_collector (detach_process #0) in_event_collector (detach_process #0) td-api Sharded Redis dedup events set System architecture (2015-12) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N Redis#0 UUID#0, UUID#4, UUID#8, UUID#12,
 …, UUID#N - LISTEN socket backlog : 8192 - Avoid instantiating Ruby object - Reduce metrics requests to Fluentd out_event_collector 1 event, 1 event, … events set buffer chunks (retention time: 1min) HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS Nginx
  • 32. Problem #4 (2016-04) • There were some issues with “detach_process” in a race condition • Also, “detach_process” got a deprecated feature! https://www.fluentd.org/blog/fluentd-v0.14.9-has-been-released
  • 34. Fluentd (multi process #0) Fluentd (multi process #0)Fluentd (multi process #0) in_event_collector td-api Sharded Redis dedup events set System architecture (2016-04) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N Redis#0 UUID#0, UUID#4, UUID#8, UUID#12,
 …, UUID#N - LISTEN socket backlog : 8192 - Avoid instantiating Ruby object - Reduce metrics requests to Fluentd out_event_collector 1 event, 1 event, … events set buffer chunks (retention time: 1min) HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS Nginx
  • 35. Problem #5 (2016-05) • The throughput of Redis cluster sometimes got a bottleneck • When dedup with Redis cluster for requests from Mobile SDK got stuck, processing requests from JavaScript SDK / Postback from SaaS were stuck too…
  • 36. dedup in another thread • If de-duplication processing runs in a different thread from input_plugin worker, input_plugin worker can continue to process requests even when accesses to Redis gets stuck • existing output_plugin runs in another thread, so it might be an option to dedup in it • But output_plugin handles large chunk files. So if it retries around the end of a chunk file, all records in the chunk file are handled as duplicated records. • We needed to mitigate the impact of this case • Let’s insert a new thin output plugin worker!
  • 37. dedup in another thread input_plugin worker Redis Cluster output_plugin worker Before event w/ UUID GetAndSet: UUID Fluentd's
 Buffer Emit event response: OK : : event w/ UUID Chunked events Upload to td-api If dedup gets stuck, it affects processing of all requests… : :
  • 38. dedup in another thread input_plugin worker Redis Cluster output_plugin worker After event w/ UUID GetAndSet: UUID Fluentd's
 Buffer#1 Emit event response: OK : : event w/ UUID Chunked events Retention time: 1m Upload to td-api thin
 output_plugin worker Fluentd's
 Buffer#0 Small Chunked events Retention time: 5s Emit event : : Even if dedup gets stuck, it doesn’t affect processing of all requests! Made input_plugin return response ASAP w/o dedup
  • 39. Fluentd (multi process #0)Fluentd (multi process #0)Fluentd (multi process #0) in_event_collector out_event_collector td-api Sharded Redis dedup 1 event, 1 event, … events set events set System architecture (2016-05) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N Redis#0 UUID#0, UUID#4, UUID#8, UUID#12,
 …, UUID#N - LISTEN socket backlog : 8192 - Avoid instantiating Ruby object - Reduce metrics requests to Fluentd out_object_handover small events set buffer chunks (retention time: 5s) buffer chunks (retention time: 1min) HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS Nginx
  • 40. Problem #6 (2016-06) • The de-duplication with Redis tended to be delayed • Even we upgrade the instance type of Redis cluster, Redis runs on a single core and can’t use benefits of multicores… • Actually, we used Redis as just a KVS. The complex data types of Redis wasn’t needed in the end
  • 41. Memcached • Replaced the dedup cluster with Memcached • Any problems didn’t occur during the migration thanks to the double write feature • Based on benchmark results using actual access pattern of Event Collector, the performance improved twice and the memory consumption of dedup cluster was reduced down to 68% comparing to Redis
  • 42. Fluentd (multi process #0)Fluentd (multi process #0)Fluentd (multi process #0) in_event_collector out_event_collector td-api Sharded Memcached 1 event, 1 event, … events set events set System architecture (2016-06) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N Memcached#0 UUID#0, UUID#4, UUID#8, UUID#12,
 …, UUID#N - LISTEN socket backlog : 8192 - Avoid instantiating Ruby object - Reduce metrics requests to Fluentd out_object_handover small events set buffer chunks (retention time: 5s) buffer chunks (retention time: 1min) HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS Nginx dedup
  • 43. Problem #7 (2016-12) • Needed to support 36 hours TTL on the dedup cluster for one of our customers while using default 1 hour TTL for other customers’ requests • It sounded easy since Memcached’s APIs support TTL • But the Memcached dedup cluster stopped reclaiming expired entries that had 1 hour TTL and memory consumptions got increased drastically…
  • 44. Reclamation mechanism in Memcached • When reading Memcached source code, found the cause • Memcached removes expired entries as far as it continues to find expired entries in a row. It works fine when all entries have similar TTL • But, if Memcached has 1 hour TTL and 36 hours TTL entries, it stops reclaiming when it finds a 36 hours TTL entry even it has expired 1 hour TTL entries a lot behind

  • 46. “lru_crawler crawl all” • Found Memcached provides an API “lru_crawler”
 
 
 
 
 
 
 
 • Let’s call “lru_crawler crawl all” repeatedly!
 - Takes a single, or a list of, numeric classids (ie: 1,3,10). This instructs the crawler to start at the tail of each of these classids and run to the head. The crawler cannot be stopped or restarted until it completes the previous request. The special keyword "all" instructs it to crawl all slabs with items in them. https://github.com/memcached/memcached/blob/master/doc/protocol.txt
  • 47. Fluentd (multi process #0)Fluentd (multi process #0)Fluentd (multi process #0) in_event_collector out_event_collector td-api Sharded Memcached 1 event, 1 event, … events set events set System architecture (2016-12) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N Memcached#0 UUID#0, UUID#4, UUID#8, UUID#12,
 …, UUID#N - LISTEN socket backlog : 8192 - Avoid instantiating Ruby object - Reduce metrics requests to Fluentd out_object_handover small events set buffer chunks (retention time: 5s) buffer chunks (retention time: 3min) lru_crawler HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS Nginx dedup
  • 48. Fluentd in_event_collector out_event_collector td-api Redis dedup 1 event, 1 event, … events set events set System architecture (2014-06) HTTPS UUID#0, UUID#1, UUID#2, UUID#3,
 …, UUID#N buffer chunks (retention time: 1min) HTTPS event w/ UUID JS SDK / Mobile SDKs /
 Postback from SaaS
  • 49. Usage change requests/sec 2015-05 910 2018-02 19300 2015-05 2018-02 - Increased more than 21 times! - Event Collector is very important
 as a part of CDP service Triggered an alert!
  • 50. But it still has problems… • Buffered data stored in local disk isn’t replicated • Uploads of many small chunk files to td-api • Aggregation those small ones with in_forward plugin is an option, though... • Further performance improvement
 

  • 51. But it still has problems… • Buffered data stored in local disk isn’t replicated ➡ Bigdam will resolve it! • Uploads of many small chunk files to td-api • Aggregation those small ones with in_forward plugin is an option, though... • Further performance improvement
 

  • 52. But it still has problems… • Buffered data stored in local disk isn’t replicated ➡ Bigdam will resolve it! • Uploads of many small chunk files to td-api • Aggregation those small ones with in_forward plugin is an option, though... ➡ Bigdam will resolve it! • Further performance improvement
 

  • 53. But it still has problems… • Buffered data stored in local disk isn’t replicated ➡ Bigdam will resolve it! • Uploads of many small chunk files to td-api • Aggregation those small ones with in_forward plugin is an option, though... ➡ Bigdam will resolve it! • Further performance improvement ➡ Bigdam will resolve it!