11. Systems Track Streams Track Pipelines Track
Data Processing at LinkedIn with Apache Kafka Portable Streaming Pipelines with Apache Beam Capture the Streams of Database Changes
Kafka in the Enterprise: What if it Fails? The Best Thing Since Partitioned Bread :
Rethinking Stream Processing with Apache
Kafka’s new Streams API
Billions of Messages a Day - Yelp’s Real-time Data
Pipeline
Apache Kafka Core Internals : A Deep Dive Microservices with Kafka : An Introduction to Kafka
Streams with a Real-Life Example
California Schemin! How the Schema Registry has
Ancestry Basking in Data
Simplifying Omni-Channel Retail at Scale Hanging Out with Your Past Self in VR: Time-
Shifted Avatar Replication Using Kafka Streams
Achieving Predictability And Compliance With The
Data Distribution Hub at Bank of New York Mellon
How to Lock Down Apache Kafka and Keep Your
Streams Safe
Building Advanced Streaming Applications using
the Latest from Apache Flink and Kafka
Every Message Counts: Kafka as Foundation for
Highly Reliable Logging at Airbnb
Running Hundreds of Kafka Clusters with 5 People The Data Dichotomy: Rethinking Data & Services
with Streams
The Source of Truth: Why the New York Times
Stores Every Piece of Content Ever Published in
Kafka
Hardening Kafka for New Use Cases with Venice Easy, Scalable, Fault-tolerant Stream Processing
with Kafka and Spark’s Structured Streaming
Single Message Transformations Are Not the
Transformations You’re Looking For
Introducing Exactly Once Semantics in Apache
Kafka
Scalable Real-time Complex Event Processing at
Uber
Cloud Native Data Streaming Microservices with
Spring Cloud and Kafka
12. Systems Track Streams Track Pipelines Track
Data Processing at LinkedIn with Apache Kafka Portable Streaming Pipelines with Apache Beam Capture the Streams of Database Changes
Kafka in the Enterprise: What if it Fails? The Best Thing Since Partitioned Bread :
Rethinking Stream Processing with Apache
Kafka’s new Streams API
Billions of Messages a Day - Yelp’s Real-time Data
Pipeline
Apache Kafka Core Internals : A Deep Dive Microservices with Kafka : An Introduction to Kafka
Streams with a Real-Life Example
California Schemin! How the Schema Registry has
Ancestry Basking in Data
Simplifying Omni-Channel Retail at Scale Hanging Out with Your Past Self in VR: Time-
Shifted Avatar Replication Using Kafka Streams
Achieving Predictability And Compliance With The
Data Distribution Hub at Bank of New York Mellon
How to Lock Down Apache Kafka and Keep Your
Streams Safe
Building Advanced Streaming Applications using
the Latest from Apache Flink and Kafka
Every Message Counts: Kafka as Foundation for
Highly Reliable Logging at Airbnb
Running Hundreds of Kafka Clusters with 5 People The Data Dichotomy: Rethinking Data & Services
with Streams
The Source of Truth: Why the New York Times
Stores Every Piece of Content Ever Published in
Kafka
Hardening Kafka for New Use Cases with Venice Easy, Scalable, Fault-tolerant Stream Processing
with Kafka and Spark’s Structured Streaming
Single Message Transformations Are Not the
Transformations You’re Looking For
Introducing Exactly Once Semantics in Apache
Kafka
Scalable Real-time Complex Event Processing at
Uber
Cloud Native Data Streaming Microservices with
Spring Cloud and Kafka
서비스 Database를 실시간 입수하는 방법들
Kafka를 이용한 Microservice
Kafka Streams 소개
그 외에도.. 안정적으로 System을 운영하는 사례들이나,
하둡 데이터와 실시간 이벤트를 함께 활용하는 방법들
13. 자세한 내용과 팀에 어떻게 적용하고 있는지는 뒤에서 얘기하고,
먼저, Kafka를 모르시는 분들에 위해서 간단하게 설명을 하려고 합니다.
14. Apache Kafka
publish / subscribe
distributed system
stable store
low latency
high throughput
real-time processing
38. IDC 및 클러스터별로 입수 상태 및 현황을 파악해야 했고,
문제가 생겼을 때 어디 쪽의 문제인지 추적이 되어야 했고,
장애시 복구 포인트나 복구는 잘 되는지를 알아야 한다.
또한 언제 어떤 Topic이 생겼는지 등의 이력관리가 되어야 하고,
문제 제기가 들어왔을 때 증명도 해야한다...
48. Kafka Streams
Kafka project의 streaming library
2016년에 release
기존의 다른 Streaming Framework에 비하여 큰 이점들이 있다.
그러나, 나온지가 얼마 안되서 증명이 안됐다.
49. Kafka Streams
눈부신 성공사례로 어느정도 증명이...
Line corps. 내부 파이프라인에 Kafka Streams적용하기
https://engineering.linecorp.com/en/blog/detail/80
Peak time : 100만 메시지 / sec
59. Kafka Streams
Kafka Streams의 장점은?
라이브러리이다.
Yarn과 같은 resource manager가 필요없다.
오직 데몬을 돌릴 수 있는 서버만 있으면 된다.
소스가 가볍다.
H/A, Scalability, Load Balancing 을 구현할 필요가 없다. Kafka에서 다 해준다.
생산성이 좋아진다.
디버깅이 쉽다. 비즈니스 로직에만 집중해서 개발을 하면 된다.
62. 하나의 레파지토리에 100만 라인의 코드였다.
모노리딕 앱 : 옐프 메인
2011
2013
2017
Kafka를 백본으로, 마이크로 서비스로 옮겼다.
70개 이상의 production 서비스였다.
2013년 보다 R&D에 100만 달러를 아끼게 됐다.
코드의 복잡성이 줄어들고, downtime이 줄어들었다.