In this webinar, Engineering Manager at Credit Karma, Dustin Lyons, discusses how not long ago his team was facing a common challenge shared by many financial services architects and engineering leaders: not only how to move from the offline, batch-mode processing of Big Data to streaming, Fast Data, and how to enable real-time decision making based on the information flowing in from over 60 million members.
Dustin reviews how his team migrated away from PHP and successfully implemented Akka Streams with Apache Kafka to ingest, process and route real-time events throughout their data ecosystem. At the end of this presentation, you’ll better understand:
* The design considerations for new Fast Data architectures, from streaming to microservices to real-time analysis.
* Some lessons learned when it comes to progressing from batch to streaming using Akka, Spark and Kafka
* Why Akka’s self-healing actor model and the resilience that it provides is actually what matters most when delivering real-time customer experiences
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Streams And Actors
1. 1 Proprietary & Confidential1 Proprietary & Confidential
Using Akka Streams
For Real Time Decision Making
Dustin Lyons
Engineering Manager, Data Platform
2. 2 Proprietary & Confidential
● Engineer turned Engineering Manager
at Credit Karma
● Data & Analytics on the Platform team
● Build things that make decisions on
where data should go
● Lover of science fiction, sushi, and
electronic music
Who I am
3. 3 Proprietary & Confidential
Credit Karma is a free financial assistant, helping over
60 million people make progress.
4. 4 Proprietary & Confidential
1. Data Infrastructure at Credit Karma: Past and current
2. Mo’ data, mo’ problems
3. Akka Streams saves the day
4. Results and learnings
5. Q&A
Agenda for today
27. 27 Proprietary & Confidential
What is backpressure?
Backpressure refers to the buildup of data at an I/O switch
when buffers are full and not able to receive additional data.
No additional data packets are transferred until the
bottleneck of data has been eliminated or the buffer has been
emptied.
28. 28 Proprietary & Confidential
Analytics export service
Coordinator Data Transformer
Workers
Kafka Importer
Workers
Analytics Export Service
HTTP Ingest Server
29. 29 Proprietary & Confidential
Analytics export service
Coordinator Data Transformer
Workers
Kafka Importer
Workers
Analytics Export Service
HTTP Ingest Server
30. 30 Proprietary & Confidential
Analytics export service
Coordinator Data Transformer
Workers
Kafka Importer
Workers
Analytics Export Service
HTTP Ingest Server
31. 31 Proprietary & Confidential
Data warehouse import
ReaderDeduplicatorProcessor Extractors
Data Warehouse Import Service
32. 32 Proprietary & Confidential
Akka Streams: Backpressure in action
Actor Actor
Data
Demand
41. 41 Proprietary & Confidential
Analytics export service heap (before)
GiB=>
Time =>
28 GiB
Red: Heap Space
Blue: Used Heap Space
Purple: Max Heap Space
42. 42 Proprietary & Confidential
Analytics export service heap (after)
GiB=>
Time =>
28 GiB
46. 46 Proprietary & Confidential
• Akka Streams allowed us to move data with increased throughput and optimal
performance
• No longer getting paged for JVM out of memory or spending time tuning our
services
• Reduced the SLA for data delivery to our business stakeholders
Final results
47. 47 Proprietary & Confidential
• Akka Actors: Great for low latency
• Akka Streams: Optimized for high throughput and solving back pressure
• Built on top of Akka Actors
• Don’t try to build high throughput systems with an actor system, you’ll just start
building Akka Streams
Lessons learned