SlideShare une entreprise Scribd logo
1  sur  42
Lambda Architecture
with Apache Spark
IMAGE
About Me
https://ua.linkedin.com/in/tarasmatyashovsky
Apache Hadoop: A Brief History
http://www.slideshare.net/fadicce/hadoop-user-group-uae-meeting
A lot of customers implemented
successful Hadoop-based M/R pipelines
which are operating today
Examples from Real Life
• Oozie workflow, operates daily and processes up to
150 TB to generate analytics
• bash managed workflow, operates daily and processes
up to 8 TB to generate analytics
Examples from Real Life
http://www.thoughtworks.com/insights/blog/hadoop-or-not-hadoop
Lambda Architecture
A data-processing architecture
designed to handle massive quantities of data
by taking advantage of both
batch and stream processing methods
http://lambda-architecture.net/
https://www.manning.com/books/big-data
https://www.manning.com/books/big-data
Layers of Lambda Architecture
Batch layer
• manages the master dataset (an immutable, append-only set of
raw data)
• pre-compute the batch views
Serving layer
• indexes the batch views so that they can be queried in ad-hoc with
low-latency
Speed layer
• deals with recent-data only
http://lambda-architecture.net/
https://speakerdeck.com/mhausenblas/lambda-architecture-with-apache-spark
Relevance of Data
http://www.slideshare.net/helenaedelson/lambda-architecture-with-spark-spark-streaming-kafka-cassandra-akka-and-scala
query =
real time view =
batch view =
function(batch view, real time view)
function(real time view, new data)
function(all data)
Trade-offs
Full recomputation vs. partical recomputation
e.g. using Bloom filters
Additive algorithms vs. approximation algorithms
e.g. HyperLogLog for count-distinct problem
Implementation of Lambda Architecture
https://speakerdeck.com/mhausenblas/lambda-architecture-with-apache-spark
Integrated solution for processing
on all lambda architecture layers
Apache Spark: a Brief History
Enables scalable, high-throughput, fault-tolerant
stream processing of live data streams
50% users consider it the most important part of Spark
Spark Streaming
http://spark.apache.org/docs/latest/streaming-programming-guide.html
Streaming Architecture
http://spark.apache.org/docs/latest/streaming-programming-guide.html
https://databricks.com/blog/2015/02/09/learning-spark-book-available-from-oreilly.html
http://spark.apache.org/docs/latest/streaming-programming-guide.html#input-dstreams-and-receivers
http://spark.apache.org/docs/latest/streaming-programming-guide.html#discretized-streams-dstreams
DStream as a Continuous Series of RDDs
http://spark.apache.org/docs/latest/streaming-programming-guide.html#discretized-streams-dstreams
http://spark.apache.org/docs/latest/streaming-programming-guide.html#discretized-streams-dstreams
Provide hashtags statistics
used in a #jeeconf tweets
All time till today + right now
Sample Application
https://github.com/tmatyashovsky/lambda-architecture-jeeconf-kyiv
Batch View
apache –
architecture –
aws –
java –
jeeconf –
lambda –
morningatlohika –
simpleworkflow –
spark –
6
12
3
4
7
6
15
14
5
https://github.com/tmatyashovsky/lambda-architecture-jeeconf-kyiv
Real-time View
“Cool presentation by @tmatyashovsky about
#lambda #architecture using #apache #spark
at #jeeconf”
apache –
architecture –
jeeconf–
lambda –
spark –
1
1
1
1
1
https://github.com/tmatyashovsky/lambda-architecture-jeeconf-kyiv
Batch View + Real-time View
apache –
architecture –
aws –
java –
jeeconf –
lambda –
morningatlohika –
simpleworkflow –
spark –
7
13
3
4
8
7
15
14
6
https://github.com/tmatyashovsky/lambda-architecture-jeeconf-kyiv
Simplified Steps
• Create batch view (.parquet) via Apache Spark
• Cache batch view in Apache Spark
• Start streaming application connected to Twitter
• Focus on real-time #jeeconf tweets*
• Build incremental real-time views
• Query, i.e. merge batch and real-time views on a fly
* Stream from file system (used for testing) can be used as a backup
https://github.com/tmatyashovsky/lambda-architecture-jeeconf-kyiv
Demo Time
https://github.com/tmatyashovsky/lambda-architecture-jeeconf-kyiv
http://spark.apache.org/docs/latest/streaming-programming-guide.html#fault-tolerance-semantics
Structured Streaming in Spark 2.0
The simplest way to perform streaming analytics
is not having to reason about streaming
Static DataFrame API = Infinite DataFrame API
http://www.slideshare.net/rxin/the-future-of-realtime-in-spark
Structured Streaming
• Introduces streaming API built on top of Spark SQL
• Unifies streaming, interactive and batch queries
logs = context.read.format("json")
.stream("s3://logs")
logs.groupBy(logs.user_id)
.agg(sum(logs.time))
.write.format("jdbc")
.stream("jdbc:mysql//...")
https://www.youtube.com/watch?v=oXkxXDG0gNk
Instead of Epilogue
http://milinda.pathirage.org/kappa-architecture.com/
http://milinda.pathirage.org/kappa-architecture.com/
Taras Matyashovsky
taras.matyashovsky@gmail.com
@tmatyashovsky
http://www.filevych.com/
Thank you!
References
http://www.thoughtworks.com/insights/blog/hadoop-or-not-hadoop
https://speakerdeck.com/mhausenblas/lambda-architecture-with-apache-spark
https://www.manning.com/books/big-data
Learning Spark, by Holden Karau, Andy Konwinski, Patrick Wendell and Matei Zaharia (early release ebook from O'Reilly
Media)
http://spark.apache.org/docs/latest/streaming-programming-guide.html
http://www.slideshare.net/helenaedelson/lambda-architecture-with-spark-spark-streaming-kafka-cassandra-akka-and-scala
http://www.rittmanmead.com/2015/08/combining-spark-streaming-and-data-frames-for-near-real-time-log-analysis/
https://databricks.com/blog/2015/07/30/diving-into-spark-streamings-execution-model.html
https://docs.cloud.databricks.com/docs/spark/1.6/index.html#examples/Streaming%20mapWithState.html
http://spark.apache.org/docs/latest/cluster-overview.html
http://milinda.pathirage.org/kappa-architecture.com/
http://www.slideshare.net/databricks/2016-spark-summit-east-keynote-matei-zaharia
http://www.slideshare.net/rxin/the-future-of-realtime-in-spark
http://thenewstack.io/spark-2-0-will-offer-interactive-querying-live-data/
http://www.slideshare.net/spark-project/deep-divewithsparkstreaming-tathagatadassparkmeetup20130617
https://databricks.com/blog/2015/10/13/interactive-audience-analytics-with-spark-and-hyperloglog.html
https://www.youtube.com/watch?v=ZFBgY0PwUeY
https://www.youtube.com/watch?v=oXkxXDG0gN
http://milinda.pathirage.org/kappa-architecture.com/
https://databricks.com/blog/2015/01/15/improved-driver-fault-tolerance-and-zero-data-loss-in-spark-streaming.html
http://www.slideshare.net/Typesafe_Inc/four-things-to-know-about-reliable-spark-streaming-with-typesafe-and-databricks
http://spark.apache.org/docs/latest/configuration.html#spark-streaming

Contenu connexe

En vedette

Implement your own profiler with blackjack and fun
Implement your own profiler with blackjack and funImplement your own profiler with blackjack and fun
Implement your own profiler with blackjack and funVladimir Sitnikov
 
No Container: a Modern Java Stack with Bootique
No Container: a Modern Java Stack with BootiqueNo Container: a Modern Java Stack with Bootique
No Container: a Modern Java Stack with BootiqueAndrus Adamchik
 
What Mr. Spock would possibly say about modern unit testing: pragmatic and em...
What Mr. Spock would possibly say about modern unit testing: pragmatic and em...What Mr. Spock would possibly say about modern unit testing: pragmatic and em...
What Mr. Spock would possibly say about modern unit testing: pragmatic and em...Yaroslav Yermilov
 
A Post-Apocalyptic sun.misc.Unsafe World
A Post-Apocalyptic sun.misc.Unsafe WorldA Post-Apocalyptic sun.misc.Unsafe World
A Post-Apocalyptic sun.misc.Unsafe WorldChristoph Engelbert
 
Spotify's Music Recommendations Lambda Architecture
Spotify's Music Recommendations Lambda ArchitectureSpotify's Music Recommendations Lambda Architecture
Spotify's Music Recommendations Lambda ArchitectureEsh Vckay
 
Angular2 Development for Java developers
Angular2 Development for Java developersAngular2 Development for Java developers
Angular2 Development for Java developersYakov Fain
 
мифы о спарке
мифы о спарке мифы о спарке
мифы о спарке Evgeny Borisov
 
Functional UI testing of Adobe Flex RIA
Functional UI testing of Adobe Flex RIAFunctional UI testing of Adobe Flex RIA
Functional UI testing of Adobe Flex RIAViktor Gamov
 
JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»
JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»
JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»Viktor Gamov
 
WebSockets: The Current State of the Most Valuable HTML5 API for Java Developers
WebSockets: The Current State of the Most Valuable HTML5 API for Java DevelopersWebSockets: The Current State of the Most Valuable HTML5 API for Java Developers
WebSockets: The Current State of the Most Valuable HTML5 API for Java DevelopersViktor Gamov
 
Creating your own private Download Center with Bintray
Creating your own private Download Center with Bintray Creating your own private Download Center with Bintray
Creating your own private Download Center with Bintray Baruch Sadogursky
 
DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at Oracle Code SF...
DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at Oracle Code SF...DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at Oracle Code SF...
DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at Oracle Code SF...Baruch Sadogursky
 
Java 8 Puzzlers [as presented at OSCON 2016]
Java 8 Puzzlers [as presented at  OSCON 2016]Java 8 Puzzlers [as presented at  OSCON 2016]
Java 8 Puzzlers [as presented at OSCON 2016]Baruch Sadogursky
 
Spring Data: New approach to persistence
Spring Data: New approach to persistenceSpring Data: New approach to persistence
Spring Data: New approach to persistenceOleksiy Rezchykov
 
Testing Flex RIAs for NJ Flex user group
Testing Flex RIAs for NJ Flex user groupTesting Flex RIAs for NJ Flex user group
Testing Flex RIAs for NJ Flex user groupViktor Gamov
 

En vedette (16)

JEEConf. Vanilla java
JEEConf. Vanilla javaJEEConf. Vanilla java
JEEConf. Vanilla java
 
Implement your own profiler with blackjack and fun
Implement your own profiler with blackjack and funImplement your own profiler with blackjack and fun
Implement your own profiler with blackjack and fun
 
No Container: a Modern Java Stack with Bootique
No Container: a Modern Java Stack with BootiqueNo Container: a Modern Java Stack with Bootique
No Container: a Modern Java Stack with Bootique
 
What Mr. Spock would possibly say about modern unit testing: pragmatic and em...
What Mr. Spock would possibly say about modern unit testing: pragmatic and em...What Mr. Spock would possibly say about modern unit testing: pragmatic and em...
What Mr. Spock would possibly say about modern unit testing: pragmatic and em...
 
A Post-Apocalyptic sun.misc.Unsafe World
A Post-Apocalyptic sun.misc.Unsafe WorldA Post-Apocalyptic sun.misc.Unsafe World
A Post-Apocalyptic sun.misc.Unsafe World
 
Spotify's Music Recommendations Lambda Architecture
Spotify's Music Recommendations Lambda ArchitectureSpotify's Music Recommendations Lambda Architecture
Spotify's Music Recommendations Lambda Architecture
 
Angular2 Development for Java developers
Angular2 Development for Java developersAngular2 Development for Java developers
Angular2 Development for Java developers
 
мифы о спарке
мифы о спарке мифы о спарке
мифы о спарке
 
Functional UI testing of Adobe Flex RIA
Functional UI testing of Adobe Flex RIAFunctional UI testing of Adobe Flex RIA
Functional UI testing of Adobe Flex RIA
 
JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»
JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»
JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»
 
WebSockets: The Current State of the Most Valuable HTML5 API for Java Developers
WebSockets: The Current State of the Most Valuable HTML5 API for Java DevelopersWebSockets: The Current State of the Most Valuable HTML5 API for Java Developers
WebSockets: The Current State of the Most Valuable HTML5 API for Java Developers
 
Creating your own private Download Center with Bintray
Creating your own private Download Center with Bintray Creating your own private Download Center with Bintray
Creating your own private Download Center with Bintray
 
DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at Oracle Code SF...
DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at Oracle Code SF...DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at Oracle Code SF...
DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at Oracle Code SF...
 
Java 8 Puzzlers [as presented at OSCON 2016]
Java 8 Puzzlers [as presented at  OSCON 2016]Java 8 Puzzlers [as presented at  OSCON 2016]
Java 8 Puzzlers [as presented at OSCON 2016]
 
Spring Data: New approach to persistence
Spring Data: New approach to persistenceSpring Data: New approach to persistence
Spring Data: New approach to persistence
 
Testing Flex RIAs for NJ Flex user group
Testing Flex RIAs for NJ Flex user groupTesting Flex RIAs for NJ Flex user group
Testing Flex RIAs for NJ Flex user group
 

Plus de Taras Matyashovsky

Distinguish Pop from Heavy Metal using Apache Spark MLlib
Distinguish Pop from Heavy Metal using Apache Spark MLlibDistinguish Pop from Heavy Metal using Apache Spark MLlib
Distinguish Pop from Heavy Metal using Apache Spark MLlibTaras Matyashovsky
 
Morning at Lohika 2nd anniversary
Morning at Lohika 2nd anniversaryMorning at Lohika 2nd anniversary
Morning at Lohika 2nd anniversaryTaras Matyashovsky
 
Influence. The Psychology of Persuasion (in IT)
Influence. The Psychology of Persuasion (in IT)Influence. The Psychology of Persuasion (in IT)
Influence. The Psychology of Persuasion (in IT)Taras Matyashovsky
 
JEEConf 2015 - Introduction to real-time big data with Apache Spark
JEEConf 2015 - Introduction to real-time big data with Apache SparkJEEConf 2015 - Introduction to real-time big data with Apache Spark
JEEConf 2015 - Introduction to real-time big data with Apache SparkTaras Matyashovsky
 
Morning at Lohika 1st anniversary
Morning at Lohika 1st anniversaryMorning at Lohika 1st anniversary
Morning at Lohika 1st anniversaryTaras Matyashovsky
 
Introduction to real time big data with Apache Spark
Introduction to real time big data with Apache SparkIntroduction to real time big data with Apache Spark
Introduction to real time big data with Apache SparkTaras Matyashovsky
 
New life inside monolithic application
New life inside monolithic applicationNew life inside monolithic application
New life inside monolithic applicationTaras Matyashovsky
 
Distributed applications using Hazelcast
Distributed applications using HazelcastDistributed applications using Hazelcast
Distributed applications using HazelcastTaras Matyashovsky
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.Taras Matyashovsky
 

Plus de Taras Matyashovsky (12)

Morning 3 anniversary
Morning 3 anniversaryMorning 3 anniversary
Morning 3 anniversary
 
Distinguish Pop from Heavy Metal using Apache Spark MLlib
Distinguish Pop from Heavy Metal using Apache Spark MLlibDistinguish Pop from Heavy Metal using Apache Spark MLlib
Distinguish Pop from Heavy Metal using Apache Spark MLlib
 
Morning at Lohika 2nd anniversary
Morning at Lohika 2nd anniversaryMorning at Lohika 2nd anniversary
Morning at Lohika 2nd anniversary
 
Confession of an Engineer
Confession of an EngineerConfession of an Engineer
Confession of an Engineer
 
Influence. The Psychology of Persuasion (in IT)
Influence. The Psychology of Persuasion (in IT)Influence. The Psychology of Persuasion (in IT)
Influence. The Psychology of Persuasion (in IT)
 
JEEConf 2015 - Introduction to real-time big data with Apache Spark
JEEConf 2015 - Introduction to real-time big data with Apache SparkJEEConf 2015 - Introduction to real-time big data with Apache Spark
JEEConf 2015 - Introduction to real-time big data with Apache Spark
 
Morning at Lohika 1st anniversary
Morning at Lohika 1st anniversaryMorning at Lohika 1st anniversary
Morning at Lohika 1st anniversary
 
Introduction to real time big data with Apache Spark
Introduction to real time big data with Apache SparkIntroduction to real time big data with Apache Spark
Introduction to real time big data with Apache Spark
 
New life inside monolithic application
New life inside monolithic applicationNew life inside monolithic application
New life inside monolithic application
 
Distributed applications using Hazelcast
Distributed applications using HazelcastDistributed applications using Hazelcast
Distributed applications using Hazelcast
 
Morning at Lohika
Morning at LohikaMorning at Lohika
Morning at Lohika
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
 

Dernier

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Dernier (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

JEEConf 2016 - Lambda Architecture with Apache Spark

Notes de l'éditeur

  1. micro-batch architecture series of batch computations on small chunks of data batch interval is configurable exactly once semantics
  2. Receiver: Task that collects data from the input source and represents it as RDDs Is launched automatically for each input source Replicates data to another executor for fault tolerance
  3. spark.streaming.backpressure.enabled spark.streaming.receiver.maxRate (number of records per second) spark.streaming.blockInterval (default 200ms)
  4. Spark 2.0: Project Tungsten 2.0 Whole stage code generation Optimized input / output -> Parquet + built-in cache Spark Streaming DataFrame API unified with Dataset API