SlideShare une entreprise Scribd logo
1  sur  25
Télécharger pour lire hors ligne
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tara E. Walker,
AWS Technical Evangelist
06.21.2016
Real-time Data Processing Using
AWS Lambda
Agenda
AWS Services for Data Processing
AWS Lambda
Amazon Kinesis
Architecture & Workflow for Streaming Data Processing
Streaming Data Processing Demo
Best Practices in Building Data Processing Solutions
AWS Services for Data
Processing
Amazon
Kinesis
AWS
Lambda
Amazon Lambda: Overview
Serverless compute service that runs code in response to events
without need to provision and manage servers
No Servers to
Manage
Automatically runs code
without Provisioning or
Managing servers. Just
write the code and upload
it to Lambda.
Continuous Scaling
Auto scales Application
precisely with the size of
the workload. Code runs
in Parallel & Processes
each trigger individually
Subsecond Metering
Starts Code within
milliseconds of an Event
Executes only when event is
triggered. Triggers from AWS
services or directly from web
or mobile applications
AWS Lambda: Serverless Compute in the Cloud
Compute service that runs your code in response to events
Easy to author, deploy, maintain, secure and manage
Allows for focus on business logic, not infrastructure
Stateless, event-driven code with native support for
Node.js, Java, and Python languages
Compute & Code without managing infrastructure
like EC2 instances and auto scaling groups
Makes it easy to Build back-end services that
perform at scale
High performance at any scale;
Cost-effective and efficient
No Infrastructure to
manage
Pay only for what you use: Lambda
automatically matches capacity to
your request rate. Purchase compute
in 100ms increments.
Bring Your
Own Code
“Productivity focused compute platform to build powerful, dynamic,
modular applications in the cloud”
Run code in a choice of
standard languages. Use
threads, processes, files, and
shell scripts normally.
Focus on business logic, not
infrastructure. You upload code;
AWS Lambda handles everything
else.
Benefits of AWS Lambda for building a server-
less data processing engine
1 2 3
Amazon Kinesis: Overview
A managed service for streaming data ingestion and processing
Amazon Kinesis
Streams
Build your own
custom applications
that process or
analyze streaming
data
Amazon Kinesis
Firehose
Easily load massive
volumes of streaming
data into Amazon S3
and Redshift
Amazon Kinesis
Analytics
Easily analyze data
streams using
standard SQL queries
Amazon Kinesis: Streaming data done the AWS way
Makes it easy to capture, deliver, and process real-time data streams
Pay as you go, no up-front costs
Elastically scalable
Right services for your specific use cases
Real-time latencies
Easy to provision, deploy, and manage
Benefits of Amazon Kinesis for stream data
ingestion and continuous processing
Real-time Ingest
Highly Scalable
Durable
Elastic
Replay-able Reads
Continuous Processing FX
Elastic
Load-balancing incoming streams
Fault-tolerance, Checkpoint / Replay
Enable multiple processing apps in
parallel
Enable data movement into Stores/ Processing Engines
Managed Service
Low end-to-end latency
Data Processing/Streaming
Architecture & Workflow
Smart
Devices
Click
Stream
Log
Data
AWS Lambda and Amazon Kinesis integration
How it Works
Stream-based model:
▪ Lambda polls the stream; When new records detected Lambda function
invoked.
▪ New records are passed by Kinesis as parameter.
▪ Kinesis mapped as Event source in Lambda
Synchronous invocation:
▪ Lambda invoked by RequestResponse invocation i.e. polling Kinesis
Stream.
▪ Lambda function is executed once.
Event structure:
▪ Event received by Lambda function is a collection of records from
Kinesis stream.
▪ Lambda Kinesis Event source: Batch size/Max records configured to be
received per invocation.
Streaming Architecture Workflow: Lambda + Kinesis
Data Input Kinesis Action Lambda Data Output
IT application activity
Capture the
stream
Audit
Process the
stream
SNS
Metering records Condense Redshift
Change logs Backup S3
Financial data Store RDS
Transaction orders Process SQS
Server health metrics Monitor EC2
User clickstream Analyze EMR
IoT device data Respond Backend endpoint
Custom data Custom action Custom application
Common Architecture: Lambda + Kinesis
Real Time Data Processing
Amazon
Kinesis
AWS
Lambda 1
Amazon
CloudWatch
Amazon
DynamoDB
AWS
Lambda 2 Amazon
S3
1. Real-time event data sent to Amazon
Kinesis, allows multiple AWS Lambda
functions to process the same events.
2. In AWS Lambda, Function 1
processes the incoming events and
stores event data in Amazon
DynamoDB
3. Lambda Function 1 also sends values
to Amazon CloudWatch for simple
monitoring of metrics.
4. In AWS Lambda function, Function 2
stores incoming data events in
Amazon S3
https://s3.amazonaws.com/awslambda-reference-
architectures/stream-processing/lambda-refarch-
streamprocessing.pdf
Common Architecture: Lambda + Kinesis
Data Processing for Data Storage/Analysis
Use Lambda to “fan out” to
other AWS services i.e.
Storage, Database, and
BI/analytics
Amazon Kinesis stream can
continuously capture and
store terabytes of data per
hour from hundreds of
thousands of sources
Grant AWS Lambda
permissions for the relevant
stream actions via IAM
(Execution Role) during
function creation
IAM
IAM
IAM
Demo: Real time processing of
Amazon Kinesis data streams with
AWS Lambda
Data Processing:
Best Practices & Tips
Best Practices: Kinesis
Creating a Kinesis stream
Streams
▪ Made up of Shards
▪ Each Shard ingests data up to 1MB/sec
▪ Each Shard emits data up to 2MB/sec
Data
▪ All data is stored for 24 hours, Replay data inside of 24hr window
▪ A Partition Key is supplied by producer and used to distribute the PUTs across
Shards
▪ A unique Sequence # is returned to the Producer upon a successful PUT call
Best Practices: Lambda
Create different Lambda functions for each task, associate to same Kinesis stream
Log to
CloudWatch
Logs
Push to SNS
Best Practices: Lambda
Attaching a Lambda function to a Kinesis Stream
▪ Shards: One Lambda function concurrently invoked per Kinesis shard
▪ Increasing shards will cause more Lambda functions invoked
concurrently
▪ Each individual shard follows ordered processing
… …
Source
Kinesis
Destination
1
Lambda
Destination
2
Pollers FunctionsShards
Lambda will scale automaticallyScale Kinesis by adding shards
Best Practices: Kinesis
Performance tuning Kinesis as an event source
Batch size:
▪ Number of records that AWS Lambda
will retrieve from Kinesis at the time of
invoking your function
▪ Increasing batch size will cause fewer
Lambda function invocations with more
data processed per function
Starting Position:
▪ The position in the stream where
Lambda starts reading
▪ Set to “Trim Horizon” for reading from
start of stream (all data)
▪ Set to “Latest” for reading most
recent data (LIFO) (latest data)
Best Practices: Lambda
Creating Lambda functions
Memory:
▪ CPU and disk proportional to the memory configured
▪ Increasing memory makes your code execute faster (if CPU bound)
▪ Increasing memory allows for larger record sizes processed
Timeout:
▪ Increasing timeout allows for longer functions, but more wait in case of
errors
Retries:
▪ For Kinesis, Lambda has unlimited retries (until data expires)
Permission model:
• Lambda pulls data from Kinesis, so no invocation role needed, only
execution role
Best Practices: Lambda
Monitoring and Debugging Lambda functions
•Monitoring: available in Amazon CloudWatch Metrics
• Invocation count
• Duration
• Error count
• Throttle count
•Debugging: available in Amazon CloudWatch Logs
• All Metrics
• Custom logs
• RAM consumed
• Search for log events
Get Started: Data Processing with AWS
Next Steps
1. Create your first Kinesis stream. Configure hundreds of thousands
of data producers to put data into an Amazon Kinesis stream. Ex.
data from Social media feeds.
2. Create and test your first Lambda function. Use any third party
library, even native ones. First 1M requests each month are on us!
3. Read the Developer Guide, AWS Lambda and Kinesis Tutorial, and
resources on GitHub at AWS Labs
• http://docs.aws.amazon.com/lambda/latest/dg/with-kinesis.html
• https://github.com/awslabs/lambda-streams-to-firehose lambda-
streams-to-firehose
Thank You!
Tara E. Walker
AWS Technical Evangelist
@taraw

Contenu connexe

Plus de Amazon Web Services

OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSAmazon Web Services
 
AWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAmazon Web Services
 
Crea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSightCrea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSightAmazon Web Services
 
Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotCostruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotAmazon Web Services
 
Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Amazon Web Services
 
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?Amazon Web Services
 
Protect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced AttacksProtect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced AttacksAmazon Web Services
 

Plus de Amazon Web Services (20)

OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWS
 
AWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei server
 
Crea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSightCrea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSight
 
Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotCostruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
 
Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows
 
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
 
Protect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced AttacksProtect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced Attacks
 

Dernier

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Dernier (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Real-time Data Processing Using AWS Lambda

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tara E. Walker, AWS Technical Evangelist 06.21.2016 Real-time Data Processing Using AWS Lambda
  • 2.
  • 3. Agenda AWS Services for Data Processing AWS Lambda Amazon Kinesis Architecture & Workflow for Streaming Data Processing Streaming Data Processing Demo Best Practices in Building Data Processing Solutions
  • 4. AWS Services for Data Processing Amazon Kinesis AWS Lambda
  • 5. Amazon Lambda: Overview Serverless compute service that runs code in response to events without need to provision and manage servers No Servers to Manage Automatically runs code without Provisioning or Managing servers. Just write the code and upload it to Lambda. Continuous Scaling Auto scales Application precisely with the size of the workload. Code runs in Parallel & Processes each trigger individually Subsecond Metering Starts Code within milliseconds of an Event Executes only when event is triggered. Triggers from AWS services or directly from web or mobile applications
  • 6. AWS Lambda: Serverless Compute in the Cloud Compute service that runs your code in response to events Easy to author, deploy, maintain, secure and manage Allows for focus on business logic, not infrastructure Stateless, event-driven code with native support for Node.js, Java, and Python languages Compute & Code without managing infrastructure like EC2 instances and auto scaling groups Makes it easy to Build back-end services that perform at scale
  • 7. High performance at any scale; Cost-effective and efficient No Infrastructure to manage Pay only for what you use: Lambda automatically matches capacity to your request rate. Purchase compute in 100ms increments. Bring Your Own Code “Productivity focused compute platform to build powerful, dynamic, modular applications in the cloud” Run code in a choice of standard languages. Use threads, processes, files, and shell scripts normally. Focus on business logic, not infrastructure. You upload code; AWS Lambda handles everything else. Benefits of AWS Lambda for building a server- less data processing engine 1 2 3
  • 8. Amazon Kinesis: Overview A managed service for streaming data ingestion and processing Amazon Kinesis Streams Build your own custom applications that process or analyze streaming data Amazon Kinesis Firehose Easily load massive volumes of streaming data into Amazon S3 and Redshift Amazon Kinesis Analytics Easily analyze data streams using standard SQL queries
  • 9. Amazon Kinesis: Streaming data done the AWS way Makes it easy to capture, deliver, and process real-time data streams Pay as you go, no up-front costs Elastically scalable Right services for your specific use cases Real-time latencies Easy to provision, deploy, and manage
  • 10. Benefits of Amazon Kinesis for stream data ingestion and continuous processing Real-time Ingest Highly Scalable Durable Elastic Replay-able Reads Continuous Processing FX Elastic Load-balancing incoming streams Fault-tolerance, Checkpoint / Replay Enable multiple processing apps in parallel Enable data movement into Stores/ Processing Engines Managed Service Low end-to-end latency
  • 11. Data Processing/Streaming Architecture & Workflow Smart Devices Click Stream Log Data
  • 12. AWS Lambda and Amazon Kinesis integration How it Works Stream-based model: ▪ Lambda polls the stream; When new records detected Lambda function invoked. ▪ New records are passed by Kinesis as parameter. ▪ Kinesis mapped as Event source in Lambda Synchronous invocation: ▪ Lambda invoked by RequestResponse invocation i.e. polling Kinesis Stream. ▪ Lambda function is executed once. Event structure: ▪ Event received by Lambda function is a collection of records from Kinesis stream. ▪ Lambda Kinesis Event source: Batch size/Max records configured to be received per invocation.
  • 13. Streaming Architecture Workflow: Lambda + Kinesis Data Input Kinesis Action Lambda Data Output IT application activity Capture the stream Audit Process the stream SNS Metering records Condense Redshift Change logs Backup S3 Financial data Store RDS Transaction orders Process SQS Server health metrics Monitor EC2 User clickstream Analyze EMR IoT device data Respond Backend endpoint Custom data Custom action Custom application
  • 14. Common Architecture: Lambda + Kinesis Real Time Data Processing Amazon Kinesis AWS Lambda 1 Amazon CloudWatch Amazon DynamoDB AWS Lambda 2 Amazon S3 1. Real-time event data sent to Amazon Kinesis, allows multiple AWS Lambda functions to process the same events. 2. In AWS Lambda, Function 1 processes the incoming events and stores event data in Amazon DynamoDB 3. Lambda Function 1 also sends values to Amazon CloudWatch for simple monitoring of metrics. 4. In AWS Lambda function, Function 2 stores incoming data events in Amazon S3 https://s3.amazonaws.com/awslambda-reference- architectures/stream-processing/lambda-refarch- streamprocessing.pdf
  • 15. Common Architecture: Lambda + Kinesis Data Processing for Data Storage/Analysis Use Lambda to “fan out” to other AWS services i.e. Storage, Database, and BI/analytics Amazon Kinesis stream can continuously capture and store terabytes of data per hour from hundreds of thousands of sources Grant AWS Lambda permissions for the relevant stream actions via IAM (Execution Role) during function creation IAM IAM IAM
  • 16. Demo: Real time processing of Amazon Kinesis data streams with AWS Lambda
  • 18. Best Practices: Kinesis Creating a Kinesis stream Streams ▪ Made up of Shards ▪ Each Shard ingests data up to 1MB/sec ▪ Each Shard emits data up to 2MB/sec Data ▪ All data is stored for 24 hours, Replay data inside of 24hr window ▪ A Partition Key is supplied by producer and used to distribute the PUTs across Shards ▪ A unique Sequence # is returned to the Producer upon a successful PUT call
  • 19. Best Practices: Lambda Create different Lambda functions for each task, associate to same Kinesis stream Log to CloudWatch Logs Push to SNS
  • 20. Best Practices: Lambda Attaching a Lambda function to a Kinesis Stream ▪ Shards: One Lambda function concurrently invoked per Kinesis shard ▪ Increasing shards will cause more Lambda functions invoked concurrently ▪ Each individual shard follows ordered processing … … Source Kinesis Destination 1 Lambda Destination 2 Pollers FunctionsShards Lambda will scale automaticallyScale Kinesis by adding shards
  • 21. Best Practices: Kinesis Performance tuning Kinesis as an event source Batch size: ▪ Number of records that AWS Lambda will retrieve from Kinesis at the time of invoking your function ▪ Increasing batch size will cause fewer Lambda function invocations with more data processed per function Starting Position: ▪ The position in the stream where Lambda starts reading ▪ Set to “Trim Horizon” for reading from start of stream (all data) ▪ Set to “Latest” for reading most recent data (LIFO) (latest data)
  • 22. Best Practices: Lambda Creating Lambda functions Memory: ▪ CPU and disk proportional to the memory configured ▪ Increasing memory makes your code execute faster (if CPU bound) ▪ Increasing memory allows for larger record sizes processed Timeout: ▪ Increasing timeout allows for longer functions, but more wait in case of errors Retries: ▪ For Kinesis, Lambda has unlimited retries (until data expires) Permission model: • Lambda pulls data from Kinesis, so no invocation role needed, only execution role
  • 23. Best Practices: Lambda Monitoring and Debugging Lambda functions •Monitoring: available in Amazon CloudWatch Metrics • Invocation count • Duration • Error count • Throttle count •Debugging: available in Amazon CloudWatch Logs • All Metrics • Custom logs • RAM consumed • Search for log events
  • 24. Get Started: Data Processing with AWS Next Steps 1. Create your first Kinesis stream. Configure hundreds of thousands of data producers to put data into an Amazon Kinesis stream. Ex. data from Social media feeds. 2. Create and test your first Lambda function. Use any third party library, even native ones. First 1M requests each month are on us! 3. Read the Developer Guide, AWS Lambda and Kinesis Tutorial, and resources on GitHub at AWS Labs • http://docs.aws.amazon.com/lambda/latest/dg/with-kinesis.html • https://github.com/awslabs/lambda-streams-to-firehose lambda- streams-to-firehose
  • 25. Thank You! Tara E. Walker AWS Technical Evangelist @taraw