SlideShare une entreprise Scribd logo
1  sur  48
Lynn Langit
New AWS Services
For bioinformatics pipelines
Feb 2017
New AWS Services
• Useful for scaling bioinformatics pipelines
• Announced at re:Invent (Nov 2016)
• Athena
• Step Functions
• Batch
• Glue
• QuickSight
Starting Point for CSIRO
Serverless AWS Lambda Application
Public Genomic Datasets
About AWS Athena
Serverless SQL queries on S3 data
AWS Athena Information
• Add table (structure) to database via DDL from input file(s)
• Write and execute SQL query
• Optionally save query
• Optionally review query history
• View results
• Optionally download result set to .csv
Athena - Demo
Athena Genomics Query Example
About AWS Step Functions
Serverless visual workflows for Lambdas
AWS Step Functions
1. Define steps and services (activities or lambdas)
2. Verify step execution(s)
3. Monitor and scale
“Your application as a state machine.”
AWS Step Functions – 1. Define Steps/Services
AWS Step Functions – 2. Verify step execution
Step Functions - Demo
About AWS Batch
Fully managed batch processing at scale
What is batch computing?
Run jobs asynchronously and automatically across one or
more computers.
Jobs may dependencies, making the sequencing and
scheduling of multiple jobs complex and challenging.
What is AWS Batch?
Fully Managed
No software to install or
servers to manage.
Integrated with AWS
Batch jobs can easily and
securely interact with
services such as Amazon S3,
DynamoDB, and Rekognition
Cost-optimized
Provisioning
Auto provisions compute
resources tailored to the job
needs using EC2 & EC2 Spot
AWS Batch Concepts
1. Jobs
1. Job Definitions
2. Job Queues
3. Job States
2. Compute Environments
3. Scheduler
Short Video -- here
Jobs
Jobs are the unit of work executed by AWS Batch as containerized
applications running on Amazon EC2.
Containerized jobs can reference a container image, command, and
parameters or users can simply provide a .zip containing their
application and we will run it on a default Amazon Linux container.
$ aws batch submit-job --job-name variant-calling
--job-definition gatk --job-queue genomics
Massively parallel jobs
• Now - users can submit a large number of independent “simple jobs.”
• Soon – AWS will add support for “array jobs” that run many copies of an
application against an array of elements.
Array jobs are an efficient way to run:
• Parametric sweeps
• Monte Carlo simulations
• Processing a large collection of objects
NOTE: These use cases are possible today, simply submit more jobs.
Example Genomics
Workflow
Workflows, Pipelines, and Job Dependencies
Jobs can express a dependency on the successful
completion of other jobs or specific elements of an
array job.
Use your preferred workflow engine and language to
submit jobs. Flow-based systems simply submit jobs
serially, while DAG-based systems submit many jobs
at once, identifying inter-job dependencies.
$ aws batch submit-job –depends-on 606b3ad1-aa31-48d8-92ec-f154bfc8215f ...
Job Definitions
Batch Job Definitions specify how jobs are to be run. While each job
must reference a job definition, many parameters can be overridden.
Some of the attributes specified in a job definition:
• IAM role associated with the job
• vCPU and memory requirements
• Mount points
• Container properties
• Environment variables
$ aws batch register-job-definition --job-definition-name gatk
--container-properties ...
Job Queues
Jobs are submitted to a Job Queue, where they reside until they are
able to be scheduled to a compute resource. Information related to
completed jobs persists in the queue for 24 hours.
$ aws batch create-job-queue --job-queue-name genomics
--priority 500 --compute-environment-order ...
Compute Environments
Mapped from job queues to run containerized batch jobs.
• Managed CEs - you describe your requirements (instance types,
min/max/desired vCPUs, and EC2 Spot bid as a % of On-Demand),
AWS launches & scales resources for you. Pick specific instance types,
instance families or simply choose “optimal”
• Unmanaged CEs - you can launch and manage your own resources. Your
instances need to include the ECS agent and run supported versions of Linux
and Docker. AWS Batch will then create an Amazon ECS cluster which can
accept the instances you launch. Jobs can be scheduled to your Compute
Environment as soon as your instances are healthy and register with the
ECS Agent.
$ aws batch create-compute-environment --compute-
environment-name unmanagedce --type UNMANAGED ...
AWS Batch Scheduler
The Scheduler evaluates when, where, and
how to run jobs that have been submitted to
a job queue.
Jobs run in approximately the order in which
they are submitted as long as all
dependencies on other jobs have been met.
Queued Job States
• SUBMITTED: Accepted into the queue, but not yet evaluated for execution
• PENDING: Your job has dependencies on other jobs which have not yet
completed
• RUNNABLE: Your job has been evaluated by the scheduler and is ready to run
• STARTING: Your job is in the process of being scheduled to a compute
resource
• RUNNING: Your job is currently running
• SUCCEEDED: Your job has finished with exit code 0
• FAILED: Your job finished with a non-zero exit code or was cancelled or
terminated.
AWS Batch Actions
• CancelJob: Marks jobs that are not yet STARTING as
FAILED.
• TerminateJob: Cancels jobs that are currently waiting in the
queue. Stops jobs that are in a STARTING or RUNNING state
and transitions them to FAILED.
NOTE: Requires a “reason” which is viewable via DescribeJobs
$ aws batch cancel-job --reason “Submitted to wrong queue”
--jobId= 8a767ac8-e28a-4c97-875b-e5c0bcf49eb8
AWS Batch Data Types
• ComputeEnvironmentDetail
• ComputeEnvironmentOrder
• ComputeResource
• ContainerProperties
• ContainerPropertiesResource
• CounterProperties
• Host
• Job
• JobDefinition
• JobQueueDetail
• MountPoint
• Parameter
• Ulimit
• Volume
Batch - Demo
AWS Batch Pricing and Functionality
There is no charge for AWS Batch; you only pay for the
underlying resources that you consume!
NOTE: Support for Array Jobs, retries, and jobs executed as AWS Lambda
functions coming soon!
Use the Right Tool for the Job
Not all batch workloads are the same…
• ETL and Big Data processing/analytics?
• Consider EMR, Data Pipeline, Redshift, and related services.
• Lots of small Cron jobs? AWS Batch is a great way to execute these jobs, but
you will likely want a workflow or job-scheduling system to orchestrate job
submissions.
• Efficiently run lots of big and small compute jobs on heterogeneous
compute resources? Use AWS Batch
Example: DNA Sequencing
Example: Genomics on Unmanaged Compute Environments
Fully Managed Integrated with AWS Cost-optimized
Resource Provisioning
AWS Batch summarized
About AWS Glue
Serverless managed, scalable ETL
AWS Glue
1. Build a data catalog
1. Discover and use your datasets via a Hive-compatible metastore
2. Store versions, connection and credential info
3. Use crawlers to auto-generate schema from S3 data & partitions
2. Generate and edit transforms using PySpark
3. Schedule and run your jobs
1. On schedule, event or lambda
NOTE: Glue is announced, but no beta as of yet…video from re:Invent -- here
An aside…
EC2 Elastic GPUs
About AWS QuickSight
Quick and easy data dashboards
Resources for new AWS Services
• Athena (SQL query on S3) – here
• Batch (Optimized, chained EC2 batches) – here
• Glue (Scaled ETL) -- here
• Step Functions (Lambda workflows) – here
• QuickSight (Data Dashboards) – here
• Full list of AWS services announced at re:Invent 2016 -- here

Contenu connexe

Tendances

AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!Chris Taylor
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Chris Fregly
 
AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...
AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...
AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...Amazon Web Services
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)Amazon Web Services
 
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Amazon Web Services
 
Batch Processing with Containers on AWS - June 2017 AWS Online Tech Talks
Batch Processing with Containers on AWS -  June 2017 AWS Online Tech TalksBatch Processing with Containers on AWS -  June 2017 AWS Online Tech Talks
Batch Processing with Containers on AWS - June 2017 AWS Online Tech TalksAmazon Web Services
 
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...Amazon Web Services
 
(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWS(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWSAmazon Web Services
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.Amazon Web Services
 
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for RedisManaging Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for RedisAmazon Web Services
 
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)Amazon Web Services
 
AWS for the Data Professional
AWS for the Data ProfessionalAWS for the Data Professional
AWS for the Data ProfessionalLynn Langit
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Amazon Web Services
 
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
Building Big Data Applications with Serverless Architectures -  June 2017 AWS...Building Big Data Applications with Serverless Architectures -  June 2017 AWS...
Building Big Data Applications with Serverless Architectures - June 2017 AWS...Amazon Web Services
 
AWS re:Invent 2016 Recap: What Happened, What It Means
AWS re:Invent 2016 Recap: What Happened, What It MeansAWS re:Invent 2016 Recap: What Happened, What It Means
AWS re:Invent 2016 Recap: What Happened, What It MeansRightScale
 
Scaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsScaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsYelp Engineering
 
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...Amazon Web Services
 

Tendances (20)

AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...
AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...
AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
 
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
 
Batch Processing with Containers on AWS - June 2017 AWS Online Tech Talks
Batch Processing with Containers on AWS -  June 2017 AWS Online Tech TalksBatch Processing with Containers on AWS -  June 2017 AWS Online Tech Talks
Batch Processing with Containers on AWS - June 2017 AWS Online Tech Talks
 
Real-Time Event Processing
Real-Time Event ProcessingReal-Time Event Processing
Real-Time Event Processing
 
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
 
(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWS(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWS
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
 
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for RedisManaging Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
 
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
 
AWS for the Data Professional
AWS for the Data ProfessionalAWS for the Data Professional
AWS for the Data Professional
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
 
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
Building Big Data Applications with Serverless Architectures -  June 2017 AWS...Building Big Data Applications with Serverless Architectures -  June 2017 AWS...
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
 
AWS re:Invent 2016 Recap: What Happened, What It Means
AWS re:Invent 2016 Recap: What Happened, What It MeansAWS re:Invent 2016 Recap: What Happened, What It Means
AWS re:Invent 2016 Recap: What Happened, What It Means
 
Scaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsScaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique Visitors
 
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
 

En vedette

What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'Lynn Langit
 
Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsLynn Langit
 
Optimizing costs with spot instances
Optimizing costs with spot instancesOptimizing costs with spot instances
Optimizing costs with spot instancesAmazon Web Services
 
Redis Labs and SQL Server
Redis Labs and SQL ServerRedis Labs and SQL Server
Redis Labs and SQL ServerLynn Langit
 
AWS Cost optimization at scale
AWS Cost optimization at scaleAWS Cost optimization at scale
AWS Cost optimization at scaleBrett Pollak
 
Aws Atlanta meetup Amazon Athena
Aws Atlanta meetup Amazon AthenaAws Atlanta meetup Amazon Athena
Aws Atlanta meetup Amazon AthenaAdam Book
 
Aws meetup aws_waf
Aws meetup aws_wafAws meetup aws_waf
Aws meetup aws_wafAdam Book
 
Scaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformScaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformLynn Langit
 
Cloud Big Data Architectures
Cloud Big Data ArchitecturesCloud Big Data Architectures
Cloud Big Data ArchitecturesLynn Langit
 
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)Amazon Web Services
 
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, GoogleBringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, GoogleAmbassador Labs
 
(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data
(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data
(STG308) How EA, State Of Texas & H3 Biomedicine Protect DataAmazon Web Services
 
Design in Tech Report 2017
Design in Tech Report 2017Design in Tech Report 2017
Design in Tech Report 2017John Maeda
 

En vedette (13)

What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'
 
Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline Patterns
 
Optimizing costs with spot instances
Optimizing costs with spot instancesOptimizing costs with spot instances
Optimizing costs with spot instances
 
Redis Labs and SQL Server
Redis Labs and SQL ServerRedis Labs and SQL Server
Redis Labs and SQL Server
 
AWS Cost optimization at scale
AWS Cost optimization at scaleAWS Cost optimization at scale
AWS Cost optimization at scale
 
Aws Atlanta meetup Amazon Athena
Aws Atlanta meetup Amazon AthenaAws Atlanta meetup Amazon Athena
Aws Atlanta meetup Amazon Athena
 
Aws meetup aws_waf
Aws meetup aws_wafAws meetup aws_waf
Aws meetup aws_waf
 
Scaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformScaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud Platform
 
Cloud Big Data Architectures
Cloud Big Data ArchitecturesCloud Big Data Architectures
Cloud Big Data Architectures
 
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
 
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, GoogleBringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
 
(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data
(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data
(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data
 
Design in Tech Report 2017
Design in Tech Report 2017Design in Tech Report 2017
Design in Tech Report 2017
 

Similaire à New AWS Services for Bioinformatics

NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing
 	  NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing 	  NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computingAmazon Web Services
 
Amazon Batch: 實現簡單且有效率的批次運算
Amazon Batch: 實現簡單且有效率的批次運算Amazon Batch: 實現簡單且有效率的批次運算
Amazon Batch: 實現簡單且有效率的批次運算Amazon Web Services
 
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...Amazon Web Services
 
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...Amazon Web Services
 
AWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the CloudAWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the CloudAmazon Web Services
 
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container DayECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container DayAmazon Web Services Korea
 
AWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloudAWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloudAdrian Hornsby
 
AWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the CloudAWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the CloudAmazon Web Services
 
Building and scaling your containerized microservices on Amazon ECS
Building and scaling your containerized microservices on Amazon ECSBuilding and scaling your containerized microservices on Amazon ECS
Building and scaling your containerized microservices on Amazon ECSAmazon Web Services
 
Cloud & Native Cloud for Managers
Cloud & Native Cloud for ManagersCloud & Native Cloud for Managers
Cloud & Native Cloud for ManagersEitan Sela
 
Introduction to Batch Processing on AWS
Introduction to Batch Processing on AWSIntroduction to Batch Processing on AWS
Introduction to Batch Processing on AWSAmazon Web Services
 
intro elastic container service amazon aws
intro elastic container service amazon awsintro elastic container service amazon aws
intro elastic container service amazon awsDanielJara92
 
Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017delagoya
 

Similaire à New AWS Services for Bioinformatics (20)

NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing
 	  NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing 	  NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing
 
Amazon Batch: 實現簡單且有效率的批次運算
Amazon Batch: 實現簡單且有效率的批次運算Amazon Batch: 實現簡單且有效率的批次運算
Amazon Batch: 實現簡單且有效率的批次運算
 
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
 
Introduction to AWS Batch
Introduction to AWS BatchIntroduction to AWS Batch
Introduction to AWS Batch
 
Introduction to AWS Batch
Introduction to AWS BatchIntroduction to AWS Batch
Introduction to AWS Batch
 
Introduction to AWS Batch
Introduction to AWS BatchIntroduction to AWS Batch
Introduction to AWS Batch
 
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...
 
SRV410 Deep Dive on AWS Batch
SRV410 Deep Dive on AWS BatchSRV410 Deep Dive on AWS Batch
SRV410 Deep Dive on AWS Batch
 
AWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the CloudAWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the Cloud
 
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container DayECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
 
AWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloudAWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloud
 
AWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the CloudAWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the Cloud
 
Building and scaling your containerized microservices on Amazon ECS
Building and scaling your containerized microservices on Amazon ECSBuilding and scaling your containerized microservices on Amazon ECS
Building and scaling your containerized microservices on Amazon ECS
 
Managing Your Cloud Assets
Managing Your Cloud AssetsManaging Your Cloud Assets
Managing Your Cloud Assets
 
Cloud & Native Cloud for Managers
Cloud & Native Cloud for ManagersCloud & Native Cloud for Managers
Cloud & Native Cloud for Managers
 
AWS glue technical enablement training
AWS glue technical enablement trainingAWS glue technical enablement training
AWS glue technical enablement training
 
Introduction to Batch Processing on AWS
Introduction to Batch Processing on AWSIntroduction to Batch Processing on AWS
Introduction to Batch Processing on AWS
 
Intro to Amazon ECS
Intro to Amazon ECSIntro to Amazon ECS
Intro to Amazon ECS
 
intro elastic container service amazon aws
intro elastic container service amazon awsintro elastic container service amazon aws
intro elastic container service amazon aws
 
Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017
 

Plus de Lynn Langit

VariantSpark on AWS
VariantSpark on AWSVariantSpark on AWS
VariantSpark on AWSLynn Langit
 
Serverless Architectures
Serverless ArchitecturesServerless Architectures
Serverless ArchitecturesLynn Langit
 
10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids ProgrammingLynn Langit
 
Blastn plus jupyter on Docker
Blastn plus jupyter on DockerBlastn plus jupyter on Docker
Blastn plus jupyter on DockerLynn Langit
 
Testing in Ballerina Language
Testing in Ballerina LanguageTesting in Ballerina Language
Testing in Ballerina LanguageLynn Langit
 
Teaching Kids to create Alexa Skills
Teaching Kids to create Alexa SkillsTeaching Kids to create Alexa Skills
Teaching Kids to create Alexa SkillsLynn Langit
 
Understanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examplesUnderstanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examplesLynn Langit
 
Genome-scale Big Data Pipelines
Genome-scale Big Data PipelinesGenome-scale Big Data Pipelines
Genome-scale Big Data PipelinesLynn Langit
 
Teaching Kids Programming
Teaching Kids ProgrammingTeaching Kids Programming
Teaching Kids ProgrammingLynn Langit
 
Serverless Reality
Serverless RealityServerless Reality
Serverless RealityLynn Langit
 
Genomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesGenomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesLynn Langit
 
VariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsVariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsLynn Langit
 
Bioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSBioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSLynn Langit
 
Teaching Kids Programming for Developers
Teaching Kids Programming for DevelopersTeaching Kids Programming for Developers
Teaching Kids Programming for DevelopersLynn Langit
 
Cloud-centric Internet of Things
Cloud-centric Internet of ThingsCloud-centric Internet of Things
Cloud-centric Internet of ThingsLynn Langit
 
Building AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauBuilding AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauLynn Langit
 
TKPJava Eclipse and Codenvy IDE Keyboard Shortcuts
TKPJava Eclipse and Codenvy IDE Keyboard ShortcutsTKPJava Eclipse and Codenvy IDE Keyboard Shortcuts
TKPJava Eclipse and Codenvy IDE Keyboard ShortcutsLynn Langit
 
Understanding Codenvy - for Containerized Developer Workspaces
Understanding Codenvy - for Containerized Developer WorkspacesUnderstanding Codenvy - for Containerized Developer Workspaces
Understanding Codenvy - for Containerized Developer WorkspacesLynn Langit
 

Plus de Lynn Langit (20)

VariantSpark on AWS
VariantSpark on AWSVariantSpark on AWS
VariantSpark on AWS
 
Serverless Architectures
Serverless ArchitecturesServerless Architectures
Serverless Architectures
 
10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming
 
Blastn plus jupyter on Docker
Blastn plus jupyter on DockerBlastn plus jupyter on Docker
Blastn plus jupyter on Docker
 
Testing in Ballerina Language
Testing in Ballerina LanguageTesting in Ballerina Language
Testing in Ballerina Language
 
Teaching Kids to create Alexa Skills
Teaching Kids to create Alexa SkillsTeaching Kids to create Alexa Skills
Teaching Kids to create Alexa Skills
 
Practical cloud
Practical cloudPractical cloud
Practical cloud
 
Understanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examplesUnderstanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examples
 
Genome-scale Big Data Pipelines
Genome-scale Big Data PipelinesGenome-scale Big Data Pipelines
Genome-scale Big Data Pipelines
 
Teaching Kids Programming
Teaching Kids ProgrammingTeaching Kids Programming
Teaching Kids Programming
 
Practical Cloud
Practical CloudPractical Cloud
Practical Cloud
 
Serverless Reality
Serverless RealityServerless Reality
Serverless Reality
 
Genomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesGenomic Scale Big Data Pipelines
Genomic Scale Big Data Pipelines
 
VariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsVariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomics
 
Bioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSBioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWS
 
Teaching Kids Programming for Developers
Teaching Kids Programming for DevelopersTeaching Kids Programming for Developers
Teaching Kids Programming for Developers
 
Cloud-centric Internet of Things
Cloud-centric Internet of ThingsCloud-centric Internet of Things
Cloud-centric Internet of Things
 
Building AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauBuilding AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and Tableau
 
TKPJava Eclipse and Codenvy IDE Keyboard Shortcuts
TKPJava Eclipse and Codenvy IDE Keyboard ShortcutsTKPJava Eclipse and Codenvy IDE Keyboard Shortcuts
TKPJava Eclipse and Codenvy IDE Keyboard Shortcuts
 
Understanding Codenvy - for Containerized Developer Workspaces
Understanding Codenvy - for Containerized Developer WorkspacesUnderstanding Codenvy - for Containerized Developer Workspaces
Understanding Codenvy - for Containerized Developer Workspaces
 

Dernier

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Dernier (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 

New AWS Services for Bioinformatics

  • 1. Lynn Langit New AWS Services For bioinformatics pipelines Feb 2017
  • 2. New AWS Services • Useful for scaling bioinformatics pipelines • Announced at re:Invent (Nov 2016) • Athena • Step Functions • Batch • Glue • QuickSight
  • 4. Serverless AWS Lambda Application
  • 6. About AWS Athena Serverless SQL queries on S3 data
  • 7.
  • 8. AWS Athena Information • Add table (structure) to database via DDL from input file(s) • Write and execute SQL query • Optionally save query • Optionally review query history • View results • Optionally download result set to .csv
  • 11. About AWS Step Functions Serverless visual workflows for Lambdas
  • 12. AWS Step Functions 1. Define steps and services (activities or lambdas) 2. Verify step execution(s) 3. Monitor and scale “Your application as a state machine.”
  • 13. AWS Step Functions – 1. Define Steps/Services
  • 14. AWS Step Functions – 2. Verify step execution
  • 16.
  • 17. About AWS Batch Fully managed batch processing at scale
  • 18. What is batch computing? Run jobs asynchronously and automatically across one or more computers. Jobs may dependencies, making the sequencing and scheduling of multiple jobs complex and challenging.
  • 19. What is AWS Batch? Fully Managed No software to install or servers to manage. Integrated with AWS Batch jobs can easily and securely interact with services such as Amazon S3, DynamoDB, and Rekognition Cost-optimized Provisioning Auto provisions compute resources tailored to the job needs using EC2 & EC2 Spot
  • 20. AWS Batch Concepts 1. Jobs 1. Job Definitions 2. Job Queues 3. Job States 2. Compute Environments 3. Scheduler Short Video -- here
  • 21. Jobs Jobs are the unit of work executed by AWS Batch as containerized applications running on Amazon EC2. Containerized jobs can reference a container image, command, and parameters or users can simply provide a .zip containing their application and we will run it on a default Amazon Linux container. $ aws batch submit-job --job-name variant-calling --job-definition gatk --job-queue genomics
  • 22. Massively parallel jobs • Now - users can submit a large number of independent “simple jobs.” • Soon – AWS will add support for “array jobs” that run many copies of an application against an array of elements. Array jobs are an efficient way to run: • Parametric sweeps • Monte Carlo simulations • Processing a large collection of objects NOTE: These use cases are possible today, simply submit more jobs.
  • 24. Workflows, Pipelines, and Job Dependencies Jobs can express a dependency on the successful completion of other jobs or specific elements of an array job. Use your preferred workflow engine and language to submit jobs. Flow-based systems simply submit jobs serially, while DAG-based systems submit many jobs at once, identifying inter-job dependencies. $ aws batch submit-job –depends-on 606b3ad1-aa31-48d8-92ec-f154bfc8215f ...
  • 25. Job Definitions Batch Job Definitions specify how jobs are to be run. While each job must reference a job definition, many parameters can be overridden. Some of the attributes specified in a job definition: • IAM role associated with the job • vCPU and memory requirements • Mount points • Container properties • Environment variables $ aws batch register-job-definition --job-definition-name gatk --container-properties ...
  • 26. Job Queues Jobs are submitted to a Job Queue, where they reside until they are able to be scheduled to a compute resource. Information related to completed jobs persists in the queue for 24 hours. $ aws batch create-job-queue --job-queue-name genomics --priority 500 --compute-environment-order ...
  • 27. Compute Environments Mapped from job queues to run containerized batch jobs. • Managed CEs - you describe your requirements (instance types, min/max/desired vCPUs, and EC2 Spot bid as a % of On-Demand), AWS launches & scales resources for you. Pick specific instance types, instance families or simply choose “optimal” • Unmanaged CEs - you can launch and manage your own resources. Your instances need to include the ECS agent and run supported versions of Linux and Docker. AWS Batch will then create an Amazon ECS cluster which can accept the instances you launch. Jobs can be scheduled to your Compute Environment as soon as your instances are healthy and register with the ECS Agent. $ aws batch create-compute-environment --compute- environment-name unmanagedce --type UNMANAGED ...
  • 28. AWS Batch Scheduler The Scheduler evaluates when, where, and how to run jobs that have been submitted to a job queue. Jobs run in approximately the order in which they are submitted as long as all dependencies on other jobs have been met.
  • 29. Queued Job States • SUBMITTED: Accepted into the queue, but not yet evaluated for execution • PENDING: Your job has dependencies on other jobs which have not yet completed • RUNNABLE: Your job has been evaluated by the scheduler and is ready to run • STARTING: Your job is in the process of being scheduled to a compute resource • RUNNING: Your job is currently running • SUCCEEDED: Your job has finished with exit code 0 • FAILED: Your job finished with a non-zero exit code or was cancelled or terminated.
  • 30. AWS Batch Actions • CancelJob: Marks jobs that are not yet STARTING as FAILED. • TerminateJob: Cancels jobs that are currently waiting in the queue. Stops jobs that are in a STARTING or RUNNING state and transitions them to FAILED. NOTE: Requires a “reason” which is viewable via DescribeJobs $ aws batch cancel-job --reason “Submitted to wrong queue” --jobId= 8a767ac8-e28a-4c97-875b-e5c0bcf49eb8
  • 31. AWS Batch Data Types • ComputeEnvironmentDetail • ComputeEnvironmentOrder • ComputeResource • ContainerProperties • ContainerPropertiesResource • CounterProperties • Host • Job • JobDefinition • JobQueueDetail • MountPoint • Parameter • Ulimit • Volume
  • 33. AWS Batch Pricing and Functionality There is no charge for AWS Batch; you only pay for the underlying resources that you consume! NOTE: Support for Array Jobs, retries, and jobs executed as AWS Lambda functions coming soon!
  • 34. Use the Right Tool for the Job Not all batch workloads are the same… • ETL and Big Data processing/analytics? • Consider EMR, Data Pipeline, Redshift, and related services. • Lots of small Cron jobs? AWS Batch is a great way to execute these jobs, but you will likely want a workflow or job-scheduling system to orchestrate job submissions. • Efficiently run lots of big and small compute jobs on heterogeneous compute resources? Use AWS Batch
  • 36. Example: Genomics on Unmanaged Compute Environments
  • 37. Fully Managed Integrated with AWS Cost-optimized Resource Provisioning AWS Batch summarized
  • 38. About AWS Glue Serverless managed, scalable ETL
  • 39. AWS Glue 1. Build a data catalog 1. Discover and use your datasets via a Hive-compatible metastore 2. Store versions, connection and credential info 3. Use crawlers to auto-generate schema from S3 data & partitions 2. Generate and edit transforms using PySpark 3. Schedule and run your jobs 1. On schedule, event or lambda NOTE: Glue is announced, but no beta as of yet…video from re:Invent -- here
  • 40.
  • 41.
  • 42.
  • 44.
  • 45. About AWS QuickSight Quick and easy data dashboards
  • 46.
  • 47.
  • 48. Resources for new AWS Services • Athena (SQL query on S3) – here • Batch (Optimized, chained EC2 batches) – here • Glue (Scaled ETL) -- here • Step Functions (Lambda workflows) – here • QuickSight (Data Dashboards) – here • Full list of AWS services announced at re:Invent 2016 -- here

Notes de l'éditeur

  1. https://www.csiro.au/en/Locations/NSW/North-Ryde Riverside Life Sciences Centre reception Our North Ryde site is in the heart of Sydney's high-tech hub and co-locates researchers from diverse disciplines. Our NSW science education centre is located on this site.
  2. https://aws.amazon.com/blogs/aws/genome-engineering-applications-early-adopters-of-the-cloud/
  3. https://aws.amazon.com/public-datasets/
  4. https://aws.amazon.com/blogs/big-data/interactive-analysis-of-genomic-datasets-using-amazon-athena/
  5. https://aws.amazon.com/blogs/aws/new-aws-step-functions-build-distributed-applications-using-visual-workflows/
  6. https://aws.amazon.com/blogs/aws/aws-batch-run-batch-computing-jobs-on-aws/ Jamie Kinney, Principal Product Manager, AWS Batch
  7. High-Throughput: Can process as many concurrent genomic workflows as needed (>1000 day). Flexible: You define your containers, dependencies, and resource requirements. Batch takes care of the rest. Elastic and Scalable: Treat each workflow like a burst compute. Pay only for what you need when you need it. Cost-Optimized: Runs on spot-fleet to significantly reduce cost of genomic analysis.
  8. https://aws.amazon.com/glue/
  9. https://aws.amazon.com/ec2/Elastic-GPUs/
  10. https://aws.amazon.com/glue/