SlideShare une entreprise Scribd logo
1  sur  49
The Age of Data-Driven
Personalized Medicine
Ketan Paranjape
Worldwide Director, Health & Life Sciences
Intel Corporation
www.intel.com/healthcare/bigdata
Notice and Disclaimers
• Notice: This document contains information on products in the design phase of development. The information here is
subject to change without notice. Do not finalize a design with this information. Contact your local Intel sales office or
your distributor to obtain the latest specification before placing your product order.
• INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. EXCEPT AS PROVIDED
IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER,
AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY RELATING TO SALE AND/OR USE OF INTEL PRODUCTS,
INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR
INFRINGEMENT OF ANY PATENT, COPYRIGHT, OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not
intended for use in medical, life saving, or life sustaining applications. Intel may make changes to specifications,
product descriptions, and plans at any time, without notice.
• All products, dates, and figures are preliminary for planning purposes and are subject to change without notice.
• Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or
"undefined.“ Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or
incompatibilities arising from future changes to them.
•Performance tests and ratings are measured using specific computer systems and/or components and reflect the
approximate performance of Intel products as measured by those tests. Any difference in system hardware or
software design or configuration may affect actual performance.
• The Intel products discussed herein may contain design defects or errors known as errata which may cause the
product to deviate from published specifications. Current characterized errata are available on request.
• Knights Corner, Knights Ferry, Aubrey Isle and other code names featured are used internally within Intel to identify
products that are in development and not yet publicly announced for release. Customers, licensees and other third
parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or
services and any such use of Intel's internal code names is at the sole risk of the user.
• Copies of documents which have an order number and are referenced in this document, or other Intel literature, may
be obtained by calling 1-800-548-4725, or by visiting Intel's website at http://www.intel.com.
• Intel®, Itanium®, Xeon®, Pentium®, and the Intel logo are trademarks or registered trademarks of Intel
Corporation or its subsidiaries in the United States and other countries.
• Copyright © 2011-13, Intel Corporation. All rights reserved.
• *Other names and brands may be claimed as the property of others.
Compute for Personalized Medicine
a.k.a Big Data Analytics in Healthcare and Analytics
ACTIONABLE EHR
ANALYTICS -PAYER,
PROVIDER
Regional Health Information Network
RHIN – China (Jinzhou, Pop 3M)
• Challenge: RHIN has
challenges with scalability,
performance and
maintenance. Data storage
is expensive
• Solution: EMR data and
healthcare services running
on Intel Hadoop Distribution
and Xeon E5 servers.
• Benefits: High
performance and scalability
demonstrated via POC and
stress testing. Significantly
reduced storage cost
• 1/5 Reduction in
Response Time; 5x
Concurrent Users
Data processing flow of RHIN platform
http://hadoop.intel.com/pdfs/IntelChinaHealthyCityAnalyticsCaseStudy.pdf
GE-Medical Quality Improvement
Consortium (MQIC)
• Challenge – Gaining value from
data in EMRs/EHRs and other
digital health information tools
• Solution – De-identified data
from Centricity EMRs; Analytics
capabilities to enhance their
quality and reporting activities.
• 1.6 billion documents
representing 30 million de-
identified patient records and
209 million office visits.
• Benefits - Physician practices
and ambulatory care clinics
deliver their best care more
efficiently, along with
population-based research and
public health activities.
6http://visualization.geblogs.com/visualization/network/
DEMO - http://visualization.geblogs.com/visualization/network/
NHS Trust – Leeds Teaching
Hospitals
• Challenge – Capture data at the
point of admission, throughout the
patient care cycle and use natural
language processing (NLP) to make
sense of unstructured care notes and
combine with structured care data for
analysis
• Solution – Partnering with ISVs –
Ascribe, Two 10degrees, Microsoft
and machines powered by Intel Xeon
processor E5 family;
• 30M patients, > 7M attendances
each year worth of records
• Benefits – Billing optimizations
(doctors log the correct data),
Resource Optimizations (learning
patient trends for resource planning)
7http://visualization.geblogs.com/visualization/network/
DEMO - http://visualization.geblogs.com/visualization/network/
“The use of big data analysis on
our patient care notes enables us
to prove things our clinical intuition
was telling us. In the new world
anecdotal evidence isn’t enough.
What we think isn’t sufficient to
spend money. We need proof.”
Iain MacBrairdy,
Business Manager,
Emergency Medicine,
Leeds Teaching Hospitals
Charite “Real-time” Cancer Analysis –
Matching proper therapies to patients
• Challenge: Real-time
analysis of cancer patients
using the in-memory SAP
HANA Oncolyzer database
that is running on mission
critical Intel Xeon family
infrastructure. (3.5M Data
points per Patient, Up to 20
TB of data/patient
• Solution: Using structured
and unstructured data to
collect and analyze tables
used to take up to two days
-- now takes seconds
• Benefits: Improves medical
quality in disruptive way for
– Patient
– Doctor
– Hospital
– Research
8
http://moss.ger.ith.intel.com/sites/SAP/SAP%20account%20team%20documents/Marketing/SAP%20HANA/SAPHANA_Charite_case_study_HI.PDF
HANA Oncolyzer
• Ad-hoc Analysis of heterogeneous
tumor data for cancer research
• Medical records from decades of tens
of thousands of patients
• Structured and unstructured data
(records, time series, free text, etc.)
Solution
• Integrated into condensed but
exhaustive view
• On-the-fly analyses
(e.g. Kaplan-Meier estimation, cohort statistics)
• Includes external data sources
(e.g. PubMed, pharmaceutical databases)
• Attributes can be native, views,
freetext-extracted, calculated
LIFE SCIENCES, PHARMA,
GENOMICS
Life Sciences: At the intersection of
transformative forces
Enabling exascale
computing on massive
data sets
Helping enterprises build
open interoperable clouds
Contributing code and
fostering ecosystem
HPC Cloud Open Source
10
18
Genomics Is A Big Data Problem
AffectingFactors
Cell
Response
313 Exabytes
if everyone in the
US has their genes
sequenced
495 Exabytes
if every cancer
patient in the US
has their genes
sequenced every 2
weeks
A complex
interaction of varied
& changing intrinsic
and extrinsic factors
determine cell
response
Life Sciences:
Key Industry Challenges and Solutions
• Many (most) applications are single-
threaded, single address space
Intel is delivering optimizations working
with open source community, developing
NGS+HPC curriculum
• Some algorithms scale quadratically with
the size of the problem. Large data sets
exceed available memory and storage
Innovations in acceleration, compute,
storage, networking, security, and *-as-a-
service.
• International collaboration is an
imperative, bioinformatics expertise is
scarce
• Intel is working closely with the ecosystem
to address enterprise to cloud transmission
of terabyte payloads
• Databases are distributed, data is siloed
and will likely stay that way
Tools like Hadoop, Lustre, Graphlab, In-
Memory Analytics, etc.
Need for Balanced Compute Infrastructure
Dell Active Infrastructure for
HPC Life Sciences
• Challenge: Experiment processing takes
7 days with current infrastructure.
Delays treatment for sick patients
• Solution: Dell Next Generation
Sequencing Appliance
– Single Rack Solution
– 9 Teraflops of Sandy Bridge Processors
– Lustre File Storage
– Intel SW tools and engineers
• Benefits: RNA-Seq processing
reduced to 4 hour
• Includes everything you need for NGS -
compute, storage, software, networking,
infrastructure, installation, deployment,
training, service & support
Dell HSS (Lustre)
(up to 360TB)
Dell NSS (NFS)
(up to 180TB)
Infrastructure:
Dell PE, PC & F10
M420 (Compute)
(up to 32 nodes)
2U Plenum
Actual placement in racks may vary.
NSS-HA Pair
NSS User
Data
HSS Metadata
Pair
HSS OSS Pair
HSS User
Data
IBM, CLC bio Genomics Sequencing
Analytics Solution
• Challenge: Need for processing power and
storage capacity in order to correlate the
variants in the genome with the relevant patient
symptoms
• Solution: IBM®, CLC Genomics server SW,
Genomics Workbench client SW; Small (48
Cores, 192 GB), Medium, Large (192 Cores,
768 GB) Analytics Solutions
• Benefits:
– Reference Mapping for 37x coverage human
genome – ~9hr (1 node) to ~30mins (37 nodes)
– Variant Calling and annotation for 37x coverage –
~40 hrs (1 node) to ~3hrs (23 nodes)
• Infrastructure
– IBM System x® 3550 M4, E5-2650; 48 CPU cores and 192
GBs of memory to 192 CPU cores and 768 GBs of memory
– IBM Storwize® V7000
– CLC Genomics Server 5.0.2 , Workbench 6.0.1
– 7x 3TB SAS 6 Gbps HDD (16 TB usable)
http://www-148.ibm.com/bin/newsletter/tool/landingPage.cgi?lpId=6155
NGS Appliances
BioTeam “SlipStream”
• Challenge: Significant IT overhead,
limited bioinformatics support, changing
landscape
• Solution: “Slipstream” Appliance
• Benefits:
– Minimize lab IT startup costs
– Integrate and standardize data management
including security, easily traceable results
– Adaptable to any Laboratory, Workflow-
based Lab Management
– Seamless Sequencer Integration
• Infrastructure
– Dell PowerEdge T620 Desktop Server
– 2x Intel Xeon 8 Core Processors (16 cores)
– 16x 32GB RAM (512GB), 1x 100GB SSD
– 7x 3TB SAS 6 Gbps HDD (16 TB usable)
Convey Computing’s Hybrid Core Architecture
to Accelerate Algorithms
• Challenge: Advances in sequencing
technology have significantly increased data
generation and require similar computational
advances for bioinformatics analysis
• Solution: Convey Hybrid-Core (HC)
architecture - Intel® x86 microprocessors
with a coprocessor comprised of
reconfigurable hardware (FPGAs)
• Benefits: Accelerated BWA pipeline up to
18x compared to a standard x86 system
• Project Characteristics:
HC-1: Intel L5408, Xilinx Virtex-5 FPGAs, 1TB
SATA disks
HC-2: Intel X5670, Xilinx Virtex-5 FPGAs, 1TB
SATA disks
HC-2ex: 128GB (host), 64GB (coprocessor),
1TB SATA disks
Genomics & Health Analytics Appliances
18
2U Plenum
Actual placement in racks may vary.
NSS-HA Pair
NSS User Data
HSS Metadata Pair
HSS OSS Pair
HSS User Data
Scale through independent solutions,
each targeting a different segment & usage model
Ultra High-Speed
Networking Optimizations – Aspera Labs
• Challenge: Improving big data transfer to
and from the backend data center
• Solution: Optimize ultra high-speed (10
Gbps and beyond) data transfer solutions
built on Aspera’s FASP ™ transport
technology and Intel’s innovative hardware
platform
• Benefits with Intel Xeon E5-2600
(DDIO, SR-IOV)
– 300% improvement in Aspera transfer
throughput
– Same transfer speed performance in both
physical and virtualized computing
environments
– Both LAN and WAN transfer speeds had
similar results
• Infrastructure and Data Characteristics:
– Xeon E5 2687, 32GB DDR3 with Non-Uniform Memory Access (NUMA)
Data Direct IO (DDIO), Intel 910 SSD, Intel 82599EB 10 GbE
– Aspera Enterprise server 3.1.1.66573, Aspera Performance Automation
Suite
• Challenge: Can high performance interconnect
technology (InfiniBand) keep up with increase in
number of processor cores?
• Workloads: VASP, WIEN2K
• Benchmarks: MVAPICH (MPI over InfiniBand),
IMPI (Intel MPI)
• Results:
– Scale-up research – 5 to 10 fold time
improvement in performance when scaling
from a single node to 16 nodes
– Intel® True Scale Fabric QDR-40 shows
excellent price/performance results
• Infrastructure and Data Characteristics:
– 1 Head + 16 compute nodes, Dual Xeon® E5 2680 2.7GHz p/node
– 32GB of RAM 1666MHz p/node
– RHEL, Compiler, MPI variations available
– Intel® Cluster Suite, Intel® Fabric Suite
High-Performance Interconnect (InfiniBand)
and HPC – Intel® True Scale Fabric
Data Life Cycle Management
with iRODS – EMC, RENCI
4.3M WGS for
all US
newborns/yr.
~= 100 PB*
Can I
describe
it?
Can I find
it?
Can I
access it?
Can I
move it?
* Chris Mason, Weill Cornell Medical College, WGA
Mtg. Nov. 2012
High Performance Scale-out Storage for
Wellcome Trust Sanger Institute
Challenge: Exponential increases in the volume of data being generated – but
storage budgets are flat or growing slowly.
Large data sets are difficult to proactively manage, and can easily overwhelm
storage resources. Un-optimized storage has a direct, negative impact on
application performance – slowing the time for breakthrough results.
Solution: Exploit the power and scale of HPC-class storage, powered by Intel®
Enterprise Edition for Lustre* software for unprecedented performance with
unmatched management simplicity.
Benefits that storage solutions powered by Intel EE for Lustre software:
– Openness – Developed and enhanced by the Lustre experts
– Global namespace – all clients can access all data
– Performance – Upwards of 1 TB/s
– Virtually unlimited file system and per file sizes
– Management simplicity using Intel® Manager for Lustre*
Heterogeneous Clusters for Biomedical Computing at
Virginia Bioinformatics Institute (VBI)
• Challenge: Scalable infrastructure for
rapid data growth and the need to run
varied applications is driving the need for
novel computing needs.
• Solution: Combination of Intel® Xeon®,
Intel® True Scale QDR Infiniband and SGI’s
infiniteStorage platform was deployed to
deliver a 300% speedup. Overall reduction
in cost resulted in the purchase of additional
compute nodes.
• VBI Cluster – Symmetric multiprocessing
(SMP) nodes (large memory Xeon E7) with
1 TB of RAM, massively parallel processing
(MPP) nodes (Xeon E5) with 64 GB. 50 PB of
tape storage, 600 TB of HDD. Using SGI’s
IS16000 platform and Intel TrueScale
fabric, VBI moves data through the storage
systems at 2 GB per second.
“The amazing thing is that we see
almost a three times performance
increase on 48 nodes compared to 56
nodes of the previous generatyion,
even though the processors are
slightly slower clock speed. The Intel®
QuickPath Interconnect and Intel®
TrueScale Fabric have has a big
impact.”
Dr. Kevin Shinpaugh,
Director of IT and HPC,
Virginia Bioinformatics Institute
Top-5 Pharmaceutical Company -
SAS Grid
• Challenge: Need to accelerate and
optimize “time to results” clinical trial
simulation environment; resource
allocation and job prioritization was
manual/ad-hoc
• Solution: “Scale-Out” architecture:
– SAS Visual Analytics, Enterprise Miner, Grid
Manager
– Red Hat Enterprise Linux
– Xeon E5 servers (HP)
• Benefits: Clinical trial simulation
exercises reduced from hours to < 5
minutes; registration decisions
accelerated with multi-hundred million
USD impact
http://www.intel.com/content/www/us/en/cloud-computing/cloud-computing-xeon-e5-carestream-imaging-brief.html
Mitsui Knowledge Industry (MKI)
• Challenge: Reduce the amount of
time it takes to do complete genomic
analysis and deliver results to
patients
• Solution: Real-Time Big Data
Platform
– R (Revolution Analytics)
– SAP HANA
– Hadoop
• Benefits: Genomic analysis
shortened from several days to 20
minutes; performance for some
queries improved 400,000 X
http://www.intel.com/content/www/us/en/cloud-computing/cloud-computing-xeon-e5-carestream-imaging-brief.html
http://www.saphana.com/docs/DOC-3641
Value
• Enable researchers to discover biomarkers
and drug targets by correlating genomic data
sets
• 90% gain in throughput; 6X data compression
Analytics
• Provide curated data sets with pre-computed
analysis (classification, correlation,
biomarkers)
• Provide APIs for applications to combine and
analyze public and private data sets
Data Management
• Use Hive and Hadoop for query and search
• Dynamically partition and scale Hbase
• 10-node cluster / Intel Xeon E5 processors
• 10GbE network
Data-Intensive Discovery: Genomics
Intel Distribution
Intel Confidential
• Solution: Intel Distribution for
Hadoop (IDH), Map Reduce,
Hbase, Hive
• Benefits: Ability to compare 14
million proteins and more,
reducing the processing time
from days to hours.
• Project Characteristics:
Hadoop: 5 nodes Cluster
Storage:16TB (Internal
storage) per server
Servers: Xeon E5 2 socket 8
cores, 64GB RAM
SLA: reducing processing time
from 30 days to less then a
day and scale to 4x4 million
samples comparison
Data: Multi-Terabyte database
Problem Statement:
Back in 2008 a genome research team
faced compute and scalability issue in
comparing all pairs of 4 million
proteins, the BLAST search results
overwhelmed a single database table.
Today they need to compare 14 million
proteins, this requirement cannot
be addressed with existing
technology.
Big Data, Bioinformatics
Team website Blast
Program
Genome data
Proteins comparison
High performance scalable
Hadoop/Hbase cluster
High Throughput Science:
Embracing Cloud-based Analytics
• Challenge: Team of cancer
researchers had to screen a drug
concept with a list of tens of millions
of molecules working with a tight
deadline, a fixed budget, and strict
security and compliance requirements.
Schrödinger’s existing in-house
servers would be tied up for weeks
• Solution: Schrödinger leveraged
software from AWS partner, Cycle
Computing, to provision a fully
secured cluster of 50,000 cores,
powered by the Intel® Xeon®
processor E5 family.
– This configuration enabled the
team to run 16 million
molecular simulations an hour.
– Developed 1000 molecule list
in < 8hrs.
High Throughput Science:
Large Scale Computational Chemistry Simulation
• Challenge: Sustaining access to
50000+ compute cores for large scale
computational chemistry simulation
results in under a week. Ability to
monitor and re-launch jobs, no
additional capital expenditure with
internal HPCC already running at
capacity.
• Solution: Novartis leveraged software
from AWS partner, Cycle Computing,
and MolSoft to provision a fully
secured cluster of 30,000 CPUs,
powered by the Intel® Xeon®
processor E5 family.
– Completed screening of 3.2
million compounds in
approximately 9 hrs, compared
to 4 -14 days on existing
resources.
Virtual Screening
Goals and Current applications target
• Focus on improving
genomics pipelines
• Optimize individual
applications
• Work with code
authors to release
optimizations
• Intel® Xeon®
processor focus
 Selectively experiment with
Intel® Xeon Phi™ coprocessor
DOMAIN Applications
Intel®
Architecture
Target
Genomics
Bowtie 1*, Bowtie 2* Xeon® processor
BWA* Xeon® processor
BLAST* Xeon® processor
GATK* Xeon® processor
HMMER*
Xeon® processor
Xeon® Phi™
coprocessor
Abyss* Xeon® processor
Velvet* Xeon® processor
*Other names and brands may be claimed as the property of others.
TGen* RNA sequencing pipeline
Partnership between Intel®, DELL*, Tgen*
1.8x
** 2-socket Intel(R) Xeon(R) CPU E5-2687W / 3.1 GHz
*Other names and brands may be claimed as the property of others.
Goals and Current applications target
• Optimize for Intel®
Xeon® processor and
Intel® Xeon Phi™
coprocessor (node
and cluster)
• Increase availability
of applications on
Intel® Xeon Phi™
coprocessor
• Work with code
authors to release
optimizations
*Other names and brands may be claimed as the property of others.
DOMAIN Applications
Intel®
Architecture
Targets
Molecular
Dynamics/
Chemistry
AMBER*
Xeon® processor
Xeon® Phi™
coprocessor
NAMD*
GROMACS*
GAMESS*
Quantum Espresso*
Gaussian*
VASP*
CP2K*
QBOX*
CPMD*
LAMMPS*
Intel® Xeon® processor: new platforms,
architecture improve Life Science applications
2-socket “Ivybridge vs. Sandybridge”
Ivybridge 12c/24T 2.7Ghz, Sandybridge 8c/16T 2.9GHz
http://www.intel.com/content/www/us/en/benchmarks/server/xeon-e5-2600-v2/xeon-e5-v2-hpc-life-sciences.html
*Other names and brands may be claimed as the property of others.
34
Optimizing/Accelerating the DNA Pipeline
Compression – IPP library – HW Acceleration – Custom library
FPGA Acceleration
35
Incorporating Intel IPP Deflater into Picard Tools
36
Picard MarkDuplicates Optimizations
Two Fold Approach:
1. Added optional tag ‘MC’ to SAM Specification
• Tag ‘MC’ is used to store Mate Cigar for a Paired Read, where mate is mapped.
• SAM JDK extended to support tag ‘MC’
• MergeBamAlignment modified to include the new ‘MC’ tag within each relevant
record of the SAM/BAM file
2. Redesign of MarkDuplicates
• Inclusion of ‘MC’ tag provides opportunity for algorithmic redesign of
MarkDuplicates
• Overall speedup ~2x for MergeBamAlig/MarkDuplicates
Additional Gains: Enables streaming of records for the entire pre-GATK phase (from
‘bwa mem’ to ‘MarkDuplicates’ ) in a typical bwa_mem+GATK workflow
37
MarkDuplicates
RdX_1: …………Cigar
………………
………………
RdY_1: …………Cigar
………………
………………
RdX_2: …………Cigar
………………
………………
………………
………………
RdY_2: …………Cigar
BASELINE
1) Store information per each read:
Used to determine unclipped 5’ coordinate for
both ends & orientation of pair
2) Sort reads within the entire file by unclipped
5’coordinate and MarkDuplicates
3) Write out BAM file
OPTIMIZED
RdX_1: …………Cigar ………. MC:
…………………………………
…………………………………
RdY_1: …………Cigar ……… MC:
…………………………………
…………………………………
RdX_2: ………Cigar ……… MC:
…………………………………
…………………………………
…………………………………
…………………………………
RdY_2: ………Cigar ………. MC:
PairX
PairY
PairX+
+
1) Sort reads within a small window by
unclipped 5’ coordinate:
- MarkDuplicates
- Write out
38
DNA Pipeline: BWA+GATK: Whole Genome Sample: ~65x Coverage
Collaborating with our Partners and Medical community
Process level
Parallelism
Thread-level
Parallelism
Step # of
Threads
Runtime
(hours)
Read Alignment (bwa
mem)
24 7
View (samtools) 24 2
Sort + Index (samtools) 24 3
MarkDuplicates
(picardtools) + Index
1 11
RealignerTargetCreator
(GATK)
24 1
IndelRealigner* (GATK) +
Index
24 6.5
BaseRecalibrator(GATK) 24 1.3
PrintReads* (GATK) +
Index + Flagstat
24 12.3
TOTAL (hours)
44
Step Tool # of
Threads
Runtime
(hours)
Read Alignment (bwa) 16 8
Sampe (bwa) 1 24
Import (samtools) 1 11
Sort + Index (samtools) 1 14.5
MarkDuplicates
(picardtools) + Index
1 11.5
UnifiedGenotyper* (GATK) 16 7.5
SomaticIndelDetector
(GATK)
1 3
RealignerTargetCreator
(GATK)
16 0.8
IndelRealigner* (GATK) +
Index
1 17.5
BaseRecalibrator*(GATK) 1 62
PrintReads* (GATK) +
Index + Flagstat
1 25
TOTAL (hours) 177
Algorithmic
Improvement
6X improvement so far and 4X without major code change and rest with code changes.
Redesign of
Mark Duplicates
+
Merge Bam Align
30-36
hours
39
Profiling: Single Instance Run – Lower Latency
# of Machines = 1
# of cores/Machine = 24
Temporary Storage – RAID0 2x4TB HDD
Input Dataset: G15512.HCC1954.1, coverage: 65x
Average CPU utilization is very low. Most cores not being used
Average I/O bandwidth is very low. Application not I/O bound
Average memory footprint is small. Application not using memory available in newer systems
There is a lot of room to improvise
40
Smith Waterman Acceleration
Working on accelerating two versions of Smith Waterman:
1. Simplified version where gap open, gap extension, and mismatch penalties are identical
2. Affine gap penalty (as implemented in BWA-MEM)
Initial results on #1 seem promising
Speed up measured in terms of
throughput for these runs.
Banded Smith Waterman
implementation
Bitwise parallelism:
Packed32: 32-bit uint
Packed64: 64-bit uint
AVX: 256-bit vector
Xeon Phi: 512-bit vector
41
Optimizing/Accelerating
Compression – IPP library – HW Acceleration – Custom library
FPGA Acceleration
Genomics - Big Data Problem
AffectingFactors
Cell Response
313 Exabytes
if everyone in the US has
their genes sequenced
495 Exabytes
if every cancer patient in the US has
their genes sequenced every 2 weeks.
Images, Assays and Drug
response data will push it
further up as shown in Blue line
Complex interaction of
varied & changing intrinsic
and extrinsic factors
determine cell response
Source: Knights Cancer Institute, Oregon Health Sciences University & Intel
Proliferation
Apoptosis
Differentiation
DNA Repair
Motility
Senescence
With Genomic Data growing rapidly, hospitals and research centers need to access the local data (the ones not shared) and
the centralized public/private data for various analysis and analytics for Genomic Research/Development/Medicine.
Compute has to be done “where data is” and need to be consistent locally and in the cloud.
Energy, Total Cost of Operation are key
Invasion,Metastasis&
therapeuticresponse
The day when every newborn gets their DNA sequenced is not far away: http://www.nih.gov/news/health/sep2013/nhgri-04.htm.
43
1
2
2 3
3
3
4
4
4
4 5
5
5
5
PairHMM Matrix Dependencies
Wave-Front Computation in AVX
44
Pair HMM Acceleration using AVX
• Computation kernel and bottleneck in GATK Haplotype Caller
• AVX enables 8 floating point SIMD operations in parallel
• 2 Ways to vectorize HMM computation
• Intra-Sequence – Parallelize computation within one HMM matrix
operation. Run multiple (8) computations concurrently along diagonal
• Inter-Sequence – Perform multiple (8) HMM matrix operations at once
Time (seconds) Speedup C++/Java
Serial C++ 1540 1x / 9x
1 core with AVX (Intra) 340 4.5x / 40.7x
1 core with AVX (Inter) 285 5.4x / 48.6x
24 cores with AVX (Inter) 14.3 108x / 970x
24 cores hybrid (Inter) 15.7 98x / 882x
45
Policy – United States, European
Union
Snapshot of US, EU Recommendations
Develop an ICT-enabled European Strategy for Personalised
Medicine
2014-2020
Driving research to unleash the potential of ICT at the point-of-care
EU R&D initiatives must address:
 Interoperability of technical standards for managing and sharing sequence data in
research and clinical samples;
 Development of hardware, software and workflow algorithms to accelerate cost
efficient analysis of genetic abnormalities that cause cancer and other complex
diseases;
 Research to ensure convergence of Big Data and Cloud Computing infrastructure to
meet the requirements of High Performance Computing and data throughout the life
sciences and healthcare value chains
The eHealth Action Plan 2020 should include Personalised Medicine as a
priority
 Gain knowledge of the challenges and barriers (technical, organizational, legal and
political) to the adoption of ICT in support of Personalised Medicine leveraged by
genomic information;
 Evaluate how to change workflows and education requirements to facilitate adoption
of ICT mediated personalized medicine in clinical practice;
 Expand collaboration with other regions of the world in matters of common interest,
e.g. by leveraging the eHealth MoU with the United States of America;
 Study, evaluate and disseminate technology neutral risk assessment frameworks for
data privacy and security, covering the entire ICT enabled Personalised Medicine
delivery chain;
 Develop effective methods for enabling the use of medical information for public health
and research
Intel Assets for Life Sciences
Intel
Xeon E5
Intel
Xeon Phi
Intel Fabric Intel
Storage
Intel
Software
• Up to 80%
greater
performance
• Up to 70% more
energy efficiency
• Up to 30% less
network latency
• Hardware-
accelerated
security (AES-NI)
• Broad industry
adoption
Consistent
Performance Gains
each generation
• Performance and
programmability for
highly-parallel
workloads
• Programming
continuity and
scalable parallel
programming
models: common
source code and
software tools
between multicore
Intel® Xeon® and
manycore Intel®
Xeon Phi™
• Partner ecosystem
continues growing
and making progress
• Intel® Cluster
Studio XE compilers,
libraries, analysis
tools, OpenMP and
MPI
• Intel® Hadoop
Distribution
• Intel® Data Center
Manager and Intel®
Node Manager (NM)
Intel® Expressway
Service Gateway for
Cloud usage models
• Intel® True Scale
Fabric designed
from the ground up
for HPC
• QDR-40 and QDR-80
deliver performance
that scales - high
MPI message rates
and end-to-end
latency that stays
low at scale
• Optimized support
for Intel® Xeon® E5
and Xeon® Phi
processors
• Intel Fabric Suite –
IB Fabric
Management &
FastFabric
Management tools
• Intel® Xeon®
processors and
platforms are
enabled with
beneficial storage
optimizations
• Solid State Drives
(SSD) and other NVM
technologies improve
storage performance
• Intel® Cache
Acceleration
Software
• Intel’s open source
Lustre file-system
support/development
and Chroma
management/provisio
ning tools
Summary
• Enabling ecosystem of partners to innovate and make
Personalized Medicine vision a reality
• Delivering hardware-enhanced capabilities and software to
accelerate science, translate results, deliver today.
• Looking for collaboration opportunities to take
Personalized Medicine mainstream by 2020
• Big Data/Analytics in Health & Life Sciences
• www.intel.com/healthcare
• hadoop.intel.com
49
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY
ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS
DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR
IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING
TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT,
COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
Software and workloads used in performance tests may have been optimized for performance only on Intel
microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer
systems, components, software, operations and functions. Any change to any of those factors may cause the results
to vary. You should consult other information and performance tests to assist you in fully evaluating your
contemplated purchases, including the performance of that product when combined with other products.
Copyright © , Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon, Core, VTune, and Cilk are trademarks
of Intel Corporation in the U.S. and other countries.
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that
are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and
other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on
microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended
for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for
Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information
regarding the specific instruction sets covered by this notice.
Notice revision #20110804
Legal Disclaimer & Optimization Notice
Copyright© 2012, Intel Corporation. All rights reserved.
*Other brands and names are the property of their respective owners.
49

Contenu connexe

Tendances

Uniting traditional GIS and mainstream IT
Uniting traditional GIS and mainstream ITUniting traditional GIS and mainstream IT
Uniting traditional GIS and mainstream ITgssg
 
ESG Lab Report - Catalogic Software DPX
ESG Lab Report - Catalogic Software DPXESG Lab Report - Catalogic Software DPX
ESG Lab Report - Catalogic Software DPXCatalogic Software
 
Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...
Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...
Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...Big Data Spain
 
Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...
Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...
Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...The HDF-EOS Tools and Information Center
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...IRJET Journal
 
Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001Abhishek Satyam
 
Database Archiving - Managing Data for Long Retention Periods
Database Archiving - Managing Data for Long Retention PeriodsDatabase Archiving - Managing Data for Long Retention Periods
Database Archiving - Managing Data for Long Retention PeriodsCraig Mullins
 
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE csandit
 
Data integrity proof techniques in cloud storage
Data integrity proof techniques in cloud storageData integrity proof techniques in cloud storage
Data integrity proof techniques in cloud storageIAEME Publication
 
Dynamic datacenter planning and design
Dynamic datacenter   planning and designDynamic datacenter   planning and design
Dynamic datacenter planning and designYeonki Choi
 
Big Data-Survey
Big Data-SurveyBig Data-Survey
Big Data-Surveyijeei-iaes
 
Big data analysis concepts and references by Cloud Security Alliance
Big data analysis concepts and references by Cloud Security AllianceBig data analysis concepts and references by Cloud Security Alliance
Big data analysis concepts and references by Cloud Security AllianceInformation Security Awareness Group
 
Intel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing GuideIntel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing GuideIntel IT Center
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200xIBM Sverige
 
Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015PivotalOpenSourceHub
 
Case Study British Red Cross 161257
Case Study British Red Cross 161257Case Study British Red Cross 161257
Case Study British Red Cross 161257AsigraCloudBackup
 

Tendances (19)

Uniting traditional GIS and mainstream IT
Uniting traditional GIS and mainstream ITUniting traditional GIS and mainstream IT
Uniting traditional GIS and mainstream IT
 
ESG Lab Report - Catalogic Software DPX
ESG Lab Report - Catalogic Software DPXESG Lab Report - Catalogic Software DPX
ESG Lab Report - Catalogic Software DPX
 
Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...
Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...
Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...
 
Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...
Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...
Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
 
Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001
 
Database Archiving - Managing Data for Long Retention Periods
Database Archiving - Managing Data for Long Retention PeriodsDatabase Archiving - Managing Data for Long Retention Periods
Database Archiving - Managing Data for Long Retention Periods
 
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE
 
Informatics technologies in an evolving r & d landscape
Informatics technologies in an evolving r & d landscapeInformatics technologies in an evolving r & d landscape
Informatics technologies in an evolving r & d landscape
 
Data integrity proof techniques in cloud storage
Data integrity proof techniques in cloud storageData integrity proof techniques in cloud storage
Data integrity proof techniques in cloud storage
 
Dynamic datacenter planning and design
Dynamic datacenter   planning and designDynamic datacenter   planning and design
Dynamic datacenter planning and design
 
Big Data-Survey
Big Data-SurveyBig Data-Survey
Big Data-Survey
 
Big data analysis concepts and references
Big data analysis concepts and referencesBig data analysis concepts and references
Big data analysis concepts and references
 
Big data analysis concepts and references by Cloud Security Alliance
Big data analysis concepts and references by Cloud Security AllianceBig data analysis concepts and references by Cloud Security Alliance
Big data analysis concepts and references by Cloud Security Alliance
 
Intel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing GuideIntel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing Guide
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200x
 
Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015
 
Case Study British Red Cross 161257
Case Study British Red Cross 161257Case Study British Red Cross 161257
Case Study British Red Cross 161257
 

En vedette

Ahmed Absi slides bigbwa
Ahmed Absi slides  bigbwaAhmed Absi slides  bigbwa
Ahmed Absi slides bigbwaAbsi Ahmed
 
The Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision MedicineThe Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision Medicinemhaendel
 
Data Warehouse to Data Science
Data Warehouse to Data ScienceData Warehouse to Data Science
Data Warehouse to Data ScienceChandan Rajah
 
Big Data Spells Big Problems ...
Big Data Spells Big Problems ...Big Data Spells Big Problems ...
Big Data Spells Big Problems ...Remedy Informatics
 
Challenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data GenomicsChallenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data GenomicsYasin Memari
 
prpl: a non-profit foundation embracing IoT diversity, big data, and analytics
prpl: a non-profit foundation embracing IoT diversity, big data, and analyticsprpl: a non-profit foundation embracing IoT diversity, big data, and analytics
prpl: a non-profit foundation embracing IoT diversity, big data, and analyticsAmit Rohatgi
 
CS Guest Lecture 2015 10-05 advanced databases
CS Guest Lecture 2015 10-05 advanced databasesCS Guest Lecture 2015 10-05 advanced databases
CS Guest Lecture 2015 10-05 advanced databasesGabe Rudy
 
Quantified Diagnosis by Lavinia Ionita
Quantified Diagnosis by Lavinia IonitaQuantified Diagnosis by Lavinia Ionita
Quantified Diagnosis by Lavinia IonitaTheFamily
 
Fabricio Silva: Cloud Computing Technologies for Genomic Big Data Analysis
Fabricio  Silva: Cloud Computing Technologies for Genomic Big Data AnalysisFabricio  Silva: Cloud Computing Technologies for Genomic Big Data Analysis
Fabricio Silva: Cloud Computing Technologies for Genomic Big Data AnalysisFlávio Codeço Coelho
 
Big data and digital ecosystem mark skilton jan 2014 v1
Big data and digital ecosystem mark skilton jan 2014 v1Big data and digital ecosystem mark skilton jan 2014 v1
Big data and digital ecosystem mark skilton jan 2014 v1Mark Skilton
 
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014Torsten Seemann
 
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...Perficient
 
Building a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystemBuilding a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystemGregg Barrett
 
2013 OHSUG - Clinical Data Warehouse Implementation
2013 OHSUG - Clinical Data Warehouse Implementation2013 OHSUG - Clinical Data Warehouse Implementation
2013 OHSUG - Clinical Data Warehouse ImplementationPerficient
 
"Computer Vision and Artificial Intelligence: Market Trends and Implications,...
"Computer Vision and Artificial Intelligence: Market Trends and Implications,..."Computer Vision and Artificial Intelligence: Market Trends and Implications,...
"Computer Vision and Artificial Intelligence: Market Trends and Implications,...Edge AI and Vision Alliance
 
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014Amazon Web Services
 

En vedette (20)

P4 Medicine May 2011
P4 Medicine May 2011P4 Medicine May 2011
P4 Medicine May 2011
 
Ahmed Absi slides bigbwa
Ahmed Absi slides  bigbwaAhmed Absi slides  bigbwa
Ahmed Absi slides bigbwa
 
The Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision MedicineThe Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision Medicine
 
Schnelle Orientierung in Genomdaten
Schnelle Orientierung in GenomdatenSchnelle Orientierung in Genomdaten
Schnelle Orientierung in Genomdaten
 
Data Warehouse to Data Science
Data Warehouse to Data ScienceData Warehouse to Data Science
Data Warehouse to Data Science
 
Big Data Spells Big Problems ...
Big Data Spells Big Problems ...Big Data Spells Big Problems ...
Big Data Spells Big Problems ...
 
Challenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data GenomicsChallenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data Genomics
 
prpl: a non-profit foundation embracing IoT diversity, big data, and analytics
prpl: a non-profit foundation embracing IoT diversity, big data, and analyticsprpl: a non-profit foundation embracing IoT diversity, big data, and analytics
prpl: a non-profit foundation embracing IoT diversity, big data, and analytics
 
CS Guest Lecture 2015 10-05 advanced databases
CS Guest Lecture 2015 10-05 advanced databasesCS Guest Lecture 2015 10-05 advanced databases
CS Guest Lecture 2015 10-05 advanced databases
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Quantified Diagnosis by Lavinia Ionita
Quantified Diagnosis by Lavinia IonitaQuantified Diagnosis by Lavinia Ionita
Quantified Diagnosis by Lavinia Ionita
 
Fabricio Silva: Cloud Computing Technologies for Genomic Big Data Analysis
Fabricio  Silva: Cloud Computing Technologies for Genomic Big Data AnalysisFabricio  Silva: Cloud Computing Technologies for Genomic Big Data Analysis
Fabricio Silva: Cloud Computing Technologies for Genomic Big Data Analysis
 
Big data and digital ecosystem mark skilton jan 2014 v1
Big data and digital ecosystem mark skilton jan 2014 v1Big data and digital ecosystem mark skilton jan 2014 v1
Big data and digital ecosystem mark skilton jan 2014 v1
 
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
 
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
 
Building a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystemBuilding a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystem
 
Genome Big Data
Genome Big DataGenome Big Data
Genome Big Data
 
2013 OHSUG - Clinical Data Warehouse Implementation
2013 OHSUG - Clinical Data Warehouse Implementation2013 OHSUG - Clinical Data Warehouse Implementation
2013 OHSUG - Clinical Data Warehouse Implementation
 
"Computer Vision and Artificial Intelligence: Market Trends and Implications,...
"Computer Vision and Artificial Intelligence: Market Trends and Implications,..."Computer Vision and Artificial Intelligence: Market Trends and Implications,...
"Computer Vision and Artificial Intelligence: Market Trends and Implications,...
 
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
 

Similaire à Intel big data analytics in health and life sciences personalized medicine

Healthcare IoT and Analytics to treat Parkinsons
Healthcare IoT and Analytics to treat ParkinsonsHealthcare IoT and Analytics to treat Parkinsons
Healthcare IoT and Analytics to treat Parkinsonsrcnossen
 
Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...
Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...
Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...PerkinElmer Informatics
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyIntel IT Center
 
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura IntelTDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Inteltdc-globalcode
 
Developing Software for Persistent Memory / Willhalm Thomas (Intel)
Developing Software for Persistent Memory / Willhalm Thomas (Intel)Developing Software for Persistent Memory / Willhalm Thomas (Intel)
Developing Software for Persistent Memory / Willhalm Thomas (Intel)Ontico
 
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Johann Lombardi
 
Intel HIMSS WoHIT mhealth
Intel HIMSS WoHIT mhealthIntel HIMSS WoHIT mhealth
Intel HIMSS WoHIT mhealthrcnossen
 
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...Bonnie Hurwitz
 
O uso de tecnologias Intel na implantação de sistemas de alto desempenho
O uso de tecnologias Intel na implantação de sistemas de alto desempenhoO uso de tecnologias Intel na implantação de sistemas de alto desempenho
O uso de tecnologias Intel na implantação de sistemas de alto desempenhoIntel Software Brasil
 
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference ChipSpring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chipinside-BigData.com
 
Internet of Things: Lightning Round, Sargent
Internet of Things: Lightning Round, SargentInternet of Things: Lightning Round, Sargent
Internet of Things: Lightning Round, SargentGovLoop
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Denodo
 
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...EMC
 
Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...DataWorks Summit
 
Advice for Healthcare IT Startups NYC Executive Informational and Networking ...
Advice for Healthcare IT Startups NYC Executive Informational and Networking ...Advice for Healthcare IT Startups NYC Executive Informational and Networking ...
Advice for Healthcare IT Startups NYC Executive Informational and Networking ...Todd Winey
 
Accelerating Real-Time Analytics Insights Through Hadoop Open Source Ecosystem
Accelerating Real-Time Analytics Insights Through Hadoop Open Source EcosystemAccelerating Real-Time Analytics Insights Through Hadoop Open Source Ecosystem
Accelerating Real-Time Analytics Insights Through Hadoop Open Source EcosystemDataWorks Summit
 
Healthcare trends and information management strategy
Healthcare trends and information management strategyHealthcare trends and information management strategy
Healthcare trends and information management strategyChristopher Wynder
 
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning AcceleratorDeep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning Acceleratorinside-BigData.com
 

Similaire à Intel big data analytics in health and life sciences personalized medicine (20)

Healthcare IoT and Analytics to treat Parkinsons
Healthcare IoT and Analytics to treat ParkinsonsHealthcare IoT and Analytics to treat Parkinsons
Healthcare IoT and Analytics to treat Parkinsons
 
Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...
Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...
Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge Economy
 
8 2interoperability day_open_ehr_case_tieto
8 2interoperability day_open_ehr_case_tieto8 2interoperability day_open_ehr_case_tieto
8 2interoperability day_open_ehr_case_tieto
 
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura IntelTDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
 
Developing Software for Persistent Memory / Willhalm Thomas (Intel)
Developing Software for Persistent Memory / Willhalm Thomas (Intel)Developing Software for Persistent Memory / Willhalm Thomas (Intel)
Developing Software for Persistent Memory / Willhalm Thomas (Intel)
 
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
 
Intel HIMSS WoHIT mhealth
Intel HIMSS WoHIT mhealthIntel HIMSS WoHIT mhealth
Intel HIMSS WoHIT mhealth
 
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
 
O uso de tecnologias Intel na implantação de sistemas de alto desempenho
O uso de tecnologias Intel na implantação de sistemas de alto desempenhoO uso de tecnologias Intel na implantação de sistemas de alto desempenho
O uso de tecnologias Intel na implantação de sistemas de alto desempenho
 
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference ChipSpring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
 
Internet of Things: Lightning Round, Sargent
Internet of Things: Lightning Round, SargentInternet of Things: Lightning Round, Sargent
Internet of Things: Lightning Round, Sargent
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)
 
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
 
Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...
 
Advice for Healthcare IT Startups NYC Executive Informational and Networking ...
Advice for Healthcare IT Startups NYC Executive Informational and Networking ...Advice for Healthcare IT Startups NYC Executive Informational and Networking ...
Advice for Healthcare IT Startups NYC Executive Informational and Networking ...
 
Accelerating Real-Time Analytics Insights Through Hadoop Open Source Ecosystem
Accelerating Real-Time Analytics Insights Through Hadoop Open Source EcosystemAccelerating Real-Time Analytics Insights Through Hadoop Open Source Ecosystem
Accelerating Real-Time Analytics Insights Through Hadoop Open Source Ecosystem
 
Ibm and zato health
Ibm and zato healthIbm and zato health
Ibm and zato health
 
Healthcare trends and information management strategy
Healthcare trends and information management strategyHealthcare trends and information management strategy
Healthcare trends and information management strategy
 
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning AcceleratorDeep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
 

Dernier

Primary headache and facial pain. (2024)
Primary headache and facial pain. (2024)Primary headache and facial pain. (2024)
Primary headache and facial pain. (2024)Mohamed Rizk Khodair
 
Music Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara Rajendran
Music Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara RajendranMusic Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara Rajendran
Music Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara RajendranTara Rajendran
 
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand UniversityCEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand UniversityHarshChauhan475104
 
Presentation for Bella Mahl 2024-03-28-24-MW-Overview-Bella.pptx
Presentation for Bella Mahl 2024-03-28-24-MW-Overview-Bella.pptxPresentation for Bella Mahl 2024-03-28-24-MW-Overview-Bella.pptx
Presentation for Bella Mahl 2024-03-28-24-MW-Overview-Bella.pptxpdamico1
 
Clinical Pharmacotherapy of Scabies Disease
Clinical Pharmacotherapy of Scabies DiseaseClinical Pharmacotherapy of Scabies Disease
Clinical Pharmacotherapy of Scabies DiseaseSreenivasa Reddy Thalla
 
PHYSIOTHERAPY IN HEART TRANSPLANTATION..
PHYSIOTHERAPY IN HEART TRANSPLANTATION..PHYSIOTHERAPY IN HEART TRANSPLANTATION..
PHYSIOTHERAPY IN HEART TRANSPLANTATION..AneriPatwari
 
MedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdf
MedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdfMedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdf
MedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdfSasikiranMarri
 
Statistical modeling in pharmaceutical research and development.
Statistical modeling in pharmaceutical research and development.Statistical modeling in pharmaceutical research and development.
Statistical modeling in pharmaceutical research and development.ANJALI
 
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic AnalysisVarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic AnalysisGolden Helix
 
Lippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdf
Lippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdfLippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdf
Lippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdfSreeja Cherukuru
 
Nutrition of OCD for my Nutritional Neuroscience Class
Nutrition of OCD for my Nutritional Neuroscience ClassNutrition of OCD for my Nutritional Neuroscience Class
Nutrition of OCD for my Nutritional Neuroscience Classmanuelazg2001
 
Valproic Acid. (VPA). Antiseizure medication
Valproic Acid.  (VPA). Antiseizure medicationValproic Acid.  (VPA). Antiseizure medication
Valproic Acid. (VPA). Antiseizure medicationMohamadAlhes
 
PULMONARY EMBOLISM AND ITS MANAGEMENTS.pdf
PULMONARY EMBOLISM AND ITS MANAGEMENTS.pdfPULMONARY EMBOLISM AND ITS MANAGEMENTS.pdf
PULMONARY EMBOLISM AND ITS MANAGEMENTS.pdfDolisha Warbi
 
LUNG TUMORS AND ITS CLASSIFICATIONS.pdf
LUNG TUMORS AND ITS  CLASSIFICATIONS.pdfLUNG TUMORS AND ITS  CLASSIFICATIONS.pdf
LUNG TUMORS AND ITS CLASSIFICATIONS.pdfDolisha Warbi
 
Presentation on General Anesthetics pdf.
Presentation on General Anesthetics pdf.Presentation on General Anesthetics pdf.
Presentation on General Anesthetics pdf.Prerana Jadhav
 
Radiation Dosimetry Parameters and Isodose Curves.pptx
Radiation Dosimetry Parameters and Isodose Curves.pptxRadiation Dosimetry Parameters and Isodose Curves.pptx
Radiation Dosimetry Parameters and Isodose Curves.pptxDr. Dheeraj Kumar
 
PERFECT BUT PAINFUL TKR -ROLE OF SYNOVECTOMY.pptx
PERFECT BUT PAINFUL TKR -ROLE OF SYNOVECTOMY.pptxPERFECT BUT PAINFUL TKR -ROLE OF SYNOVECTOMY.pptx
PERFECT BUT PAINFUL TKR -ROLE OF SYNOVECTOMY.pptxdrashraf369
 
History and Development of Pharmacovigilence.pdf
History and Development of Pharmacovigilence.pdfHistory and Development of Pharmacovigilence.pdf
History and Development of Pharmacovigilence.pdfSasikiranMarri
 
Monoclonal antibody production by hybridoma technology
Monoclonal antibody production by hybridoma technologyMonoclonal antibody production by hybridoma technology
Monoclonal antibody production by hybridoma technologyHasnat Tariq
 
ANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMA
ANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMAANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMA
ANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMADivya Kanojiya
 

Dernier (20)

Primary headache and facial pain. (2024)
Primary headache and facial pain. (2024)Primary headache and facial pain. (2024)
Primary headache and facial pain. (2024)
 
Music Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara Rajendran
Music Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara RajendranMusic Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara Rajendran
Music Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara Rajendran
 
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand UniversityCEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
CEHPALOSPORINS.pptx By Harshvardhan Dev Bhoomi Uttarakhand University
 
Presentation for Bella Mahl 2024-03-28-24-MW-Overview-Bella.pptx
Presentation for Bella Mahl 2024-03-28-24-MW-Overview-Bella.pptxPresentation for Bella Mahl 2024-03-28-24-MW-Overview-Bella.pptx
Presentation for Bella Mahl 2024-03-28-24-MW-Overview-Bella.pptx
 
Clinical Pharmacotherapy of Scabies Disease
Clinical Pharmacotherapy of Scabies DiseaseClinical Pharmacotherapy of Scabies Disease
Clinical Pharmacotherapy of Scabies Disease
 
PHYSIOTHERAPY IN HEART TRANSPLANTATION..
PHYSIOTHERAPY IN HEART TRANSPLANTATION..PHYSIOTHERAPY IN HEART TRANSPLANTATION..
PHYSIOTHERAPY IN HEART TRANSPLANTATION..
 
MedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdf
MedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdfMedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdf
MedDRA-A-Comprehensive-Guide-to-Standardized-Medical-Terminology.pdf
 
Statistical modeling in pharmaceutical research and development.
Statistical modeling in pharmaceutical research and development.Statistical modeling in pharmaceutical research and development.
Statistical modeling in pharmaceutical research and development.
 
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic AnalysisVarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
 
Lippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdf
Lippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdfLippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdf
Lippincott Microcards_ Microbiology Flash Cards-LWW (2015).pdf
 
Nutrition of OCD for my Nutritional Neuroscience Class
Nutrition of OCD for my Nutritional Neuroscience ClassNutrition of OCD for my Nutritional Neuroscience Class
Nutrition of OCD for my Nutritional Neuroscience Class
 
Valproic Acid. (VPA). Antiseizure medication
Valproic Acid.  (VPA). Antiseizure medicationValproic Acid.  (VPA). Antiseizure medication
Valproic Acid. (VPA). Antiseizure medication
 
PULMONARY EMBOLISM AND ITS MANAGEMENTS.pdf
PULMONARY EMBOLISM AND ITS MANAGEMENTS.pdfPULMONARY EMBOLISM AND ITS MANAGEMENTS.pdf
PULMONARY EMBOLISM AND ITS MANAGEMENTS.pdf
 
LUNG TUMORS AND ITS CLASSIFICATIONS.pdf
LUNG TUMORS AND ITS  CLASSIFICATIONS.pdfLUNG TUMORS AND ITS  CLASSIFICATIONS.pdf
LUNG TUMORS AND ITS CLASSIFICATIONS.pdf
 
Presentation on General Anesthetics pdf.
Presentation on General Anesthetics pdf.Presentation on General Anesthetics pdf.
Presentation on General Anesthetics pdf.
 
Radiation Dosimetry Parameters and Isodose Curves.pptx
Radiation Dosimetry Parameters and Isodose Curves.pptxRadiation Dosimetry Parameters and Isodose Curves.pptx
Radiation Dosimetry Parameters and Isodose Curves.pptx
 
PERFECT BUT PAINFUL TKR -ROLE OF SYNOVECTOMY.pptx
PERFECT BUT PAINFUL TKR -ROLE OF SYNOVECTOMY.pptxPERFECT BUT PAINFUL TKR -ROLE OF SYNOVECTOMY.pptx
PERFECT BUT PAINFUL TKR -ROLE OF SYNOVECTOMY.pptx
 
History and Development of Pharmacovigilence.pdf
History and Development of Pharmacovigilence.pdfHistory and Development of Pharmacovigilence.pdf
History and Development of Pharmacovigilence.pdf
 
Monoclonal antibody production by hybridoma technology
Monoclonal antibody production by hybridoma technologyMonoclonal antibody production by hybridoma technology
Monoclonal antibody production by hybridoma technology
 
ANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMA
ANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMAANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMA
ANTI-DIABETICS DRUGS - PTEROCARPUS AND GYMNEMA
 

Intel big data analytics in health and life sciences personalized medicine

  • 1. The Age of Data-Driven Personalized Medicine Ketan Paranjape Worldwide Director, Health & Life Sciences Intel Corporation www.intel.com/healthcare/bigdata
  • 2. Notice and Disclaimers • Notice: This document contains information on products in the design phase of development. The information here is subject to change without notice. Do not finalize a design with this information. Contact your local Intel sales office or your distributor to obtain the latest specification before placing your product order. • INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY RELATING TO SALE AND/OR USE OF INTEL PRODUCTS, INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT, OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not intended for use in medical, life saving, or life sustaining applications. Intel may make changes to specifications, product descriptions, and plans at any time, without notice. • All products, dates, and figures are preliminary for planning purposes and are subject to change without notice. • Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined.“ Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. •Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. • The Intel products discussed herein may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. • Knights Corner, Knights Ferry, Aubrey Isle and other code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user. • Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting Intel's website at http://www.intel.com. • Intel®, Itanium®, Xeon®, Pentium®, and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. • Copyright © 2011-13, Intel Corporation. All rights reserved. • *Other names and brands may be claimed as the property of others.
  • 3. Compute for Personalized Medicine a.k.a Big Data Analytics in Healthcare and Analytics
  • 5. Regional Health Information Network RHIN – China (Jinzhou, Pop 3M) • Challenge: RHIN has challenges with scalability, performance and maintenance. Data storage is expensive • Solution: EMR data and healthcare services running on Intel Hadoop Distribution and Xeon E5 servers. • Benefits: High performance and scalability demonstrated via POC and stress testing. Significantly reduced storage cost • 1/5 Reduction in Response Time; 5x Concurrent Users Data processing flow of RHIN platform http://hadoop.intel.com/pdfs/IntelChinaHealthyCityAnalyticsCaseStudy.pdf
  • 6. GE-Medical Quality Improvement Consortium (MQIC) • Challenge – Gaining value from data in EMRs/EHRs and other digital health information tools • Solution – De-identified data from Centricity EMRs; Analytics capabilities to enhance their quality and reporting activities. • 1.6 billion documents representing 30 million de- identified patient records and 209 million office visits. • Benefits - Physician practices and ambulatory care clinics deliver their best care more efficiently, along with population-based research and public health activities. 6http://visualization.geblogs.com/visualization/network/ DEMO - http://visualization.geblogs.com/visualization/network/
  • 7. NHS Trust – Leeds Teaching Hospitals • Challenge – Capture data at the point of admission, throughout the patient care cycle and use natural language processing (NLP) to make sense of unstructured care notes and combine with structured care data for analysis • Solution – Partnering with ISVs – Ascribe, Two 10degrees, Microsoft and machines powered by Intel Xeon processor E5 family; • 30M patients, > 7M attendances each year worth of records • Benefits – Billing optimizations (doctors log the correct data), Resource Optimizations (learning patient trends for resource planning) 7http://visualization.geblogs.com/visualization/network/ DEMO - http://visualization.geblogs.com/visualization/network/ “The use of big data analysis on our patient care notes enables us to prove things our clinical intuition was telling us. In the new world anecdotal evidence isn’t enough. What we think isn’t sufficient to spend money. We need proof.” Iain MacBrairdy, Business Manager, Emergency Medicine, Leeds Teaching Hospitals
  • 8. Charite “Real-time” Cancer Analysis – Matching proper therapies to patients • Challenge: Real-time analysis of cancer patients using the in-memory SAP HANA Oncolyzer database that is running on mission critical Intel Xeon family infrastructure. (3.5M Data points per Patient, Up to 20 TB of data/patient • Solution: Using structured and unstructured data to collect and analyze tables used to take up to two days -- now takes seconds • Benefits: Improves medical quality in disruptive way for – Patient – Doctor – Hospital – Research 8 http://moss.ger.ith.intel.com/sites/SAP/SAP%20account%20team%20documents/Marketing/SAP%20HANA/SAPHANA_Charite_case_study_HI.PDF
  • 9. HANA Oncolyzer • Ad-hoc Analysis of heterogeneous tumor data for cancer research • Medical records from decades of tens of thousands of patients • Structured and unstructured data (records, time series, free text, etc.) Solution • Integrated into condensed but exhaustive view • On-the-fly analyses (e.g. Kaplan-Meier estimation, cohort statistics) • Includes external data sources (e.g. PubMed, pharmaceutical databases) • Attributes can be native, views, freetext-extracted, calculated
  • 11. Life Sciences: At the intersection of transformative forces Enabling exascale computing on massive data sets Helping enterprises build open interoperable clouds Contributing code and fostering ecosystem HPC Cloud Open Source 10 18
  • 12. Genomics Is A Big Data Problem AffectingFactors Cell Response 313 Exabytes if everyone in the US has their genes sequenced 495 Exabytes if every cancer patient in the US has their genes sequenced every 2 weeks A complex interaction of varied & changing intrinsic and extrinsic factors determine cell response
  • 13. Life Sciences: Key Industry Challenges and Solutions • Many (most) applications are single- threaded, single address space Intel is delivering optimizations working with open source community, developing NGS+HPC curriculum • Some algorithms scale quadratically with the size of the problem. Large data sets exceed available memory and storage Innovations in acceleration, compute, storage, networking, security, and *-as-a- service. • International collaboration is an imperative, bioinformatics expertise is scarce • Intel is working closely with the ecosystem to address enterprise to cloud transmission of terabyte payloads • Databases are distributed, data is siloed and will likely stay that way Tools like Hadoop, Lustre, Graphlab, In- Memory Analytics, etc. Need for Balanced Compute Infrastructure
  • 14. Dell Active Infrastructure for HPC Life Sciences • Challenge: Experiment processing takes 7 days with current infrastructure. Delays treatment for sick patients • Solution: Dell Next Generation Sequencing Appliance – Single Rack Solution – 9 Teraflops of Sandy Bridge Processors – Lustre File Storage – Intel SW tools and engineers • Benefits: RNA-Seq processing reduced to 4 hour • Includes everything you need for NGS - compute, storage, software, networking, infrastructure, installation, deployment, training, service & support Dell HSS (Lustre) (up to 360TB) Dell NSS (NFS) (up to 180TB) Infrastructure: Dell PE, PC & F10 M420 (Compute) (up to 32 nodes) 2U Plenum Actual placement in racks may vary. NSS-HA Pair NSS User Data HSS Metadata Pair HSS OSS Pair HSS User Data
  • 15. IBM, CLC bio Genomics Sequencing Analytics Solution • Challenge: Need for processing power and storage capacity in order to correlate the variants in the genome with the relevant patient symptoms • Solution: IBM®, CLC Genomics server SW, Genomics Workbench client SW; Small (48 Cores, 192 GB), Medium, Large (192 Cores, 768 GB) Analytics Solutions • Benefits: – Reference Mapping for 37x coverage human genome – ~9hr (1 node) to ~30mins (37 nodes) – Variant Calling and annotation for 37x coverage – ~40 hrs (1 node) to ~3hrs (23 nodes) • Infrastructure – IBM System x® 3550 M4, E5-2650; 48 CPU cores and 192 GBs of memory to 192 CPU cores and 768 GBs of memory – IBM Storwize® V7000 – CLC Genomics Server 5.0.2 , Workbench 6.0.1 – 7x 3TB SAS 6 Gbps HDD (16 TB usable) http://www-148.ibm.com/bin/newsletter/tool/landingPage.cgi?lpId=6155
  • 16. NGS Appliances BioTeam “SlipStream” • Challenge: Significant IT overhead, limited bioinformatics support, changing landscape • Solution: “Slipstream” Appliance • Benefits: – Minimize lab IT startup costs – Integrate and standardize data management including security, easily traceable results – Adaptable to any Laboratory, Workflow- based Lab Management – Seamless Sequencer Integration • Infrastructure – Dell PowerEdge T620 Desktop Server – 2x Intel Xeon 8 Core Processors (16 cores) – 16x 32GB RAM (512GB), 1x 100GB SSD – 7x 3TB SAS 6 Gbps HDD (16 TB usable)
  • 17. Convey Computing’s Hybrid Core Architecture to Accelerate Algorithms • Challenge: Advances in sequencing technology have significantly increased data generation and require similar computational advances for bioinformatics analysis • Solution: Convey Hybrid-Core (HC) architecture - Intel® x86 microprocessors with a coprocessor comprised of reconfigurable hardware (FPGAs) • Benefits: Accelerated BWA pipeline up to 18x compared to a standard x86 system • Project Characteristics: HC-1: Intel L5408, Xilinx Virtex-5 FPGAs, 1TB SATA disks HC-2: Intel X5670, Xilinx Virtex-5 FPGAs, 1TB SATA disks HC-2ex: 128GB (host), 64GB (coprocessor), 1TB SATA disks
  • 18. Genomics & Health Analytics Appliances 18 2U Plenum Actual placement in racks may vary. NSS-HA Pair NSS User Data HSS Metadata Pair HSS OSS Pair HSS User Data Scale through independent solutions, each targeting a different segment & usage model
  • 19. Ultra High-Speed Networking Optimizations – Aspera Labs • Challenge: Improving big data transfer to and from the backend data center • Solution: Optimize ultra high-speed (10 Gbps and beyond) data transfer solutions built on Aspera’s FASP ™ transport technology and Intel’s innovative hardware platform • Benefits with Intel Xeon E5-2600 (DDIO, SR-IOV) – 300% improvement in Aspera transfer throughput – Same transfer speed performance in both physical and virtualized computing environments – Both LAN and WAN transfer speeds had similar results • Infrastructure and Data Characteristics: – Xeon E5 2687, 32GB DDR3 with Non-Uniform Memory Access (NUMA) Data Direct IO (DDIO), Intel 910 SSD, Intel 82599EB 10 GbE – Aspera Enterprise server 3.1.1.66573, Aspera Performance Automation Suite
  • 20. • Challenge: Can high performance interconnect technology (InfiniBand) keep up with increase in number of processor cores? • Workloads: VASP, WIEN2K • Benchmarks: MVAPICH (MPI over InfiniBand), IMPI (Intel MPI) • Results: – Scale-up research – 5 to 10 fold time improvement in performance when scaling from a single node to 16 nodes – Intel® True Scale Fabric QDR-40 shows excellent price/performance results • Infrastructure and Data Characteristics: – 1 Head + 16 compute nodes, Dual Xeon® E5 2680 2.7GHz p/node – 32GB of RAM 1666MHz p/node – RHEL, Compiler, MPI variations available – Intel® Cluster Suite, Intel® Fabric Suite High-Performance Interconnect (InfiniBand) and HPC – Intel® True Scale Fabric
  • 21. Data Life Cycle Management with iRODS – EMC, RENCI 4.3M WGS for all US newborns/yr. ~= 100 PB* Can I describe it? Can I find it? Can I access it? Can I move it? * Chris Mason, Weill Cornell Medical College, WGA Mtg. Nov. 2012
  • 22. High Performance Scale-out Storage for Wellcome Trust Sanger Institute Challenge: Exponential increases in the volume of data being generated – but storage budgets are flat or growing slowly. Large data sets are difficult to proactively manage, and can easily overwhelm storage resources. Un-optimized storage has a direct, negative impact on application performance – slowing the time for breakthrough results. Solution: Exploit the power and scale of HPC-class storage, powered by Intel® Enterprise Edition for Lustre* software for unprecedented performance with unmatched management simplicity. Benefits that storage solutions powered by Intel EE for Lustre software: – Openness – Developed and enhanced by the Lustre experts – Global namespace – all clients can access all data – Performance – Upwards of 1 TB/s – Virtually unlimited file system and per file sizes – Management simplicity using Intel® Manager for Lustre*
  • 23. Heterogeneous Clusters for Biomedical Computing at Virginia Bioinformatics Institute (VBI) • Challenge: Scalable infrastructure for rapid data growth and the need to run varied applications is driving the need for novel computing needs. • Solution: Combination of Intel® Xeon®, Intel® True Scale QDR Infiniband and SGI’s infiniteStorage platform was deployed to deliver a 300% speedup. Overall reduction in cost resulted in the purchase of additional compute nodes. • VBI Cluster – Symmetric multiprocessing (SMP) nodes (large memory Xeon E7) with 1 TB of RAM, massively parallel processing (MPP) nodes (Xeon E5) with 64 GB. 50 PB of tape storage, 600 TB of HDD. Using SGI’s IS16000 platform and Intel TrueScale fabric, VBI moves data through the storage systems at 2 GB per second. “The amazing thing is that we see almost a three times performance increase on 48 nodes compared to 56 nodes of the previous generatyion, even though the processors are slightly slower clock speed. The Intel® QuickPath Interconnect and Intel® TrueScale Fabric have has a big impact.” Dr. Kevin Shinpaugh, Director of IT and HPC, Virginia Bioinformatics Institute
  • 24. Top-5 Pharmaceutical Company - SAS Grid • Challenge: Need to accelerate and optimize “time to results” clinical trial simulation environment; resource allocation and job prioritization was manual/ad-hoc • Solution: “Scale-Out” architecture: – SAS Visual Analytics, Enterprise Miner, Grid Manager – Red Hat Enterprise Linux – Xeon E5 servers (HP) • Benefits: Clinical trial simulation exercises reduced from hours to < 5 minutes; registration decisions accelerated with multi-hundred million USD impact http://www.intel.com/content/www/us/en/cloud-computing/cloud-computing-xeon-e5-carestream-imaging-brief.html
  • 25. Mitsui Knowledge Industry (MKI) • Challenge: Reduce the amount of time it takes to do complete genomic analysis and deliver results to patients • Solution: Real-Time Big Data Platform – R (Revolution Analytics) – SAP HANA – Hadoop • Benefits: Genomic analysis shortened from several days to 20 minutes; performance for some queries improved 400,000 X http://www.intel.com/content/www/us/en/cloud-computing/cloud-computing-xeon-e5-carestream-imaging-brief.html http://www.saphana.com/docs/DOC-3641
  • 26. Value • Enable researchers to discover biomarkers and drug targets by correlating genomic data sets • 90% gain in throughput; 6X data compression Analytics • Provide curated data sets with pre-computed analysis (classification, correlation, biomarkers) • Provide APIs for applications to combine and analyze public and private data sets Data Management • Use Hive and Hadoop for query and search • Dynamically partition and scale Hbase • 10-node cluster / Intel Xeon E5 processors • 10GbE network Data-Intensive Discovery: Genomics Intel Distribution
  • 27. Intel Confidential • Solution: Intel Distribution for Hadoop (IDH), Map Reduce, Hbase, Hive • Benefits: Ability to compare 14 million proteins and more, reducing the processing time from days to hours. • Project Characteristics: Hadoop: 5 nodes Cluster Storage:16TB (Internal storage) per server Servers: Xeon E5 2 socket 8 cores, 64GB RAM SLA: reducing processing time from 30 days to less then a day and scale to 4x4 million samples comparison Data: Multi-Terabyte database Problem Statement: Back in 2008 a genome research team faced compute and scalability issue in comparing all pairs of 4 million proteins, the BLAST search results overwhelmed a single database table. Today they need to compare 14 million proteins, this requirement cannot be addressed with existing technology. Big Data, Bioinformatics Team website Blast Program Genome data Proteins comparison High performance scalable Hadoop/Hbase cluster
  • 28. High Throughput Science: Embracing Cloud-based Analytics • Challenge: Team of cancer researchers had to screen a drug concept with a list of tens of millions of molecules working with a tight deadline, a fixed budget, and strict security and compliance requirements. Schrödinger’s existing in-house servers would be tied up for weeks • Solution: Schrödinger leveraged software from AWS partner, Cycle Computing, to provision a fully secured cluster of 50,000 cores, powered by the Intel® Xeon® processor E5 family. – This configuration enabled the team to run 16 million molecular simulations an hour. – Developed 1000 molecule list in < 8hrs.
  • 29. High Throughput Science: Large Scale Computational Chemistry Simulation • Challenge: Sustaining access to 50000+ compute cores for large scale computational chemistry simulation results in under a week. Ability to monitor and re-launch jobs, no additional capital expenditure with internal HPCC already running at capacity. • Solution: Novartis leveraged software from AWS partner, Cycle Computing, and MolSoft to provision a fully secured cluster of 30,000 CPUs, powered by the Intel® Xeon® processor E5 family. – Completed screening of 3.2 million compounds in approximately 9 hrs, compared to 4 -14 days on existing resources. Virtual Screening
  • 30. Goals and Current applications target • Focus on improving genomics pipelines • Optimize individual applications • Work with code authors to release optimizations • Intel® Xeon® processor focus  Selectively experiment with Intel® Xeon Phi™ coprocessor DOMAIN Applications Intel® Architecture Target Genomics Bowtie 1*, Bowtie 2* Xeon® processor BWA* Xeon® processor BLAST* Xeon® processor GATK* Xeon® processor HMMER* Xeon® processor Xeon® Phi™ coprocessor Abyss* Xeon® processor Velvet* Xeon® processor *Other names and brands may be claimed as the property of others.
  • 31. TGen* RNA sequencing pipeline Partnership between Intel®, DELL*, Tgen* 1.8x ** 2-socket Intel(R) Xeon(R) CPU E5-2687W / 3.1 GHz *Other names and brands may be claimed as the property of others.
  • 32. Goals and Current applications target • Optimize for Intel® Xeon® processor and Intel® Xeon Phi™ coprocessor (node and cluster) • Increase availability of applications on Intel® Xeon Phi™ coprocessor • Work with code authors to release optimizations *Other names and brands may be claimed as the property of others. DOMAIN Applications Intel® Architecture Targets Molecular Dynamics/ Chemistry AMBER* Xeon® processor Xeon® Phi™ coprocessor NAMD* GROMACS* GAMESS* Quantum Espresso* Gaussian* VASP* CP2K* QBOX* CPMD* LAMMPS*
  • 33. Intel® Xeon® processor: new platforms, architecture improve Life Science applications 2-socket “Ivybridge vs. Sandybridge” Ivybridge 12c/24T 2.7Ghz, Sandybridge 8c/16T 2.9GHz http://www.intel.com/content/www/us/en/benchmarks/server/xeon-e5-2600-v2/xeon-e5-v2-hpc-life-sciences.html *Other names and brands may be claimed as the property of others.
  • 34. 34 Optimizing/Accelerating the DNA Pipeline Compression – IPP library – HW Acceleration – Custom library FPGA Acceleration
  • 35. 35 Incorporating Intel IPP Deflater into Picard Tools
  • 36. 36 Picard MarkDuplicates Optimizations Two Fold Approach: 1. Added optional tag ‘MC’ to SAM Specification • Tag ‘MC’ is used to store Mate Cigar for a Paired Read, where mate is mapped. • SAM JDK extended to support tag ‘MC’ • MergeBamAlignment modified to include the new ‘MC’ tag within each relevant record of the SAM/BAM file 2. Redesign of MarkDuplicates • Inclusion of ‘MC’ tag provides opportunity for algorithmic redesign of MarkDuplicates • Overall speedup ~2x for MergeBamAlig/MarkDuplicates Additional Gains: Enables streaming of records for the entire pre-GATK phase (from ‘bwa mem’ to ‘MarkDuplicates’ ) in a typical bwa_mem+GATK workflow
  • 37. 37 MarkDuplicates RdX_1: …………Cigar ……………… ……………… RdY_1: …………Cigar ……………… ……………… RdX_2: …………Cigar ……………… ……………… ……………… ……………… RdY_2: …………Cigar BASELINE 1) Store information per each read: Used to determine unclipped 5’ coordinate for both ends & orientation of pair 2) Sort reads within the entire file by unclipped 5’coordinate and MarkDuplicates 3) Write out BAM file OPTIMIZED RdX_1: …………Cigar ………. MC: ………………………………… ………………………………… RdY_1: …………Cigar ……… MC: ………………………………… ………………………………… RdX_2: ………Cigar ……… MC: ………………………………… ………………………………… ………………………………… ………………………………… RdY_2: ………Cigar ………. MC: PairX PairY PairX+ + 1) Sort reads within a small window by unclipped 5’ coordinate: - MarkDuplicates - Write out
  • 38. 38 DNA Pipeline: BWA+GATK: Whole Genome Sample: ~65x Coverage Collaborating with our Partners and Medical community Process level Parallelism Thread-level Parallelism Step # of Threads Runtime (hours) Read Alignment (bwa mem) 24 7 View (samtools) 24 2 Sort + Index (samtools) 24 3 MarkDuplicates (picardtools) + Index 1 11 RealignerTargetCreator (GATK) 24 1 IndelRealigner* (GATK) + Index 24 6.5 BaseRecalibrator(GATK) 24 1.3 PrintReads* (GATK) + Index + Flagstat 24 12.3 TOTAL (hours) 44 Step Tool # of Threads Runtime (hours) Read Alignment (bwa) 16 8 Sampe (bwa) 1 24 Import (samtools) 1 11 Sort + Index (samtools) 1 14.5 MarkDuplicates (picardtools) + Index 1 11.5 UnifiedGenotyper* (GATK) 16 7.5 SomaticIndelDetector (GATK) 1 3 RealignerTargetCreator (GATK) 16 0.8 IndelRealigner* (GATK) + Index 1 17.5 BaseRecalibrator*(GATK) 1 62 PrintReads* (GATK) + Index + Flagstat 1 25 TOTAL (hours) 177 Algorithmic Improvement 6X improvement so far and 4X without major code change and rest with code changes. Redesign of Mark Duplicates + Merge Bam Align 30-36 hours
  • 39. 39 Profiling: Single Instance Run – Lower Latency # of Machines = 1 # of cores/Machine = 24 Temporary Storage – RAID0 2x4TB HDD Input Dataset: G15512.HCC1954.1, coverage: 65x Average CPU utilization is very low. Most cores not being used Average I/O bandwidth is very low. Application not I/O bound Average memory footprint is small. Application not using memory available in newer systems There is a lot of room to improvise
  • 40. 40 Smith Waterman Acceleration Working on accelerating two versions of Smith Waterman: 1. Simplified version where gap open, gap extension, and mismatch penalties are identical 2. Affine gap penalty (as implemented in BWA-MEM) Initial results on #1 seem promising Speed up measured in terms of throughput for these runs. Banded Smith Waterman implementation Bitwise parallelism: Packed32: 32-bit uint Packed64: 64-bit uint AVX: 256-bit vector Xeon Phi: 512-bit vector
  • 41. 41 Optimizing/Accelerating Compression – IPP library – HW Acceleration – Custom library FPGA Acceleration
  • 42. Genomics - Big Data Problem AffectingFactors Cell Response 313 Exabytes if everyone in the US has their genes sequenced 495 Exabytes if every cancer patient in the US has their genes sequenced every 2 weeks. Images, Assays and Drug response data will push it further up as shown in Blue line Complex interaction of varied & changing intrinsic and extrinsic factors determine cell response Source: Knights Cancer Institute, Oregon Health Sciences University & Intel Proliferation Apoptosis Differentiation DNA Repair Motility Senescence With Genomic Data growing rapidly, hospitals and research centers need to access the local data (the ones not shared) and the centralized public/private data for various analysis and analytics for Genomic Research/Development/Medicine. Compute has to be done “where data is” and need to be consistent locally and in the cloud. Energy, Total Cost of Operation are key Invasion,Metastasis& therapeuticresponse The day when every newborn gets their DNA sequenced is not far away: http://www.nih.gov/news/health/sep2013/nhgri-04.htm.
  • 43. 43 1 2 2 3 3 3 4 4 4 4 5 5 5 5 PairHMM Matrix Dependencies Wave-Front Computation in AVX
  • 44. 44 Pair HMM Acceleration using AVX • Computation kernel and bottleneck in GATK Haplotype Caller • AVX enables 8 floating point SIMD operations in parallel • 2 Ways to vectorize HMM computation • Intra-Sequence – Parallelize computation within one HMM matrix operation. Run multiple (8) computations concurrently along diagonal • Inter-Sequence – Perform multiple (8) HMM matrix operations at once Time (seconds) Speedup C++/Java Serial C++ 1540 1x / 9x 1 core with AVX (Intra) 340 4.5x / 40.7x 1 core with AVX (Inter) 285 5.4x / 48.6x 24 cores with AVX (Inter) 14.3 108x / 970x 24 cores hybrid (Inter) 15.7 98x / 882x
  • 45. 45
  • 46. Policy – United States, European Union Snapshot of US, EU Recommendations Develop an ICT-enabled European Strategy for Personalised Medicine 2014-2020 Driving research to unleash the potential of ICT at the point-of-care EU R&D initiatives must address:  Interoperability of technical standards for managing and sharing sequence data in research and clinical samples;  Development of hardware, software and workflow algorithms to accelerate cost efficient analysis of genetic abnormalities that cause cancer and other complex diseases;  Research to ensure convergence of Big Data and Cloud Computing infrastructure to meet the requirements of High Performance Computing and data throughout the life sciences and healthcare value chains The eHealth Action Plan 2020 should include Personalised Medicine as a priority  Gain knowledge of the challenges and barriers (technical, organizational, legal and political) to the adoption of ICT in support of Personalised Medicine leveraged by genomic information;  Evaluate how to change workflows and education requirements to facilitate adoption of ICT mediated personalized medicine in clinical practice;  Expand collaboration with other regions of the world in matters of common interest, e.g. by leveraging the eHealth MoU with the United States of America;  Study, evaluate and disseminate technology neutral risk assessment frameworks for data privacy and security, covering the entire ICT enabled Personalised Medicine delivery chain;  Develop effective methods for enabling the use of medical information for public health and research
  • 47. Intel Assets for Life Sciences Intel Xeon E5 Intel Xeon Phi Intel Fabric Intel Storage Intel Software • Up to 80% greater performance • Up to 70% more energy efficiency • Up to 30% less network latency • Hardware- accelerated security (AES-NI) • Broad industry adoption Consistent Performance Gains each generation • Performance and programmability for highly-parallel workloads • Programming continuity and scalable parallel programming models: common source code and software tools between multicore Intel® Xeon® and manycore Intel® Xeon Phi™ • Partner ecosystem continues growing and making progress • Intel® Cluster Studio XE compilers, libraries, analysis tools, OpenMP and MPI • Intel® Hadoop Distribution • Intel® Data Center Manager and Intel® Node Manager (NM) Intel® Expressway Service Gateway for Cloud usage models • Intel® True Scale Fabric designed from the ground up for HPC • QDR-40 and QDR-80 deliver performance that scales - high MPI message rates and end-to-end latency that stays low at scale • Optimized support for Intel® Xeon® E5 and Xeon® Phi processors • Intel Fabric Suite – IB Fabric Management & FastFabric Management tools • Intel® Xeon® processors and platforms are enabled with beneficial storage optimizations • Solid State Drives (SSD) and other NVM technologies improve storage performance • Intel® Cache Acceleration Software • Intel’s open source Lustre file-system support/development and Chroma management/provisio ning tools
  • 48. Summary • Enabling ecosystem of partners to innovate and make Personalized Medicine vision a reality • Delivering hardware-enhanced capabilities and software to accelerate science, translate results, deliver today. • Looking for collaboration opportunities to take Personalized Medicine mainstream by 2020 • Big Data/Analytics in Health & Life Sciences • www.intel.com/healthcare • hadoop.intel.com
  • 49. 49 INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Copyright © , Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon, Core, VTune, and Cilk are trademarks of Intel Corporation in the U.S. and other countries. Optimization Notice Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 Legal Disclaimer & Optimization Notice Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 49