Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
! 
Mohammad Quraishi (IT Senior Principal - Cigna) 
atif71@gmail.com 
! 
Leading a Healthcare Company 
to the 
Big Data Pr...
About me 
•BS in Computer Science and Engineering from University of 
Connecticut 
•In the Healthcare Industry for over 19...
Breakdown of the Hadoop Journey 
1 2 3 
The blowback 
What we 
accomplished 
Roadmap to the 
future 
Lessons Learned 
Ques...
The Elephant in the room 
Image Credit: Guian Bolisay/Flickr 4
What’s the problem? 
5 
We already have a mature data analysis infrastructure
And it looks something like this… 
What we already do 
•We have independent data marts 
•We have the Hub-and-spoke archite...
What is the vision? 
7 
The ability to perform 
•Descriptive, Predictive and Prescriptive Analytics 
! 
Remove the traditi...
Benefits of Big Data 
8 
•Hadoop has the lowest cost per TB ratio of any 
data technology available 
! 
•Getting started w...
9 
Benefits of Big Data 
! 
You don’t have to throw away data anymore!
Vision - Reference Architecture 
Analysis 
Microstrategy 
10 
Logs 
Web 
IVR 
Portal 
Mobile 
NoSQL 
Storing 
weblogs 
Rea...
The Initial Evaluation 
11 
•Vendor Evaluation: Which relationship best fits 
our needs without lock-in? 
! 
•Selection of...
Use Case 1 
12
Use Case 2 
13
Success! 
14 
•Ready to tackle tougher more 
complicated problems 
! 
•Went out looking for more use cases
Ran into misconceptions 
15 
“Let’s use Hadoop as ETL!” 
! 
“Help us move data.” 
! 
“Can we back up data for archiving?” ...
… & Challenges 
16
But Why? 
17 
•Overuse of the words “Big” & “Data” 
! 
•There was an overlap with other tools and 
platforms 
! 
•Hadoop l...
Broader impact - Business Benefits 
18 
!! 
•Building a Customer Persona 
!! 
•Service Ops efficiency 
!! 
•Being Customer...
Broader impact - IT Benefits 
19 
!!! 
•Predictive threat modeling 
!! 
•Data Archival 
!! 
•Network Efficiency
Hadoop and Big Data 
20 
! 
•Big Data = Hadoop + Relational + other 
suitable task related technologies 
! 
•Hadoop is com...
Hadoop is Complementary 
21 
•Hadoop excels at processing and analyzing large volumes of 
distributed, unstructured, struc...
Embrace the Most Important Change: 
Culture 
22 
Democratize your data and 
reap the benefits!
Why is Hadoop Complementary? 
23
What we accomplished? 
24 
•Evangelized Hadoop 
! 
•Linked Hadoop to BI Tools 
! 
•R on Hadoop 
! 
•A fail fast iterative ...
Lambda Architecture as the foundation 
Credit Nathan Marz - Big Data 
25 
The master dataset is the only part of the Lambd...
What we accomplished? 
26 
•ETL - Ingest, Transform and Move patterns 
! 
•Logs generated from consumer channels were 
ing...
A Custom NLP Framework 
27
A Roadmap to the Future 
28
A Roadmap to the Future 
29 
! 
Data Driven Solutions + FP 
! 
! 
“Functional Programming: I came for the 
concurrency, bu...
The Hadoop Stack – Advanced View 
There’s also Workflow Management with Oozie. 
30
Lessons Learned 
31 
•Overuse of the words “Big” & “Data” 
! 
•The overlap 
! 
•Everyone found a use for Hadoop 
! 
•Big C...
Healthcare company needs 
32 
•Security 
! 
•Vendors 
! 
•Vendor Partnerships
WWYS 
33 
“Difficult to see. Always in motion 
is the future…” 
Yoda 
! 
“Many of the truths that we cling to depend on 
o...
34 
Questions? 
! 
Mohammad Quraishi (IT Senior Principal - Cigna) 
atif71@gmail.com
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
Apache Pig: Making data transformation easy
Next
Upcoming SlideShare
Apache Pig: Making data transformation easy
Next
Download to read offline and view in fullscreen.

Share

Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Promised Land -- A Case Study of Hadoop in Healthcare

Download to read offline

Mohammad Quraishi, Senior IT Principal, Cigna

Like Moses seeing the Promised Land from afar, we knew the big data journey would be worth it, but we didn't know how hard it would be. In this talk, I'll delve into the details of our big data and analytics initiative at Cigna,

  • Be the first to like this

Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Promised Land -- A Case Study of Hadoop in Healthcare

  1. 1. ! Mohammad Quraishi (IT Senior Principal - Cigna) atif71@gmail.com ! Leading a Healthcare Company to the Big Data Promised Land: !!! A Case Study of Hadoop in Healthcare
  2. 2. About me •BS in Computer Science and Engineering from University of Connecticut •In the Healthcare Industry for over 19 years •Programmer most of my career - Architect, Designer •Worked in the SOA space for a number of years •Lead engineer in the mobile application space •Now Lead engineer in the Big Data Analytics Space - Hadoop ! In my spare time • Love to travel with the family • Video games, music, movies • Community relations work • Fan of College basketball 2
  3. 3. Breakdown of the Hadoop Journey 1 2 3 The blowback What we accomplished Roadmap to the future Lessons Learned Questions? 3 Making the case Vision Architecture
  4. 4. The Elephant in the room Image Credit: Guian Bolisay/Flickr 4
  5. 5. What’s the problem? 5 We already have a mature data analysis infrastructure
  6. 6. And it looks something like this… What we already do •We have independent data marts •We have the Hub-and-spoke architecture, the centralized warehouse 6
  7. 7. What is the vision? 7 The ability to perform •Descriptive, Predictive and Prescriptive Analytics ! Remove the traditional IT barriers separating the business users from insights
  8. 8. Benefits of Big Data 8 •Hadoop has the lowest cost per TB ratio of any data technology available ! •Getting started with Hadoop is fairly inexpensive •“Entry-level” clusters relatively inexpensive •Grow in small steps
  9. 9. 9 Benefits of Big Data ! You don’t have to throw away data anymore!
  10. 10. Vision - Reference Architecture Analysis Microstrategy 10 Logs Web IVR Portal Mobile NoSQL Storing weblogs Real%me'Data'Store'or'event'processing SAS Pentaho R ? Analysis/Modeling'Tools Tableau Spotfire Platfora Cognos ? Data'Science'Tools' Teradata filestack RDBMS External'Hadoop'Output Or'in'HDFS *Use Spark Streaming and append to Hadoop output - Realtime events Live Data Streams Web Analytics Event detection(Storm) Scalding (Scala) MapReduce'Distributed'Programming'Framework HDFS 2 Hadoop'Cluster'running'HDFS'and'MapReduce Includes'Management,'Monitoring'and'Security Visualization Map Reduce jobs Batch Log files to HDFS Data from tables to HDFS CED/ Claims Clinical Data RDBMS SQOOP Back'up'or'copy'data'from'HDFS'to'a' redundant'cluster'for'quick'recovery *'For'future'implementa%on'TBD Realtime Feed Flume Teradata Hive/Impala (SQL) Cascading (Java) Python 1 5 3 7 6 4 Flume SQOOP Edge'Node'For'Hadoop'Client Jobs SQOOP/ Flume Chronos Hadoop' Cluster'#2 Hadoop' Cluster'#3 8
  11. 11. The Initial Evaluation 11 •Vendor Evaluation: Which relationship best fits our needs without lock-in? ! •Selection of use cases for demonstration ! •Visualization of those use cases
  12. 12. Use Case 1 12
  13. 13. Use Case 2 13
  14. 14. Success! 14 •Ready to tackle tougher more complicated problems ! •Went out looking for more use cases
  15. 15. Ran into misconceptions 15 “Let’s use Hadoop as ETL!” ! “Help us move data.” ! “Can we back up data for archiving?” ! ! !
  16. 16. … & Challenges 16
  17. 17. But Why? 17 •Overuse of the words “Big” & “Data” ! •There was an overlap with other tools and platforms ! •Hadoop looked like a swiss army knife ! •Will it take over the world and replace other platforms?
  18. 18. Broader impact - Business Benefits 18 !! •Building a Customer Persona !! •Service Ops efficiency !! •Being Customer Centric !! •Product Efficiency !! •Brand Impact
  19. 19. Broader impact - IT Benefits 19 !!! •Predictive threat modeling !! •Data Archival !! •Network Efficiency
  20. 20. Hadoop and Big Data 20 ! •Big Data = Hadoop + Relational + other suitable task related technologies ! •Hadoop is complementary
  21. 21. Hadoop is Complementary 21 •Hadoop excels at processing and analyzing large volumes of distributed, unstructured, structured and semi-structured data in batch or near real-time fashion for analysis ! •NoSQL databases are adept at storing and serving up multi-structured data in near-real time for web-based applications ! •Massively parallel OLAP databases are best at providing analysis of large volumes of mainly structured data - Teradata ! •SAS/R - Modeling and Business Intelligence ! •Tableau - Visualization
  22. 22. Embrace the Most Important Change: Culture 22 Democratize your data and reap the benefits!
  23. 23. Why is Hadoop Complementary? 23
  24. 24. What we accomplished? 24 •Evangelized Hadoop ! •Linked Hadoop to BI Tools ! •R on Hadoop ! •A fail fast iterative analytics approach
  25. 25. Lambda Architecture as the foundation Credit Nathan Marz - Big Data 25 The master dataset is the only part of the Lambda Architecture that absolutely must be safeguarded from corruption. So for this λ
  26. 26. What we accomplished? 26 •ETL - Ingest, Transform and Move patterns ! •Logs generated from consumer channels were ingested with Flume ! •Standardized on Parquet (Storage) and Snappy (Compression) ! •Lifecycle and organization of Data on HDFS ! •LUKS - dm-crypt — for data at rest encryption ! •Sentry and LDAP for Role Based Access Control
  27. 27. A Custom NLP Framework 27
  28. 28. A Roadmap to the Future 28
  29. 29. A Roadmap to the Future 29 ! Data Driven Solutions + FP ! ! “Functional Programming: I came for the concurrency, but I stayed for the Data Science” Dean Wampler
  30. 30. The Hadoop Stack – Advanced View There’s also Workflow Management with Oozie. 30
  31. 31. Lessons Learned 31 •Overuse of the words “Big” & “Data” ! •The overlap ! •Everyone found a use for Hadoop ! •Big Change/Baby Steps ! •Agility + Process = Cognitive Dissonance
  32. 32. Healthcare company needs 32 •Security ! •Vendors ! •Vendor Partnerships
  33. 33. WWYS 33 “Difficult to see. Always in motion is the future…” Yoda ! “Many of the truths that we cling to depend on our point of view.” Yoda ! The Journey of a thousand miles begins with one cluster…
  34. 34. 34 Questions? ! Mohammad Quraishi (IT Senior Principal - Cigna) atif71@gmail.com

Mohammad Quraishi, Senior IT Principal, Cigna Like Moses seeing the Promised Land from afar, we knew the big data journey would be worth it, but we didn't know how hard it would be. In this talk, I'll delve into the details of our big data and analytics initiative at Cigna,

Views

Total views

1,512

On Slideshare

0

From embeds

0

Number of embeds

13

Actions

Downloads

32

Shares

0

Comments

0

Likes

0

×