SlideShare une entreprise Scribd logo
1  sur  22
TITLE and title
BIG DATA SCIENCE
Chandan Rajah – CEO, Parallel AI
“The price of light is far less than the cost of darkness”
TITLE and title
SUB TITLE SUB TITLE
footnote footnote
BENEFITS OF BIG DATA
COST SPEED
AGILITY CAPABILITY
TITLE and title
SUB TITLE SUB TITLE
footnote footnote
BIG DATA JOURNEY
WHERE
WHAT WHY
HOW
TITLE and title
SUB TITLE SUB TITLE
footnote footnote
What is Big Data ?
Big Data ≠ Data Volume
Big Data = Crude Oil
Think of data like ‘Crude Oil’
Big Data is about extracting ‘crude oil’; transporting it in ‘pipelines’; storing it in ‘mega tanks’
Source: Data Science London
TITLE and title
SUB TITLE SUB TITLE
footnote footnote
What is Data Science ?
Data Science ≠ Statistical Analysis
Data Science = Oil Refinery
Data science is about ‘treating’ data; applying ‘science’ to the data;
Refine the data ‘results’; and combine to form ‘insight’
Source: Data Science London
TITLE and title
SUB TITLE SUB TITLE
footnote footnote
What is the Big Data Science Toolkit ?
• Scala, Java, Python, R… (bonus: Clojure Haskell, Erlang)
• Hadoop, HDFS, MapReduce… (bonus: Spark, Storm, Tez)
• Scalding, HBase, Hive… (bonus: Shark, Titan, Giraph)
• Flume, Sqoop, ETL, Webscrapers… (bonus: Hume)
• SQL, RDBMS, DW, OLAP… (bonus: SOLR, ElasticSearch)
• Knime, Weka RapidMiner… (bonus: SciPy, NumPy, Pandas)
• D3.js, Kibana, ggplot2, Flare… (bonus: Shiny, Flare, Datameer)
• NoSQL, MongoDB, Cassandra, CouchDB
• And sometimes… MS Excel
Source: Data Science London
TITLE and title
SUB TITLE SUB TITLE
footnote footnote
Knowns, Unknowns & DIKUW FTW!
known knowns
we know we know
known unknowns
we know we don’t know
unknown unknowns
we don’t know we don’t know
D I K U W
DATA INFORMATION KNOWLEDGE UNDERSTANDING WISDOM
raw what how to why when
numbers description experience cause & effect prediction
letters context tested proven what’s best
symbols relationship instruction
signals reports programs models
PAST FUTURE
Data Engineer Data Analyst Data Miner Data Scientist
known knowns
known unknowns unknown unknowns
Source: Data Science London
TITLE
TITLE TITLE
TITLE
Business Intelligence to Data Discovery ?
data you know
data you don’t know
questionsyou’reasking
questionsyou’renotasking
Data Analyst
Data Scientist
Business
Intelligence
Data Discovery
DATA MODELLING
Y  F( X, random noise, parameters)
ALGORITHMIC MODELLING
Y  [ BLACK BOX ]  X
Source: Applied Data Labs & Leo Breiman
TITLE and title
SUB TITLE SUB TITLE
footnote footnote
BIG DATA JOURNEY
WHERE
WHAT WHY
HOW
TITLE
TITLE TITLE
TITLE
Why is Big Data needed ?
VOLUME VELOCITY VARIETY
Exponential growth; 2x in 2 yrs
PB (1000 TB) is now common
Event streams; never at rest
640k GB per internet minute
100s of data sources
85% not in a table
TITLE and title
SUB TITLE SUB TITLE
footnote footnote
BIG DATA JOURNEY
WHERE
WHAT WHY
HOW
TITLE
TITLE TITLE
TITLE
Big Data Heat Map – Gartner 2012
TITLE
TITLE TITLE
TITLE
Big Data Potential by Sector – McKinsey for USBLS, 2011
TITLE
TITLE TITLE
TITLE
Big Data Investment by Industry – Gartner, 2012
TITLE
TITLE TITLE
TITLE
Top Big Data Challenges – Gartner, 2012
TITLE
TITLE TITLE
TITLE
CIO Survey on Big Data Investments – IDG Survey, 2013
TITLE
TITLE TITLE
TITLE
CIO Survey on Main Drivers to Invest – IDG Survey, 2014
TITLE and title
SUB TITLE SUB TITLE
footnote footnote
BIG DATA JOURNEY
WHERE
WHAT WHY
HOW
TITLE
TITLE TITLE
TITLE
How will Big Data Evolve?
EXTERNAL ALIGNMENT INTERNAL COHERENCE
Align with Existing BI; Maximise Value
Exploit Capability; Respond Rapidly
Focus; Innovate; Stay Ahead
Repeat; Stabilize; Governance
TITLE and title
SUB TITLE SUB TITLE
footnote footnote
RECAP OF BENEFITS
COST SPEED
AGILITY CAPABILITY
TITLE
TITLE TITLE
TITLE
LAST WORDS OF WISDOM
NOT ALL ROADS LEAD TO ROME
TIME VALUE OF DATA KNOWLEDGE IS POWER
I AM AN INDIVIDUAL
TITLE and title
“The price of light is far less than the cost of darkness”

Contenu connexe

Tendances

Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data ArchitectureGuido Schmutz
 
Basic of python for data analysis
Basic of python for data analysisBasic of python for data analysis
Basic of python for data analysisPramod Toraskar
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceSampath Kumar
 
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...DATAVERSITY
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data ScienceJason Geng
 
Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Edureka!
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big DataIndu Khemchandani
 
Data science applications and usecases
Data science applications and usecasesData science applications and usecases
Data science applications and usecasesSreenatha Reddy K R
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data ScienceSpotle.ai
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data sciencebhavesh lande
 
Advanced analytics
Advanced analyticsAdvanced analytics
Advanced analyticsShankar R
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineeringThang Bui (Bob)
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptxSadhanaParameswaran
 
Holland & Barrett: Gen AI Prompt Engineering for Tech teams
Holland & Barrett: Gen AI Prompt Engineering for Tech teamsHolland & Barrett: Gen AI Prompt Engineering for Tech teams
Holland & Barrett: Gen AI Prompt Engineering for Tech teamsDobo Radichkov
 
Key Elements of a Successful Data Governance Program
Key Elements of a Successful Data Governance ProgramKey Elements of a Successful Data Governance Program
Key Elements of a Successful Data Governance ProgramDATAVERSITY
 

Tendances (20)

Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Basic of python for data analysis
Basic of python for data analysisBasic of python for data analysis
Basic of python for data analysis
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
Data science applications and usecases
Data science applications and usecasesData science applications and usecases
Data science applications and usecases
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Advanced analytics
Advanced analyticsAdvanced analytics
Advanced analytics
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineering
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptx
 
Carol Scott - How to Thrive in the AI Era.pdf
Carol Scott - How to Thrive in the AI Era.pdfCarol Scott - How to Thrive in the AI Era.pdf
Carol Scott - How to Thrive in the AI Era.pdf
 
Importance of Big Data Analytics
Importance of Big Data AnalyticsImportance of Big Data Analytics
Importance of Big Data Analytics
 
Holland & Barrett: Gen AI Prompt Engineering for Tech teams
Holland & Barrett: Gen AI Prompt Engineering for Tech teamsHolland & Barrett: Gen AI Prompt Engineering for Tech teams
Holland & Barrett: Gen AI Prompt Engineering for Tech teams
 
Key Elements of a Successful Data Governance Program
Key Elements of a Successful Data Governance ProgramKey Elements of a Successful Data Governance Program
Key Elements of a Successful Data Governance Program
 

En vedette

Introduction to data science intro,ch(1,2,3)
Introduction to data science intro,ch(1,2,3)Introduction to data science intro,ch(1,2,3)
Introduction to data science intro,ch(1,2,3)heba_ahmad
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceCaserta
 
Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningJulian Bright
 
Essential Tools For Your Big Data Arsenal
Essential Tools For Your Big Data ArsenalEssential Tools For Your Big Data Arsenal
Essential Tools For Your Big Data ArsenalMongoDB
 
Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Rajan Kanitkar
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellSri Ambati
 
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Austin Ogilvie
 
Introduction to Data Science with Hadoop
Introduction to Data Science with HadoopIntroduction to Data Science with Hadoop
Introduction to Data Science with HadoopDr. Volkan OBAN
 
Introduction to data science and candidate data science projects
Introduction to data science and candidate data science projectsIntroduction to data science and candidate data science projects
Introduction to data science and candidate data science projectsJay (Jianqiang) Wang
 
Introduction to Data Science: A Practical Approach to Big Data Analytics
Introduction to Data Science: A Practical Approach to Big Data AnalyticsIntroduction to Data Science: A Practical Approach to Big Data Analytics
Introduction to Data Science: A Practical Approach to Big Data AnalyticsIvan Khvostishkov
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceVignesh Prajapati
 
Introduction to Data Science with H2O- Mountain View
Introduction to Data Science with H2O- Mountain ViewIntroduction to Data Science with H2O- Mountain View
Introduction to Data Science with H2O- Mountain ViewSri Ambati
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsSri Ambati
 
Introduction to (Big) Data Science
Introduction to (Big) Data ScienceIntroduction to (Big) Data Science
Introduction to (Big) Data ScienceInfoFarm
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBernard Marr
 

En vedette (20)

Introduction to data science intro,ch(1,2,3)
Introduction to data science intro,ch(1,2,3)Introduction to data science intro,ch(1,2,3)
Introduction to data science intro,ch(1,2,3)
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine Learning
 
Intro to Data Science Concepts
Intro to Data Science ConceptsIntro to Data Science Concepts
Intro to Data Science Concepts
 
Essential Tools For Your Big Data Arsenal
Essential Tools For Your Big Data ArsenalEssential Tools For Your Big Data Arsenal
Essential Tools For Your Big Data Arsenal
 
TOUG Big Data Challenge and Impact
TOUG Big Data Challenge and ImpactTOUG Big Data Challenge and Impact
TOUG Big Data Challenge and Impact
 
Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
 
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
 
Introduction to Data Science with Hadoop
Introduction to Data Science with HadoopIntroduction to Data Science with Hadoop
Introduction to Data Science with Hadoop
 
Introduction to data science and candidate data science projects
Introduction to data science and candidate data science projectsIntroduction to data science and candidate data science projects
Introduction to data science and candidate data science projects
 
Introduction to Data Science: A Practical Approach to Big Data Analytics
Introduction to Data Science: A Practical Approach to Big Data AnalyticsIntroduction to Data Science: A Practical Approach to Big Data Analytics
Introduction to Data Science: A Practical Approach to Big Data Analytics
 
Data Science
Data ScienceData Science
Data Science
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Introduction to Data Science with H2O- Mountain View
Introduction to Data Science with H2O- Mountain ViewIntroduction to Data Science with H2O- Mountain View
Introduction to Data Science with H2O- Mountain View
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
 
QlikView & Big Data
QlikView & Big DataQlikView & Big Data
QlikView & Big Data
 
Introduction to (Big) Data Science
Introduction to (Big) Data ScienceIntroduction to (Big) Data Science
Introduction to (Big) Data Science
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
 

Similaire à Big Data Science: Intro and Benefits

Big Data Science at the Digital Catapult
Big Data Science at the Digital CatapultBig Data Science at the Digital Catapult
Big Data Science at the Digital CatapultChandan Rajah
 
Steps to the Big Data Science Epiphany
Steps to the Big Data Science EpiphanySteps to the Big Data Science Epiphany
Steps to the Big Data Science EpiphanyChandan Rajah
 
Comparing Data Science, Big Data, and Data Analytics.pdf
Comparing Data Science, Big Data, and Data Analytics.pdfComparing Data Science, Big Data, and Data Analytics.pdf
Comparing Data Science, Big Data, and Data Analytics.pdfUSDSI
 
Data Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryData Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryInside Analysis
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big DataFrank Kienle
 
New Directions in Metadata
New Directions in MetadataNew Directions in Metadata
New Directions in Metadatasuyu22
 
Predictive analytics and big data tutorial
Predictive analytics and big data tutorial Predictive analytics and big data tutorial
Predictive analytics and big data tutorial Benjamin Taylor
 
IT Arena-2021
IT Arena-2021IT Arena-2021
IT Arena-2021b0ris_1
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionInside Analysis
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Sciencesarith divakar
 
Introduction to Big Data An analogy between Sugar Cane & Big Data
Introduction to Big Data An analogy  between Sugar Cane & Big DataIntroduction to Big Data An analogy  between Sugar Cane & Big Data
Introduction to Big Data An analogy between Sugar Cane & Big DataJean-Marc Desvaux
 
Analyzing Real Time News
Analyzing Real Time NewsAnalyzing Real Time News
Analyzing Real Time NewsMarco Fusi
 
Foundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureFoundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureInside Analysis
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...Alex Liu
 
Accelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data FabricAccelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data FabricCambridge Semantics
 
CWIN17 Singapore / Darmadi komo (microsoft) modern data estate
CWIN17 Singapore / Darmadi komo (microsoft)   modern data estateCWIN17 Singapore / Darmadi komo (microsoft)   modern data estate
CWIN17 Singapore / Darmadi komo (microsoft) modern data estateCapgemini
 
Top 10 data science takeaways for executives
Top 10 data science takeaways for executivesTop 10 data science takeaways for executives
Top 10 data science takeaways for executivesDylan Erens
 
C* Summit 2013: Big Data Analytics – Realize the Investment from Your Big Dat...
C* Summit 2013: Big Data Analytics – Realize the Investment from Your Big Dat...C* Summit 2013: Big Data Analytics – Realize the Investment from Your Big Dat...
C* Summit 2013: Big Data Analytics – Realize the Investment from Your Big Dat...DataStax Academy
 
Big data processing with apache spark
Big data processing with apache sparkBig data processing with apache spark
Big data processing with apache sparksarith divakar
 

Similaire à Big Data Science: Intro and Benefits (20)

Big Data Science at the Digital Catapult
Big Data Science at the Digital CatapultBig Data Science at the Digital Catapult
Big Data Science at the Digital Catapult
 
Steps to the Big Data Science Epiphany
Steps to the Big Data Science EpiphanySteps to the Big Data Science Epiphany
Steps to the Big Data Science Epiphany
 
Comparing Data Science, Big Data, and Data Analytics.pdf
Comparing Data Science, Big Data, and Data Analytics.pdfComparing Data Science, Big Data, and Data Analytics.pdf
Comparing Data Science, Big Data, and Data Analytics.pdf
 
Data Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryData Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data Discovery
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
New Directions in Metadata
New Directions in MetadataNew Directions in Metadata
New Directions in Metadata
 
Predictive analytics and big data tutorial
Predictive analytics and big data tutorial Predictive analytics and big data tutorial
Predictive analytics and big data tutorial
 
IT Arena-2021
IT Arena-2021IT Arena-2021
IT Arena-2021
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
Introduction to Big Data An analogy between Sugar Cane & Big Data
Introduction to Big Data An analogy  between Sugar Cane & Big DataIntroduction to Big Data An analogy  between Sugar Cane & Big Data
Introduction to Big Data An analogy between Sugar Cane & Big Data
 
Big Data: hype or necessity?
Big Data: hype or necessity?Big Data: hype or necessity?
Big Data: hype or necessity?
 
Analyzing Real Time News
Analyzing Real Time NewsAnalyzing Real Time News
Analyzing Real Time News
 
Foundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureFoundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information Architecture
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
 
Accelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data FabricAccelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data Fabric
 
CWIN17 Singapore / Darmadi komo (microsoft) modern data estate
CWIN17 Singapore / Darmadi komo (microsoft)   modern data estateCWIN17 Singapore / Darmadi komo (microsoft)   modern data estate
CWIN17 Singapore / Darmadi komo (microsoft) modern data estate
 
Top 10 data science takeaways for executives
Top 10 data science takeaways for executivesTop 10 data science takeaways for executives
Top 10 data science takeaways for executives
 
C* Summit 2013: Big Data Analytics – Realize the Investment from Your Big Dat...
C* Summit 2013: Big Data Analytics – Realize the Investment from Your Big Dat...C* Summit 2013: Big Data Analytics – Realize the Investment from Your Big Dat...
C* Summit 2013: Big Data Analytics – Realize the Investment from Your Big Dat...
 
Big data processing with apache spark
Big data processing with apache sparkBig data processing with apache spark
Big data processing with apache spark
 

Plus de Chandan Rajah

Business Change through Predictive Analytics
Business Change through Predictive AnalyticsBusiness Change through Predictive Analytics
Business Change through Predictive AnalyticsChandan Rajah
 
Business Change through Predictive Analytics
Business Change through Predictive AnalyticsBusiness Change through Predictive Analytics
Business Change through Predictive AnalyticsChandan Rajah
 
Data Disruption by Vertical Innovation
Data Disruption by Vertical InnovationData Disruption by Vertical Innovation
Data Disruption by Vertical InnovationChandan Rajah
 
Data Innovation in the UK
Data Innovation in the UKData Innovation in the UK
Data Innovation in the UKChandan Rajah
 
Data Disruption by Vertical Innovation in Media
Data Disruption by Vertical Innovation in MediaData Disruption by Vertical Innovation in Media
Data Disruption by Vertical Innovation in MediaChandan Rajah
 
Catalysing Sector Advantage
Catalysing Sector AdvantageCatalysing Sector Advantage
Catalysing Sector AdvantageChandan Rajah
 
Rise of the Machines
Rise of the MachinesRise of the Machines
Rise of the MachinesChandan Rajah
 
Health Innovation and the Digital Catapult
Health Innovation and the Digital CatapultHealth Innovation and the Digital Catapult
Health Innovation and the Digital CatapultChandan Rajah
 
Connected Farms ...and the Digital Catapult
Connected Farms ...and the Digital CatapultConnected Farms ...and the Digital Catapult
Connected Farms ...and the Digital CatapultChandan Rajah
 
Data Innovation in the Digital Economy
Data Innovation in the Digital EconomyData Innovation in the Digital Economy
Data Innovation in the Digital EconomyChandan Rajah
 
Disruptive Data in Future Care
Disruptive Data in Future CareDisruptive Data in Future Care
Disruptive Data in Future CareChandan Rajah
 
Data Warehouse to Data Science
Data Warehouse to Data ScienceData Warehouse to Data Science
Data Warehouse to Data ScienceChandan Rajah
 
Business Impact of Predictive Analytics
Business Impact of Predictive AnalyticsBusiness Impact of Predictive Analytics
Business Impact of Predictive AnalyticsChandan Rajah
 
Social Triangulation with Big Data
Social Triangulation with Big DataSocial Triangulation with Big Data
Social Triangulation with Big DataChandan Rajah
 
Big Data Science Challenges in Media
Big Data Science Challenges in MediaBig Data Science Challenges in Media
Big Data Science Challenges in MediaChandan Rajah
 

Plus de Chandan Rajah (17)

Business Change through Predictive Analytics
Business Change through Predictive AnalyticsBusiness Change through Predictive Analytics
Business Change through Predictive Analytics
 
Business Change through Predictive Analytics
Business Change through Predictive AnalyticsBusiness Change through Predictive Analytics
Business Change through Predictive Analytics
 
Data Disruption by Vertical Innovation
Data Disruption by Vertical InnovationData Disruption by Vertical Innovation
Data Disruption by Vertical Innovation
 
Data Innovation in the UK
Data Innovation in the UKData Innovation in the UK
Data Innovation in the UK
 
Data Disruption by Vertical Innovation in Media
Data Disruption by Vertical Innovation in MediaData Disruption by Vertical Innovation in Media
Data Disruption by Vertical Innovation in Media
 
Catalysing Sector Advantage
Catalysing Sector AdvantageCatalysing Sector Advantage
Catalysing Sector Advantage
 
Rise of the Machines
Rise of the MachinesRise of the Machines
Rise of the Machines
 
Health Innovation and the Digital Catapult
Health Innovation and the Digital CatapultHealth Innovation and the Digital Catapult
Health Innovation and the Digital Catapult
 
Connected Farms ...and the Digital Catapult
Connected Farms ...and the Digital CatapultConnected Farms ...and the Digital Catapult
Connected Farms ...and the Digital Catapult
 
Data Innovation in the Digital Economy
Data Innovation in the Digital EconomyData Innovation in the Digital Economy
Data Innovation in the Digital Economy
 
Disruptive Data in Future Care
Disruptive Data in Future CareDisruptive Data in Future Care
Disruptive Data in Future Care
 
Data Warehouse to Data Science
Data Warehouse to Data ScienceData Warehouse to Data Science
Data Warehouse to Data Science
 
Business Impact of Predictive Analytics
Business Impact of Predictive AnalyticsBusiness Impact of Predictive Analytics
Business Impact of Predictive Analytics
 
Social Triangulation with Big Data
Social Triangulation with Big DataSocial Triangulation with Big Data
Social Triangulation with Big Data
 
Big Data Science Challenges in Media
Big Data Science Challenges in MediaBig Data Science Challenges in Media
Big Data Science Challenges in Media
 
Hadoop and friends
Hadoop and friendsHadoop and friends
Hadoop and friends
 
IPTV Case Study
IPTV Case StudyIPTV Case Study
IPTV Case Study
 

Big Data Science: Intro and Benefits

  • 1. TITLE and title BIG DATA SCIENCE Chandan Rajah – CEO, Parallel AI “The price of light is far less than the cost of darkness”
  • 2. TITLE and title SUB TITLE SUB TITLE footnote footnote BENEFITS OF BIG DATA COST SPEED AGILITY CAPABILITY
  • 3. TITLE and title SUB TITLE SUB TITLE footnote footnote BIG DATA JOURNEY WHERE WHAT WHY HOW
  • 4. TITLE and title SUB TITLE SUB TITLE footnote footnote What is Big Data ? Big Data ≠ Data Volume Big Data = Crude Oil Think of data like ‘Crude Oil’ Big Data is about extracting ‘crude oil’; transporting it in ‘pipelines’; storing it in ‘mega tanks’ Source: Data Science London
  • 5. TITLE and title SUB TITLE SUB TITLE footnote footnote What is Data Science ? Data Science ≠ Statistical Analysis Data Science = Oil Refinery Data science is about ‘treating’ data; applying ‘science’ to the data; Refine the data ‘results’; and combine to form ‘insight’ Source: Data Science London
  • 6. TITLE and title SUB TITLE SUB TITLE footnote footnote What is the Big Data Science Toolkit ? • Scala, Java, Python, R… (bonus: Clojure Haskell, Erlang) • Hadoop, HDFS, MapReduce… (bonus: Spark, Storm, Tez) • Scalding, HBase, Hive… (bonus: Shark, Titan, Giraph) • Flume, Sqoop, ETL, Webscrapers… (bonus: Hume) • SQL, RDBMS, DW, OLAP… (bonus: SOLR, ElasticSearch) • Knime, Weka RapidMiner… (bonus: SciPy, NumPy, Pandas) • D3.js, Kibana, ggplot2, Flare… (bonus: Shiny, Flare, Datameer) • NoSQL, MongoDB, Cassandra, CouchDB • And sometimes… MS Excel Source: Data Science London
  • 7. TITLE and title SUB TITLE SUB TITLE footnote footnote Knowns, Unknowns & DIKUW FTW! known knowns we know we know known unknowns we know we don’t know unknown unknowns we don’t know we don’t know D I K U W DATA INFORMATION KNOWLEDGE UNDERSTANDING WISDOM raw what how to why when numbers description experience cause & effect prediction letters context tested proven what’s best symbols relationship instruction signals reports programs models PAST FUTURE Data Engineer Data Analyst Data Miner Data Scientist known knowns known unknowns unknown unknowns Source: Data Science London
  • 8. TITLE TITLE TITLE TITLE Business Intelligence to Data Discovery ? data you know data you don’t know questionsyou’reasking questionsyou’renotasking Data Analyst Data Scientist Business Intelligence Data Discovery DATA MODELLING Y  F( X, random noise, parameters) ALGORITHMIC MODELLING Y  [ BLACK BOX ]  X Source: Applied Data Labs & Leo Breiman
  • 9. TITLE and title SUB TITLE SUB TITLE footnote footnote BIG DATA JOURNEY WHERE WHAT WHY HOW
  • 10. TITLE TITLE TITLE TITLE Why is Big Data needed ? VOLUME VELOCITY VARIETY Exponential growth; 2x in 2 yrs PB (1000 TB) is now common Event streams; never at rest 640k GB per internet minute 100s of data sources 85% not in a table
  • 11. TITLE and title SUB TITLE SUB TITLE footnote footnote BIG DATA JOURNEY WHERE WHAT WHY HOW
  • 12. TITLE TITLE TITLE TITLE Big Data Heat Map – Gartner 2012
  • 13. TITLE TITLE TITLE TITLE Big Data Potential by Sector – McKinsey for USBLS, 2011
  • 14. TITLE TITLE TITLE TITLE Big Data Investment by Industry – Gartner, 2012
  • 15. TITLE TITLE TITLE TITLE Top Big Data Challenges – Gartner, 2012
  • 16. TITLE TITLE TITLE TITLE CIO Survey on Big Data Investments – IDG Survey, 2013
  • 17. TITLE TITLE TITLE TITLE CIO Survey on Main Drivers to Invest – IDG Survey, 2014
  • 18. TITLE and title SUB TITLE SUB TITLE footnote footnote BIG DATA JOURNEY WHERE WHAT WHY HOW
  • 19. TITLE TITLE TITLE TITLE How will Big Data Evolve? EXTERNAL ALIGNMENT INTERNAL COHERENCE Align with Existing BI; Maximise Value Exploit Capability; Respond Rapidly Focus; Innovate; Stay Ahead Repeat; Stabilize; Governance
  • 20. TITLE and title SUB TITLE SUB TITLE footnote footnote RECAP OF BENEFITS COST SPEED AGILITY CAPABILITY
  • 21. TITLE TITLE TITLE TITLE LAST WORDS OF WISDOM NOT ALL ROADS LEAD TO ROME TIME VALUE OF DATA KNOWLEDGE IS POWER I AM AN INDIVIDUAL
  • 22. TITLE and title “The price of light is far less than the cost of darkness”

Notes de l'éditeur

  1. COST – 20x less per TB v/s Teradata, Netezza, Oracle– 75% less average marginal cost per capacitySPEED – 10x faster than Teradata, NetezzaAGILITY – 115% lesser average cost per data source v/s OracleSCIENCE – Machine learning, prediction
  2. WHAT - What is Big Data Science?WHY - Why is it needed?WHERE - Where is it being used?HOW - How will it evolve?
  3. COST – 20x less per TB v/s Teradata, Netezza, Oracle– 75% less average marginal cost per capacitySPEED – 10x faster than Teradata, NetezzaAGILITY – 115% lesser average cost per data source v/s OracleSCIENCE – Machine learning, prediction
  4. TIME VALUE - Yesterday’s data is less valuable than today’s data - Historical data is more valuable than just now alonePOWER - Get from unknown unknowns to known unknowns or known knowns is powerfulLEAD TO ROME - Exploring with no direct business impact is not a bad thingINDIVUDUAL - Treat every customer as an individual not an aggregate and analyse - Aggregate only individual insights