SlideShare une entreprise Scribd logo
1  sur  42
Télécharger pour lire hors ligne
Amir Sedighi
February 2017
Dark Data
Risks and Opportunities
@amirsedighi
Speaker
Amir Sedighi
Software Engineer

Data Solutions Architect
Founder at recommender.ir
twitter: @amirsedighi
By even the most conservative estimates, the amount
of data in the world doubles every two years.
Data Era
May Venn Diagram helps us!
Big Data
May Venn Diagram helps us!
Tabular/
Relational/
RDBMS
Data
Big Data
May Venn Diagram helps us!
Dark Data
Tabular/
Relational/
RDBMS
Data
Big Data
May Venn Diagram helps us!
Dark Data
Tabular/
Relational/
RDBMS
Data
(Structured/Unstructured)
(Almost Unstructured)
(Structured)
Big Data
May Venn Diagram helps us!
Dark Data
Tabular/
Relational/
RDBMS
Data
(Structured/Unstructured)
(Almost Unstructured)
(Structured)
Big Data
Almost can’t be
processed or analyzed
Gartner defines dark data as the information assets
organizations collect, process and store during
regular business activities, but generally fail to use
for other purposes (for example, analytics, business
relationships and direct monetizing).
Dark Data Definition by Gartner
Gartner defines dark data as the information assets
organizations collect, process and store during
regular business activities, but generally fail to use
for other purposes (for example, analytics, business
relationships and direct monetizing).
Similar to dark matter in physics, dark data often
comprises most organizations’ universe of
information assets.
Dark Data Definition by Gartner
Gartner defines dark data as the information assets
organizations collect, process and store during
regular business activities, but generally fail to use
for other purposes (for example, analytics, business
relationships and direct monetizing).
Similar to dark matter in physics, dark data often
comprises most organizations’ universe of
information assets.
Thus, organizations often retain dark data for
compliance purposes only. Storing and securing
data typically incurs more expense (and sometimes
greater risk) than value.
Dark Data Definition by Gartner
Gartner defines dark data as the information assets
organizations collect, process and store during
regular business activities, but generally fail to use
for other purposes (for example, analytics, business
relationships and direct monetizing).
Similar to dark matter in physics, dark data often
comprises most organizations’ universe of
information assets.
Thus, organizations often retain dark data for
compliance purposes only. Storing and securing
data typically incurs more expense (and sometimes
greater risk) than value.
Dark Data Definition by Gartner
Dark Data - A more Sensible Definition
Dark Data - A more Sensible Definition
Organizations Generate
and Gather Data
Dark Data - A more Sensible Definition
Organizations Generate
and Gather Data
A large portion of the
collected data are
never even analyzed!
Dark Data - A more Sensible Definition
Organizations Generate
and Gather Data
A large portion of the
collected data are
never even analyzed!
90% of the data are
never analyzed
Dark Data - A more Sensible Definition
Organizations Generate
and Gather Data
A large portion of the
collected data are
never even analyzed!
90% of the data are
never analysed.
• Customer Information
• Log Files
• Previous Employee Information
• Previous Webpages
• Sensor Data
• Email Correspondences
• Account Information
• Notes or Presentations
• Old Versions of Relevant
Documents
80%..90% is Dark Data
Does Your Org have any Dark Data?
I am just going to
check if we have
any dark data in
the cellar…
Brining Dark Data into Light
1. Gathering
2. Storing/Processing
3. Analyzing and Bringing it into decisions
Brining Dark Data into Light
Brining Dark Data into Light
Brining Dark Data into Light
Brining Dark Data into Light
Brining Dark Data into Light
Brining Dark Data into Light
Question
All companies know data is going to provide value.
Why there is so much of dark data?
Why there is so much of dark data?
• Lack of insight about data
• Lack of ambitions to improve
• Disconnect among departments
• Lopsided priorities
• Lack of technologies to Capture and Store
• Lack of resources/infrastructures to make it available
• Lack of CPU and technics to analyze the data
The issues you face with Dark Data
• Legal and Regulatory Issues
• Loss of Reputation
• Intelligence Risk
• Operation Costs
• Opportunity Costs
Some essential questions
• What can we gather?
• What may we extract from it?
• How we may prune it?
• How long should we keep it?
• What are the storage options?
• What are the processing options?
• How much is the value of each block of data
(Approximately)
• Running limited boundary scenarios
Software Tools & Frameworks on DD
Software Tools & Frameworks on DD
Software Tools & Frameworks on DD
Log Management
Software Tools & Frameworks on DD
Indexing and Search
Software Tools & Frameworks on DD
Data Streaming
Software Tools & Frameworks on DD
Software Tools & Frameworks on DD
Software Tools & Frameworks on DD
Machine Learning and Graph Processing
• Mahout
• MLLib
• FlinkMK
• Theano
• Torch
• TensorFlow
• GraphX
• Gelly
A common Pipeline
Machine
Learning
Steam Processing
Query
Already Processed Data
Real World RT Events
A common Pipeline
Machine
Learning
Steam Processing
Query
Already Processed Data
Real World RT Events
New Pipeline
Questions?
Keep in touch:
twitter: @amirsedighi
1. http://www.gartner.com/it-glossary/dark-data/
2. http://www.itproportal.com/2016/03/07/5-benefits-of-putting-dark-data-to-work/
3. http://www.kdnuggets.com/2015/11/importance-dark-data-big-data-world.html
4. https://www.youtube.com/watch?v=_fBMmQo-Z4E
5. http://confluent.io
6. https://www.ecmconnection.com/doc/the-various-shades-of-dark-data-0001
7. https://www.datanami.com/2015/11/30/spark-streaming-what-is-it-and-whos-using-it/
References

Contenu connexe

Tendances

IoT and AI in Sports - Presentation Dubai 2019
IoT and AI in Sports - Presentation Dubai 2019IoT and AI in Sports - Presentation Dubai 2019
IoT and AI in Sports - Presentation Dubai 2019Francisco Maroto
 
AI Overview and Capabilities
AI Overview and CapabilitiesAI Overview and Capabilities
AI Overview and CapabilitiesAnandSRao1962
 
Digital Transformation and IOT
Digital Transformation and IOTDigital Transformation and IOT
Digital Transformation and IOTMatthew W. Bowers
 
Capability Maps - The Next Generation
Capability Maps - The Next GenerationCapability Maps - The Next Generation
Capability Maps - The Next GenerationIntersection Group
 
Future of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxFuture of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxGreg Makowski
 
Introduction to Chat GPT
Introduction to Chat GPTIntroduction to Chat GPT
Introduction to Chat GPTDianaGray10
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfPremNaraindas1
 
DevOps, BA and COBIT don’t really align, or do they?
DevOps, BA and COBIT don’t really align, or do they?DevOps, BA and COBIT don’t really align, or do they?
DevOps, BA and COBIT don’t really align, or do they?IIBA-Canberra
 
Generative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfGenerative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfLiming Zhu
 
Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...
Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...
Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...Amazon Web Services
 
[BEDROCK] Claude Prompt Engineering Techniques.pptx
[BEDROCK] Claude Prompt Engineering Techniques.pptx[BEDROCK] Claude Prompt Engineering Techniques.pptx
[BEDROCK] Claude Prompt Engineering Techniques.pptxssuserdd71c7
 
Mike Sharples - Generative AI and Large Language Models in Digital Education....
Mike Sharples - Generative AI and Large Language Models in Digital Education....Mike Sharples - Generative AI and Large Language Models in Digital Education....
Mike Sharples - Generative AI and Large Language Models in Digital Education....EADTU
 

Tendances (20)

IoT and AI in Sports - Presentation Dubai 2019
IoT and AI in Sports - Presentation Dubai 2019IoT and AI in Sports - Presentation Dubai 2019
IoT and AI in Sports - Presentation Dubai 2019
 
Implementing Ethics in AI
Implementing Ethics in AIImplementing Ethics in AI
Implementing Ethics in AI
 
Metaverse System Architectures
Metaverse System ArchitecturesMetaverse System Architectures
Metaverse System Architectures
 
Cloud History 101
Cloud History 101Cloud History 101
Cloud History 101
 
AI Overview and Capabilities
AI Overview and CapabilitiesAI Overview and Capabilities
AI Overview and Capabilities
 
Digital Transformation and IOT
Digital Transformation and IOTDigital Transformation and IOT
Digital Transformation and IOT
 
Capability Maps - The Next Generation
Capability Maps - The Next GenerationCapability Maps - The Next Generation
Capability Maps - The Next Generation
 
Future of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxFuture of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptx
 
Hyperautomation
HyperautomationHyperautomation
Hyperautomation
 
Introduction to Chat GPT
Introduction to Chat GPTIntroduction to Chat GPT
Introduction to Chat GPT
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
 
DevOps, BA and COBIT don’t really align, or do they?
DevOps, BA and COBIT don’t really align, or do they?DevOps, BA and COBIT don’t really align, or do they?
DevOps, BA and COBIT don’t really align, or do they?
 
Generative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfGenerative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdf
 
Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...
Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...
Building Text Analytics Applications on AWS using Amazon Comprehend - AWS Onl...
 
[BEDROCK] Claude Prompt Engineering Techniques.pptx
[BEDROCK] Claude Prompt Engineering Techniques.pptx[BEDROCK] Claude Prompt Engineering Techniques.pptx
[BEDROCK] Claude Prompt Engineering Techniques.pptx
 
AI and Future Jobs
AI and Future JobsAI and Future Jobs
AI and Future Jobs
 
Mike Sharples - Generative AI and Large Language Models in Digital Education....
Mike Sharples - Generative AI and Large Language Models in Digital Education....Mike Sharples - Generative AI and Large Language Models in Digital Education....
Mike Sharples - Generative AI and Large Language Models in Digital Education....
 
AzureOpenAI.pptx
AzureOpenAI.pptxAzureOpenAI.pptx
AzureOpenAI.pptx
 
When Digital becomes Human
When Digital becomes HumanWhen Digital becomes Human
When Digital becomes Human
 

En vedette

Big Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing EnvironmentsBig Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing EnvironmentsFarzad Nozarian
 
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگآشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM Amir Sedighi
 
تحلیل احساسات در شبکه های اجتماعی
تحلیل احساسات در شبکه های اجتماعیتحلیل احساسات در شبکه های اجتماعی
تحلیل احساسات در شبکه های اجتماعیHamed Azizi
 
تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)
تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)
تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)khalooei
 
تحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوب
تحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوبتحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوب
تحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوبHamed Azizi
 

En vedette (6)

Big Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing EnvironmentsBig Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing Environments
 
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگآشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
 
Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM
 
تحلیل احساسات در شبکه های اجتماعی
تحلیل احساسات در شبکه های اجتماعیتحلیل احساسات در شبکه های اجتماعی
تحلیل احساسات در شبکه های اجتماعی
 
تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)
تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)
تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)
 
تحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوب
تحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوبتحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوب
تحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوب
 

Similaire à Dark data

INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPDr Geetha Mohan
 
Discovering Big Data in the Fog: Why Catalogs Matter
 Discovering Big Data in the Fog: Why Catalogs Matter Discovering Big Data in the Fog: Why Catalogs Matter
Discovering Big Data in the Fog: Why Catalogs MatterEric Kavanagh
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science TJ Stalcup
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining Sushil Kulkarni
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedcedrinemadera
 
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?Albert Hoitingh
 
Dealing with Dark Data
Dealing with Dark DataDealing with Dark Data
Dealing with Dark DataKazoup
 
Understanding Dark Data
Understanding Dark DataUnderstanding Dark Data
Understanding Dark DataAhmed Banafa
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big DataIndu Khemchandani
 
Big datarevealed hadoop catalog
Big datarevealed hadoop catalogBig datarevealed hadoop catalog
Big datarevealed hadoop catalogSteven Meister
 
EPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdfEPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdfcedrinemadera
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureCaserta
 
(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...
(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...
(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...Priyanka Aash
 
Extract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark DataExtract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark Dataijtsrd
 
2017 06-14-getting started with data science
2017 06-14-getting started with data science2017 06-14-getting started with data science
2017 06-14-getting started with data scienceThinkful
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It? Caserta
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCTJ Stalcup
 

Similaire à Dark data (20)

INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
 
Discovering Big Data in the Fog: Why Catalogs Matter
 Discovering Big Data in the Fog: Why Catalogs Matter Discovering Big Data in the Fog: Why Catalogs Matter
Discovering Big Data in the Fog: Why Catalogs Matter
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
 
Dealing with Dark Data
Dealing with Dark DataDealing with Dark Data
Dealing with Dark Data
 
Understanding Dark Data
Understanding Dark DataUnderstanding Dark Data
Understanding Dark Data
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
Big datarevealed hadoop catalog
Big datarevealed hadoop catalogBig datarevealed hadoop catalog
Big datarevealed hadoop catalog
 
EPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdfEPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdf
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
 
(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...
(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...
(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...
 
Extract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark DataExtract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark Data
 
Thilga
ThilgaThilga
Thilga
 
2017 06-14-getting started with data science
2017 06-14-getting started with data science2017 06-14-getting started with data science
2017 06-14-getting started with data science
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It?
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DC
 

Plus de Amir Sedighi

Big Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACMBig Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACMBig Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM Amir Sedighi
 
Big Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACMBig Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACMBig Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACMBig Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACMAmir Sedighi
 
Two Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in IranTwo Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in IranAmir Sedighi
 
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring SolutionHelio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring SolutionAmir Sedighi
 
Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015Amir Sedighi
 
Case Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User GroupCase Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User GroupAmir Sedighi
 
Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingAmir Sedighi
 
Elasticsearch 1.x Cluster Installation (VirtualBox)
Elasticsearch 1.x Cluster Installation (VirtualBox)Elasticsearch 1.x Cluster Installation (VirtualBox)
Elasticsearch 1.x Cluster Installation (VirtualBox)Amir Sedighi
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)Amir Sedighi
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache KafkaAmir Sedighi
 
An introduction To Apache Spark
An introduction To Apache SparkAn introduction To Apache Spark
An introduction To Apache SparkAmir Sedighi
 
Distributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUDistributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUAmir Sedighi
 
An introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoopAn introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoopAmir Sedighi
 
An Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAn Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAmir Sedighi
 

Plus de Amir Sedighi (18)

Big Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACMBig Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACM
 
Big Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACMBig Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACM
 
Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM
 
Big Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACMBig Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACM
 
Big Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACMBig Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACM
 
Big Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACMBig Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACM
 
Two Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in IranTwo Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
 
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring SolutionHelio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
 
Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015
 
Case Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User GroupCase Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User Group
 
Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData Processing
 
Elasticsearch 1.x Cluster Installation (VirtualBox)
Elasticsearch 1.x Cluster Installation (VirtualBox)Elasticsearch 1.x Cluster Installation (VirtualBox)
Elasticsearch 1.x Cluster Installation (VirtualBox)
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
An introduction To Apache Spark
An introduction To Apache SparkAn introduction To Apache Spark
An introduction To Apache Spark
 
Distributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUDistributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBU
 
An introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoopAn introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoop
 
An Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAn Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for Beginners
 

Dernier

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 

Dernier (20)

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 

Dark data

  • 1. Amir Sedighi February 2017 Dark Data Risks and Opportunities @amirsedighi
  • 2. Speaker Amir Sedighi Software Engineer
 Data Solutions Architect Founder at recommender.ir twitter: @amirsedighi
  • 3. By even the most conservative estimates, the amount of data in the world doubles every two years. Data Era
  • 4. May Venn Diagram helps us! Big Data
  • 5. May Venn Diagram helps us! Tabular/ Relational/ RDBMS Data Big Data
  • 6. May Venn Diagram helps us! Dark Data Tabular/ Relational/ RDBMS Data Big Data
  • 7. May Venn Diagram helps us! Dark Data Tabular/ Relational/ RDBMS Data (Structured/Unstructured) (Almost Unstructured) (Structured) Big Data
  • 8. May Venn Diagram helps us! Dark Data Tabular/ Relational/ RDBMS Data (Structured/Unstructured) (Almost Unstructured) (Structured) Big Data Almost can’t be processed or analyzed
  • 9. Gartner defines dark data as the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Dark Data Definition by Gartner
  • 10. Gartner defines dark data as the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. Dark Data Definition by Gartner
  • 11. Gartner defines dark data as the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. Thus, organizations often retain dark data for compliance purposes only. Storing and securing data typically incurs more expense (and sometimes greater risk) than value. Dark Data Definition by Gartner
  • 12. Gartner defines dark data as the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. Thus, organizations often retain dark data for compliance purposes only. Storing and securing data typically incurs more expense (and sometimes greater risk) than value. Dark Data Definition by Gartner
  • 13. Dark Data - A more Sensible Definition
  • 14. Dark Data - A more Sensible Definition Organizations Generate and Gather Data
  • 15. Dark Data - A more Sensible Definition Organizations Generate and Gather Data A large portion of the collected data are never even analyzed!
  • 16. Dark Data - A more Sensible Definition Organizations Generate and Gather Data A large portion of the collected data are never even analyzed! 90% of the data are never analyzed
  • 17. Dark Data - A more Sensible Definition Organizations Generate and Gather Data A large portion of the collected data are never even analyzed! 90% of the data are never analysed. • Customer Information • Log Files • Previous Employee Information • Previous Webpages • Sensor Data • Email Correspondences • Account Information • Notes or Presentations • Old Versions of Relevant Documents
  • 19. Does Your Org have any Dark Data? I am just going to check if we have any dark data in the cellar…
  • 20. Brining Dark Data into Light 1. Gathering 2. Storing/Processing 3. Analyzing and Bringing it into decisions
  • 21. Brining Dark Data into Light
  • 22. Brining Dark Data into Light
  • 23. Brining Dark Data into Light
  • 24. Brining Dark Data into Light
  • 25. Brining Dark Data into Light
  • 26. Brining Dark Data into Light
  • 27. Question All companies know data is going to provide value. Why there is so much of dark data?
  • 28. Why there is so much of dark data? • Lack of insight about data • Lack of ambitions to improve • Disconnect among departments • Lopsided priorities • Lack of technologies to Capture and Store • Lack of resources/infrastructures to make it available • Lack of CPU and technics to analyze the data
  • 29. The issues you face with Dark Data • Legal and Regulatory Issues • Loss of Reputation • Intelligence Risk • Operation Costs • Opportunity Costs
  • 30. Some essential questions • What can we gather? • What may we extract from it? • How we may prune it? • How long should we keep it? • What are the storage options? • What are the processing options? • How much is the value of each block of data (Approximately) • Running limited boundary scenarios
  • 31. Software Tools & Frameworks on DD
  • 32. Software Tools & Frameworks on DD
  • 33. Software Tools & Frameworks on DD Log Management
  • 34. Software Tools & Frameworks on DD Indexing and Search
  • 35. Software Tools & Frameworks on DD Data Streaming
  • 36. Software Tools & Frameworks on DD
  • 37. Software Tools & Frameworks on DD
  • 38. Software Tools & Frameworks on DD Machine Learning and Graph Processing • Mahout • MLLib • FlinkMK • Theano • Torch • TensorFlow • GraphX • Gelly
  • 39. A common Pipeline Machine Learning Steam Processing Query Already Processed Data Real World RT Events
  • 40. A common Pipeline Machine Learning Steam Processing Query Already Processed Data Real World RT Events New Pipeline
  • 42. 1. http://www.gartner.com/it-glossary/dark-data/ 2. http://www.itproportal.com/2016/03/07/5-benefits-of-putting-dark-data-to-work/ 3. http://www.kdnuggets.com/2015/11/importance-dark-data-big-data-world.html 4. https://www.youtube.com/watch?v=_fBMmQo-Z4E 5. http://confluent.io 6. https://www.ecmconnection.com/doc/the-various-shades-of-dark-data-0001 7. https://www.datanami.com/2015/11/30/spark-streaming-what-is-it-and-whos-using-it/ References