SlideShare a Scribd company logo
1 of 75
December 2nd, 2014
@NasoLuca
The What, Why and How of
Definitions, Examples, Suggestions, Howtos and much more
Agenda
✤ What is Big Data?
✤ Big Data Examples
✤ How to Tackle a Big Data Problem
✤ Sentiment Analysis
✤ Big Data tools
Part I Part II
How relevant is it?
Big Data
Social Media
Digital Marketing
Machine Learning
Computer Vision
Who’s more relevant to the people?
Let’s ask Google!
How relevant is it?
Big Data
Social Media
Digital Marketing
Machine Learning
Computer Vision
Google Trends
From 2007 to end 2014
Big Data Market
What is Big Data? How relevant is it?
Jobs to support Big Data
In 2012 it was $28B, for 2013 expected $37B
Scattered across a number of IT landscapes. 45% for new
social network analysis and content analytics tools[1]
4.4 Million IT jobs globally by 2015, 1.9m in the US[1]
By 2018, the US alone could face a shortage of 200k people
with deep analytical skills as well as 1.5m managers and
analysts[2]
Definition
Big Data according to Oxford Dictionary[3]:
big data n. Computing (also with capital initials) data of a very
large size, typically to the extent that its manipulation and
management present significant logistical challenges; (also) the
branch of computing involving such data.
Big Data according to Gartner[4]:
Big data is high-volume, high-velocity and high-variety
information assets that demand cost-effective, innovative forms of
information processing for enhanced insight and decision making.
This is where the 3 Vs originated from:
Volume Velocity Variety
VOLUME
About: Amount of data. Unit: bytes
What is Big Data? Definition
Information about the general population, education, health,
medicine, travel, geographic locations, shopping, financial
transactions, jobs, scientific experiments, emails, sensors,
texts, photos, videos, activity on social networks …
2.5 Exabytes of data are created each day worldwide[5]
Facebook (2012): 200 PB of data each year
In 3 years CERN collected 75 PB of data (with LHC)
Most of US company have 100 TB[5]
1 ZB = 10002 PB = 10003 TB = 10004 GB
How much is Big Data? > 5 TB (as of 2014)
VELOCITY
About: moving data. Unit: bytes per seconds
What is Big Data? Definition
This really has two interpretations:
Data Generation Rate or Data Processing Rate
Every minute (2014)[6]:
200M emails
4M google search
277k more tweets
216k pictures on Instagram
What’s the limit to be considered big data?
As of 2014
Generation: time to reach 5TB < Project Life Time
Processing: > 1 MB/s = 5TB/2mo
VARIETY
About: Form of the data.
3 Types: structured, semi-structured, unstructured
What is Big Data? Definition
1. Structured = Data in a fixed field within a record
(spreadsheets, Relational Database)
2. Semi-Structured = XML, JSON, CSV (Text with columns,
with a separator)
3. Unstructured = Data stored without any model, or that
does not have any organisation
All of them can be Big Data
What is Big Data? Definition
VERACITY
Lack of accuracy
Data itself is often imprecise or incomplete (typos, empty
fields, errors, source changes, …)
The time of small and tidy samples is over
This concludes the classical 3 Vs of Big Data.
To better describe Big Data we can add a couple more Vs.
VALUE
About the actionable insights one can get
What is Big Data? Definition
People do not need data, they need insights which are hidden
in the data: Value is a concentrated data-juice.
Obtaining correct, but irrelevant, information is a waste of
time, effort and resources.
Close interactions between an analytics team and business
managers can help you address the right questions.
“Datafication” is the movement behind Big Data[7]
What is Big Data? Implications
Big Data implicitly requires 3 paradigm shifts:
1. from “some” to “all”
2. from “clean” to “messy”
3. from “causation” to “correlation”
What is Big Data? Implications
What is Big Data? Implications
http://xkcd.com/552/
Big Data Examples
General Application Fields
Not only business: Big Data have implications far beyond
marketing and consumer goods
It will profoundly change how governments work and alter
the nature of politics and our daily life too (smart cities).
When it comes to generating economic growth, providing
public services, or fighting wars, those who can harness big
data effectively will have a significant edge over others.
Forbes think that it will influence us in 5 ways[8]:
1. how we spend
2. how we vote
3. how we study
4. how we stay healthy
5. how we keep/lose privacy
Big Data Examples - General Application Fields
1. Fire-prevention @ New York City[7]
Big Data Examples - Real Life Applications
Problem
Imbalance between needs and resource
Too many complaints (25,000 per year) too few inspectors
(200).
You want your inspectors to tackle the most relevant cases
only/first.
How to prioritise the complaints?
1. Fire-prevention @ New York City[7]
Big Data Examples - Real Life Applications
1. Fire-prevention @ New York City[7]
Solution
a. Database with information about buildings (crime rates,
ambulance visits, utility usage, missed payments, …)
b. Compare database to records of building fires, looking for
correlations
c. Estimate the probability of fire for each of the complaint
Big Data Examples - Real Life Applications
Result
The efficiency of the inspectors raised from 13% to 70%
Among the predictors of a fire were:
the type of building and the year it was built
permits for exterior brickwork correlated with lower risks
2. Improve Formula 1 car performance[9]
Big Data Examples - Real Life Applications
2. Improve Formula 1 car performance[9]
Big Data Examples - Real Life Applications
Why is this Big Data?
Volume = average 10+ TB of data at each GP per team
Velocity = teams take decisions in <~ 30 seconds
Main goals
1. get real time alarms on brakes, tires, fuel and other factors
that affect car performance during a race
2. find ways to improve car performance in the long term
2. Improve Formula 1 car performance[9]
a. Collect data:130-160 sensors on a car during race, plus
weather conditions, track conditions …
b. Compare data with records of success/failures
c. Look for correlations to get (1) real-time alarms and (2)
long term insights
Big Data Examples - Real Life Applications
$1B cost of saving 0.1s from a single lap
$60M money spent by a team on a supercomputer
3. Predict Flu Outbreak in Real-Time
Big Data Examples - Real Life Applications
3. Predict Flu Outbreak in Real-Time
Flu can spread very fast with catastrophic consequences,
traditional methods can be too slow.
Each day, millions of users around the world search for health
information online. As you might expect, there are more flu-
related searches during flu season.
Of course, not every person who searches for "flu" is actually
sick, but a pattern emerges when all the flu-related search
queries are added together.
Big Data Examples - Real Life Applications
3. Predict Flu Outbreak in Real-Time
a. Collect data: keywords searched on the web; data collected by
national medical authorities (US Centers for Disease Control
and Prevention - CDC)
b. Compare the trends of search queries (top 50M) with the
records in real data
c. Find the keywords that correlate with the actual trends, to
make predictions based on current searches.
Big Data Examples - Real Life Applications
There are 45 keywords that correlate well with the historical data
The predictions from this system can improve the CDC data by up
to 50% [Royal Society Open Science, 2014]
3. Predict Flu Outbreak in Real-Time
Big Data Examples - Real Life Applications
Orange: US real data
Blue: predictions based on keywords
3. Predict Flu Outbreak in Real-Time
Big Data Examples - Real Life Applications
Google Flu Trend GFT project: www.google.org/flutrends/
Published in Nature in 2009[10]
Example of power of Big Data and of failure of Big Data.
4. Reduce injuries in sports[11]
Big Data Examples - Real Life Applications
4. Reduce injuries in sports[11]
Big Data Examples - Real Life Applications
Injuries are probably the largest market inefficiency in pro
sports
In 2013, teams in the Major League Baseball spent $665
million on the salaries of injured players and replacements
Goal
anticipate when an athlete will get hurt before it actually
happens so to avoid it
4. Reduce injuries in sports[11]
a. Collect data: data about how players actually move
(accelerations, elevations, jumping ranges, …) and at what
intensity.
b. Compare with records of injuries; let doctors analyse the
data
c. Predict the chances to get an injury and intervene before it
happens both during workouts or matches
Big Data Examples - Real Life Applications
Founded in 2006, Catapult sales have increased ~70% for six
consecutive years and is on track to gross $20 million in 2013.
5. Running massive multiplayer games
Big Data Examples - Real Life Applications
“Infinity Challenge”, a massive 5 week online battle.
Two needs: handle massive amount of data in almost real time
to update leaderboards and detect cheaters.
Big Data Examples - Real Life Applications
The development team was taking these insights and
updating the game almost weekly, using direct player
feedback to tweak the game.
Behind the scenes there was the Microsoft Big Data cloud
platform - HDInsight on Azure.
5. Running massive multiplayer games
6. Transparency of Governments
Improving politics for all
Big Data Examples - Real Life Applications
6. Transparency of Governments
Improving politics for all
In 2009 the US government started www.data.gov
Today there are 133k datasets in different fields:
Agriculture, Climate, Education, Energy, Finance, Geospatial,
Global Development, Health, Jobs & Skills, Public Safety,
Science & Research, Weather
Big Data Examples - Real Life Applications
Many countries have followed
including Italy (from 2011):
~ 9k datasets from 80 PA
Code4italy @Montecitorio
The Dark Side
There is one massive downside to this: Privacy concerns
Do we really want all our data to be logged and stored?
Data that can say where we are everyday, which products we
buy, which movie we watch, how fast (or slow) we drive our
car, where we park it, which roads we usually take, where we
go with out bike, how much exercise we do (or don’t), what
we eat, how much we spend, which drugs we take, …
Security issues: track my position, steal my identity
Not all applications are customer-centric: insurance
companies (use data to increase costs)
Governments need to protect citizens against unhealthy
market dominance: data antitrust
Also, they need to regulate better the ways companies ask and
get the data (just asking for permission with Terms of Use is
not enough!)
Big Data Examples - The Dark Side
At present the control of information is being taken away
from citizens
The danger is that individuals will not be able to control the
ways they are monitored or what happens to the information
How to Tackle a Big Data Problem
Preliminary Steps
First things first: check if it really is a Big Data problem
From the examples we have seen that common 3 steps are:
1. collect data
2. find correlations (compare with historical records)
3. make predictions
Do not follow these steps!
These are relevant phases to execute a Big Data project, once
everything is in place.
Preliminary steps:
1. Goals and timescale
what you want to achieve and by when
2. Data
which data you have or need to get
3. Team
which skills you need (can change with data)
4. Silo breaking
connections you need to create (crm, it, marketing)
5. Budget
how much money you can put overall (business stakeholders)
How to Tackle a Big Data Problem - Preliminary Steps
How to Tackle a Big Data Problem - Four Universal Steps
1. Collect & store data (source, privacy, real-time)
2. Clean data (na, errors)
3. Analyse data (correlations)
4. Visualise data (kpi)
It is very unluckily to get everything right (or everything you
need) at first attempt. Be prepared to iterate.
4 Universal Steps
Agenda
✤ What is Big Data?
✤ Big Data Examples
✤ How to Tackle a Big Data Problem
Part I Part II
✤ Sentiment Analysis
✤ Big Data tools
Sentiment Analysis
What is Sentiment Analysis?
Sentiment Analysis according to Oxford[14]:
The process of computationally identifying and categorising
opinions expressed in a piece of text, especially in order to
determine whether the writer’s attitude towards a particular
topic, product, etc. is positive, negative, or neutral.
Operative definition in steps:
Trying to understand what people think about a subject,
from what they write,
automatically,
producing a measure of what they think.
Sentiment Analysis - What is Sentiment Analysis?
The challenge:
Sentiment Analysis - What is Sentiment Analysis?
Hundreds (if not more) of scientific papers have been
published on this topic.
None of the problem is solved, applications are flourishing
(plenty of space for new ideas)
What humans readily grasp from context is very difficult for
computers to detect.
Abbreviations, bad spelling and grammar, sarcasm, irony,
slang, idiom and personality
Show me the data!
Where is the sentiment expressed?
Activity on social network
Survey
CRM notes
Reviews (movies, restaurants, events,…)
Blogs
News
Sentiment Analysis - What is Sentiment Analysis?
Why is it important?
Today people are different, they are:
1. more digital/technological
2. more connected
3. less loyal to brands
Communication is bidirectional and people’s reach is large
The People, not the Companies, have the power …
… and they are not afraid to use it.
Sentiment Analysis - Why is it important?
Nestle’ censors a Greenpeace video criticising the company
Domino’s Pizza employees post a video showing bad health
codes
United Airlines broke a guitar and did not reimburse
Some reasons to do sentiment analysis:
Gather feedback from customers (automatic, reliable)
• Give chance to react in real time
Sentiment as proxy of sales, opinions influence a lot
• To make predictions
Sentiment Analysis - Why is it important?
Gather information from/about competitions (so start
“listening”!)
• Find ways to get new customers
Sentiment Analysis - Techniques[13]
One Technique consists in (mainly) looking for:
Lexical choice, Negator, Intensifier, Modal operators
I bought an iPhone a few days ago. It is such a nice phone. The
touch screen is really cool. The voice quality is clear too. It is
much better than my old Blackberry, which was a terrible
phone and so difficult to type with its tiny keys. However, my
mother was mad with me as I did not tell her before I bought
the phone. She also thought the phone was too expensive.
Here is an (old) opinion:
Sentiment Analysis - Techniques
Lexical choice (words):
positive: nice, boost, benefit, brave
negative: terrible, conspire, catastrophe, cowardly
Negator: can flip the valence,
not, never
Intensifier: give the strength of the sentiment,
really, very, most
Modal operators: distinguish hypothetical from real situations
and weaken intensity,
might, could, should
A text can contain
multiples sentiments,
that will usually be
connected to each
other, maybe a
comparison (as for
products)
Analyse the whole text,
each sentence
Sentiment Analysis - Techniques
Lexical choice (words):
positive: nice, boost, benefit, brave
negative: terrible, conspire,
catastrophe, cowardly
Negator: can flip the valence,
not, never
Intensifier: give the strength of the
sentiment,
really, very, most
Modal operators: distinguish
hypothetical from real situations
and weaken intensity,
might, could, should
Sentiment Analysis - Techniques
There is a market of fake opinions!
Every opinion is a quintuple:
entity, feature, sentiment value, holder, time
Mike87 on 23-06-2009 “I bought an iPhone a few days ago. It is
such a nice phone. The touch screen is really cool. The voice
quality is clear too. It is much better than my old Blackberry,
which was a terrible phone and so difficult to type with its tiny
keys. However, my mother was mad with me as I did not tell her
before I bought the phone. She also thought the phone was too
expensive”
Sentiment Analysis - Techniques
(iPhone, GENERAL , +, Mike87, 23-06-2009)
(iPhone, touch_screen, +, Mike87, 23-06-2009)
…
We are making an unstructured data a structured data
An Operative Plan
Preliminary:
What’s your goal?
e.g. Reaction to my new product launch (1 month tail)
How can you obtain it?
e.g. Twitter, Facebook and related-field blogs (want to use
google alert?)
How can I measure it? Which KPI? Which test?
e.g. KPI: # of mentions/comments/posts, % of positive over
total; choose threshold values for the goal to be met (for each
KPI)
Universal step 1: Collect and Store The Data
Identify the data
tweets that mention the product (or the company?),
comments to your Facebook page posts, select the specific
blogs to follow
Setup a system that can get the data
create/buy some tool to get the data automatically and
programmatically
Store the data
somewhere useful for the project and for your company
(you don’t want to create new silos!)
Sentiment Analysis - An Operative Plan
Universal step 2: Clean The Data
Act on the data
deal with writer mistakes: replace, modify text
deal with program error: remove records
Sentiment Analysis - An Operative Plan
Universal step 3: Analyse The Data
Analyse the data, extract the sentiment
Build the KPI
Universal step 4: Visualise The Data
Learn from the numbers, you need to come out with a
story
e.g. Reaction was massive on Twitter and Facebook (2 x
threshold), initially very positive (1.5x), then reduce but
still good (1.3x); for blog posts the positive test was just
passed (1x)
Visualise the story,
create a dashboard to follow evolution in real-time
create a static infographics to describe what happened
Sentiment Analysis - An Operative Plan
Big Data Tools
What is Hadoop?
Apache Hadoop is an open-source software framework
for distributed storage and distributed processing of Big Data
on clusters of commodity hardware
Created in 2005 by Doug
Cutting and Mike Cafarella
Named it after a toy elephant
(Cutting son). Originally
developed to support the Nutch
search engine project
The base Apache Hadoop framework is composed of the
following modules:
1. Hadoop Common – libraries and utilities for other modules
2. Hadoop Distributed File System (HDFS) – a distributed
file-system that splits files into large blocks and distribute
them among the machines
3. Hadoop MapReduce – a programming model for large
scale data processing. MapReduce ships code (.jar files) to
the nodes that have the required data, and the nodes then
process the data in parallel.
4. Hadoop YARN - resource-management platform
Big Data Tools - What is Hadoop?
The Hadoop Ecosystem
Since 2012, "Hadoop"
often refers not to just
the base modules but
rather to the Hadoop
Ecosystem,
which includes all of
the additional
packages that can be
installed on top of or
alongside Hadoop.
Let us meet some of the “Hadoop tools”:
Hive
Pig
Sqoop
Oozie
Big Data Tools - The Hadoop Ecosystem
Both HIVE and PIG allow to run MapReduce jobs using simple
query languages
Big Data Tools - The Hadoop Ecosystem
Hive
provides a SQL-like interface to data and allows to impose a
schema on the data, and is best suited for structured and semi
structured data
Pig
translates the Pig Latin language so that scripts can run on
Hadoop. Best suited for data flow jobs, for semi-structured
and unstructured data
Sqoop
tool designed for efficiently transferring bulk data
between Apache Hadoop and structured data stores.
Big Data Tools - The Hadoop Ecosystem
Oozie
workflow scheduler system to manage Apache Hadoop jobs.
Oozie is integrated with the rest of the Hadoop Ecosystem
supporting several types of Hadoop jobs out of the box
(including Pig, Hive and Sqoop) as well as system specific jobs
(such as Java programs and shell scripts).
Big Data Tools - The Hadoop Ecosystem
Big Data with Microsoft
Hadoop can be deployed on premises as well as in the cloud.
The cloud allows organisations to deploy Hadoop without
hardware to acquire or specific setup expertise.
Vendors who currently have an offer for the cloud include
Microsoft, Amazon and Google.
Let us focus on Microsoft
The key product is: HDInsight for Microsoft Azure
Big Data Tools - Big Data with Microsoft
Azure is Microsoft Cloud Platform, that offers several services
Azure HDInsight
deploys and provisions Apache Hadoop clusters in the cloud,
it is compatible with: Ambari, Avro, HBase, HDFS, Hive,
Mahout, MapReduce and YARN, Oozie, Pig, Sqoop, Storm,
Zookeeper.
Azure Power Shell
A scripting environment to control and automate the
deployment and management of your workloads in Azure
Big Data Tools - Big Data with Microsoft
Windows Azure Blob Storage WASB
Blob Storage is a general-purpose Hadoop-compatible Azure
storage solution that integrates with HDInsight.
Store data in Azure (blob) instead that in the cluster (HDFS)
(Positive) Consequences:
Data are still there after you finish Map Reduce jobs and
turn the cluster down
Easier to share data with other applications
Big Data Tools - Big Data with Microsoft
Windows Azure Blob Storage WASB
Big Data Tools - Big Data with Microsoft
Excel on steroids,
thanks to some powerful add-ins
Power Query
allows to simplifies data discovery and access.
You can connect to data across a wide variety of sources,
including relational databases, Web and Hadoop
You can combine and refine the data
You can save queries and refresh the data
Big Data Tools - Big Data with Microsoft
Power Pivot
allows non specialised users to do some Business Intelligence
on different data sources and create interactive reports,
sharable as web applications
Power View
is a very interactive data exploration, visualisation and
presentation tool
Power Map
is a data visualisation tool that allows to plot geographic and
temporal data on a 3D map, show it over time, and create
visual tours
it.linkedin.com/in/lucanaso/
@NasoLuca
Contacts
www.edisonweb.com
References
Big Data & Digital Marketing
Most of the original material
has been posted on:

More Related Content

What's hot

Trends in Big Data & Business Challenges
Trends in Big Data & Business Challenges   Trends in Big Data & Business Challenges
Trends in Big Data & Business Challenges Experian_US
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaSkillspeed
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)Sonu Gupta
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBernard Marr
 
Fundamentals of Big Data in 2 minutes!!
Fundamentals of Big Data in  2 minutes!!Fundamentals of Big Data in  2 minutes!!
Fundamentals of Big Data in 2 minutes!!Simplify360
 
Data science innovations
Data science innovations Data science innovations
Data science innovations suresh sood
 
Big data analytics and its impact on internet users
Big data analytics and its impact on internet usersBig data analytics and its impact on internet users
Big data analytics and its impact on internet usersStruggler Ever
 
The Pros and Cons of Big Data in an ePatient World
The Pros and Cons of Big Data in an ePatient WorldThe Pros and Cons of Big Data in an ePatient World
The Pros and Cons of Big Data in an ePatient WorldPYA, P.C.
 
Impact of big data on analytics
Impact of big data on analyticsImpact of big data on analytics
Impact of big data on analyticsCapgemini
 
BIG DATA | How to explain it & how to use it for your career?
BIG DATA | How to explain it & how to use it for your career?BIG DATA | How to explain it & how to use it for your career?
BIG DATA | How to explain it & how to use it for your career?Tuan Yang
 
The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...IBM Analytics
 
Challenges and outlook with Big Data
Challenges and outlook with Big Data Challenges and outlook with Big Data
Challenges and outlook with Big Data IJCERT JOURNAL
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Gregg Barrett
 
A Journey into bringing (Artificial) Intelligence to the Enterprise
A Journey into bringing (Artificial) Intelligence to the EnterpriseA Journey into bringing (Artificial) Intelligence to the Enterprise
A Journey into bringing (Artificial) Intelligence to the EnterprisePatrick Deglon
 

What's hot (20)

Trends in Big Data & Business Challenges
Trends in Big Data & Business Challenges   Trends in Big Data & Business Challenges
Trends in Big Data & Business Challenges
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social Media
 
NewMR 2016 presents: 9 Big Applications of Big Data
NewMR 2016 presents: 9 Big Applications of Big DataNewMR 2016 presents: 9 Big Applications of Big Data
NewMR 2016 presents: 9 Big Applications of Big Data
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)
 
Business analytics
Business analyticsBusiness analytics
Business analytics
 
Big data
Big dataBig data
Big data
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
 
Fundamentals of Big Data in 2 minutes!!
Fundamentals of Big Data in  2 minutes!!Fundamentals of Big Data in  2 minutes!!
Fundamentals of Big Data in 2 minutes!!
 
Data science innovations
Data science innovations Data science innovations
Data science innovations
 
Big data analytics and its impact on internet users
Big data analytics and its impact on internet usersBig data analytics and its impact on internet users
Big data analytics and its impact on internet users
 
Big Data Analysis
Big Data AnalysisBig Data Analysis
Big Data Analysis
 
The Pros and Cons of Big Data in an ePatient World
The Pros and Cons of Big Data in an ePatient WorldThe Pros and Cons of Big Data in an ePatient World
The Pros and Cons of Big Data in an ePatient World
 
Impact of big data on analytics
Impact of big data on analyticsImpact of big data on analytics
Impact of big data on analytics
 
BIG DATA | How to explain it & how to use it for your career?
BIG DATA | How to explain it & how to use it for your career?BIG DATA | How to explain it & how to use it for your career?
BIG DATA | How to explain it & how to use it for your career?
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
 
Big Data
Big DataBig Data
Big Data
 
The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...
 
Challenges and outlook with Big Data
Challenges and outlook with Big Data Challenges and outlook with Big Data
Challenges and outlook with Big Data
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?
 
A Journey into bringing (Artificial) Intelligence to the Enterprise
A Journey into bringing (Artificial) Intelligence to the EnterpriseA Journey into bringing (Artificial) Intelligence to the Enterprise
A Journey into bringing (Artificial) Intelligence to the Enterprise
 

Viewers also liked

Working With Big Data
Working With Big DataWorking With Big Data
Working With Big DataSeth Familian
 
How to plan a successful Digital Signage Campaign in 5 steps
How to plan a successful Digital Signage Campaign in 5 stepsHow to plan a successful Digital Signage Campaign in 5 steps
How to plan a successful Digital Signage Campaign in 5 stepsLuca Naso
 
Problem Solving
Problem SolvingProblem Solving
Problem SolvingLuca Naso
 
Stalling investments in infrastructure and the expanding infra debt burden in...
Stalling investments in infrastructure and the expanding infra debt burden in...Stalling investments in infrastructure and the expanding infra debt burden in...
Stalling investments in infrastructure and the expanding infra debt burden in...Kyna Tsai
 
Entirely tailored sentiment analysis - MeaningCloud webinar
Entirely tailored sentiment analysis - MeaningCloud webinarEntirely tailored sentiment analysis - MeaningCloud webinar
Entirely tailored sentiment analysis - MeaningCloud webinarMeaningCloud
 
Big Data - 25 Facts to Know
Big Data - 25 Facts to KnowBig Data - 25 Facts to Know
Big Data - 25 Facts to KnowTamela Coval
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simpleDori Waldman
 
Bringing big data to life
Bringing big data to lifeBringing big data to life
Bringing big data to lifeSKIM
 
5 Big Data stats that will convince your boss
5 Big Data stats that will convince your boss5 Big Data stats that will convince your boss
5 Big Data stats that will convince your bossRedPixie
 
Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingAmir Sedighi
 
Business Intelligence, Analytics e Big Data: una guida per capire e orientarsi
Business Intelligence, Analytics e Big Data: una guida per capire e orientarsiBusiness Intelligence, Analytics e Big Data: una guida per capire e orientarsi
Business Intelligence, Analytics e Big Data: una guida per capire e orientarsiSMAU
 
Big Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business SchoolBig Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business SchoolLaurent Kinet
 
Guide to big data analytics
Guide to big data analyticsGuide to big data analytics
Guide to big data analyticsGahya Pandian
 

Viewers also liked (20)

Working With Big Data
Working With Big DataWorking With Big Data
Working With Big Data
 
What is big data?
What is big data?What is big data?
What is big data?
 
How to plan a successful Digital Signage Campaign in 5 steps
How to plan a successful Digital Signage Campaign in 5 stepsHow to plan a successful Digital Signage Campaign in 5 steps
How to plan a successful Digital Signage Campaign in 5 steps
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Problem Solving
Problem SolvingProblem Solving
Problem Solving
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Stalling investments in infrastructure and the expanding infra debt burden in...
Stalling investments in infrastructure and the expanding infra debt burden in...Stalling investments in infrastructure and the expanding infra debt burden in...
Stalling investments in infrastructure and the expanding infra debt burden in...
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Entirely tailored sentiment analysis - MeaningCloud webinar
Entirely tailored sentiment analysis - MeaningCloud webinarEntirely tailored sentiment analysis - MeaningCloud webinar
Entirely tailored sentiment analysis - MeaningCloud webinar
 
Big Data - 25 Facts to Know
Big Data - 25 Facts to KnowBig Data - 25 Facts to Know
Big Data - 25 Facts to Know
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simple
 
Bringing big data to life
Bringing big data to lifeBringing big data to life
Bringing big data to life
 
5 Big Data stats that will convince your boss
5 Big Data stats that will convince your boss5 Big Data stats that will convince your boss
5 Big Data stats that will convince your boss
 
Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData Processing
 
Big data Hadoop
Big data  Hadoop   Big data  Hadoop
Big data Hadoop
 
Business Intelligence, Analytics e Big Data: una guida per capire e orientarsi
Business Intelligence, Analytics e Big Data: una guida per capire e orientarsiBusiness Intelligence, Analytics e Big Data: una guida per capire e orientarsi
Business Intelligence, Analytics e Big Data: una guida per capire e orientarsi
 
Big Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business SchoolBig Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business School
 
Guide to big data analytics
Guide to big data analyticsGuide to big data analytics
Guide to big data analytics
 
The Big Data Revolution in Retail
The Big Data Revolution in RetailThe Big Data Revolution in Retail
The Big Data Revolution in Retail
 

Similar to The What, Why and How of Big Data

Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementationSandip Tipayle Patil
 
The REAL Impact of Big Data on Privacy
The REAL Impact of Big Data on PrivacyThe REAL Impact of Big Data on Privacy
The REAL Impact of Big Data on PrivacyClaudiu Popa
 
INN530 - Assignment 2, Big data and cloud computing for management
INN530 - Assignment 2, Big data and cloud computing for managementINN530 - Assignment 2, Big data and cloud computing for management
INN530 - Assignment 2, Big data and cloud computing for managementSimen Smaaberg
 
Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...
Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...
Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...Arab Federation for Digital Economy
 
hariri2019.pdf
hariri2019.pdfhariri2019.pdf
hariri2019.pdfAkuhuruf
 
Big data 2 4 - big-social-predicting-behavior-with-big-data
Big data 2 4 - big-social-predicting-behavior-with-big-dataBig data 2 4 - big-social-predicting-behavior-with-big-data
Big data 2 4 - big-social-predicting-behavior-with-big-dataRick Bouter
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Hritika Raj
 
big data on science of analytics and innovativeness among udergraduate studen...
big data on science of analytics and innovativeness among udergraduate studen...big data on science of analytics and innovativeness among udergraduate studen...
big data on science of analytics and innovativeness among udergraduate studen...johnmutiso245
 
big data on science of analytics and innovativeness among udergraduate studen...
big data on science of analytics and innovativeness among udergraduate studen...big data on science of analytics and innovativeness among udergraduate studen...
big data on science of analytics and innovativeness among udergraduate studen...johnmutiso245
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAkshata Humbe
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big DataSonovate
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Oomph! Recruitment
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalIIIT Allahabad
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big dataHari Priya
 

Similar to The What, Why and How of Big Data (20)

big-data.pdf
big-data.pdfbig-data.pdf
big-data.pdf
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
The REAL Impact of Big Data on Privacy
The REAL Impact of Big Data on PrivacyThe REAL Impact of Big Data on Privacy
The REAL Impact of Big Data on Privacy
 
INN530 - Assignment 2, Big data and cloud computing for management
INN530 - Assignment 2, Big data and cloud computing for managementINN530 - Assignment 2, Big data and cloud computing for management
INN530 - Assignment 2, Big data and cloud computing for management
 
Big Data Challenges faced by Organizations
Big Data Challenges faced by OrganizationsBig Data Challenges faced by Organizations
Big Data Challenges faced by Organizations
 
Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...
Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...
Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...
 
hariri2019.pdf
hariri2019.pdfhariri2019.pdf
hariri2019.pdf
 
Unit III.pdf
Unit III.pdfUnit III.pdf
Unit III.pdf
 
Big data 2 4 - big-social-predicting-behavior-with-big-data
Big data 2 4 - big-social-predicting-behavior-with-big-dataBig data 2 4 - big-social-predicting-behavior-with-big-data
Big data 2 4 - big-social-predicting-behavior-with-big-data
 
Big data
Big dataBig data
Big data
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
Bigdata Hadoop introduction
Bigdata Hadoop introductionBigdata Hadoop introduction
Bigdata Hadoop introduction
 
big data on science of analytics and innovativeness among udergraduate studen...
big data on science of analytics and innovativeness among udergraduate studen...big data on science of analytics and innovativeness among udergraduate studen...
big data on science of analytics and innovativeness among udergraduate studen...
 
big data on science of analytics and innovativeness among udergraduate studen...
big data on science of analytics and innovativeness among udergraduate studen...big data on science of analytics and innovativeness among udergraduate studen...
big data on science of analytics and innovativeness among udergraduate studen...
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big Data
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
Complete-SRS.doc
Complete-SRS.docComplete-SRS.doc
Complete-SRS.doc
 

More from Luca Naso

Workshop introduttivo al Machine Learning in Python
Workshop introduttivo al Machine Learning in PythonWorkshop introduttivo al Machine Learning in Python
Workshop introduttivo al Machine Learning in PythonLuca Naso
 
Machine Learning Workshop - EPS YM CT - parte 1 (8 Maggio 2021) Edit Details
Machine Learning Workshop - EPS YM CT - parte 1 (8 Maggio 2021) Edit DetailsMachine Learning Workshop - EPS YM CT - parte 1 (8 Maggio 2021) Edit Details
Machine Learning Workshop - EPS YM CT - parte 1 (8 Maggio 2021) Edit DetailsLuca Naso
 
Machine learning: Definizione e Tipologie
Machine learning: Definizione e TipologieMachine learning: Definizione e Tipologie
Machine learning: Definizione e TipologieLuca Naso
 
Big Data - Breve panoramica
Big Data - Breve panoramicaBig Data - Breve panoramica
Big Data - Breve panoramicaLuca Naso
 
Machine Learning - Breve panoramica
Machine Learning - Breve panoramicaMachine Learning - Breve panoramica
Machine Learning - Breve panoramicaLuca Naso
 
Cos'è il Machine Learning?
Cos'è il Machine Learning?Cos'è il Machine Learning?
Cos'è il Machine Learning?Luca Naso
 
Introduzione sul Machine Learning
Introduzione sul Machine LearningIntroduzione sul Machine Learning
Introduzione sul Machine LearningLuca Naso
 

More from Luca Naso (7)

Workshop introduttivo al Machine Learning in Python
Workshop introduttivo al Machine Learning in PythonWorkshop introduttivo al Machine Learning in Python
Workshop introduttivo al Machine Learning in Python
 
Machine Learning Workshop - EPS YM CT - parte 1 (8 Maggio 2021) Edit Details
Machine Learning Workshop - EPS YM CT - parte 1 (8 Maggio 2021) Edit DetailsMachine Learning Workshop - EPS YM CT - parte 1 (8 Maggio 2021) Edit Details
Machine Learning Workshop - EPS YM CT - parte 1 (8 Maggio 2021) Edit Details
 
Machine learning: Definizione e Tipologie
Machine learning: Definizione e TipologieMachine learning: Definizione e Tipologie
Machine learning: Definizione e Tipologie
 
Big Data - Breve panoramica
Big Data - Breve panoramicaBig Data - Breve panoramica
Big Data - Breve panoramica
 
Machine Learning - Breve panoramica
Machine Learning - Breve panoramicaMachine Learning - Breve panoramica
Machine Learning - Breve panoramica
 
Cos'è il Machine Learning?
Cos'è il Machine Learning?Cos'è il Machine Learning?
Cos'è il Machine Learning?
 
Introduzione sul Machine Learning
Introduzione sul Machine LearningIntroduzione sul Machine Learning
Introduzione sul Machine Learning
 

Recently uploaded

Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 

Recently uploaded (20)

Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 

The What, Why and How of Big Data

  • 1. December 2nd, 2014 @NasoLuca The What, Why and How of Definitions, Examples, Suggestions, Howtos and much more
  • 2. Agenda ✤ What is Big Data? ✤ Big Data Examples ✤ How to Tackle a Big Data Problem ✤ Sentiment Analysis ✤ Big Data tools Part I Part II
  • 3. How relevant is it? Big Data Social Media Digital Marketing Machine Learning Computer Vision Who’s more relevant to the people? Let’s ask Google!
  • 4. How relevant is it? Big Data Social Media Digital Marketing Machine Learning Computer Vision Google Trends From 2007 to end 2014
  • 5. Big Data Market What is Big Data? How relevant is it? Jobs to support Big Data In 2012 it was $28B, for 2013 expected $37B Scattered across a number of IT landscapes. 45% for new social network analysis and content analytics tools[1] 4.4 Million IT jobs globally by 2015, 1.9m in the US[1] By 2018, the US alone could face a shortage of 200k people with deep analytical skills as well as 1.5m managers and analysts[2]
  • 6. Definition Big Data according to Oxford Dictionary[3]: big data n. Computing (also with capital initials) data of a very large size, typically to the extent that its manipulation and management present significant logistical challenges; (also) the branch of computing involving such data. Big Data according to Gartner[4]: Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. This is where the 3 Vs originated from: Volume Velocity Variety
  • 7. VOLUME About: Amount of data. Unit: bytes What is Big Data? Definition Information about the general population, education, health, medicine, travel, geographic locations, shopping, financial transactions, jobs, scientific experiments, emails, sensors, texts, photos, videos, activity on social networks … 2.5 Exabytes of data are created each day worldwide[5] Facebook (2012): 200 PB of data each year In 3 years CERN collected 75 PB of data (with LHC) Most of US company have 100 TB[5] 1 ZB = 10002 PB = 10003 TB = 10004 GB How much is Big Data? > 5 TB (as of 2014)
  • 8. VELOCITY About: moving data. Unit: bytes per seconds What is Big Data? Definition This really has two interpretations: Data Generation Rate or Data Processing Rate Every minute (2014)[6]: 200M emails 4M google search 277k more tweets 216k pictures on Instagram What’s the limit to be considered big data? As of 2014 Generation: time to reach 5TB < Project Life Time Processing: > 1 MB/s = 5TB/2mo
  • 9. VARIETY About: Form of the data. 3 Types: structured, semi-structured, unstructured What is Big Data? Definition 1. Structured = Data in a fixed field within a record (spreadsheets, Relational Database) 2. Semi-Structured = XML, JSON, CSV (Text with columns, with a separator) 3. Unstructured = Data stored without any model, or that does not have any organisation All of them can be Big Data
  • 10. What is Big Data? Definition VERACITY Lack of accuracy Data itself is often imprecise or incomplete (typos, empty fields, errors, source changes, …) The time of small and tidy samples is over This concludes the classical 3 Vs of Big Data. To better describe Big Data we can add a couple more Vs.
  • 11. VALUE About the actionable insights one can get What is Big Data? Definition People do not need data, they need insights which are hidden in the data: Value is a concentrated data-juice. Obtaining correct, but irrelevant, information is a waste of time, effort and resources. Close interactions between an analytics team and business managers can help you address the right questions.
  • 12. “Datafication” is the movement behind Big Data[7] What is Big Data? Implications Big Data implicitly requires 3 paradigm shifts: 1. from “some” to “all” 2. from “clean” to “messy” 3. from “causation” to “correlation”
  • 13. What is Big Data? Implications
  • 14. What is Big Data? Implications http://xkcd.com/552/
  • 16. General Application Fields Not only business: Big Data have implications far beyond marketing and consumer goods It will profoundly change how governments work and alter the nature of politics and our daily life too (smart cities). When it comes to generating economic growth, providing public services, or fighting wars, those who can harness big data effectively will have a significant edge over others.
  • 17. Forbes think that it will influence us in 5 ways[8]: 1. how we spend 2. how we vote 3. how we study 4. how we stay healthy 5. how we keep/lose privacy Big Data Examples - General Application Fields
  • 18. 1. Fire-prevention @ New York City[7] Big Data Examples - Real Life Applications
  • 19. Problem Imbalance between needs and resource Too many complaints (25,000 per year) too few inspectors (200). You want your inspectors to tackle the most relevant cases only/first. How to prioritise the complaints? 1. Fire-prevention @ New York City[7] Big Data Examples - Real Life Applications
  • 20. 1. Fire-prevention @ New York City[7] Solution a. Database with information about buildings (crime rates, ambulance visits, utility usage, missed payments, …) b. Compare database to records of building fires, looking for correlations c. Estimate the probability of fire for each of the complaint Big Data Examples - Real Life Applications Result The efficiency of the inspectors raised from 13% to 70% Among the predictors of a fire were: the type of building and the year it was built permits for exterior brickwork correlated with lower risks
  • 21. 2. Improve Formula 1 car performance[9] Big Data Examples - Real Life Applications
  • 22. 2. Improve Formula 1 car performance[9] Big Data Examples - Real Life Applications Why is this Big Data? Volume = average 10+ TB of data at each GP per team Velocity = teams take decisions in <~ 30 seconds Main goals 1. get real time alarms on brakes, tires, fuel and other factors that affect car performance during a race 2. find ways to improve car performance in the long term
  • 23. 2. Improve Formula 1 car performance[9] a. Collect data:130-160 sensors on a car during race, plus weather conditions, track conditions … b. Compare data with records of success/failures c. Look for correlations to get (1) real-time alarms and (2) long term insights Big Data Examples - Real Life Applications $1B cost of saving 0.1s from a single lap $60M money spent by a team on a supercomputer
  • 24. 3. Predict Flu Outbreak in Real-Time Big Data Examples - Real Life Applications
  • 25. 3. Predict Flu Outbreak in Real-Time Flu can spread very fast with catastrophic consequences, traditional methods can be too slow. Each day, millions of users around the world search for health information online. As you might expect, there are more flu- related searches during flu season. Of course, not every person who searches for "flu" is actually sick, but a pattern emerges when all the flu-related search queries are added together. Big Data Examples - Real Life Applications
  • 26. 3. Predict Flu Outbreak in Real-Time a. Collect data: keywords searched on the web; data collected by national medical authorities (US Centers for Disease Control and Prevention - CDC) b. Compare the trends of search queries (top 50M) with the records in real data c. Find the keywords that correlate with the actual trends, to make predictions based on current searches. Big Data Examples - Real Life Applications There are 45 keywords that correlate well with the historical data The predictions from this system can improve the CDC data by up to 50% [Royal Society Open Science, 2014]
  • 27. 3. Predict Flu Outbreak in Real-Time Big Data Examples - Real Life Applications Orange: US real data Blue: predictions based on keywords
  • 28. 3. Predict Flu Outbreak in Real-Time Big Data Examples - Real Life Applications Google Flu Trend GFT project: www.google.org/flutrends/ Published in Nature in 2009[10] Example of power of Big Data and of failure of Big Data.
  • 29. 4. Reduce injuries in sports[11] Big Data Examples - Real Life Applications
  • 30. 4. Reduce injuries in sports[11] Big Data Examples - Real Life Applications Injuries are probably the largest market inefficiency in pro sports In 2013, teams in the Major League Baseball spent $665 million on the salaries of injured players and replacements Goal anticipate when an athlete will get hurt before it actually happens so to avoid it
  • 31. 4. Reduce injuries in sports[11] a. Collect data: data about how players actually move (accelerations, elevations, jumping ranges, …) and at what intensity. b. Compare with records of injuries; let doctors analyse the data c. Predict the chances to get an injury and intervene before it happens both during workouts or matches Big Data Examples - Real Life Applications Founded in 2006, Catapult sales have increased ~70% for six consecutive years and is on track to gross $20 million in 2013.
  • 32. 5. Running massive multiplayer games Big Data Examples - Real Life Applications
  • 33. “Infinity Challenge”, a massive 5 week online battle. Two needs: handle massive amount of data in almost real time to update leaderboards and detect cheaters. Big Data Examples - Real Life Applications The development team was taking these insights and updating the game almost weekly, using direct player feedback to tweak the game. Behind the scenes there was the Microsoft Big Data cloud platform - HDInsight on Azure. 5. Running massive multiplayer games
  • 34. 6. Transparency of Governments Improving politics for all Big Data Examples - Real Life Applications
  • 35. 6. Transparency of Governments Improving politics for all In 2009 the US government started www.data.gov Today there are 133k datasets in different fields: Agriculture, Climate, Education, Energy, Finance, Geospatial, Global Development, Health, Jobs & Skills, Public Safety, Science & Research, Weather Big Data Examples - Real Life Applications Many countries have followed including Italy (from 2011): ~ 9k datasets from 80 PA Code4italy @Montecitorio
  • 36. The Dark Side There is one massive downside to this: Privacy concerns Do we really want all our data to be logged and stored? Data that can say where we are everyday, which products we buy, which movie we watch, how fast (or slow) we drive our car, where we park it, which roads we usually take, where we go with out bike, how much exercise we do (or don’t), what we eat, how much we spend, which drugs we take, … Security issues: track my position, steal my identity Not all applications are customer-centric: insurance companies (use data to increase costs)
  • 37. Governments need to protect citizens against unhealthy market dominance: data antitrust Also, they need to regulate better the ways companies ask and get the data (just asking for permission with Terms of Use is not enough!) Big Data Examples - The Dark Side At present the control of information is being taken away from citizens The danger is that individuals will not be able to control the ways they are monitored or what happens to the information
  • 38. How to Tackle a Big Data Problem
  • 39. Preliminary Steps First things first: check if it really is a Big Data problem From the examples we have seen that common 3 steps are: 1. collect data 2. find correlations (compare with historical records) 3. make predictions Do not follow these steps! These are relevant phases to execute a Big Data project, once everything is in place.
  • 40. Preliminary steps: 1. Goals and timescale what you want to achieve and by when 2. Data which data you have or need to get 3. Team which skills you need (can change with data) 4. Silo breaking connections you need to create (crm, it, marketing) 5. Budget how much money you can put overall (business stakeholders) How to Tackle a Big Data Problem - Preliminary Steps
  • 41. How to Tackle a Big Data Problem - Four Universal Steps 1. Collect & store data (source, privacy, real-time) 2. Clean data (na, errors) 3. Analyse data (correlations) 4. Visualise data (kpi) It is very unluckily to get everything right (or everything you need) at first attempt. Be prepared to iterate. 4 Universal Steps
  • 42. Agenda ✤ What is Big Data? ✤ Big Data Examples ✤ How to Tackle a Big Data Problem Part I Part II ✤ Sentiment Analysis ✤ Big Data tools
  • 44. What is Sentiment Analysis? Sentiment Analysis according to Oxford[14]: The process of computationally identifying and categorising opinions expressed in a piece of text, especially in order to determine whether the writer’s attitude towards a particular topic, product, etc. is positive, negative, or neutral.
  • 45. Operative definition in steps: Trying to understand what people think about a subject, from what they write, automatically, producing a measure of what they think. Sentiment Analysis - What is Sentiment Analysis?
  • 46. The challenge: Sentiment Analysis - What is Sentiment Analysis? Hundreds (if not more) of scientific papers have been published on this topic. None of the problem is solved, applications are flourishing (plenty of space for new ideas) What humans readily grasp from context is very difficult for computers to detect. Abbreviations, bad spelling and grammar, sarcasm, irony, slang, idiom and personality
  • 47. Show me the data! Where is the sentiment expressed? Activity on social network Survey CRM notes Reviews (movies, restaurants, events,…) Blogs News Sentiment Analysis - What is Sentiment Analysis?
  • 48. Why is it important? Today people are different, they are: 1. more digital/technological 2. more connected 3. less loyal to brands Communication is bidirectional and people’s reach is large The People, not the Companies, have the power … … and they are not afraid to use it.
  • 49. Sentiment Analysis - Why is it important? Nestle’ censors a Greenpeace video criticising the company Domino’s Pizza employees post a video showing bad health codes United Airlines broke a guitar and did not reimburse
  • 50. Some reasons to do sentiment analysis: Gather feedback from customers (automatic, reliable) • Give chance to react in real time Sentiment as proxy of sales, opinions influence a lot • To make predictions Sentiment Analysis - Why is it important? Gather information from/about competitions (so start “listening”!) • Find ways to get new customers
  • 51. Sentiment Analysis - Techniques[13] One Technique consists in (mainly) looking for: Lexical choice, Negator, Intensifier, Modal operators I bought an iPhone a few days ago. It is such a nice phone. The touch screen is really cool. The voice quality is clear too. It is much better than my old Blackberry, which was a terrible phone and so difficult to type with its tiny keys. However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was too expensive. Here is an (old) opinion:
  • 52. Sentiment Analysis - Techniques Lexical choice (words): positive: nice, boost, benefit, brave negative: terrible, conspire, catastrophe, cowardly Negator: can flip the valence, not, never Intensifier: give the strength of the sentiment, really, very, most Modal operators: distinguish hypothetical from real situations and weaken intensity, might, could, should
  • 53. A text can contain multiples sentiments, that will usually be connected to each other, maybe a comparison (as for products) Analyse the whole text, each sentence Sentiment Analysis - Techniques Lexical choice (words): positive: nice, boost, benefit, brave negative: terrible, conspire, catastrophe, cowardly Negator: can flip the valence, not, never Intensifier: give the strength of the sentiment, really, very, most Modal operators: distinguish hypothetical from real situations and weaken intensity, might, could, should
  • 54. Sentiment Analysis - Techniques There is a market of fake opinions!
  • 55. Every opinion is a quintuple: entity, feature, sentiment value, holder, time Mike87 on 23-06-2009 “I bought an iPhone a few days ago. It is such a nice phone. The touch screen is really cool. The voice quality is clear too. It is much better than my old Blackberry, which was a terrible phone and so difficult to type with its tiny keys. However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was too expensive” Sentiment Analysis - Techniques (iPhone, GENERAL , +, Mike87, 23-06-2009) (iPhone, touch_screen, +, Mike87, 23-06-2009) … We are making an unstructured data a structured data
  • 56. An Operative Plan Preliminary: What’s your goal? e.g. Reaction to my new product launch (1 month tail) How can you obtain it? e.g. Twitter, Facebook and related-field blogs (want to use google alert?) How can I measure it? Which KPI? Which test? e.g. KPI: # of mentions/comments/posts, % of positive over total; choose threshold values for the goal to be met (for each KPI)
  • 57. Universal step 1: Collect and Store The Data Identify the data tweets that mention the product (or the company?), comments to your Facebook page posts, select the specific blogs to follow Setup a system that can get the data create/buy some tool to get the data automatically and programmatically Store the data somewhere useful for the project and for your company (you don’t want to create new silos!) Sentiment Analysis - An Operative Plan
  • 58. Universal step 2: Clean The Data Act on the data deal with writer mistakes: replace, modify text deal with program error: remove records Sentiment Analysis - An Operative Plan Universal step 3: Analyse The Data Analyse the data, extract the sentiment Build the KPI
  • 59. Universal step 4: Visualise The Data Learn from the numbers, you need to come out with a story e.g. Reaction was massive on Twitter and Facebook (2 x threshold), initially very positive (1.5x), then reduce but still good (1.3x); for blog posts the positive test was just passed (1x) Visualise the story, create a dashboard to follow evolution in real-time create a static infographics to describe what happened Sentiment Analysis - An Operative Plan
  • 61. What is Hadoop? Apache Hadoop is an open-source software framework for distributed storage and distributed processing of Big Data on clusters of commodity hardware Created in 2005 by Doug Cutting and Mike Cafarella Named it after a toy elephant (Cutting son). Originally developed to support the Nutch search engine project
  • 62. The base Apache Hadoop framework is composed of the following modules: 1. Hadoop Common – libraries and utilities for other modules 2. Hadoop Distributed File System (HDFS) – a distributed file-system that splits files into large blocks and distribute them among the machines 3. Hadoop MapReduce – a programming model for large scale data processing. MapReduce ships code (.jar files) to the nodes that have the required data, and the nodes then process the data in parallel. 4. Hadoop YARN - resource-management platform Big Data Tools - What is Hadoop?
  • 63. The Hadoop Ecosystem Since 2012, "Hadoop" often refers not to just the base modules but rather to the Hadoop Ecosystem, which includes all of the additional packages that can be installed on top of or alongside Hadoop.
  • 64. Let us meet some of the “Hadoop tools”: Hive Pig Sqoop Oozie Big Data Tools - The Hadoop Ecosystem
  • 65. Both HIVE and PIG allow to run MapReduce jobs using simple query languages Big Data Tools - The Hadoop Ecosystem Hive provides a SQL-like interface to data and allows to impose a schema on the data, and is best suited for structured and semi structured data Pig translates the Pig Latin language so that scripts can run on Hadoop. Best suited for data flow jobs, for semi-structured and unstructured data
  • 66. Sqoop tool designed for efficiently transferring bulk data between Apache Hadoop and structured data stores. Big Data Tools - The Hadoop Ecosystem Oozie workflow scheduler system to manage Apache Hadoop jobs. Oozie is integrated with the rest of the Hadoop Ecosystem supporting several types of Hadoop jobs out of the box (including Pig, Hive and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
  • 67. Big Data Tools - The Hadoop Ecosystem
  • 68. Big Data with Microsoft Hadoop can be deployed on premises as well as in the cloud. The cloud allows organisations to deploy Hadoop without hardware to acquire or specific setup expertise. Vendors who currently have an offer for the cloud include Microsoft, Amazon and Google. Let us focus on Microsoft The key product is: HDInsight for Microsoft Azure
  • 69. Big Data Tools - Big Data with Microsoft Azure is Microsoft Cloud Platform, that offers several services Azure HDInsight deploys and provisions Apache Hadoop clusters in the cloud, it is compatible with: Ambari, Avro, HBase, HDFS, Hive, Mahout, MapReduce and YARN, Oozie, Pig, Sqoop, Storm, Zookeeper. Azure Power Shell A scripting environment to control and automate the deployment and management of your workloads in Azure
  • 70. Big Data Tools - Big Data with Microsoft Windows Azure Blob Storage WASB Blob Storage is a general-purpose Hadoop-compatible Azure storage solution that integrates with HDInsight. Store data in Azure (blob) instead that in the cluster (HDFS) (Positive) Consequences: Data are still there after you finish Map Reduce jobs and turn the cluster down Easier to share data with other applications
  • 71. Big Data Tools - Big Data with Microsoft Windows Azure Blob Storage WASB
  • 72. Big Data Tools - Big Data with Microsoft Excel on steroids, thanks to some powerful add-ins Power Query allows to simplifies data discovery and access. You can connect to data across a wide variety of sources, including relational databases, Web and Hadoop You can combine and refine the data You can save queries and refresh the data
  • 73. Big Data Tools - Big Data with Microsoft Power Pivot allows non specialised users to do some Business Intelligence on different data sources and create interactive reports, sharable as web applications Power View is a very interactive data exploration, visualisation and presentation tool Power Map is a data visualisation tool that allows to plot geographic and temporal data on a 3D map, show it over time, and create visual tours
  • 75. References Big Data & Digital Marketing Most of the original material has been posted on:

Editor's Notes

  1. 300 BC: Library of Alexandria Today: 320 Alexandria Library per person What about our knowledge level?
  2. 1. how we spend: real-time & targeted 2. vote: micro-targeting & n=all approach 3. study: reduce drop-out in schools[9] 4. stay healthy: wearable monitor 24h. Analyse to get suggestion on better life style[10] 5. keep/lose privacy: Big Data also attracts criminal hackers and identity thieves
  3. Imbalance between needs and resource: Too many complaints (25,000 per year) too few inspectors (200). You want your inspectors to tackle the most relevant cases only/first.
  4. Volume = 240+ TB of data at each GP per team. Velocity = take a decisions in <~ 30 seconds. Main goal: to get real time alarms on brakes, tires, fuel and other factors that affect car performance during a race.
  5. Flu can spread very fast with catastrophic consequences, traditional methods can be too slow. Each day, millions of users around the world search for health information online. As you might expect, there are more flu-related searches during flu season.
  6. Of course, not every person who searches for "flu" is actually sick, but a pattern emerges when all the flu-related search queries are added together. 
  7. Of course, not every person who searches for "flu" is actually sick, but a pattern emerges when all the flu-related search queries are added together. 
  8. Predictions based only on GFT can be very inaccurate; if used a complementary tool is incredibly powerful.
  9. Injuries = largest market inefficiency in pro sports in 2013 teams in the Major League Baseball spent $665 million on the salaries of injured players and replacements.  Goal: anticipate when an athlete will get hurt before it actually happens.
  10. Big Data can help in running massive multiplayer games with success and also in tailoring games on the player
  11. Open data: certain data should be freely available to everyone to use and republish as they wish.
  12. Companies such as Google, Amazon, and Facebook are amassing vast amounts of information on everyone and everything. 
  13. Sentiment Analysis according to New York Times[15]: translating the vagaries of human emotion into hard data, mining the web for feelings, not facts
  14. Sentiment is categorised in negative, neutral and positive. There are several works trying to get more states.