SlideShare une entreprise Scribd logo
1  sur  54
NASSCOM Future Skills Training
Course – Data Science & Analytics
Dhruv Saxena
Assistant Professor (TEQIP-NPIU)
1
2
3
4
5
6
7
Introduction
to
Data Science
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 8
OBJECTIVES
The objective of this course is to Impart necessary knowledge of the
mathematical foundations needed for data science and develop
programming skills required to build data science applications.
Duration – 60 Hours (40L + 20C)
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 10
LEARNING OUTCOMES
At the end of this course, the students will be able to:
● Demonstrate understanding of the mathematical foundations
needed for data science.
● Collect, explore, clean, munge and manipulate data.
● Implement models such as k-nearest Neighbors, Naïve Bayes,
linear and logistic regression, decision trees, neural networks and
clustering.
● Build data science applications using Python based toolkits.
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 11
Data, Big Data and Challenges
Data Science
◦ Introduction
◦ Why Data Science
Data Scientists
◦ What do they do?
Major/Concentration in Data Science
◦ What courses to take.
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 12
Data All Around
Lots of data is being collected and warehoused
◦Web data, e-commerce
◦Financial transactions, bank/credit transactions
◦Online trading and purchasing
◦Social Network
13
How Much Data Do We have?
Google processes 20 PB a day (2008)
Facebook has 60 TB of daily logs
eBay has 6.5 PB of user data + 50 TB/day (5/2009)
1000 genomes project: 200 TB
Cost of 1 TB of disk: $35
Time to read 1 TB disk: 3 hrs
(100 MB/s)
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 14
Big Data
Big Data is any data that is expensive to manage and hard to extract value
from
◦ Volume
◦ The size of the data
◦ Velocity
◦ The latency of data processing relative to the growing demand for interactivity
◦ Variety and Complexity
◦ the diversity of sources, formats, quality, structures.
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 15
Big Data
vs
Data Science
vs
Data Analytics
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 16
What is Data Science?
Dealing with unstructured and structured data, Data Science is a
field that comprises everything that related to data cleansing,
preparation, and analysis.
Data Science is the combination of statistics, mathematics,
programming, problem-solving, capturing data in ingenious ways,
the ability to look at things differently, and the activity of cleansing,
preparing, and aligning the data.
In simple terms, it is the umbrella of techniques used when trying
to extract insights and information from data.
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 17
What is Big Data?
Big Data refers to humongous volumes of data that cannot be processed effectively with
the traditional applications that exist. The processing of Big Data begins with the raw data
that isn’t aggregated and is most often impossible to store in the memory of a single
computer.
A buzzword that is used to describe immense volumes of data, both unstructured and
structured, Big Data inundates a business on a day-to-day basis. Big Data is something that
can be used to analyze insights that can lead to better decisions and strategic business
moves.
The definition of Big Data, given by Gartner, is, “Big data is high-volume, and high-velocity
or high-variety information assets that demand cost-effective, innovative forms of
information processing that enable enhanced insight, decision making, and process
automation.”
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 18
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 19
Big Data
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 20
What is Data Analytics?
Data Analytics the science of examining raw data to conclude that
information.
Data Analytics involves applying an algorithmic or mechanical process to
derive insights and, for example, running through several data sets to look for
meaningful correlations between each other.
It is used in several industries to allow organizations and companies to
make better decisions as well as verify and disprove existing theories or
models. The focus of Data Analytics lies in inference, which is the process of
deriving conclusions that are solely based on what the researcher already
knows.
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 21
Types of Data We Have
Relational Data (Tables/Transaction/Legacy Data)
Text Data (Web)
Semi-structured Data (XML)
Graph Data
Social Network, Semantic Web (RDF), …
Streaming Data
You can afford to scan the data once
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 22
What To Do With These Data?
Aggregation and Statistics
◦ Data warehousing and OLAP
Indexing, Searching, and Querying
◦ Keyword based search
◦ Pattern matching (XML/RDF)
Knowledge discovery
◦ Data Mining
◦ Statistical Modeling
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 23
Big Data and Data Science
“… the sexy job in the next 10 years will be statisticians,” Hal Varian, Google Chief
Economist
The U.S. will need 140,000-190,000 predictive analysts and 1.5 million managers/analysts
by 2018.
McKinsey Global Institute’s June 2011
India will be needing around 160,000+ Data Scientists by 2020 and World demand
predicted to be around 2.7million by 2020.
New Data Science institutes being created or repurposed – NYU, Columbia, Washington,
UCB,...
New degree programs, courses, boot-camps:
◦ e.g., at Berkeley: Stats, I-School, CS, Astronomy…
◦ One proposal (elsewhere) for an MS in “Big Data Science”
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 24
What is Data Science?
An area that manages, manipulates, extracts, and interprets knowledge from
tremendous amount of data.
Data science (DS) is a multidisciplinary field of study with goal to address the challenges
in big data.
Data science principles apply to all data – big and small.
Simply – Extraction of knowledge from large volumes of data that are structure or
unstructured.
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 25
What is Data Science?
Theories and techniques from many fields and disciplines are used to
investigate and analyze a large amount of data to help decision makers in
many industries such as science, engineering, economics, politics, finance,
and education.
◦ Computer Science
◦ Pattern recognition, visualization, data warehousing, High performance computing,
Databases, AI
◦ Mathematics
◦ Mathematical Modeling
◦ Statistics
◦ Statistical and Stochastic modeling, Probability.
Mr. Dhruv Saxena, Asst. Professor (TEQIP-NPIU) 26
Why is it sexy?
Gartner’s 2014 Hype Cycle
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 27
Data Science
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 28
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 29
Real Life Examples
Companies learn your secrets, shopping patterns, and preferences
◦ For example, can we know if a woman is pregnant, even if she doesn’t want us to know?
Target case study
Data Science and election (2008, 2012)
◦ 1 million people installed the Obama Facebook app that gave access to info on “friends”
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 30
Applications of Data Science
Internet Search
Search engines make use of data science algorithms to deliver the best results for search queries
in a fraction of seconds.
Digital Advertisements
The entire digital marketing spectrum uses the data science algorithms - from display banners to
digital billboards. This is the mean reason for digital ads getting higher CTR than traditional
advertisements.
Recommender Systems
The recommender systems not only make it easy to find relevant products from billions of
products available but also adds a lot to user-experience. A lot of companies use this system to
promote their products and suggestions in accordance with the user’s demands and relevance of
information. The recommendations are based on the user’s previous search results.
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 31
Big Data for Retail
Brick and Mortar or an online e-tailer, the answer to staying the
game and being competitive is understanding the customer better
to serve them. This requires the ability to analyze all the disparate
data sources that companies deal with every day, including the
weblogs, customer transaction data, social media, store-branded
credit card data, and loyalty program data.
32
Applications of Big Data
Big Data for Financial Services
Credit card companies, retail banks, private wealth management
advisories, insurance firms, venture funds, and institutional investment
banks use big data for their financial services. The common problem
among them all is the massive amounts of multi-structured data living
in multiple disparate systems, which can be solved by big data. Thus big
data is used in several ways like:
Customer analytics
Compliance analytics
Fraud analytics
Operational analytics
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 33
Big Data in Communications
Gaining new subscribers, retaining customers, and
expanding within current subscriber bases are top
priorities for telecommunication service providers. The
solutions to these challenges lie in the ability to combine
and analyze the masses of customer-generated data and
machine-generated data that is being created every day.
34
Applications of Data Analytics
Healthcare
The main challenge for hospitals with cost pressures tightens is to treat as many patients
as they can efficiently, keeping in mind the improvement of the quality of care. Instrument
and machine data are being used increasingly to track as well as optimize patient flow,
treatment, and equipment used in the hospitals. It is estimated that there will be a 1%
efficiency gain that could yield more than $63 billion in global healthcare savings.
Travel
Data analytics can optimize the buying experience through mobile/ weblog and social
media data analysis. Travel sights can gain insights into the customer’s desires and
preferences. Products can be up-sold by correlating the current sales to the subsequent
browsing increase browse-to-buy conversions via customized packages and offers.
Personalized travel recommendations can also be delivered by data analytics based on
social media data.
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 35
Gaming
Data Analytics helps in collecting data to optimize and spend within as well as
across games. Game companies gain insight into the dislikes, the
relationships, and the likes of the users.
Energy Management
Most firms are using data analytics for energy management, including smart-
grid management, energy optimization, energy distribution, and building
automation in utility companies. The application here is centered on the
controlling and monitoring of network devices, dispatch crews, and manage
service outages. Utilities are given the ability to integrate millions of data
points in the network performance and lets the engineers use the analytics to
monitor the network.
36
Data Scientists
Data Scientist
◦ The Sexiest Job of the 21st Century
“They find stories, extract
knowledge. They are not reporters “
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 37
Data Scientists
Data scientists are the key to realizing the opportunities presented by big data. They bring
structure to it, find compelling patterns in it, and advise executives on the implications for
products, processes, and decisions
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 38
What do Data Scientists do?
National Security
Cyber Security
Business Analytics
Engineering
Healthcare
And more ….
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 39
Concentration in Data Science
Mathematics and Applied Mathematics
Applied Statistics/Data Analysis
Solid Programming Skills (R, Python, Julia, SQL)
Data Mining
Data Base Storage and Management
Machine Learning and discovery
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 40
Machine Learning
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 41
What is Machine Learning ?
Machine learning (ML) is the study of computer algorithms
that improve automatically through experience.
It is seen as a subset of artificial intelligence.
Machine learning algorithms build a mathematical model
based on sample data, known as "training data", in order to
make predictions or decisions without being explicitly
programmed to do so.
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 42
What is Machine Learning ?
Machine learning algorithms are used in a wide variety of
applications, such as email filtering and computer vision,
where it is difficult or infeasible to develop conventional
algorithms to perform the needed tasks.
Machine learning is closely related to computational
statistics, which focuses on making predictions using
computers.
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 43
Real-time applications
Video
44
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 45
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 46
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 47
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 48
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 49
NASSCOM Formative Assessments (Mid-training)
 Formative assessment of students shall be conducted for 100 marks and the test duration shall be
between 45-60 min.
Post training assessment and certification shall be conducted after the successful completion of
training.
Only those students who are Registered and Attending training on Future Skills shall be eligible for
mid-training and post-training assessment.
All assessments shall be conducted online and Auto Proctored through NASSCOM SSC.
The assessment results shall be shared within 3 working days with the SPOC of the institute.
Formative Assessment scores are independent and shall not be counted in the final assessment
scores for certification.
Tentative Date – 16th August 2020
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 50
NASSCOM Formative Assessment
Syllabus for Data Sci. & Analytics
Module
No. of
Questions
Type of
Questions
Indicative
Time/Module
Marks
Introduction to
Data Science
2
MCQ & DC 2 min 6
Mathematical
Foundations
18
MCQ, DC &
ScB
20 min 44
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 51
Multiple Choice
Questions
MCQ
In this type of question, the candidate is asked to choose one or more
responses from a limited list of choices. It also includes True/ False
questions(T/F) depending on the level of difficulty.
Scenario based ScB
This question asks the candidate to describe how they might respond
to a hypothetical situation.
Direct Concept DC
This type of question revolves around the concept that particular subject
deals with. The candidate would be asked a direct question pertaining
to the concept of that particular subject. This can be an MCQ or Fill in
the Blank or Multiple Response
Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 52
Next Lecture
Mathematical Foundations
Introduction & Syllabus
Linear Algebra – Vectors & Matrices
53Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU)
Mr. Dhruv Saxena
Asst. Professor (TEQIP-NPIU)54

Contenu connexe

Tendances

Data mining Introduction
Data mining IntroductionData mining Introduction
Data mining IntroductionVijayasankariS
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop IntroductionJayant Mukherjee
 
Data managing and Exchange GDB
Data managing and Exchange GDB Data managing and Exchange GDB
Data managing and Exchange GDB Esri
 
The Importance of Data Visualization
The Importance of Data VisualizationThe Importance of Data Visualization
The Importance of Data VisualizationCenterline Digital
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptxVrishit Saraswat
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsSrinath Perera
 
Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Edureka!
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data AnalyticsRohithND
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and workAmr Abd El Latief
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
Big data introduction
Big data introductionBig data introduction
Big data introductionChirag Ahuja
 
AI and Blockchain 2017
AI and Blockchain 2017AI and Blockchain 2017
AI and Blockchain 2017Peter Morgan
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data ScienceJason Geng
 
Ibm big data-platform
Ibm big data-platformIbm big data-platform
Ibm big data-platformIBM Sverige
 

Tendances (20)

Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Data mining
Data miningData mining
Data mining
 
Data mining Introduction
Data mining IntroductionData mining Introduction
Data mining Introduction
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop Introduction
 
Data managing and Exchange GDB
Data managing and Exchange GDB Data managing and Exchange GDB
Data managing and Exchange GDB
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
The Importance of Data Visualization
The Importance of Data VisualizationThe Importance of Data Visualization
The Importance of Data Visualization
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
 
01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
Data Science
Data ScienceData Science
Data Science
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
AI and Blockchain 2017
AI and Blockchain 2017AI and Blockchain 2017
AI and Blockchain 2017
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
Ibm big data-platform
Ibm big data-platformIbm big data-platform
Ibm big data-platform
 

Similaire à Introduction to Data Science and Analytics

Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Joanne Luciano
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfvishal choudhary
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxssuser1a4f0f
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxwahiba ben abdessalem
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptxshalini s
 
Data Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and InnovationsData Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and InnovationsVaishali Pal
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018Susanna-Assunta Sansone
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data scienceJordan Engbers
 
BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. maigva
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science LandscapePhilip Bourne
 
Introduction to Data Science: Unveiling Insights Hidden in Data
Introduction to Data Science: Unveiling Insights Hidden in DataIntroduction to Data Science: Unveiling Insights Hidden in Data
Introduction to Data Science: Unveiling Insights Hidden in Datahemayadav41
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First CourseArnab Majumdar
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleDr. Radhey Shyam
 
dissertation proposal writing service
dissertation proposal writing servicedissertation proposal writing service
dissertation proposal writing servicePhd Assistance
 
DSS_Understanding_the_paradigm_shift.pdf
DSS_Understanding_the_paradigm_shift.pdfDSS_Understanding_the_paradigm_shift.pdf
DSS_Understanding_the_paradigm_shift.pdfBizuayehuDesalegn
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfDr. Radhey Shyam
 

Similaire à Introduction to Data Science and Analytics (20)

Information entanglement
Information entanglementInformation entanglement
Information entanglement
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
 
BIG-DATAPPTFINAL.ppt
BIG-DATAPPTFINAL.pptBIG-DATAPPTFINAL.ppt
BIG-DATAPPTFINAL.ppt
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptx
 
Data Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and InnovationsData Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and Innovations
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data science
 
BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm.
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
Introduction to Data Science: Unveiling Insights Hidden in Data
Introduction to Data Science: Unveiling Insights Hidden in DataIntroduction to Data Science: Unveiling Insights Hidden in Data
Introduction to Data Science: Unveiling Insights Hidden in Data
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First Course
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycle
 
dissertation proposal writing service
dissertation proposal writing servicedissertation proposal writing service
dissertation proposal writing service
 
Untitled document.pdf
Untitled document.pdfUntitled document.pdf
Untitled document.pdf
 
DSS_Understanding_the_paradigm_shift.pdf
DSS_Understanding_the_paradigm_shift.pdfDSS_Understanding_the_paradigm_shift.pdf
DSS_Understanding_the_paradigm_shift.pdf
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
 
BIG DATA.ppt
BIG DATA.pptBIG DATA.ppt
BIG DATA.ppt
 

Plus de Dhruv Saxena

Disaster Management Course Objectives
Disaster Management Course ObjectivesDisaster Management Course Objectives
Disaster Management Course ObjectivesDhruv Saxena
 
Disaster Management - Medical and Institutional arrangement
Disaster Management - Medical and Institutional arrangementDisaster Management - Medical and Institutional arrangement
Disaster Management - Medical and Institutional arrangementDhruv Saxena
 
Disaster Preparedness
Disaster PreparednessDisaster Preparedness
Disaster PreparednessDhruv Saxena
 
Disaster Management Introduction & Classification
Disaster Management Introduction & ClassificationDisaster Management Introduction & Classification
Disaster Management Introduction & ClassificationDhruv Saxena
 
Hazards in Textile processing Industries
Hazards in Textile processing IndustriesHazards in Textile processing Industries
Hazards in Textile processing IndustriesDhruv Saxena
 
Drought - Disaster management
Drought - Disaster managementDrought - Disaster management
Drought - Disaster managementDhruv Saxena
 
Cloudburst | Disaster Management
Cloudburst | Disaster ManagementCloudburst | Disaster Management
Cloudburst | Disaster ManagementDhruv Saxena
 
Small bore system: Wastewater Engineering
Small bore system: Wastewater EngineeringSmall bore system: Wastewater Engineering
Small bore system: Wastewater EngineeringDhruv Saxena
 

Plus de Dhruv Saxena (8)

Disaster Management Course Objectives
Disaster Management Course ObjectivesDisaster Management Course Objectives
Disaster Management Course Objectives
 
Disaster Management - Medical and Institutional arrangement
Disaster Management - Medical and Institutional arrangementDisaster Management - Medical and Institutional arrangement
Disaster Management - Medical and Institutional arrangement
 
Disaster Preparedness
Disaster PreparednessDisaster Preparedness
Disaster Preparedness
 
Disaster Management Introduction & Classification
Disaster Management Introduction & ClassificationDisaster Management Introduction & Classification
Disaster Management Introduction & Classification
 
Hazards in Textile processing Industries
Hazards in Textile processing IndustriesHazards in Textile processing Industries
Hazards in Textile processing Industries
 
Drought - Disaster management
Drought - Disaster managementDrought - Disaster management
Drought - Disaster management
 
Cloudburst | Disaster Management
Cloudburst | Disaster ManagementCloudburst | Disaster Management
Cloudburst | Disaster Management
 
Small bore system: Wastewater Engineering
Small bore system: Wastewater EngineeringSmall bore system: Wastewater Engineering
Small bore system: Wastewater Engineering
 

Dernier

Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 

Dernier (20)

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 

Introduction to Data Science and Analytics

  • 1. NASSCOM Future Skills Training Course – Data Science & Analytics Dhruv Saxena Assistant Professor (TEQIP-NPIU) 1
  • 2. 2
  • 3. 3
  • 4. 4
  • 5. 5
  • 6. 6
  • 7. 7
  • 8. Introduction to Data Science Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 8
  • 9.
  • 10. OBJECTIVES The objective of this course is to Impart necessary knowledge of the mathematical foundations needed for data science and develop programming skills required to build data science applications. Duration – 60 Hours (40L + 20C) Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 10
  • 11. LEARNING OUTCOMES At the end of this course, the students will be able to: ● Demonstrate understanding of the mathematical foundations needed for data science. ● Collect, explore, clean, munge and manipulate data. ● Implement models such as k-nearest Neighbors, Naïve Bayes, linear and logistic regression, decision trees, neural networks and clustering. ● Build data science applications using Python based toolkits. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 11
  • 12. Data, Big Data and Challenges Data Science ◦ Introduction ◦ Why Data Science Data Scientists ◦ What do they do? Major/Concentration in Data Science ◦ What courses to take. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 12
  • 13. Data All Around Lots of data is being collected and warehoused ◦Web data, e-commerce ◦Financial transactions, bank/credit transactions ◦Online trading and purchasing ◦Social Network 13
  • 14. How Much Data Do We have? Google processes 20 PB a day (2008) Facebook has 60 TB of daily logs eBay has 6.5 PB of user data + 50 TB/day (5/2009) 1000 genomes project: 200 TB Cost of 1 TB of disk: $35 Time to read 1 TB disk: 3 hrs (100 MB/s) Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 14
  • 15. Big Data Big Data is any data that is expensive to manage and hard to extract value from ◦ Volume ◦ The size of the data ◦ Velocity ◦ The latency of data processing relative to the growing demand for interactivity ◦ Variety and Complexity ◦ the diversity of sources, formats, quality, structures. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 15
  • 16. Big Data vs Data Science vs Data Analytics Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 16
  • 17. What is Data Science? Dealing with unstructured and structured data, Data Science is a field that comprises everything that related to data cleansing, preparation, and analysis. Data Science is the combination of statistics, mathematics, programming, problem-solving, capturing data in ingenious ways, the ability to look at things differently, and the activity of cleansing, preparing, and aligning the data. In simple terms, it is the umbrella of techniques used when trying to extract insights and information from data. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 17
  • 18. What is Big Data? Big Data refers to humongous volumes of data that cannot be processed effectively with the traditional applications that exist. The processing of Big Data begins with the raw data that isn’t aggregated and is most often impossible to store in the memory of a single computer. A buzzword that is used to describe immense volumes of data, both unstructured and structured, Big Data inundates a business on a day-to-day basis. Big Data is something that can be used to analyze insights that can lead to better decisions and strategic business moves. The definition of Big Data, given by Gartner, is, “Big data is high-volume, and high-velocity or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation.” Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 18
  • 19. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 19
  • 20. Big Data Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 20
  • 21. What is Data Analytics? Data Analytics the science of examining raw data to conclude that information. Data Analytics involves applying an algorithmic or mechanical process to derive insights and, for example, running through several data sets to look for meaningful correlations between each other. It is used in several industries to allow organizations and companies to make better decisions as well as verify and disprove existing theories or models. The focus of Data Analytics lies in inference, which is the process of deriving conclusions that are solely based on what the researcher already knows. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 21
  • 22. Types of Data We Have Relational Data (Tables/Transaction/Legacy Data) Text Data (Web) Semi-structured Data (XML) Graph Data Social Network, Semantic Web (RDF), … Streaming Data You can afford to scan the data once Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 22
  • 23. What To Do With These Data? Aggregation and Statistics ◦ Data warehousing and OLAP Indexing, Searching, and Querying ◦ Keyword based search ◦ Pattern matching (XML/RDF) Knowledge discovery ◦ Data Mining ◦ Statistical Modeling Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 23
  • 24. Big Data and Data Science “… the sexy job in the next 10 years will be statisticians,” Hal Varian, Google Chief Economist The U.S. will need 140,000-190,000 predictive analysts and 1.5 million managers/analysts by 2018. McKinsey Global Institute’s June 2011 India will be needing around 160,000+ Data Scientists by 2020 and World demand predicted to be around 2.7million by 2020. New Data Science institutes being created or repurposed – NYU, Columbia, Washington, UCB,... New degree programs, courses, boot-camps: ◦ e.g., at Berkeley: Stats, I-School, CS, Astronomy… ◦ One proposal (elsewhere) for an MS in “Big Data Science” Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 24
  • 25. What is Data Science? An area that manages, manipulates, extracts, and interprets knowledge from tremendous amount of data. Data science (DS) is a multidisciplinary field of study with goal to address the challenges in big data. Data science principles apply to all data – big and small. Simply – Extraction of knowledge from large volumes of data that are structure or unstructured. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 25
  • 26. What is Data Science? Theories and techniques from many fields and disciplines are used to investigate and analyze a large amount of data to help decision makers in many industries such as science, engineering, economics, politics, finance, and education. ◦ Computer Science ◦ Pattern recognition, visualization, data warehousing, High performance computing, Databases, AI ◦ Mathematics ◦ Mathematical Modeling ◦ Statistics ◦ Statistical and Stochastic modeling, Probability. Mr. Dhruv Saxena, Asst. Professor (TEQIP-NPIU) 26
  • 27. Why is it sexy? Gartner’s 2014 Hype Cycle Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 27
  • 28. Data Science Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 28
  • 29. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 29
  • 30. Real Life Examples Companies learn your secrets, shopping patterns, and preferences ◦ For example, can we know if a woman is pregnant, even if she doesn’t want us to know? Target case study Data Science and election (2008, 2012) ◦ 1 million people installed the Obama Facebook app that gave access to info on “friends” Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 30
  • 31. Applications of Data Science Internet Search Search engines make use of data science algorithms to deliver the best results for search queries in a fraction of seconds. Digital Advertisements The entire digital marketing spectrum uses the data science algorithms - from display banners to digital billboards. This is the mean reason for digital ads getting higher CTR than traditional advertisements. Recommender Systems The recommender systems not only make it easy to find relevant products from billions of products available but also adds a lot to user-experience. A lot of companies use this system to promote their products and suggestions in accordance with the user’s demands and relevance of information. The recommendations are based on the user’s previous search results. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 31
  • 32. Big Data for Retail Brick and Mortar or an online e-tailer, the answer to staying the game and being competitive is understanding the customer better to serve them. This requires the ability to analyze all the disparate data sources that companies deal with every day, including the weblogs, customer transaction data, social media, store-branded credit card data, and loyalty program data. 32
  • 33. Applications of Big Data Big Data for Financial Services Credit card companies, retail banks, private wealth management advisories, insurance firms, venture funds, and institutional investment banks use big data for their financial services. The common problem among them all is the massive amounts of multi-structured data living in multiple disparate systems, which can be solved by big data. Thus big data is used in several ways like: Customer analytics Compliance analytics Fraud analytics Operational analytics Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 33
  • 34. Big Data in Communications Gaining new subscribers, retaining customers, and expanding within current subscriber bases are top priorities for telecommunication service providers. The solutions to these challenges lie in the ability to combine and analyze the masses of customer-generated data and machine-generated data that is being created every day. 34
  • 35. Applications of Data Analytics Healthcare The main challenge for hospitals with cost pressures tightens is to treat as many patients as they can efficiently, keeping in mind the improvement of the quality of care. Instrument and machine data are being used increasingly to track as well as optimize patient flow, treatment, and equipment used in the hospitals. It is estimated that there will be a 1% efficiency gain that could yield more than $63 billion in global healthcare savings. Travel Data analytics can optimize the buying experience through mobile/ weblog and social media data analysis. Travel sights can gain insights into the customer’s desires and preferences. Products can be up-sold by correlating the current sales to the subsequent browsing increase browse-to-buy conversions via customized packages and offers. Personalized travel recommendations can also be delivered by data analytics based on social media data. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 35
  • 36. Gaming Data Analytics helps in collecting data to optimize and spend within as well as across games. Game companies gain insight into the dislikes, the relationships, and the likes of the users. Energy Management Most firms are using data analytics for energy management, including smart- grid management, energy optimization, energy distribution, and building automation in utility companies. The application here is centered on the controlling and monitoring of network devices, dispatch crews, and manage service outages. Utilities are given the ability to integrate millions of data points in the network performance and lets the engineers use the analytics to monitor the network. 36
  • 37. Data Scientists Data Scientist ◦ The Sexiest Job of the 21st Century “They find stories, extract knowledge. They are not reporters “ Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 37
  • 38. Data Scientists Data scientists are the key to realizing the opportunities presented by big data. They bring structure to it, find compelling patterns in it, and advise executives on the implications for products, processes, and decisions Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 38
  • 39. What do Data Scientists do? National Security Cyber Security Business Analytics Engineering Healthcare And more …. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 39
  • 40. Concentration in Data Science Mathematics and Applied Mathematics Applied Statistics/Data Analysis Solid Programming Skills (R, Python, Julia, SQL) Data Mining Data Base Storage and Management Machine Learning and discovery Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 40
  • 41. Machine Learning Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 41
  • 42. What is Machine Learning ? Machine learning (ML) is the study of computer algorithms that improve automatically through experience. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to do so. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 42
  • 43. What is Machine Learning ? Machine learning algorithms are used in a wide variety of applications, such as email filtering and computer vision, where it is difficult or infeasible to develop conventional algorithms to perform the needed tasks. Machine learning is closely related to computational statistics, which focuses on making predictions using computers. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 43
  • 45. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 45
  • 46. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 46
  • 47. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 47
  • 48. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 48
  • 49. Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 49
  • 50. NASSCOM Formative Assessments (Mid-training)  Formative assessment of students shall be conducted for 100 marks and the test duration shall be between 45-60 min. Post training assessment and certification shall be conducted after the successful completion of training. Only those students who are Registered and Attending training on Future Skills shall be eligible for mid-training and post-training assessment. All assessments shall be conducted online and Auto Proctored through NASSCOM SSC. The assessment results shall be shared within 3 working days with the SPOC of the institute. Formative Assessment scores are independent and shall not be counted in the final assessment scores for certification. Tentative Date – 16th August 2020 Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 50
  • 51. NASSCOM Formative Assessment Syllabus for Data Sci. & Analytics Module No. of Questions Type of Questions Indicative Time/Module Marks Introduction to Data Science 2 MCQ & DC 2 min 6 Mathematical Foundations 18 MCQ, DC & ScB 20 min 44 Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 51
  • 52. Multiple Choice Questions MCQ In this type of question, the candidate is asked to choose one or more responses from a limited list of choices. It also includes True/ False questions(T/F) depending on the level of difficulty. Scenario based ScB This question asks the candidate to describe how they might respond to a hypothetical situation. Direct Concept DC This type of question revolves around the concept that particular subject deals with. The candidate would be asked a direct question pertaining to the concept of that particular subject. This can be an MCQ or Fill in the Blank or Multiple Response Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU) 52
  • 53. Next Lecture Mathematical Foundations Introduction & Syllabus Linear Algebra – Vectors & Matrices 53Mr. Dhruv Saxena, Assistant Professor (TEQIP-NPIU)
  • 54. Mr. Dhruv Saxena Asst. Professor (TEQIP-NPIU)54