This document provides an overview of data science including what is big data and data science, applications of data science, and system infrastructure. It then discusses recommendation systems in more detail, describing them as systems that predict user preferences for items. A case study on recommendation systems follows, outlining collaborative filtering and content-based recommendation algorithms, and diving deeper into collaborative filtering approaches of user-based and item-based filtering. Challenges with collaborative filtering are also noted.
2. Agenda
What is big data
What is data science
Data science applications
System infrastructure
Case study – recommendation system
3.
4. Data Scientist
Analytics
Artificial
Intelligence
Statistics
Natural Language
ProcessingFeature Engineering
Scientific
Method
Simulation
Data & Text Mining
Machine Learning
Predictive
Modeling
Graph
Analytics
Data
Management
Data Warehousing
Mashups
Databases
Business Intelligence
Big Data
Information Retrieval
Art & Design
Business
Mindset
Computer
Science
Visualization
Communication
Data Product Design
Domain Knowledge
Ethics
Privacy & Security
Programming
Cloud Computing Distributed Systems
Technology & Infrastructure
Growth
Hacking
Social network
Public Relation
Online ToolsResource
8. Recommendation System
Are a subclass of information filtering system that
seek to predict the “rating” or “preference” that a
user would give to an item ---- Wikipedia
11. Collaborative Filtering
Basic Assumption
• Users with similar interests have common
preference
• Sufficiently large number of user preferences are
available
Main Approaches
• User-based
• Item-based
15. Problem with Collaborative Filtering
New user cold start problem
New item cold start problem
Popularity bias: tend to recommend only popular items
Sparsity problem: if there are many items to be recommended, user/rating
matrix is sparse and it hard to find the users who have rated the same item