New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
DataScience and BigData Cebu 1st meetup
1. 1st Meetup Event : Meet and Greet
DataScience &
BigData Cebu Meetup
Friday, May 13, 2016 at
7:00PM
A SPACE Cebu, Unit KLM
Crossroads Banilad, Cebu City,
6000 Cebu Philippines
2. Profile
❖ Data Engineer @ nanu
❖ Worked at IBM, Toshiba,
Lexmark, NEC
❖ Co-founder, Jaga-me Pte.Ltd
❖ Founder, HandyNanay.co
❖ Master of Technology in
Knowledge Engineering @
National University of Singapore
( NUS)
❖ Organizer, IoTCebu Meetup
❖ Nodejs,Python, C/C++
3. DataScience and BigData Cebu Meetup
❖ About
it is an avenue for students, tech entrepreneurs, professionals,
businessmen,hobbyist,designers,developers and the academe people to collaborate,to share
skills and knowledge, and to improve overall understanding of BigData,Data Analytics, Machine
Learning,Hadoop and DataScience through meetups,clinics, trainings, hackathons and ideation.
❖ Mission
Train, mentor and educate members on current trends and best practices for DS and Big Data
through clinics,demos,presentations, ideation ,workshops, competitions(kaggle,etc.)
❖ Vision
-Become the largest pool of BigData and Data Science practitioners in Cebu
-Produced more experts and evangelist of DataScience and BigData
❖ Goal
Develop more talents/members in the field of Big Data, Data Analytics and Data Science
4. What is Data Science
❖ Science - “the intellectual and practical activity
encompassing the systematic study of the structure and
behavior of the physical and natural world through
observation and experiment.”
❖ Data Science - is the intellectual and practical activity to s
❖ Data - raw ,unprocessed, unorganised facts
6. Data Science Process
1. Data Collection / Elicitation
2. Data Preparation (cleansing, cleaning, munging, trans
3. Data Exploration
4. Data Analysis
5. Data Modelling
6. Data Visualization (Results)
0. Ask important/interesting questions
7. Standard extended to DS
CRISP-DM
(Cross Industry Standard Process for Data Mining)
8. What is Big Data
❖ complex large data sets
❖ data that is unable to fit to ordinary desktop storage or server storage
❖ 4 Vs ( Volume, Velocity, Variety, Veracity)
9. The Rise of Data
• Social Media
• Banking
• Telecommunications
• IoT (Internet of Things)
• Web
• Mobile
• Government
•By 2017 global mobile data traffic will reach 11.2 exabytes p
1 EB = 10006bytes = 1018bytes 1000 petabytes = 1millionterabytes = 1billion gigabytes.
10. The Data Workers
• Data Scientist
• Data Engineer
• Data Analyst
• Business Analyst
12. The Data Products
• Actionable Insights ( Data Analysis reports )
• Data Visualization
- Interactive
- Static reports
• Data Analytics
-Descriptive Analytics Model
-Predictive Analytics Model
• Machine Learning Model
13. Data Science and Big Data Landscape in Cebu
(Philippines)
• IBM,HP, CISCO, Microsoft, Accenture, etc
• DataSeer
• Exists Global
• SavvySherpa
• ANALITIKA - DTI , DOST, PLDT
•Big Data Analytics Summit Cebu
14. The Big GAP
• Not Enough Startups or Local Companies offering Data Scien
• Shortage of Math, Engineering and IT graduates with Data S
• Less support from the Government
• Not enough Local experts
15. Opportunities
• Grassroots and local BigData / Data Science companies
• Local Data Analytics Startup
• BigData / Data Science Institutes or Learning Centers offere
• International DataScience Competitions ( Kaggle, Google, AW
• Train younger generation for DS and BigData Skills and Too
17. DEMO
A. Quick Introduction to Apache Zeppelin for Data Science Life Cycle
1. Download here - https://zeppelin.incubator.apache.org/
2. Author - https://spark-summit.org/eu-2015/speakers/moon-soo-lee/
3. Mac Os Installation -
http://www.makedatauseful.com/apache-zeppelin-on-osx-ultra-quick-start/
4. Sample notebooks - https://github.com/hortonworks-gallery/zeppelin-notebooks