SlideShare une entreprise Scribd logo
1  sur  92
Télécharger pour lire hors ligne
Java BigData Full Stack
Development as is ...
Alexey Zinovyev, Java Trainer in EPAM
About
With IT since 2007
With Java since 2009
With Hadoop since 2012
With EPAM since 2015
3Java Big Data Full Stack Development
Contacts
E-mail : Alexey_Zinovyev@epam.com
Twitter : @zaleslaw @BigDataRussia
vk.com/big_data_russia Big Data Russia
vk.com/java_jvm Java & JVM langs
4Java Big Data Full Stack Development
The Good Old Days
5Java Big Data Full Stack Development
HRs & RMs are looking for Java developers
6Java Big Data Full Stack Development
Is Java Dream Team waiting You?
7Java Big Data Full Stack Development
Required Skills
• Advanced SQL
• Basic Linux
• Core Java & JVM
• Backend Development Experience
• Basic Computer Science Level
8Java Big Data Full Stack Development
REAL WORLD
9Java Big Data Full Stack Development
Let’s just use Javascript in frontend ONLY
10Java Big Data Full Stack Development
In frontend
ONLY?
11Java Big Data Full Stack Development
Cruel world
12Java Big Data Full Stack Development
Do you know ML JS library?
13Java Big Data Full Stack Development
Wild animals everywhere
14Java Big Data Full Stack Development
And what I tell you
15Java Big Data Full Stack Development
And what I tell you
16Java Big Data Full Stack Development
It’s Time for Java Superhero, yeah!
17Java Big Data Full Stack Development
Before patterns discovering you should ..
• Select small pieces
• Define default values for missed
data
• Remove strange signals from data
• Merge some tables in one if
required
18Java Big Data Full Stack Development
How it really works
• Share your date with us
• Our magic manipulations
• Building an answering machine
• PROFIT!!!
19Java Big Data Full Stack Development
How to start?
20Java Big Data Full Stack Development
21Java Big Data Full Stack Development
WHAT IS BIG DATA?
22Java Big Data Full Stack Development
Joke about Excel
23Java Big Data Full Stack Development
5V
24Java Big Data Full Stack Development
Every 60 seconds…
25Java Big Data Full Stack Development
From Mobile Devices
26Java Big Data Full Stack Development
From Industry
27Java Big Data Full Stack Development
We started to keep and handle stupid new things!
28Java Big Data Full Stack Development
10^6 rows
in MySQL
29Java Big Data Full Stack Development
GB->TB->PB->?
30Java Big Data Full Stack Development
Is BigData about PBs?
31Java Big Data Full Stack Development
Is BigData about PBs?
32Java Big Data Full Stack Development
It’s hard to …
• .. store
• .. handle
• .. search in
• .. visualize
• .. send in network
33Java Big Data Full Stack Development
Likes in Classmates: how to count?
34Java Big Data Full Stack Development
Crazy Zoo
2012
35Java Big Data Full Stack Development
Crazy Zoo
2016
36Java Big Data Full Stack Development
What will be
lighted this
training
37Java Big Data Full Stack Development
NOSQL
38Java Big Data Full Stack Development
What’s the problem with RBDMS’s
• Caching
• Master/Slave
• Cluster
• Table Partitioning
• Sharding
39Java Big Data Full Stack Development
Family
40Java Big Data Full Stack Development
Database
party
41Java Big Data Full Stack Development
Spring Data
42Java Big Data Full Stack Development
How to start?
43Java Big Data Full Stack Development
Java MongoDB Driver + Robomongo
44Java Big Data Full Stack Development
BIG DATA TOOL MASTER
VS
DATA SCIENTIST
45Java Big Data Full Stack Development
TRAIN
MODEL
46Java Big Data Full Stack Development
Datasets
• Facebook users, tweets
• Trade transactions
• Government
• Medicine (genomic data)
• Telecommunications
47Java Big Data Full Stack Development
Data Sources
• Relational Databases
• Data warehouses (Historical data)
• Files in CSV or in binary format
• Internet or electronic mails
• Scientific, research (R, Octave,
Matlab)
48Java Big Data Full Stack Development
Hey, man, predict something!
49Java Big Data Full Stack Development
Man or sofa?
50Java Big Data Full Stack Development
Typical questions for DM
• Which loan applicants are high-risk?
51Java Big Data Full Stack Development
Typical questions for DM
• Which loan applicants are high-risk?
• How do we detect phone card fraud?
52Java Big Data Full Stack Development
Typical questions for DM
• Which loan applicants are high-risk?
• How do we detect phone card fraud?
• What is the revenue prediction for next year?
53Java Big Data Full Stack Development
Typical questions for DM
• Which loan applicants are high-risk?
• How do we detect phone card fraud?
• What is the revenue prediction for next year?
• Can you recommend music for users?
54Java Big Data Full Stack Development
Green circle is blue square or red
triangle? Let’s ask its neighbors!
kNN (k-nearest neighbor)
55Java Big Data Full Stack Development
Collaborative Filtering
56Java Big Data Full Stack Development
Machine Learning vs Traditional Programming
57Java Big Data Full Stack Development
Data
Science
58Java Big Data Full Stack Development
Can a Java programmer to be a Data Scientist?
59Java Big Data Full Stack Development
Sexy Data Scientist
60Java Big Data Full Stack Development
Real Data Scientist
61Java Big Data Full Stack Development
How to start?
62Java Big Data Full Stack Development
Weka
63Java Big Data Full Stack Development
HADOOP
64Java Big Data Full Stack Development
Hadoop and Data Knights
65Java Big Data Full Stack Development
Hadoop
66Java Big Data Full Stack Development
MapReduce in different languages
67Java Big Data Full Stack Development
MapReduce for WordCount
68Java Big Data Full Stack Development
Hadoop
Jobs
69Java Big Data Full Stack Development
Hadoop frameworks
• Universal (MapReduce, Tez, RDD in Spark)
• Abstract (Pig, Pipeline Spark)
• SQL - like (Hive, Impala, Spark SQL)
• Processing graph (Giraph, GraphX)
• Machine Learning (Mahout, MLib)
• Stream processing (Spark Streaming, Storm)
70Java Big Data Full Stack Development
SPARK
71Java Big Data Full Stack Development
SPARK: the bloody son of MR
• MapReduce in memory
• Up to 50x faster than Hadoop
• RDD is a basic building block
(immutable distributed
collections of objects)
• Pipeline API (no needs in PIG)
72Java Big Data Full Stack Development
Spark
Family
73Java Big Data Full Stack Development
MLlib supports
• Classification and regression
• Collaborative filtering
• Clustering
• Dimensionality reduction
• Optimization
74Java Big Data Full Stack Development
Code sample MLlib (K-Means)
// Cluster the data into two classes using KMeans
int numClusters = 2;
int numIterations = 20;
KMeansModel clusters = KMeans.train(parsedData.rdd(), numClusters, numIterations);
// Evaluate clustering by computing Within Set Sum of Squared Errors
double WSSSE = clusters.computeCost(parsedData.rdd());
System.out.println("Within Set Sum of Squared Errors = " + WSSSE);
// Save and load model
clusters.save(sc.sc(), "myModelPath");
KMeansModel sameModel = KMeansModel.load(sc.sc(), "myModelPath");
75Java Big Data Full Stack Development
MLlib
• .. extends scikit-learn (Python lib) and Mahout
• .. runs fully on Spark and supports Spark’s Pipeline API
• .. dataset is represented by Spark SQL’s SchemaRDD
• .. supports Hive like external data source
• .. is well for large datasets and parallelized algorithms
76Java Big Data Full Stack Development
It solves all problems!
77Java Big Data Full Stack Development
How to start?
78Java Big Data Full Stack Development
HDP Zoo
79Java Big Data Full Stack Development
Ok, Google!
80Java Big Data Full Stack Development
AWS Amazon
81Java Big Data Full Stack Development
Infrastructure issues are waiting YOU!
82Java Big Data Full Stack Development
DEEP LEARNING
83Java Big Data Full Stack Development
Deep Learning help us build NEW FUTURE
84Java Big Data Full Stack Development
Deep Learning help us build NEW FUTURE
85Java Big Data Full Stack Development
HOW TO LEARN?
86Java Big Data Full Stack Development
1. Read books and write ‘pet’ projects
DIFFERENT WAYS
87Java Big Data Full Stack Development
1. Read books and write ‘pet’ projects
2. Become a mentee in Mentoring Process
DIFFERENT WAYS
88Java Big Data Full Stack Development
1. Read books and write ‘pet’ projects
2. Become a mentee in Mentoring Process
3. MOOC
DIFFERENT WAYS
89Java Big Data Full Stack Development
1. Read books and write ‘pet’ projects
2. Become a mentee in Mentoring Process
3. MOOC
4. Take a training course
DIFFERENT WAYS
90Java Big Data Full Stack Development
1. Read books and write ‘pet’ projects
2. Become a mentee in Mentoring Process
3. MOOC
4. Take a training course
5. Visit conferences
DIFFERENT WAYS
91Java Big Data Full Stack Development
Recommended Books
92Java Big Data Full Stack Development
Contacts
E-mail : Alexey_Zinovyev@epam.com
Twitter : @zaleslaw @BigDataRussia
vk.com/big_data_russia Big Data Russia
vk.com/java_jvm Java & JVM langs

Contenu connexe

Tendances

MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Consjohnrjenson
 
HPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemHPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemAdam Marcus
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremRahul Jain
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL DatabasesBADR
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep divelucenerevolution
 
introduction to Neo4j (Tabriz Software Open Talks)
introduction to Neo4j (Tabriz Software Open Talks)introduction to Neo4j (Tabriz Software Open Talks)
introduction to Neo4j (Tabriz Software Open Talks)Farzin Bagheri
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)gdusbabek
 
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...rhatr
 
Elephants vs. Dolphins: Comparing PostgreSQL and MySQL for use in the DoD
Elephants vs. Dolphins:  Comparing PostgreSQL and MySQL for use in the DoDElephants vs. Dolphins:  Comparing PostgreSQL and MySQL for use in the DoD
Elephants vs. Dolphins: Comparing PostgreSQL and MySQL for use in the DoDJamey Hanson
 
Plmce2012 scaling pinterest
Plmce2012 scaling pinterestPlmce2012 scaling pinterest
Plmce2012 scaling pinterestMohit Jain
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational DatabasesChris Baglieri
 
Big Data tools in practice
Big Data tools in practiceBig Data tools in practice
Big Data tools in practiceDarko Marjanovic
 
Hadoop and Cassandra at Rackspace
Hadoop and Cassandra at RackspaceHadoop and Cassandra at Rackspace
Hadoop and Cassandra at RackspaceStu Hood
 
A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0Krishna Sankar
 

Tendances (20)

MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
 
HPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemHPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL Ecosystem
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep dive
 
introduction to Neo4j (Tabriz Software Open Talks)
introduction to Neo4j (Tabriz Software Open Talks)introduction to Neo4j (Tabriz Software Open Talks)
introduction to Neo4j (Tabriz Software Open Talks)
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)
 
NoSQL-Overview
NoSQL-OverviewNoSQL-Overview
NoSQL-Overview
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
 
Elephants vs. Dolphins: Comparing PostgreSQL and MySQL for use in the DoD
Elephants vs. Dolphins:  Comparing PostgreSQL and MySQL for use in the DoDElephants vs. Dolphins:  Comparing PostgreSQL and MySQL for use in the DoD
Elephants vs. Dolphins: Comparing PostgreSQL and MySQL for use in the DoD
 
No sq lv1_0
No sq lv1_0No sq lv1_0
No sq lv1_0
 
Sql vs nosql
Sql vs nosqlSql vs nosql
Sql vs nosql
 
Plmce2012 scaling pinterest
Plmce2012 scaling pinterestPlmce2012 scaling pinterest
Plmce2012 scaling pinterest
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational Databases
 
Big Data tools in practice
Big Data tools in practiceBig Data tools in practice
Big Data tools in practice
 
NOSQL Overview
NOSQL OverviewNOSQL Overview
NOSQL Overview
 
Hadoop and Cassandra at Rackspace
Hadoop and Cassandra at RackspaceHadoop and Cassandra at Rackspace
Hadoop and Cassandra at Rackspace
 
A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0
 
NoSQL
NoSQLNoSQL
NoSQL
 

En vedette

Мастер-класс по BigData Tools для HappyDev'15
Мастер-класс по BigData Tools для HappyDev'15Мастер-класс по BigData Tools для HappyDev'15
Мастер-класс по BigData Tools для HappyDev'15Alexey Zinoviev
 
Google Docs. Zinoviev Alexey
Google Docs. Zinoviev AlexeyGoogle Docs. Zinoviev Alexey
Google Docs. Zinoviev AlexeyAlexey Zinoviev
 
HappyDev'15 Keynote: Когда все данные станут большими...
HappyDev'15 Keynote: Когда все данные станут большими...HappyDev'15 Keynote: Когда все данные станут большими...
HappyDev'15 Keynote: Когда все данные станут большими...Alexey Zinoviev
 
MongoDB первые впечатления
MongoDB первые впечатленияMongoDB первые впечатления
MongoDB первые впечатленияfudz1k
 
MongoDB basics in Russian
MongoDB basics in RussianMongoDB basics in Russian
MongoDB basics in RussianOleg Kachan
 
Кратко о MongoDB
Кратко о MongoDBКратко о MongoDB
Кратко о MongoDBGleb Lebedev
 
MongoDB. Области применения, преимущества и узкие места, тонкости использован...
MongoDB. Области применения, преимущества и узкие места, тонкости использован...MongoDB. Области применения, преимущества и узкие места, тонкости использован...
MongoDB. Области применения, преимущества и узкие места, тонкости использован...phpdevby
 
Преимущества NoSQL баз данных на примере MongoDB
Преимущества NoSQL баз данных на примере MongoDBПреимущества NoSQL баз данных на примере MongoDB
Преимущества NoSQL баз данных на примере MongoDBUNETA
 
Docker 基本概念與指令操作
Docker  基本概念與指令操作Docker  基本概念與指令操作
Docker 基本概念與指令操作NUTC, imac
 
Spark Solution for Rank Product
Spark Solution for Rank ProductSpark Solution for Rank Product
Spark Solution for Rank ProductMahmoud Parsian
 
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"Alexey Zinoviev
 
Performance in Spark 2.0, PDX Spark Meetup 8/18/16
Performance in Spark 2.0, PDX Spark Meetup 8/18/16Performance in Spark 2.0, PDX Spark Meetup 8/18/16
Performance in Spark 2.0, PDX Spark Meetup 8/18/16pdx_spark
 
JavaDayKiev'15 Java in production for Data Mining Research projects
JavaDayKiev'15 Java in production for Data Mining Research projectsJavaDayKiev'15 Java in production for Data Mining Research projects
JavaDayKiev'15 Java in production for Data Mining Research projectsAlexey Zinoviev
 
Joker'16 Spark 2 (API changes; Structured Streaming; Encoders)
Joker'16 Spark 2 (API changes; Structured Streaming; Encoders)Joker'16 Spark 2 (API changes; Structured Streaming; Encoders)
Joker'16 Spark 2 (API changes; Structured Streaming; Encoders)Alexey Zinoviev
 
使用 CLI 管理 OpenStack 平台
使用 CLI 管理 OpenStack 平台使用 CLI 管理 OpenStack 平台
使用 CLI 管理 OpenStack 平台NUTC, imac
 
Joker'15 Java straitjackets for MongoDB
Joker'15 Java straitjackets for MongoDBJoker'15 Java straitjackets for MongoDB
Joker'15 Java straitjackets for MongoDBAlexey Zinoviev
 

En vedette (20)

Мастер-класс по BigData Tools для HappyDev'15
Мастер-класс по BigData Tools для HappyDev'15Мастер-класс по BigData Tools для HappyDev'15
Мастер-класс по BigData Tools для HappyDev'15
 
Google Docs. Zinoviev Alexey
Google Docs. Zinoviev AlexeyGoogle Docs. Zinoviev Alexey
Google Docs. Zinoviev Alexey
 
HappyDev'15 Keynote: Когда все данные станут большими...
HappyDev'15 Keynote: Когда все данные станут большими...HappyDev'15 Keynote: Когда все данные станут большими...
HappyDev'15 Keynote: Когда все данные станут большими...
 
MongoDB первые впечатления
MongoDB первые впечатленияMongoDB первые впечатления
MongoDB первые впечатления
 
MongoDB basics in Russian
MongoDB basics in RussianMongoDB basics in Russian
MongoDB basics in Russian
 
Кратко о MongoDB
Кратко о MongoDBКратко о MongoDB
Кратко о MongoDB
 
JBoss seam 2 part
JBoss seam 2 partJBoss seam 2 part
JBoss seam 2 part
 
MongoDB. Области применения, преимущества и узкие места, тонкости использован...
MongoDB. Области применения, преимущества и узкие места, тонкости использован...MongoDB. Области применения, преимущества и узкие места, тонкости использован...
MongoDB. Области применения, преимущества и узкие места, тонкости использован...
 
A22 Introduction to DTrace by Kyle Hailey
A22 Introduction to DTrace by Kyle HaileyA22 Introduction to DTrace by Kyle Hailey
A22 Introduction to DTrace by Kyle Hailey
 
Преимущества NoSQL баз данных на примере MongoDB
Преимущества NoSQL баз данных на примере MongoDBПреимущества NoSQL баз данных на примере MongoDB
Преимущества NoSQL баз данных на примере MongoDB
 
Docker 基本概念與指令操作
Docker  基本概念與指令操作Docker  基本概念與指令操作
Docker 基本概念與指令操作
 
Spark Solution for Rank Product
Spark Solution for Rank ProductSpark Solution for Rank Product
Spark Solution for Rank Product
 
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
 
Apache Spark Essentials
Apache Spark EssentialsApache Spark Essentials
Apache Spark Essentials
 
Performance in Spark 2.0, PDX Spark Meetup 8/18/16
Performance in Spark 2.0, PDX Spark Meetup 8/18/16Performance in Spark 2.0, PDX Spark Meetup 8/18/16
Performance in Spark 2.0, PDX Spark Meetup 8/18/16
 
JavaDayKiev'15 Java in production for Data Mining Research projects
JavaDayKiev'15 Java in production for Data Mining Research projectsJavaDayKiev'15 Java in production for Data Mining Research projects
JavaDayKiev'15 Java in production for Data Mining Research projects
 
Joker'16 Spark 2 (API changes; Structured Streaming; Encoders)
Joker'16 Spark 2 (API changes; Structured Streaming; Encoders)Joker'16 Spark 2 (API changes; Structured Streaming; Encoders)
Joker'16 Spark 2 (API changes; Structured Streaming; Encoders)
 
Meetup Spark 2.0
Meetup Spark 2.0Meetup Spark 2.0
Meetup Spark 2.0
 
使用 CLI 管理 OpenStack 平台
使用 CLI 管理 OpenStack 平台使用 CLI 管理 OpenStack 平台
使用 CLI 管理 OpenStack 平台
 
Joker'15 Java straitjackets for MongoDB
Joker'15 Java straitjackets for MongoDBJoker'15 Java straitjackets for MongoDB
Joker'15 Java straitjackets for MongoDB
 

Similaire à Java BigData Full Stack Development (version 2.0)

Bringing Deep Learning into production
Bringing Deep Learning into production Bringing Deep Learning into production
Bringing Deep Learning into production Paolo Platter
 
Ncku csie talk about Spark
Ncku csie talk about SparkNcku csie talk about Spark
Ncku csie talk about SparkGiivee The
 
Introduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionIntroduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionJongwook Woo
 
Introduction to Big Data and its Trends
Introduction to Big Data and its TrendsIntroduction to Big Data and its Trends
Introduction to Big Data and its TrendsJongwook Woo
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionWeCloudData
 
[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017
[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017
[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017Viktor Gamov
 
JDD2015: Thorny path to Data Mining projects - Alexey Zinoviev
JDD2015: Thorny path to Data Mining projects - Alexey Zinoviev JDD2015: Thorny path to Data Mining projects - Alexey Zinoviev
JDD2015: Thorny path to Data Mining projects - Alexey Zinoviev PROIDEA
 
Beauty and Big Data
Beauty and Big DataBeauty and Big Data
Beauty and Big DataSri Ambati
 
Quick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysQuick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysDemi Ben-Ari
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...Big Data Spain
 
Embracing Hadoop with a musical touch!
Embracing Hadoop with a musical touch!Embracing Hadoop with a musical touch!
Embracing Hadoop with a musical touch!DataWorks Summit
 
PyData Texas 2015 Keynote
PyData Texas 2015 KeynotePyData Texas 2015 Keynote
PyData Texas 2015 KeynotePeter Wang
 
Big Data and High Performance Computing
Big Data and High Performance ComputingBig Data and High Performance Computing
Big Data and High Performance ComputingAbzetdin Adamov
 
Hadoop @ Yahoo! - Internet Scale Data Processing
Hadoop @ Yahoo! - Internet Scale Data ProcessingHadoop @ Yahoo! - Internet Scale Data Processing
Hadoop @ Yahoo! - Internet Scale Data ProcessingYahoo Developer Network
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewAbhishek Roy
 
2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato ReviewHang Li
 
DATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIDATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIshivajirao12345
 

Similaire à Java BigData Full Stack Development (version 2.0) (20)

Bringing Deep Learning into production
Bringing Deep Learning into production Bringing Deep Learning into production
Bringing Deep Learning into production
 
Ncku csie talk about Spark
Ncku csie talk about SparkNcku csie talk about Spark
Ncku csie talk about Spark
 
Introduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionIntroduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and Prediction
 
Hadoop and SAP BI
Hadoop and SAP BI   Hadoop and SAP BI
Hadoop and SAP BI
 
Introduction to Big Data and its Trends
Introduction to Big Data and its TrendsIntroduction to Big Data and its Trends
Introduction to Big Data and its Trends
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info Session
 
Big Data made easy with a Spark
Big Data made easy with a SparkBig Data made easy with a Spark
Big Data made easy with a Spark
 
[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017
[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017
[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017
 
JDD2015: Thorny path to Data Mining projects - Alexey Zinoviev
JDD2015: Thorny path to Data Mining projects - Alexey Zinoviev JDD2015: Thorny path to Data Mining projects - Alexey Zinoviev
JDD2015: Thorny path to Data Mining projects - Alexey Zinoviev
 
Beauty and Big Data
Beauty and Big DataBeauty and Big Data
Beauty and Big Data
 
Quick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysQuick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
 
Embracing Hadoop with a musical touch!
Embracing Hadoop with a musical touch!Embracing Hadoop with a musical touch!
Embracing Hadoop with a musical touch!
 
PyData Texas 2015 Keynote
PyData Texas 2015 KeynotePyData Texas 2015 Keynote
PyData Texas 2015 Keynote
 
Big Data and High Performance Computing
Big Data and High Performance ComputingBig Data and High Performance Computing
Big Data and High Performance Computing
 
AI on Big Data
AI on Big DataAI on Big Data
AI on Big Data
 
Hadoop @ Yahoo! - Internet Scale Data Processing
Hadoop @ Yahoo! - Internet Scale Data ProcessingHadoop @ Yahoo! - Internet Scale Data Processing
Hadoop @ Yahoo! - Internet Scale Data Processing
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overview
 
2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review
 
DATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIDATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAI
 

Plus de Alexey Zinoviev

Kafka pours and Spark resolves
Kafka pours and Spark resolvesKafka pours and Spark resolves
Kafka pours and Spark resolvesAlexey Zinoviev
 
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...Alexey Zinoviev
 
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)Alexey Zinoviev
 
Joker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistJoker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistAlexey Zinoviev
 
First steps in Data Mining Kindergarten
First steps in Data Mining KindergartenFirst steps in Data Mining Kindergarten
First steps in Data Mining KindergartenAlexey Zinoviev
 
EST: Smart rate (Effective recommendation system for Taxi drivers based on th...
EST: Smart rate (Effective recommendation system for Taxi drivers based on th...EST: Smart rate (Effective recommendation system for Taxi drivers based on th...
EST: Smart rate (Effective recommendation system for Taxi drivers based on th...Alexey Zinoviev
 
Android Geo Apps in Soviet Russia: Latitude and longitude find you
Android Geo Apps in Soviet Russia: Latitude and longitude find youAndroid Geo Apps in Soviet Russia: Latitude and longitude find you
Android Geo Apps in Soviet Russia: Latitude and longitude find youAlexey Zinoviev
 
Keynote on JavaDay Omsk 2014 about new features in Java 8
Keynote on JavaDay Omsk 2014 about new features in Java 8Keynote on JavaDay Omsk 2014 about new features in Java 8
Keynote on JavaDay Omsk 2014 about new features in Java 8Alexey Zinoviev
 
Big data algorithms and data structures for large scale graphs
Big data algorithms and data structures for large scale graphsBig data algorithms and data structures for large scale graphs
Big data algorithms and data structures for large scale graphsAlexey Zinoviev
 
"Говнокод-шоу"
"Говнокод-шоу""Говнокод-шоу"
"Говнокод-шоу"Alexey Zinoviev
 
Алгоритмы и структуры данных BigData для графов большой размерности
Алгоритмы и структуры данных BigData для графов большой размерностиАлгоритмы и структуры данных BigData для графов большой размерности
Алгоритмы и структуры данных BigData для графов большой размерностиAlexey Zinoviev
 
ALMADA 2013 (computer science school by Yandex and Microsoft Research)
ALMADA 2013 (computer science school by Yandex and Microsoft Research)ALMADA 2013 (computer science school by Yandex and Microsoft Research)
ALMADA 2013 (computer science school by Yandex and Microsoft Research)Alexey Zinoviev
 
GDG Devfest Omsk 2013. Year of events!
GDG Devfest Omsk 2013. Year of events!GDG Devfest Omsk 2013. Year of events!
GDG Devfest Omsk 2013. Year of events!Alexey Zinoviev
 
How to port JavaScript library to Android and iOS
How to port JavaScript library to Android and iOSHow to port JavaScript library to Android and iOS
How to port JavaScript library to Android and iOSAlexey Zinoviev
 
Поездка на IT-DUMP 2012
Поездка на IT-DUMP 2012Поездка на IT-DUMP 2012
Поездка на IT-DUMP 2012Alexey Zinoviev
 
MyBatis и Hibernate на одном проекте. Как подружить?
MyBatis и Hibernate на одном проекте. Как подружить?MyBatis и Hibernate на одном проекте. Как подружить?
MyBatis и Hibernate на одном проекте. Как подружить?Alexey Zinoviev
 
Google I/O туда и обратно.
Google I/O туда и обратно.Google I/O туда и обратно.
Google I/O туда и обратно.Alexey Zinoviev
 
Google Maps. Zinoviev Alexey.
Google Maps. Zinoviev Alexey.Google Maps. Zinoviev Alexey.
Google Maps. Zinoviev Alexey.Alexey Zinoviev
 
ORM battle. MyBatis vs Hibernate
ORM battle. MyBatis vs HibernateORM battle. MyBatis vs Hibernate
ORM battle. MyBatis vs HibernateAlexey Zinoviev
 

Plus de Alexey Zinoviev (20)

Kafka pours and Spark resolves
Kafka pours and Spark resolvesKafka pours and Spark resolves
Kafka pours and Spark resolves
 
Hadoop Jungle
Hadoop JungleHadoop Jungle
Hadoop Jungle
 
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
 
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
 
Joker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistJoker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data Scientist
 
First steps in Data Mining Kindergarten
First steps in Data Mining KindergartenFirst steps in Data Mining Kindergarten
First steps in Data Mining Kindergarten
 
EST: Smart rate (Effective recommendation system for Taxi drivers based on th...
EST: Smart rate (Effective recommendation system for Taxi drivers based on th...EST: Smart rate (Effective recommendation system for Taxi drivers based on th...
EST: Smart rate (Effective recommendation system for Taxi drivers based on th...
 
Android Geo Apps in Soviet Russia: Latitude and longitude find you
Android Geo Apps in Soviet Russia: Latitude and longitude find youAndroid Geo Apps in Soviet Russia: Latitude and longitude find you
Android Geo Apps in Soviet Russia: Latitude and longitude find you
 
Keynote on JavaDay Omsk 2014 about new features in Java 8
Keynote on JavaDay Omsk 2014 about new features in Java 8Keynote on JavaDay Omsk 2014 about new features in Java 8
Keynote on JavaDay Omsk 2014 about new features in Java 8
 
Big data algorithms and data structures for large scale graphs
Big data algorithms and data structures for large scale graphsBig data algorithms and data structures for large scale graphs
Big data algorithms and data structures for large scale graphs
 
"Говнокод-шоу"
"Говнокод-шоу""Говнокод-шоу"
"Говнокод-шоу"
 
Алгоритмы и структуры данных BigData для графов большой размерности
Алгоритмы и структуры данных BigData для графов большой размерностиАлгоритмы и структуры данных BigData для графов большой размерности
Алгоритмы и структуры данных BigData для графов большой размерности
 
ALMADA 2013 (computer science school by Yandex and Microsoft Research)
ALMADA 2013 (computer science school by Yandex and Microsoft Research)ALMADA 2013 (computer science school by Yandex and Microsoft Research)
ALMADA 2013 (computer science school by Yandex and Microsoft Research)
 
GDG Devfest Omsk 2013. Year of events!
GDG Devfest Omsk 2013. Year of events!GDG Devfest Omsk 2013. Year of events!
GDG Devfest Omsk 2013. Year of events!
 
How to port JavaScript library to Android and iOS
How to port JavaScript library to Android and iOSHow to port JavaScript library to Android and iOS
How to port JavaScript library to Android and iOS
 
Поездка на IT-DUMP 2012
Поездка на IT-DUMP 2012Поездка на IT-DUMP 2012
Поездка на IT-DUMP 2012
 
MyBatis и Hibernate на одном проекте. Как подружить?
MyBatis и Hibernate на одном проекте. Как подружить?MyBatis и Hibernate на одном проекте. Как подружить?
MyBatis и Hibernate на одном проекте. Как подружить?
 
Google I/O туда и обратно.
Google I/O туда и обратно.Google I/O туда и обратно.
Google I/O туда и обратно.
 
Google Maps. Zinoviev Alexey.
Google Maps. Zinoviev Alexey.Google Maps. Zinoviev Alexey.
Google Maps. Zinoviev Alexey.
 
ORM battle. MyBatis vs Hibernate
ORM battle. MyBatis vs HibernateORM battle. MyBatis vs Hibernate
ORM battle. MyBatis vs Hibernate
 

Dernier

Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 

Dernier (20)

Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 

Java BigData Full Stack Development (version 2.0)

  • 1. Java BigData Full Stack Development as is ... Alexey Zinovyev, Java Trainer in EPAM
  • 2. About With IT since 2007 With Java since 2009 With Hadoop since 2012 With EPAM since 2015
  • 3. 3Java Big Data Full Stack Development Contacts E-mail : Alexey_Zinovyev@epam.com Twitter : @zaleslaw @BigDataRussia vk.com/big_data_russia Big Data Russia vk.com/java_jvm Java & JVM langs
  • 4. 4Java Big Data Full Stack Development The Good Old Days
  • 5. 5Java Big Data Full Stack Development HRs & RMs are looking for Java developers
  • 6. 6Java Big Data Full Stack Development Is Java Dream Team waiting You?
  • 7. 7Java Big Data Full Stack Development Required Skills • Advanced SQL • Basic Linux • Core Java & JVM • Backend Development Experience • Basic Computer Science Level
  • 8. 8Java Big Data Full Stack Development REAL WORLD
  • 9. 9Java Big Data Full Stack Development Let’s just use Javascript in frontend ONLY
  • 10. 10Java Big Data Full Stack Development In frontend ONLY?
  • 11. 11Java Big Data Full Stack Development Cruel world
  • 12. 12Java Big Data Full Stack Development Do you know ML JS library?
  • 13. 13Java Big Data Full Stack Development Wild animals everywhere
  • 14. 14Java Big Data Full Stack Development And what I tell you
  • 15. 15Java Big Data Full Stack Development And what I tell you
  • 16. 16Java Big Data Full Stack Development It’s Time for Java Superhero, yeah!
  • 17. 17Java Big Data Full Stack Development Before patterns discovering you should .. • Select small pieces • Define default values for missed data • Remove strange signals from data • Merge some tables in one if required
  • 18. 18Java Big Data Full Stack Development How it really works • Share your date with us • Our magic manipulations • Building an answering machine • PROFIT!!!
  • 19. 19Java Big Data Full Stack Development How to start?
  • 20. 20Java Big Data Full Stack Development
  • 21. 21Java Big Data Full Stack Development WHAT IS BIG DATA?
  • 22. 22Java Big Data Full Stack Development Joke about Excel
  • 23. 23Java Big Data Full Stack Development 5V
  • 24. 24Java Big Data Full Stack Development Every 60 seconds…
  • 25. 25Java Big Data Full Stack Development From Mobile Devices
  • 26. 26Java Big Data Full Stack Development From Industry
  • 27. 27Java Big Data Full Stack Development We started to keep and handle stupid new things!
  • 28. 28Java Big Data Full Stack Development 10^6 rows in MySQL
  • 29. 29Java Big Data Full Stack Development GB->TB->PB->?
  • 30. 30Java Big Data Full Stack Development Is BigData about PBs?
  • 31. 31Java Big Data Full Stack Development Is BigData about PBs?
  • 32. 32Java Big Data Full Stack Development It’s hard to … • .. store • .. handle • .. search in • .. visualize • .. send in network
  • 33. 33Java Big Data Full Stack Development Likes in Classmates: how to count?
  • 34. 34Java Big Data Full Stack Development Crazy Zoo 2012
  • 35. 35Java Big Data Full Stack Development Crazy Zoo 2016
  • 36. 36Java Big Data Full Stack Development What will be lighted this training
  • 37. 37Java Big Data Full Stack Development NOSQL
  • 38. 38Java Big Data Full Stack Development What’s the problem with RBDMS’s • Caching • Master/Slave • Cluster • Table Partitioning • Sharding
  • 39. 39Java Big Data Full Stack Development Family
  • 40. 40Java Big Data Full Stack Development Database party
  • 41. 41Java Big Data Full Stack Development Spring Data
  • 42. 42Java Big Data Full Stack Development How to start?
  • 43. 43Java Big Data Full Stack Development Java MongoDB Driver + Robomongo
  • 44. 44Java Big Data Full Stack Development BIG DATA TOOL MASTER VS DATA SCIENTIST
  • 45. 45Java Big Data Full Stack Development TRAIN MODEL
  • 46. 46Java Big Data Full Stack Development Datasets • Facebook users, tweets • Trade transactions • Government • Medicine (genomic data) • Telecommunications
  • 47. 47Java Big Data Full Stack Development Data Sources • Relational Databases • Data warehouses (Historical data) • Files in CSV or in binary format • Internet or electronic mails • Scientific, research (R, Octave, Matlab)
  • 48. 48Java Big Data Full Stack Development Hey, man, predict something!
  • 49. 49Java Big Data Full Stack Development Man or sofa?
  • 50. 50Java Big Data Full Stack Development Typical questions for DM • Which loan applicants are high-risk?
  • 51. 51Java Big Data Full Stack Development Typical questions for DM • Which loan applicants are high-risk? • How do we detect phone card fraud?
  • 52. 52Java Big Data Full Stack Development Typical questions for DM • Which loan applicants are high-risk? • How do we detect phone card fraud? • What is the revenue prediction for next year?
  • 53. 53Java Big Data Full Stack Development Typical questions for DM • Which loan applicants are high-risk? • How do we detect phone card fraud? • What is the revenue prediction for next year? • Can you recommend music for users?
  • 54. 54Java Big Data Full Stack Development Green circle is blue square or red triangle? Let’s ask its neighbors! kNN (k-nearest neighbor)
  • 55. 55Java Big Data Full Stack Development Collaborative Filtering
  • 56. 56Java Big Data Full Stack Development Machine Learning vs Traditional Programming
  • 57. 57Java Big Data Full Stack Development Data Science
  • 58. 58Java Big Data Full Stack Development Can a Java programmer to be a Data Scientist?
  • 59. 59Java Big Data Full Stack Development Sexy Data Scientist
  • 60. 60Java Big Data Full Stack Development Real Data Scientist
  • 61. 61Java Big Data Full Stack Development How to start?
  • 62. 62Java Big Data Full Stack Development Weka
  • 63. 63Java Big Data Full Stack Development HADOOP
  • 64. 64Java Big Data Full Stack Development Hadoop and Data Knights
  • 65. 65Java Big Data Full Stack Development Hadoop
  • 66. 66Java Big Data Full Stack Development MapReduce in different languages
  • 67. 67Java Big Data Full Stack Development MapReduce for WordCount
  • 68. 68Java Big Data Full Stack Development Hadoop Jobs
  • 69. 69Java Big Data Full Stack Development Hadoop frameworks • Universal (MapReduce, Tez, RDD in Spark) • Abstract (Pig, Pipeline Spark) • SQL - like (Hive, Impala, Spark SQL) • Processing graph (Giraph, GraphX) • Machine Learning (Mahout, MLib) • Stream processing (Spark Streaming, Storm)
  • 70. 70Java Big Data Full Stack Development SPARK
  • 71. 71Java Big Data Full Stack Development SPARK: the bloody son of MR • MapReduce in memory • Up to 50x faster than Hadoop • RDD is a basic building block (immutable distributed collections of objects) • Pipeline API (no needs in PIG)
  • 72. 72Java Big Data Full Stack Development Spark Family
  • 73. 73Java Big Data Full Stack Development MLlib supports • Classification and regression • Collaborative filtering • Clustering • Dimensionality reduction • Optimization
  • 74. 74Java Big Data Full Stack Development Code sample MLlib (K-Means) // Cluster the data into two classes using KMeans int numClusters = 2; int numIterations = 20; KMeansModel clusters = KMeans.train(parsedData.rdd(), numClusters, numIterations); // Evaluate clustering by computing Within Set Sum of Squared Errors double WSSSE = clusters.computeCost(parsedData.rdd()); System.out.println("Within Set Sum of Squared Errors = " + WSSSE); // Save and load model clusters.save(sc.sc(), "myModelPath"); KMeansModel sameModel = KMeansModel.load(sc.sc(), "myModelPath");
  • 75. 75Java Big Data Full Stack Development MLlib • .. extends scikit-learn (Python lib) and Mahout • .. runs fully on Spark and supports Spark’s Pipeline API • .. dataset is represented by Spark SQL’s SchemaRDD • .. supports Hive like external data source • .. is well for large datasets and parallelized algorithms
  • 76. 76Java Big Data Full Stack Development It solves all problems!
  • 77. 77Java Big Data Full Stack Development How to start?
  • 78. 78Java Big Data Full Stack Development HDP Zoo
  • 79. 79Java Big Data Full Stack Development Ok, Google!
  • 80. 80Java Big Data Full Stack Development AWS Amazon
  • 81. 81Java Big Data Full Stack Development Infrastructure issues are waiting YOU!
  • 82. 82Java Big Data Full Stack Development DEEP LEARNING
  • 83. 83Java Big Data Full Stack Development Deep Learning help us build NEW FUTURE
  • 84. 84Java Big Data Full Stack Development Deep Learning help us build NEW FUTURE
  • 85. 85Java Big Data Full Stack Development HOW TO LEARN?
  • 86. 86Java Big Data Full Stack Development 1. Read books and write ‘pet’ projects DIFFERENT WAYS
  • 87. 87Java Big Data Full Stack Development 1. Read books and write ‘pet’ projects 2. Become a mentee in Mentoring Process DIFFERENT WAYS
  • 88. 88Java Big Data Full Stack Development 1. Read books and write ‘pet’ projects 2. Become a mentee in Mentoring Process 3. MOOC DIFFERENT WAYS
  • 89. 89Java Big Data Full Stack Development 1. Read books and write ‘pet’ projects 2. Become a mentee in Mentoring Process 3. MOOC 4. Take a training course DIFFERENT WAYS
  • 90. 90Java Big Data Full Stack Development 1. Read books and write ‘pet’ projects 2. Become a mentee in Mentoring Process 3. MOOC 4. Take a training course 5. Visit conferences DIFFERENT WAYS
  • 91. 91Java Big Data Full Stack Development Recommended Books
  • 92. 92Java Big Data Full Stack Development Contacts E-mail : Alexey_Zinovyev@epam.com Twitter : @zaleslaw @BigDataRussia vk.com/big_data_russia Big Data Russia vk.com/java_jvm Java & JVM langs