SlideShare une entreprise Scribd logo
1  sur  15
© 2016 Ness SES. All Rights Reserved1
BIG DATA
Open Source Projects
vs
Amazon Services
MOLDOVAN Radu Adrian
Iasi May 2016
© 2016 Ness SES. All Rights Reserved2
Who am I? :)
❏ passionate about technology
❏ 20 years of programming
using open source
❏ last 4 years in Big Data
❏ Big Data Architect @
© 2016 Ness SES. All Rights
Reserved
3
… where Enterprise ends and Big Data starts
www.XYZ.com
Load 1
Balancer
Load n
Balancer
Web 1.1
Server
Web 1.x
Server
Web n.1
Server
Web n.x
Server
Database
search
index
Cache
← Single Point of Failure
← Limited Scalability
read read
writewrite
© 2016 Ness SES. All Rights
Reserved
4
… where Enterprise ends and Big Data starts
www.XYZ.com
Load 1
Balancer
Load n
Balancer
Web 1.1
Server
Web 1.x
Server
Web n.1
Server
Web n.x
Server
readwrite read write
noSQL Ring
1 2
4 5
3
search
1 2
3 4
n
DFS
Resource
Manager
1
HDD
s
CPU
RAM
2
HDD
s
CPU
RAM
n
HDD
s
CPU
RAM
DFS
MPP
RES.
MANAGER
© 2016 Ness SES. All Rights Reserved5
INFRASTRUCTURE LAYER
Database
Analytics
Bigdata
INFORMATION LAYER
MULTI CHANNEL DELIVERY
Dashboard Laptop Mobile/Tablet Email SMS Print
ANALYTICS LAYER
Realtime
Near Realtime
Reports + Statistics Custom Tools
Data Processing
- system generated data
- dimensional data
- de/normalize data
Data Ingestion/Extraction
- external data
- reference internal data
- discovery data
Data Loading
- operational data
- business information
data
Architecture - High Level
© 2016 Ness SES. All Rights
Reserved
6
Big data -ETL+BI
ERP
Flat
Files
CRM
Live
Stream
RDBMS
Web
Services
Extract Transform Load
Massive
Parallel
Processing
Distributed
System
noSQL DB
warehouse
DB(OLAP)
search
engines
Business Intelligence
Web
Services
Data
Science
Data
Monetization
Data
Exploration
Data
Visualisation
ETL BI
© 2016 Ness SES. All Rights Reserved7
CONSISTENCY
(quorum)
AVAILABILITY
PARTITIONING
RDBMS
HP Vertica(Columnar)
Cassandra (Columnar)
Dynamo (Key-Value)
Couchbase(Document)
Riak (Document)
HDFS
HBase (Columnar)
MongoDB (Document)
Redis (Key-Value)
Memcached(Key-Value)
2
CAP Theorem
© 2016 Ness SES. All Rights Reserved8
Coordinator
ZooKeeper
Management
Ambari
Workflow
Oozie
???NiFi
Security
Ranger+Knox+Falcon
Kerberos
LDAP
Cluster ecosystem - components
Monitoring
Ganglia Nagios
Logs
Kibana
Logstash
© 2016 Ness SES. All Rights Reserved9
COLLECT PROCESS STORE VISUALIZE
Cluster ecosystem - COLLECT
Data Integration
Talend
Informatica
Data Streaming
Storm,
MapR Streams
Spark Streaming
Flink Stream
Data Aggregation
Flume, Scribe
Msg Brokers +
Streams
RabbitMQ
ActiveMQ
Kafka
Data Loader
Sqoop
Data Governance
Atlas
Amazon Simple Queue Service(SQS)
Amazon Kinesis
© 2016 Ness SES. All Rights Reserved10
HADOOP (HDFS)
Res. Manager
Mesos
Yarn
MapReduce
PIG
Analytics
Impala(Drill) GRAPHs
Spark GraphX,
Neo4J, Titan
Flink Gelly
HBase
MongoDB
HIVE
COLLECT PROCESS STORE VISUALIZE
Cluster ecosystem - PROCESS
In Memory
Spark
TEZ
Cloudera, Hortonworks, MapR
Amazon DynamoDB
Amazon EC2
Amazon EMR Amazon S3
Amazon Glacier
© 2016 Ness SES. All Rights Reserved11
Warehouse DB
Presto (ANSI)
HP Vertica
Search Engines
SolrCloud
Elastic Search
Columnar Store
Cassandra
Accumulo
Machine
Learning
Spark ML
FlinkML, Mahout
Key - Value
Store
Redis, Riak,
Memcached
COLLECT PROCESS STORE VISUALIZE
Cluster ecosystem - STORE
Amazon Redshift
Amazon DynamoDB
Amazon ElasticCache
Amazon ElasticSearch
Amazon ML
© 2016 Ness SES. All Rights Reserved12
Tableau
COLLECT PROCESS STORE VISUALIZE
Cluster ecosystem - components
Logi
Jasper
Reports
D3
Pentaho*
Crystal
Reports*
© 2016 Ness SES. All Rights Reserved13
HADOOP (HDFS)
Res. Manager
Mesos
Yarn
Warehouse DB
Presto (ANSI)
HP Vertica
MapReduce
PIG
Search Engines
SolrCloud
Elastic Search
Data Integration
Talend
Informatica
Analytics
Columnar Store
Cassandra
Accumulo
Impala(Drill) GRAPHs
Spark GraphX,
Titan, Neo4J
Flink Gelly
Machine
Learning
Spark ML
FlinkML, Mahout
HBase
MongoDB
Data Streaming
Storm,
MapR Streams
Spark Streaming
Flink Stream
HIVE
Tableau
Key - Value
Store
Redis, Riak,
Memcached
Data Aggregation
Flume, Scribe
Msg Brokers +
Streams
RabbitMQ
ActiveMQ
Kafka
COLLECT PROCESS STORE VISUALIZE
Data Loader
Sqoop
Cluster ecosystem - VISUALIZE
In Memory
Spark
TEZ
Cloudera, Hortonworks, MapR
Logi
Jasper
Reports
D3
Pentaho*
Interactiv
e
Reporting
Crystal
Reports
Data Governance
Atlas
© 2016 Ness SES. All Rights Reserved14
Trends - Forbes report Q1 2016
http://www.forbes.com/sites/gilpress/2016/03/14/top-10-hot-big-data-technologies/#7cd07887f26a
© 2016 Ness SES. All Rights Reserved15
Thank you!
Skype: r.moldovan

Contenu connexe

Tendances

Open source big data landscape and possible ITS applications
Open source big data landscape and possible ITS applicationsOpen source big data landscape and possible ITS applications
Open source big data landscape and possible ITS applicationsSoftwareMill
 
Sharing bisnis big data v3 part1
Sharing  bisnis big data v3 part1Sharing  bisnis big data v3 part1
Sharing bisnis big data v3 part1Dwika Sudrajat
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoMark Kromer
 
Sharing bisnis big data v3 part2
Sharing  bisnis big data v3 part2Sharing  bisnis big data v3 part2
Sharing bisnis big data v3 part2Dwika Sudrajat
 
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Sparkhbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and SparkMichael Stack
 
Managed Cluster Services
Managed Cluster ServicesManaged Cluster Services
Managed Cluster ServicesAdam Doyle
 
Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.Alina Vilk
 
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)Holden Ackerman
 
IoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management ThingsIoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management ThingsDataWorks Summit
 
Small intro to Big Data - Old version
Small intro to Big Data - Old versionSmall intro to Big Data - Old version
Small intro to Big Data - Old versionSoftwareMill
 
Enabling Apache Spark for Hybrid Cloud
Enabling Apache Spark for Hybrid CloudEnabling Apache Spark for Hybrid Cloud
Enabling Apache Spark for Hybrid CloudAlluxio, Inc.
 
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.Data Con LA
 
Big Data - Linked In_DEEPU
Big Data - Linked In_DEEPUBig Data - Linked In_DEEPU
Big Data - Linked In_DEEPUDeepu M
 
Philly Code Camp 2013 Mark Kromer Big Data with SQL Server
Philly Code Camp 2013 Mark Kromer Big Data with SQL ServerPhilly Code Camp 2013 Mark Kromer Big Data with SQL Server
Philly Code Camp 2013 Mark Kromer Big Data with SQL ServerMark Kromer
 

Tendances (20)

Open source big data landscape and possible ITS applications
Open source big data landscape and possible ITS applicationsOpen source big data landscape and possible ITS applications
Open source big data landscape and possible ITS applications
 
Sharing bisnis big data v3 part1
Sharing  bisnis big data v3 part1Sharing  bisnis big data v3 part1
Sharing bisnis big data v3 part1
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with Pentaho
 
Big Data
Big DataBig Data
Big Data
 
Sharing bisnis big data v3 part2
Sharing  bisnis big data v3 part2Sharing  bisnis big data v3 part2
Sharing bisnis big data v3 part2
 
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Sparkhbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
 
Cassandra eu
Cassandra euCassandra eu
Cassandra eu
 
Managed Cluster Services
Managed Cluster ServicesManaged Cluster Services
Managed Cluster Services
 
Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
 
IoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management ThingsIoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management Things
 
Small intro to Big Data - Old version
Small intro to Big Data - Old versionSmall intro to Big Data - Old version
Small intro to Big Data - Old version
 
Enabling Apache Spark for Hybrid Cloud
Enabling Apache Spark for Hybrid CloudEnabling Apache Spark for Hybrid Cloud
Enabling Apache Spark for Hybrid Cloud
 
Big Data - Part IV
Big Data - Part IVBig Data - Part IV
Big Data - Part IV
 
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
 
Big Data - Part II
Big Data - Part IIBig Data - Part II
Big Data - Part II
 
Big data in Azure
Big data in AzureBig data in Azure
Big data in Azure
 
Big Data - Linked In_DEEPU
Big Data - Linked In_DEEPUBig Data - Linked In_DEEPU
Big Data - Linked In_DEEPU
 
Philly Code Camp 2013 Mark Kromer Big Data with SQL Server
Philly Code Camp 2013 Mark Kromer Big Data with SQL ServerPhilly Code Camp 2013 Mark Kromer Big Data with SQL Server
Philly Code Camp 2013 Mark Kromer Big Data with SQL Server
 

Similaire à Big data advanced topics - part I

From raw data to business insights. A modern data lake
From raw data to business insights. A modern data lakeFrom raw data to business insights. A modern data lake
From raw data to business insights. A modern data lakejavier ramirez
 
Fast Big Data Ingest into SAP HANA
Fast Big Data Ingest into SAP HANAFast Big Data Ingest into SAP HANA
Fast Big Data Ingest into SAP HANASolace
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-PipelinesTimothy Spann
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache HadoopHortonworks
 
SMACK stack and beyond
SMACK stack and beyondSMACK stack and beyond
SMACK stack and beyondMatt Jarvis
 
Cloud Expo NYC 2017: Big Data in IoT
Cloud Expo NYC 2017: Big Data in IoTCloud Expo NYC 2017: Big Data in IoT
Cloud Expo NYC 2017: Big Data in IoTOcean9, Inc.
 
Infochimps: Cloud for Big Data
Infochimps: Cloud for Big DataInfochimps: Cloud for Big Data
Infochimps: Cloud for Big Datainside-BigData.com
 
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Spark Summit
 
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...Frank Munz
 
Big Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWSBig Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWSjavier ramirez
 
STG316_Optimizing Storage for Big Data Workloads
STG316_Optimizing Storage for Big Data WorkloadsSTG316_Optimizing Storage for Big Data Workloads
STG316_Optimizing Storage for Big Data WorkloadsAmazon Web Services
 
Hybrid as a Stepping Stone: It’s Not All or Nothing for Your Cloud Transforma...
Hybrid as a Stepping Stone: It’s Not All or Nothing for Your Cloud Transforma...Hybrid as a Stepping Stone: It’s Not All or Nothing for Your Cloud Transforma...
Hybrid as a Stepping Stone: It’s Not All or Nothing for Your Cloud Transforma...Amazon Web Services
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real WorldMark Kromer
 
Building a Modern Data Platform in the Cloud. AWS Initiate Portugal
Building a Modern Data Platform in the Cloud. AWS Initiate PortugalBuilding a Modern Data Platform in the Cloud. AWS Initiate Portugal
Building a Modern Data Platform in the Cloud. AWS Initiate Portugaljavier ramirez
 
Xanadu Big Data Platform Technology Introduction
Xanadu Big Data Platform Technology IntroductionXanadu Big Data Platform Technology Introduction
Xanadu Big Data Platform Technology IntroductionAlex G. Lee, Ph.D. Esq. CLP
 
SMACK Stack 1.1
SMACK Stack 1.1SMACK Stack 1.1
SMACK Stack 1.1Joe Stein
 

Similaire à Big data advanced topics - part I (20)

From raw data to business insights. A modern data lake
From raw data to business insights. A modern data lakeFrom raw data to business insights. A modern data lake
From raw data to business insights. A modern data lake
 
Fast Big Data Ingest into SAP HANA
Fast Big Data Ingest into SAP HANAFast Big Data Ingest into SAP HANA
Fast Big Data Ingest into SAP HANA
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
 
SMACK stack and beyond
SMACK stack and beyondSMACK stack and beyond
SMACK stack and beyond
 
Cloud Expo NYC 2017: Big Data in IoT
Cloud Expo NYC 2017: Big Data in IoTCloud Expo NYC 2017: Big Data in IoT
Cloud Expo NYC 2017: Big Data in IoT
 
AWS Big Data Landscape
AWS Big Data LandscapeAWS Big Data Landscape
AWS Big Data Landscape
 
Infochimps: Cloud for Big Data
Infochimps: Cloud for Big DataInfochimps: Cloud for Big Data
Infochimps: Cloud for Big Data
 
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
 
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
 
Big Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWSBig Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWS
 
STG316_Optimizing Storage for Big Data Workloads
STG316_Optimizing Storage for Big Data WorkloadsSTG316_Optimizing Storage for Big Data Workloads
STG316_Optimizing Storage for Big Data Workloads
 
Hybrid as a Stepping Stone: It’s Not All or Nothing for Your Cloud Transforma...
Hybrid as a Stepping Stone: It’s Not All or Nothing for Your Cloud Transforma...Hybrid as a Stepping Stone: It’s Not All or Nothing for Your Cloud Transforma...
Hybrid as a Stepping Stone: It’s Not All or Nothing for Your Cloud Transforma...
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real World
 
Building a Modern Data Platform in the Cloud. AWS Initiate Portugal
Building a Modern Data Platform in the Cloud. AWS Initiate PortugalBuilding a Modern Data Platform in the Cloud. AWS Initiate Portugal
Building a Modern Data Platform in the Cloud. AWS Initiate Portugal
 
Aioug big data and hadoop
Aioug  big data and hadoopAioug  big data and hadoop
Aioug big data and hadoop
 
Xanadu Big Data Platform Technology Introduction
Xanadu Big Data Platform Technology IntroductionXanadu Big Data Platform Technology Introduction
Xanadu Big Data Platform Technology Introduction
 
Silicon Valley Workshop: Xanadu introduction
Silicon Valley Workshop: Xanadu introduction Silicon Valley Workshop: Xanadu introduction
Silicon Valley Workshop: Xanadu introduction
 
SMACK Stack 1.1
SMACK Stack 1.1SMACK Stack 1.1
SMACK Stack 1.1
 
963
963963
963
 

Dernier

the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 

Dernier (20)

the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 

Big data advanced topics - part I

  • 1. © 2016 Ness SES. All Rights Reserved1 BIG DATA Open Source Projects vs Amazon Services MOLDOVAN Radu Adrian Iasi May 2016
  • 2. © 2016 Ness SES. All Rights Reserved2 Who am I? :) ❏ passionate about technology ❏ 20 years of programming using open source ❏ last 4 years in Big Data ❏ Big Data Architect @
  • 3. © 2016 Ness SES. All Rights Reserved 3 … where Enterprise ends and Big Data starts www.XYZ.com Load 1 Balancer Load n Balancer Web 1.1 Server Web 1.x Server Web n.1 Server Web n.x Server Database search index Cache ← Single Point of Failure ← Limited Scalability read read writewrite
  • 4. © 2016 Ness SES. All Rights Reserved 4 … where Enterprise ends and Big Data starts www.XYZ.com Load 1 Balancer Load n Balancer Web 1.1 Server Web 1.x Server Web n.1 Server Web n.x Server readwrite read write noSQL Ring 1 2 4 5 3 search 1 2 3 4 n DFS Resource Manager 1 HDD s CPU RAM 2 HDD s CPU RAM n HDD s CPU RAM DFS MPP RES. MANAGER
  • 5. © 2016 Ness SES. All Rights Reserved5 INFRASTRUCTURE LAYER Database Analytics Bigdata INFORMATION LAYER MULTI CHANNEL DELIVERY Dashboard Laptop Mobile/Tablet Email SMS Print ANALYTICS LAYER Realtime Near Realtime Reports + Statistics Custom Tools Data Processing - system generated data - dimensional data - de/normalize data Data Ingestion/Extraction - external data - reference internal data - discovery data Data Loading - operational data - business information data Architecture - High Level
  • 6. © 2016 Ness SES. All Rights Reserved 6 Big data -ETL+BI ERP Flat Files CRM Live Stream RDBMS Web Services Extract Transform Load Massive Parallel Processing Distributed System noSQL DB warehouse DB(OLAP) search engines Business Intelligence Web Services Data Science Data Monetization Data Exploration Data Visualisation ETL BI
  • 7. © 2016 Ness SES. All Rights Reserved7 CONSISTENCY (quorum) AVAILABILITY PARTITIONING RDBMS HP Vertica(Columnar) Cassandra (Columnar) Dynamo (Key-Value) Couchbase(Document) Riak (Document) HDFS HBase (Columnar) MongoDB (Document) Redis (Key-Value) Memcached(Key-Value) 2 CAP Theorem
  • 8. © 2016 Ness SES. All Rights Reserved8 Coordinator ZooKeeper Management Ambari Workflow Oozie ???NiFi Security Ranger+Knox+Falcon Kerberos LDAP Cluster ecosystem - components Monitoring Ganglia Nagios Logs Kibana Logstash
  • 9. © 2016 Ness SES. All Rights Reserved9 COLLECT PROCESS STORE VISUALIZE Cluster ecosystem - COLLECT Data Integration Talend Informatica Data Streaming Storm, MapR Streams Spark Streaming Flink Stream Data Aggregation Flume, Scribe Msg Brokers + Streams RabbitMQ ActiveMQ Kafka Data Loader Sqoop Data Governance Atlas Amazon Simple Queue Service(SQS) Amazon Kinesis
  • 10. © 2016 Ness SES. All Rights Reserved10 HADOOP (HDFS) Res. Manager Mesos Yarn MapReduce PIG Analytics Impala(Drill) GRAPHs Spark GraphX, Neo4J, Titan Flink Gelly HBase MongoDB HIVE COLLECT PROCESS STORE VISUALIZE Cluster ecosystem - PROCESS In Memory Spark TEZ Cloudera, Hortonworks, MapR Amazon DynamoDB Amazon EC2 Amazon EMR Amazon S3 Amazon Glacier
  • 11. © 2016 Ness SES. All Rights Reserved11 Warehouse DB Presto (ANSI) HP Vertica Search Engines SolrCloud Elastic Search Columnar Store Cassandra Accumulo Machine Learning Spark ML FlinkML, Mahout Key - Value Store Redis, Riak, Memcached COLLECT PROCESS STORE VISUALIZE Cluster ecosystem - STORE Amazon Redshift Amazon DynamoDB Amazon ElasticCache Amazon ElasticSearch Amazon ML
  • 12. © 2016 Ness SES. All Rights Reserved12 Tableau COLLECT PROCESS STORE VISUALIZE Cluster ecosystem - components Logi Jasper Reports D3 Pentaho* Crystal Reports*
  • 13. © 2016 Ness SES. All Rights Reserved13 HADOOP (HDFS) Res. Manager Mesos Yarn Warehouse DB Presto (ANSI) HP Vertica MapReduce PIG Search Engines SolrCloud Elastic Search Data Integration Talend Informatica Analytics Columnar Store Cassandra Accumulo Impala(Drill) GRAPHs Spark GraphX, Titan, Neo4J Flink Gelly Machine Learning Spark ML FlinkML, Mahout HBase MongoDB Data Streaming Storm, MapR Streams Spark Streaming Flink Stream HIVE Tableau Key - Value Store Redis, Riak, Memcached Data Aggregation Flume, Scribe Msg Brokers + Streams RabbitMQ ActiveMQ Kafka COLLECT PROCESS STORE VISUALIZE Data Loader Sqoop Cluster ecosystem - VISUALIZE In Memory Spark TEZ Cloudera, Hortonworks, MapR Logi Jasper Reports D3 Pentaho* Interactiv e Reporting Crystal Reports Data Governance Atlas
  • 14. © 2016 Ness SES. All Rights Reserved14 Trends - Forbes report Q1 2016 http://www.forbes.com/sites/gilpress/2016/03/14/top-10-hot-big-data-technologies/#7cd07887f26a
  • 15. © 2016 Ness SES. All Rights Reserved15 Thank you! Skype: r.moldovan