SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
Better Together: How Graph
database enables easy data
integration with Spark and
Kafka in the Cloud
September 30th 2020
1
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Today's Speakers
Emma Liu
Product Manager
● BS in Engineering from Harvey Mudd College, MS
in Engineering Systems from MIT
● Prior work experience at Oracle and MarkLogic
● Focus - Cloud, Containers, Enterprise Infra,
Monitoring, Management, Connectors
Rayees Pasha
Product Manager
● MS in Computer Science from University of Memphis
● Prior Lead PM and ENG positions at Workday, Hitachi
and HP
● Expertise in Database Management and Big Data
Technologies
2
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
1
TigerGraph Architecture and Data
Ingestion Overview
TigerGraph and Spark Data Pipeline
TigerGraph and Kafka Data Pipeline
Today’s Outline
3
2
3
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
SYSTEM
ARCHITECTURE
OVERVIEW
4
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
The TigerGraph Difference
Feature Design Difference Benefit
Real-Time Deep-Link Querying ● Native Graph design
● C++ engine, for high performance
● Storage Architecture
● Uncovers hard-to-find patterns
● Operational, real-time
● HTAP: Transactions+Analytics
Handling Massive Scale ● Distributed DB architecture
● Massively parallel processing
● Compressed storage reduces
footprint and messaging
● Integrates all your data
● Automatic partitioning
● Elastic scaling of resource usage
In-Database Analytics ● GSQL: High-level yet
Turing-complete language
● User-extensible graph algorithm
library, runs in-DB
● ACID (OLTP) and Accumulators
(OLAP)
● Avoids transferring data
● Richer graph context
● In-DB machine learning
5 to 10+ hops deep
5
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
TigerGraph Architecture
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Data Ingestion
7
Step 3
Each GPE consumes the
partial data updates,
processes it and puts it on
disk.
Loading Jobs and POST use
UPSERT semantics:
● If vertex/edge doesn't
yet exist, create it.
● If vertex/edge already
exists, update it.
● Idempotent
Step 1
Data integration through the
following ways to ingest in
user source data.
● Bulk load of data files or
a Kafka stream in CSV or
JSON format
● HTTP POSTs via REST
services (JSON)
● GSQL Insert commands
Step 2
Dispatcher takes in the data
ingestion requests in the form of
updates to the database.
1. Query IDS to get internal
IDs
2. Convert data to internal
format
3. Send data to one or more
corresponding GPEs
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Data Ingestion
8
Incremental
Data
Nginx Restpp
GPE GPE GPE
Disk Disk Disk
CSV/JSON Insert/Update/Delete
Vertices and Edges
Listen to
corresponding
topic for new
messages
Acknowledge
Response
Incoming
Outgoing
Synchronize
data to disk
GSE(IDS)
ID Translation
Kafka Kafka Kafka
Server 1 Server 2 Server 3
Kafka Cluster
In-memory
copy of data
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Spark and
TigerGraph
9
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Spark + TigerGraph Data Pipeline
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Typical Spark + TigerGraph Integration
● Data Preparation and Integration (TigerGraph/Spark)
● Unsupervised Learning (TigerGraph)
● Feature Extraction for Supervised Learning (TigerGraph/Spark)
● Model Training (Spark)
● Validate and Apply Model (TigerGraph)
● Visualize and Explore Interconnected Data (TigerGraph)
11
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Spark and TigerGraph Data Pipeline
Static
Data
Sources
TigerGraph
JDBC
Driver
Streaming
Data
Sources
12
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
JDBC Driver
● Type 4 driver
● Support Read and Write bi-directional data flow to TigerGraph
● Read: Converts ResultSet to DataFrame
● Write: Load DataFrame and files to vertex/edge in TigerGraph
● Supports REST endpoints of built-in, compiled and interpreted GSQL queries from
TigerGraph
● Open Source:
● https://github.com/tigergraph/ecosys/tree/master/tools/etl/tg-jdbc-driver
13
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Supervised ML with TigerGraph - Detecting Phone-Based Fraud
by Analyzing Network or Graph Relationship Features at China
Mobile
Download the solution brief at - https://info.tigergraph.com/MachineLearning
14
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
DEMO
15
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Kafka and
TigerGraph
16
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Kafka and TigerGraph Data Pipeline
Static
Data
Sources
Streaming
Data
Sources
Kafka
Loader
17
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Kafka Loader - Speed to Value from Real-time
Streaming Data
• Reduce Data Availability Gap and Accelerate Time to Value
• Native Integration with Real-time Streaming Data and Batch
Data
• Enables Real-time Graph Feature Updates with Streaming Data
in Machine Learning Use Cases
• Decrease Learning Curve With Familiar Syntax
• GSQL Support with Consistent Data Loading Syntax
• Maintain Separation of Control for Data Loading
• Designed with Built-in MultiGraph Support
18
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Kafka Loader : Three Steps
Consistent with GSQL Data Loading Steps
Step 1: Define the Data Source
Step 2: Create a Loading Job
Step 3: Run the Loading Job
19
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Kafka Loader High Level Architecture
● Connect to External Kafka Cluster
● User Commands Through GSQL Server
● Configuration Settings:
○ Config 1: Kakfa Cluster Configuration
○ Config 2: Topic/Partition/Offset Info
20
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
DEMO
21
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
TigerGraph Architecture + Spark + Kakfa
22
Get Started for Free
● Try TigerGraph Cloud ( tgcloud.io )
● Download TigerGraph’s Developer Edition
● Take a Test Drive - Online Demo
● Get TigerGraph Certified
● Join the Community
@TigerGraphDB /tigergraph /TigerGraphDB /company/TigerGraph
23

Contenu connexe

Tendances

Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesDataStax
 
How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019Randall Hunt
 
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkCaserta
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceDATAVERSITY
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge GraphsJeff Z. Pan
 
Databricks + Snowflake: Catalyzing Data and AI Initiatives
Databricks + Snowflake: Catalyzing Data and AI InitiativesDatabricks + Snowflake: Catalyzing Data and AI Initiatives
Databricks + Snowflake: Catalyzing Data and AI InitiativesDatabricks
 
Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]ercan5
 
Mastering Your Customer Data on Apache Spark by Elliott Cordo
Mastering Your Customer Data on Apache Spark by Elliott CordoMastering Your Customer Data on Apache Spark by Elliott Cordo
Mastering Your Customer Data on Apache Spark by Elliott CordoSpark Summit
 
What’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningWhat’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningDatabricks
 
Neo4j – The Fastest Path to Scalable Real-Time Analytics
Neo4j – The Fastest Path to Scalable Real-Time AnalyticsNeo4j – The Fastest Path to Scalable Real-Time Analytics
Neo4j – The Fastest Path to Scalable Real-Time AnalyticsNeo4j
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleDatabricks
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesMax De Marzi
 
Getting started with BigQuery
Getting started with BigQueryGetting started with BigQuery
Getting started with BigQueryPradeep Bhadani
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardParis Data Engineers !
 
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...Amazon Web Services
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business EnablerSrinivasan Sankar
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks
 
JanusGraph DataBase Concepts
JanusGraph DataBase ConceptsJanusGraph DataBase Concepts
JanusGraph DataBase ConceptsSanil Bagzai
 
Revolutionizing the Energy Industry with Graphs
Revolutionizing the Energy Industry with GraphsRevolutionizing the Energy Industry with Graphs
Revolutionizing the Energy Industry with GraphsNeo4j
 

Tendances (20)

Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019
 
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge Graphs
 
Databricks + Snowflake: Catalyzing Data and AI Initiatives
Databricks + Snowflake: Catalyzing Data and AI InitiativesDatabricks + Snowflake: Catalyzing Data and AI Initiatives
Databricks + Snowflake: Catalyzing Data and AI Initiatives
 
Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]
 
Mastering Your Customer Data on Apache Spark by Elliott Cordo
Mastering Your Customer Data on Apache Spark by Elliott CordoMastering Your Customer Data on Apache Spark by Elliott Cordo
Mastering Your Customer Data on Apache Spark by Elliott Cordo
 
What’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningWhat’s New with Databricks Machine Learning
What’s New with Databricks Machine Learning
 
Neo4j – The Fastest Path to Scalable Real-Time Analytics
Neo4j – The Fastest Path to Scalable Real-Time AnalyticsNeo4j – The Fastest Path to Scalable Real-Time Analytics
Neo4j – The Fastest Path to Scalable Real-Time Analytics
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for Scale
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Getting started with BigQuery
Getting started with BigQueryGetting started with BigQuery
Getting started with BigQuery
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
 
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business Enabler
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With Data
 
JanusGraph DataBase Concepts
JanusGraph DataBase ConceptsJanusGraph DataBase Concepts
JanusGraph DataBase Concepts
 
Revolutionizing the Energy Industry with Graphs
Revolutionizing the Energy Industry with GraphsRevolutionizing the Energy Industry with Graphs
Revolutionizing the Energy Industry with Graphs
 

Similaire à Better Together: How Graph database enables easy data integration with Spark and Kafka in the Cloud

How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...HostedbyConfluent
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsConnected Data World
 
Comparing three data ingestion approaches where Apache Kafka integrates with ...
Comparing three data ingestion approaches where Apache Kafka integrates with ...Comparing three data ingestion approaches where Apache Kafka integrates with ...
Comparing three data ingestion approaches where Apache Kafka integrates with ...HostedbyConfluent
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentationtestSri1
 
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFGPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFKeith Kraus
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to ProductionMostafa Majidpour
 
What's New in Upcoming Apache Spark 2.3
What's New in Upcoming Apache Spark 2.3What's New in Upcoming Apache Spark 2.3
What's New in Upcoming Apache Spark 2.3Databricks
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceData Works MD
 
GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017Joshua Patterson
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCEcscpconf
 
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)Jason Dai
 
2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3 2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3 Chester Chen
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCEcsandit
 
PNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data AnalyticsPNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data AnalyticsJohn Evans
 
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...VMware Tanzu
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsVMware Tanzu
 
apidays LIVE Paris - GraphQL meshes by Jens Neuse
apidays LIVE Paris - GraphQL meshes by Jens Neuseapidays LIVE Paris - GraphQL meshes by Jens Neuse
apidays LIVE Paris - GraphQL meshes by Jens Neuseapidays
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Databricks
 

Similaire à Better Together: How Graph database enables easy data integration with Spark and Kafka in the Cloud (20)

How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
Comparing three data ingestion approaches where Apache Kafka integrates with ...
Comparing three data ingestion approaches where Apache Kafka integrates with ...Comparing three data ingestion approaches where Apache Kafka integrates with ...
Comparing three data ingestion approaches where Apache Kafka integrates with ...
 
Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentation
 
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFGPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to Production
 
What's New in Upcoming Apache Spark 2.3
What's New in Upcoming Apache Spark 2.3What's New in Upcoming Apache Spark 2.3
What's New in Upcoming Apache Spark 2.3
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data Science
 
GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
 
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
 
2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3 2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
 
PNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data AnalyticsPNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data Analytics
 
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive Applications
 
apidays LIVE Paris - GraphQL meshes by Jens Neuse
apidays LIVE Paris - GraphQL meshes by Jens Neuseapidays LIVE Paris - GraphQL meshes by Jens Neuse
apidays LIVE Paris - GraphQL meshes by Jens Neuse
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
 

Plus de TigerGraph

MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATIONMAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATIONTigerGraph
 
Building an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signalsBuilding an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signalsTigerGraph
 
Care Intervention Assistant - Omaha Clinical Data Information System
Care Intervention Assistant - Omaha Clinical Data Information SystemCare Intervention Assistant - Omaha Clinical Data Information System
Care Intervention Assistant - Omaha Clinical Data Information SystemTigerGraph
 
Correspondent Banking Networks
Correspondent Banking NetworksCorrespondent Banking Networks
Correspondent Banking NetworksTigerGraph
 
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...TigerGraph
 
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...TigerGraph
 
Fraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph LearningFraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph LearningTigerGraph
 
Fraudulent credit card cash-out detection On Graphs
Fraudulent credit card cash-out detection On GraphsFraudulent credit card cash-out detection On Graphs
Fraudulent credit card cash-out detection On GraphsTigerGraph
 
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraphFROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraphTigerGraph
 
Customer Experience Management
Customer Experience ManagementCustomer Experience Management
Customer Experience ManagementTigerGraph
 
Graph+AI for Fin. Services
Graph+AI for Fin. ServicesGraph+AI for Fin. Services
Graph+AI for Fin. ServicesTigerGraph
 
Davraz - A graph visualization and exploration software.
Davraz - A graph visualization and exploration software.Davraz - A graph visualization and exploration software.
Davraz - A graph visualization and exploration software.TigerGraph
 
Plume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryPlume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryTigerGraph
 
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
GRAPHS FOR THE FUTURE ENERGY SYSTEMSGRAPHS FOR THE FUTURE ENERGY SYSTEMS
GRAPHS FOR THE FUTURE ENERGY SYSTEMSTigerGraph
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...TigerGraph
 
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...TigerGraph
 
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUIMachine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUITigerGraph
 
Recommendation Engine with In-Database Machine Learning
Recommendation Engine with In-Database Machine LearningRecommendation Engine with In-Database Machine Learning
Recommendation Engine with In-Database Machine LearningTigerGraph
 
Supply Chain and Logistics Management with Graph & AI
Supply Chain and Logistics Management with Graph & AISupply Chain and Logistics Management with Graph & AI
Supply Chain and Logistics Management with Graph & AITigerGraph
 

Plus de TigerGraph (20)

MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATIONMAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
 
Building an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signalsBuilding an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signals
 
Care Intervention Assistant - Omaha Clinical Data Information System
Care Intervention Assistant - Omaha Clinical Data Information SystemCare Intervention Assistant - Omaha Clinical Data Information System
Care Intervention Assistant - Omaha Clinical Data Information System
 
Correspondent Banking Networks
Correspondent Banking NetworksCorrespondent Banking Networks
Correspondent Banking Networks
 
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
 
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
 
Fraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph LearningFraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph Learning
 
Fraudulent credit card cash-out detection On Graphs
Fraudulent credit card cash-out detection On GraphsFraudulent credit card cash-out detection On Graphs
Fraudulent credit card cash-out detection On Graphs
 
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraphFROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
 
Customer Experience Management
Customer Experience ManagementCustomer Experience Management
Customer Experience Management
 
Graph+AI for Fin. Services
Graph+AI for Fin. ServicesGraph+AI for Fin. Services
Graph+AI for Fin. Services
 
Davraz - A graph visualization and exploration software.
Davraz - A graph visualization and exploration software.Davraz - A graph visualization and exploration software.
Davraz - A graph visualization and exploration software.
 
Plume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryPlume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis Library
 
TigerGraph.js
TigerGraph.jsTigerGraph.js
TigerGraph.js
 
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
GRAPHS FOR THE FUTURE ENERGY SYSTEMSGRAPHS FOR THE FUTURE ENERGY SYSTEMS
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
 
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
 
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUIMachine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
 
Recommendation Engine with In-Database Machine Learning
Recommendation Engine with In-Database Machine LearningRecommendation Engine with In-Database Machine Learning
Recommendation Engine with In-Database Machine Learning
 
Supply Chain and Logistics Management with Graph & AI
Supply Chain and Logistics Management with Graph & AISupply Chain and Logistics Management with Graph & AI
Supply Chain and Logistics Management with Graph & AI
 

Dernier

5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxFinatron037
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxdhiyaneswaranv1
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsThinkInnovation
 

Dernier (16)

5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptx
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in Logistics
 

Better Together: How Graph database enables easy data integration with Spark and Kafka in the Cloud

  • 1. Better Together: How Graph database enables easy data integration with Spark and Kafka in the Cloud September 30th 2020 1
  • 2. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Today's Speakers Emma Liu Product Manager ● BS in Engineering from Harvey Mudd College, MS in Engineering Systems from MIT ● Prior work experience at Oracle and MarkLogic ● Focus - Cloud, Containers, Enterprise Infra, Monitoring, Management, Connectors Rayees Pasha Product Manager ● MS in Computer Science from University of Memphis ● Prior Lead PM and ENG positions at Workday, Hitachi and HP ● Expertise in Database Management and Big Data Technologies 2
  • 3. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | 1 TigerGraph Architecture and Data Ingestion Overview TigerGraph and Spark Data Pipeline TigerGraph and Kafka Data Pipeline Today’s Outline 3 2 3
  • 4. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | SYSTEM ARCHITECTURE OVERVIEW 4
  • 5. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | The TigerGraph Difference Feature Design Difference Benefit Real-Time Deep-Link Querying ● Native Graph design ● C++ engine, for high performance ● Storage Architecture ● Uncovers hard-to-find patterns ● Operational, real-time ● HTAP: Transactions+Analytics Handling Massive Scale ● Distributed DB architecture ● Massively parallel processing ● Compressed storage reduces footprint and messaging ● Integrates all your data ● Automatic partitioning ● Elastic scaling of resource usage In-Database Analytics ● GSQL: High-level yet Turing-complete language ● User-extensible graph algorithm library, runs in-DB ● ACID (OLTP) and Accumulators (OLAP) ● Avoids transferring data ● Richer graph context ● In-DB machine learning 5 to 10+ hops deep 5
  • 6. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | TigerGraph Architecture
  • 7. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Data Ingestion 7 Step 3 Each GPE consumes the partial data updates, processes it and puts it on disk. Loading Jobs and POST use UPSERT semantics: ● If vertex/edge doesn't yet exist, create it. ● If vertex/edge already exists, update it. ● Idempotent Step 1 Data integration through the following ways to ingest in user source data. ● Bulk load of data files or a Kafka stream in CSV or JSON format ● HTTP POSTs via REST services (JSON) ● GSQL Insert commands Step 2 Dispatcher takes in the data ingestion requests in the form of updates to the database. 1. Query IDS to get internal IDs 2. Convert data to internal format 3. Send data to one or more corresponding GPEs
  • 8. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Data Ingestion 8 Incremental Data Nginx Restpp GPE GPE GPE Disk Disk Disk CSV/JSON Insert/Update/Delete Vertices and Edges Listen to corresponding topic for new messages Acknowledge Response Incoming Outgoing Synchronize data to disk GSE(IDS) ID Translation Kafka Kafka Kafka Server 1 Server 2 Server 3 Kafka Cluster In-memory copy of data
  • 9. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Spark and TigerGraph 9
  • 10. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Spark + TigerGraph Data Pipeline
  • 11. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Typical Spark + TigerGraph Integration ● Data Preparation and Integration (TigerGraph/Spark) ● Unsupervised Learning (TigerGraph) ● Feature Extraction for Supervised Learning (TigerGraph/Spark) ● Model Training (Spark) ● Validate and Apply Model (TigerGraph) ● Visualize and Explore Interconnected Data (TigerGraph) 11
  • 12. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Spark and TigerGraph Data Pipeline Static Data Sources TigerGraph JDBC Driver Streaming Data Sources 12
  • 13. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | JDBC Driver ● Type 4 driver ● Support Read and Write bi-directional data flow to TigerGraph ● Read: Converts ResultSet to DataFrame ● Write: Load DataFrame and files to vertex/edge in TigerGraph ● Supports REST endpoints of built-in, compiled and interpreted GSQL queries from TigerGraph ● Open Source: ● https://github.com/tigergraph/ecosys/tree/master/tools/etl/tg-jdbc-driver 13
  • 14. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Supervised ML with TigerGraph - Detecting Phone-Based Fraud by Analyzing Network or Graph Relationship Features at China Mobile Download the solution brief at - https://info.tigergraph.com/MachineLearning 14
  • 15. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | DEMO 15
  • 16. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Kafka and TigerGraph 16
  • 17. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Kafka and TigerGraph Data Pipeline Static Data Sources Streaming Data Sources Kafka Loader 17
  • 18. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Kafka Loader - Speed to Value from Real-time Streaming Data • Reduce Data Availability Gap and Accelerate Time to Value • Native Integration with Real-time Streaming Data and Batch Data • Enables Real-time Graph Feature Updates with Streaming Data in Machine Learning Use Cases • Decrease Learning Curve With Familiar Syntax • GSQL Support with Consistent Data Loading Syntax • Maintain Separation of Control for Data Loading • Designed with Built-in MultiGraph Support 18
  • 19. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Kafka Loader : Three Steps Consistent with GSQL Data Loading Steps Step 1: Define the Data Source Step 2: Create a Loading Job Step 3: Run the Loading Job 19
  • 20. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Kafka Loader High Level Architecture ● Connect to External Kafka Cluster ● User Commands Through GSQL Server ● Configuration Settings: ○ Config 1: Kakfa Cluster Configuration ○ Config 2: Topic/Partition/Offset Info 20
  • 21. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | DEMO 21
  • 22. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | TigerGraph Architecture + Spark + Kakfa 22
  • 23. Get Started for Free ● Try TigerGraph Cloud ( tgcloud.io ) ● Download TigerGraph’s Developer Edition ● Take a Test Drive - Online Demo ● Get TigerGraph Certified ● Join the Community @TigerGraphDB /tigergraph /TigerGraphDB /company/TigerGraph 23