SlideShare a Scribd company logo
1 of 29
Download to read offline
ELIS – Multimedia Lab
Reducing HTTP traffic for
scalable linked data consumption
Query Execution Optimization for
Clients of Triple Patterns Fragments
Joachim Van Herwegen, Ruben Verborgh, Erik Mannens, Rik Van De Walle
2
ELIS – Multimedia Lab
SPARQL endpoints, data dumps, simple interfaces, …
Still looking for the ultimate linked data solution
Full SPARQL support
High scalability
Fast response time
Low server & client load
…
Not found yet, so we focused on improving the response time
for clients using simple interfaces (Triple Pattern Fragments).
Accessing linked data
3
ELIS – Multimedia Lab
Accessing Linked Data
Problem statement
Improved join tree
Optimizing local joins
Bringing it all together
Query Execution Optimization for
Clients of Triple Patterns Fragments
4
ELIS – Multimedia Lab
Accessing Linked Data
Problem statement
Improved join tree
Optimizing local joins
Bringing it all together
Query Execution Optimization for
Clients of Triple Patterns Fragments
5
ELIS – Multimedia Lab
Linked Data access extremes
SPARQL protocol
Live data
Full SPARQL support
High server load
Data dump
Static data
Remote: 1 query
Local: full queries
High client load
6
ELIS – Multimedia Lab
Generic way to describe how linked data can be accessed
Data Results when accessing a selector
Metadata Description of the fragment
Controls Links to other fragments
Verborgh et al. – Web-scale querying through Linked Data Fragments
Linked Data Fragments
7
ELIS – Multimedia Lab
Accessing data through a SPARQL endpoint
Data Bindings matching a SPARQL query
Metadata { } (data contains everything needed)
Controls { } (interface can answer everything)
SPARQL endpoint
8
ELIS – Multimedia Lab
Accessing data through Triple Pattern Fragments
Data Triples matching a triple pattern
Metadata Count estimate, page size, etc.
Controls First page, next page, root fragment
Triple Pattern Fragments
9
ELIS – Multimedia Lab
Triple Pattern Fragments
URI
query
results
metadata/controls
10
ELIS – Multimedia Lab
Accessing Linked Data
Problem statement
Improved join tree
Optimizing local joins
Bringing it all together
Query Execution Optimization for
Clients of Triple Patterns Fragments
11
ELIS – Multimedia Lab
SELECT ?person ?city WHERE {
?person a db:Architect. 1200 triples
?person db:birthPlace ?city. 430,000 triples
?city dc:subject db:Category:Capitals_in_Europe. 60 triples
}
Start from the smallest pattern, apply bindings and do recursion
Greedy algorithm
birthPlace architect
400
40,000
Capitals
1
12
ELIS – Multimedia Lab
SELECT ?person ?city WHERE {
?person a db:Architect. 1200 triples
?person db:birthPlace ?city. 430,000 triples
?city dc:subject db:Category:Capitals_in_Europe. 60 triples
}
Find optimal solution for every pattern
Optimized algorithm
Capitals birthPlace architect
1
400
local
12
13
ELIS – Multimedia Lab
Accessing Linked Data
Problem statement
Improved join tree
Optimizing local joins
Bringing it all together
Query Execution Optimization for
Clients of Triple Patterns Fragments
14
ELIS – Multimedia Lab
Goal
Minimize HTTP calls required to solve BGP query
Solution
2 possible roles for every pattern in query:
Download pattern completely
or
Bind variable and download resulting patterns
Estimate best option for every pattern
Optimized algorithm
15
ELIS – Multimedia Lab
?player :team ?club 365,000 triples
?club :type :SoccerClub 16,000 triples
?club :ground ?city 15,000 triples
?city :country :Spain 7,000 triples
?player :birthPlace ?city 430,000 triples
Always download smallest pattern
Determine others on shared variables and results so far
Can change during runtime
Extended example
16
ELIS – Multimedia Lab
Extended example
?city
:country
:Spain
?city
?club
:ground
?city
?player
:birthPlace
?city
?club
?player
?player
:team
?club
?club
:type
:SoccerClub
supplies ?city
supplied by
?city
17
ELIS – Multimedia Lab
First iteration
?city
:country
:Spain
?city
?club
:ground
?city
?player
:birthPlace
?city
?club
?player
?player
:team
?club
?club
:type
:SoccerClub
18
ELIS – Multimedia Lab
Further iterations
?city
:country
:Spain
?city
?club
:ground
?city
?player
:birthPlace
?city
?club
?player
?player
:team
?club
?club
:type
:SoccerClub
Making sure no
pattern is ignored
19
ELIS – Multimedia Lab
Estimate which option requires least HTTP calls.
Download:
#𝑡𝑟𝑖𝑝𝑙𝑒𝑠
𝑝𝑎𝑔𝑒𝑠𝑖𝑧𝑒
avg pages per binding avg bindings per triple
Bind:
𝑎𝑣𝑔 𝑡𝑟𝑖𝑝𝑙𝑒𝑠/𝑏𝑖𝑛𝑑𝑖𝑛𝑔
𝑝𝑎𝑔𝑒𝑠𝑖𝑧𝑒
⋅ max
𝑏𝑖𝑛𝑑𝑖𝑛𝑔𝑠 𝑓𝑜𝑢𝑛𝑑
𝑡𝑟𝑖𝑝𝑙𝑒𝑠 𝑑𝑜𝑤𝑛𝑙𝑜𝑎𝑑𝑒𝑑
⋅ #𝑡𝑟𝑖𝑝𝑙𝑒𝑠
for all suppliers
Swap when necessary, taking into account work done so far
Updating pattern roles
20
ELIS – Multimedia Lab
Accessing Linked Data
Problem statement
Improved join tree
Optimizing local joins
Bringing it all together
Query Execution Optimization for
Clients of Triple Patterns Fragments
21
ELIS – Multimedia Lab
Most join data from previous iterations can be reused.
Challenge: reuse as much data as possible.
Local joining
Triple data
Iteration i
Triple data
Iteration i+1 New triples!
22
ELIS – Multimedia Lab
Join tree step
Iteration i Iteration i + 1
Bindings
i-1
Triples
i
New New
Bindings
i-1
Triples
i
Bindings
i
Bindings
i
New
23
ELIS – Multimedia Lab
Start with the largest unchanged, connected set of patterns.
Estimate remainder of join order based on pattern size and
connectivity.
Minimizing joins
New New
Bindings
i-1
Triples
i
New triples
Propagated
changes
24
ELIS – Multimedia Lab
Accessing Linked Data
Problem statement
Improved join tree
Optimizing local joins
Bringing it all together
Query Execution Optimization for
Clients of Triple Patterns Fragments
25
ELIS – Multimedia Lab
Prevent local optima
Join tree instead of join path
Reuse local join data
Summary
26
ELIS – Multimedia Lab
Single machine
Intel Core i5-3230M CPU @ 2.60GHz
8 GB RAM
Both client and server
Artificial delay of 100ms on server to simulate network delay
Test setup
27
ELIS – Multimedia Lab
WatDiv benchmark queries, 100ms delay on server
Median # HTTP calls Median time (s)
Results
28
ELIS – Multimedia Lab
Less HTTP calls with more client-side processing
Ideal for slow connection situations
Still room for improvements
No parallelism
Focus on BGPs
More work per HTTP call
Not guaranteed to be better
Conclusion
29
ELIS – Multimedia Lab
Thank you!
Come see demo #13 on thursday
Questions?

More Related Content

Viewers also liked

Opportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataOpportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataMiel Vander Sande
 
Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsRuben Verborgh
 
Towards an Interface for User-Friendly Linked Data Generation Administration
Towards an Interface for User-Friendly Linked Data Generation AdministrationTowards an Interface for User-Friendly Linked Data Generation Administration
Towards an Interface for User-Friendly Linked Data Generation Administrationandimou
 
Situation of open data in Flanders
Situation of open data in FlandersSituation of open data in Flanders
Situation of open data in FlandersPieter Colpaert
 
LDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triplesLDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triplesMiel Vander Sande
 
Machines are the new Digital Natives
Machines are the new Digital NativesMachines are the new Digital Natives
Machines are the new Digital NativesMiel Vander Sande
 
iRail: History & current issues
iRail: History & current issuesiRail: History & current issues
iRail: History & current issuesPieter Colpaert
 
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Pieter Heyvaert
 
Querying Heterogeneous Linked Date Interfaces through Reasoning
Querying Heterogeneous Linked Date Interfaces through ReasoningQuerying Heterogeneous Linked Date Interfaces through Reasoning
Querying Heterogeneous Linked Date Interfaces through ReasoningJoachim Van Herwegen
 
Time travelling through DBpedia
Time travelling through DBpediaTime travelling through DBpedia
Time travelling through DBpediaMiel Vander Sande
 
Presentation Data Science Challenge
Presentation Data Science ChallengePresentation Data Science Challenge
Presentation Data Science ChallengeDieter De Witte
 
Towards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsTowards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsPieter Heyvaert
 
DBpedia Mappings Quality Assessment
DBpedia Mappings Quality AssessmentDBpedia Mappings Quality Assessment
DBpedia Mappings Quality Assessmentandimou
 
Scaling out federated queries for Life Sciences Data In Production
Scaling out federated queries for Life Sciences Data In ProductionScaling out federated queries for Life Sciences Data In Production
Scaling out federated queries for Life Sciences Data In ProductionDieter De Witte
 
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...Laurens De Vocht
 
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsRMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsPieter Heyvaert
 
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked DataEffect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked DataLaurens De Vocht
 
OSLO: Open Standards for Linked Organizations
OSLO: Open Standards for Linked OrganizationsOSLO: Open Standards for Linked Organizations
OSLO: Open Standards for Linked OrganizationsLaurens De Vocht
 
Reproducibility with 
the 99 cents Linked Data archive
Reproducibility with 
the 99 cents Linked Data archiveReproducibility with 
the 99 cents Linked Data archive
Reproducibility with 
the 99 cents Linked Data archiveMiel Vander Sande
 

Viewers also liked (20)

Opportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataOpportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership Metadata
 
Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
 
Towards an Interface for User-Friendly Linked Data Generation Administration
Towards an Interface for User-Friendly Linked Data Generation AdministrationTowards an Interface for User-Friendly Linked Data Generation Administration
Towards an Interface for User-Friendly Linked Data Generation Administration
 
Situation of open data in Flanders
Situation of open data in FlandersSituation of open data in Flanders
Situation of open data in Flanders
 
LDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triplesLDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triples
 
Machines are the new Digital Natives
Machines are the new Digital NativesMachines are the new Digital Natives
Machines are the new Digital Natives
 
iRail: History & current issues
iRail: History & current issuesiRail: History & current issues
iRail: History & current issues
 
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
 
Querying Heterogeneous Linked Date Interfaces through Reasoning
Querying Heterogeneous Linked Date Interfaces through ReasoningQuerying Heterogeneous Linked Date Interfaces through Reasoning
Querying Heterogeneous Linked Date Interfaces through Reasoning
 
Time travelling through DBpedia
Time travelling through DBpediaTime travelling through DBpedia
Time travelling through DBpedia
 
Presentation Data Science Challenge
Presentation Data Science ChallengePresentation Data Science Challenge
Presentation Data Science Challenge
 
Towards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsTowards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping Definitions
 
DBpedia Mappings Quality Assessment
DBpedia Mappings Quality AssessmentDBpedia Mappings Quality Assessment
DBpedia Mappings Quality Assessment
 
Scaling out federated queries for Life Sciences Data In Production
Scaling out federated queries for Life Sciences Data In ProductionScaling out federated queries for Life Sciences Data In Production
Scaling out federated queries for Life Sciences Data In Production
 
ComparativeMotifFinding
ComparativeMotifFindingComparativeMotifFinding
ComparativeMotifFinding
 
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
 
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsRMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
 
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked DataEffect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
 
OSLO: Open Standards for Linked Organizations
OSLO: Open Standards for Linked OrganizationsOSLO: Open Standards for Linked Organizations
OSLO: Open Standards for Linked Organizations
 
Reproducibility with 
the 99 cents Linked Data archive
Reproducibility with 
the 99 cents Linked Data archiveReproducibility with 
the 99 cents Linked Data archive
Reproducibility with 
the 99 cents Linked Data archive
 

Similar to ESWC2015 - Query Optimization for Clients of Linked Data Fragments

Querying data on the Web – client or server?
Querying data on the Web – client or server?Querying data on the Web – client or server?
Querying data on the Web – client or server?Ruben Verborgh
 
Querying datasets on the Web with high availability
Querying datasets on the Web with high availabilityQuerying datasets on the Web with high availability
Querying datasets on the Web with high availabilityRuben Verborgh
 
Issues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applicationsIssues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applicationsTaesu Kim
 
Sustainable queryable access to Linked Data
Sustainable queryable access to Linked DataSustainable queryable access to Linked Data
Sustainable queryable access to Linked DataRuben Verborgh
 
Elastic search from the trenches
Elastic search from the trenchesElastic search from the trenches
Elastic search from the trenchesVinícius Carvalho
 
Deep Learning Inference at speed and scale
Deep Learning Inference at speed and scaleDeep Learning Inference at speed and scale
Deep Learning Inference at speed and scaleBill Liu
 
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkThe Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkSingleStore
 
Junhua wang ai_next_con
Junhua wang ai_next_conJunhua wang ai_next_con
Junhua wang ai_next_conJunhua Wang
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Rakebul Hasan
 
Python + MPP Database = Large Scale AI/ML Projects in Production Faster
Python + MPP Database = Large Scale AI/ML Projects in Production FasterPython + MPP Database = Large Scale AI/ML Projects in Production Faster
Python + MPP Database = Large Scale AI/ML Projects in Production FasterPaige_Roberts
 
http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151xlight
 
Use Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdfUse Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdfM Waleed Kadous
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data ScientistsRichard Garris
 
Vitus Masters Defense
Vitus Masters DefenseVitus Masters Defense
Vitus Masters DefensederDoc
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8dallemang
 
A survey of top k query processing techniques in relational database systems
A survey of top k query processing techniques in relational database systemsA survey of top k query processing techniques in relational database systems
A survey of top k query processing techniques in relational database systemsunyil96
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupesh Bansal
 
Scaling Streaming - Concepts, Research, Goals
Scaling Streaming - Concepts, Research, GoalsScaling Streaming - Concepts, Research, Goals
Scaling Streaming - Concepts, Research, Goalskamaelian
 

Similar to ESWC2015 - Query Optimization for Clients of Linked Data Fragments (20)

Querying data on the Web – client or server?
Querying data on the Web – client or server?Querying data on the Web – client or server?
Querying data on the Web – client or server?
 
Querying datasets on the Web with high availability
Querying datasets on the Web with high availabilityQuerying datasets on the Web with high availability
Querying datasets on the Web with high availability
 
Issues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applicationsIssues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applications
 
Sustainable queryable access to Linked Data
Sustainable queryable access to Linked DataSustainable queryable access to Linked Data
Sustainable queryable access to Linked Data
 
Legacy Systems Interactions with the Supply Chain Through the C2NET Cloud-ba...
Legacy Systems Interactions with the Supply  Chain Through the C2NET Cloud-ba...Legacy Systems Interactions with the Supply  Chain Through the C2NET Cloud-ba...
Legacy Systems Interactions with the Supply Chain Through the C2NET Cloud-ba...
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
Elastic search from the trenches
Elastic search from the trenchesElastic search from the trenches
Elastic search from the trenches
 
Deep Learning Inference at speed and scale
Deep Learning Inference at speed and scaleDeep Learning Inference at speed and scale
Deep Learning Inference at speed and scale
 
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkThe Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with Spark
 
Junhua wang ai_next_con
Junhua wang ai_next_conJunhua wang ai_next_con
Junhua wang ai_next_con
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
 
Python + MPP Database = Large Scale AI/ML Projects in Production Faster
Python + MPP Database = Large Scale AI/ML Projects in Production FasterPython + MPP Database = Large Scale AI/ML Projects in Production Faster
Python + MPP Database = Large Scale AI/ML Projects in Production Faster
 
http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151
 
Use Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdfUse Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdf
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data Scientists
 
Vitus Masters Defense
Vitus Masters DefenseVitus Masters Defense
Vitus Masters Defense
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
 
A survey of top k query processing techniques in relational database systems
A survey of top k query processing techniques in relational database systemsA survey of top k query processing techniques in relational database systems
A survey of top k query processing techniques in relational database systems
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
 
Scaling Streaming - Concepts, Research, Goals
Scaling Streaming - Concepts, Research, GoalsScaling Streaming - Concepts, Research, Goals
Scaling Streaming - Concepts, Research, Goals
 

Recently uploaded

unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 

Recently uploaded (20)

unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 

ESWC2015 - Query Optimization for Clients of Linked Data Fragments

  • 1. ELIS – Multimedia Lab Reducing HTTP traffic for scalable linked data consumption Query Execution Optimization for Clients of Triple Patterns Fragments Joachim Van Herwegen, Ruben Verborgh, Erik Mannens, Rik Van De Walle
  • 2. 2 ELIS – Multimedia Lab SPARQL endpoints, data dumps, simple interfaces, … Still looking for the ultimate linked data solution Full SPARQL support High scalability Fast response time Low server & client load … Not found yet, so we focused on improving the response time for clients using simple interfaces (Triple Pattern Fragments). Accessing linked data
  • 3. 3 ELIS – Multimedia Lab Accessing Linked Data Problem statement Improved join tree Optimizing local joins Bringing it all together Query Execution Optimization for Clients of Triple Patterns Fragments
  • 4. 4 ELIS – Multimedia Lab Accessing Linked Data Problem statement Improved join tree Optimizing local joins Bringing it all together Query Execution Optimization for Clients of Triple Patterns Fragments
  • 5. 5 ELIS – Multimedia Lab Linked Data access extremes SPARQL protocol Live data Full SPARQL support High server load Data dump Static data Remote: 1 query Local: full queries High client load
  • 6. 6 ELIS – Multimedia Lab Generic way to describe how linked data can be accessed Data Results when accessing a selector Metadata Description of the fragment Controls Links to other fragments Verborgh et al. – Web-scale querying through Linked Data Fragments Linked Data Fragments
  • 7. 7 ELIS – Multimedia Lab Accessing data through a SPARQL endpoint Data Bindings matching a SPARQL query Metadata { } (data contains everything needed) Controls { } (interface can answer everything) SPARQL endpoint
  • 8. 8 ELIS – Multimedia Lab Accessing data through Triple Pattern Fragments Data Triples matching a triple pattern Metadata Count estimate, page size, etc. Controls First page, next page, root fragment Triple Pattern Fragments
  • 9. 9 ELIS – Multimedia Lab Triple Pattern Fragments URI query results metadata/controls
  • 10. 10 ELIS – Multimedia Lab Accessing Linked Data Problem statement Improved join tree Optimizing local joins Bringing it all together Query Execution Optimization for Clients of Triple Patterns Fragments
  • 11. 11 ELIS – Multimedia Lab SELECT ?person ?city WHERE { ?person a db:Architect. 1200 triples ?person db:birthPlace ?city. 430,000 triples ?city dc:subject db:Category:Capitals_in_Europe. 60 triples } Start from the smallest pattern, apply bindings and do recursion Greedy algorithm birthPlace architect 400 40,000 Capitals 1
  • 12. 12 ELIS – Multimedia Lab SELECT ?person ?city WHERE { ?person a db:Architect. 1200 triples ?person db:birthPlace ?city. 430,000 triples ?city dc:subject db:Category:Capitals_in_Europe. 60 triples } Find optimal solution for every pattern Optimized algorithm Capitals birthPlace architect 1 400 local 12
  • 13. 13 ELIS – Multimedia Lab Accessing Linked Data Problem statement Improved join tree Optimizing local joins Bringing it all together Query Execution Optimization for Clients of Triple Patterns Fragments
  • 14. 14 ELIS – Multimedia Lab Goal Minimize HTTP calls required to solve BGP query Solution 2 possible roles for every pattern in query: Download pattern completely or Bind variable and download resulting patterns Estimate best option for every pattern Optimized algorithm
  • 15. 15 ELIS – Multimedia Lab ?player :team ?club 365,000 triples ?club :type :SoccerClub 16,000 triples ?club :ground ?city 15,000 triples ?city :country :Spain 7,000 triples ?player :birthPlace ?city 430,000 triples Always download smallest pattern Determine others on shared variables and results so far Can change during runtime Extended example
  • 16. 16 ELIS – Multimedia Lab Extended example ?city :country :Spain ?city ?club :ground ?city ?player :birthPlace ?city ?club ?player ?player :team ?club ?club :type :SoccerClub supplies ?city supplied by ?city
  • 17. 17 ELIS – Multimedia Lab First iteration ?city :country :Spain ?city ?club :ground ?city ?player :birthPlace ?city ?club ?player ?player :team ?club ?club :type :SoccerClub
  • 18. 18 ELIS – Multimedia Lab Further iterations ?city :country :Spain ?city ?club :ground ?city ?player :birthPlace ?city ?club ?player ?player :team ?club ?club :type :SoccerClub Making sure no pattern is ignored
  • 19. 19 ELIS – Multimedia Lab Estimate which option requires least HTTP calls. Download: #𝑡𝑟𝑖𝑝𝑙𝑒𝑠 𝑝𝑎𝑔𝑒𝑠𝑖𝑧𝑒 avg pages per binding avg bindings per triple Bind: 𝑎𝑣𝑔 𝑡𝑟𝑖𝑝𝑙𝑒𝑠/𝑏𝑖𝑛𝑑𝑖𝑛𝑔 𝑝𝑎𝑔𝑒𝑠𝑖𝑧𝑒 ⋅ max 𝑏𝑖𝑛𝑑𝑖𝑛𝑔𝑠 𝑓𝑜𝑢𝑛𝑑 𝑡𝑟𝑖𝑝𝑙𝑒𝑠 𝑑𝑜𝑤𝑛𝑙𝑜𝑎𝑑𝑒𝑑 ⋅ #𝑡𝑟𝑖𝑝𝑙𝑒𝑠 for all suppliers Swap when necessary, taking into account work done so far Updating pattern roles
  • 20. 20 ELIS – Multimedia Lab Accessing Linked Data Problem statement Improved join tree Optimizing local joins Bringing it all together Query Execution Optimization for Clients of Triple Patterns Fragments
  • 21. 21 ELIS – Multimedia Lab Most join data from previous iterations can be reused. Challenge: reuse as much data as possible. Local joining Triple data Iteration i Triple data Iteration i+1 New triples!
  • 22. 22 ELIS – Multimedia Lab Join tree step Iteration i Iteration i + 1 Bindings i-1 Triples i New New Bindings i-1 Triples i Bindings i Bindings i New
  • 23. 23 ELIS – Multimedia Lab Start with the largest unchanged, connected set of patterns. Estimate remainder of join order based on pattern size and connectivity. Minimizing joins New New Bindings i-1 Triples i New triples Propagated changes
  • 24. 24 ELIS – Multimedia Lab Accessing Linked Data Problem statement Improved join tree Optimizing local joins Bringing it all together Query Execution Optimization for Clients of Triple Patterns Fragments
  • 25. 25 ELIS – Multimedia Lab Prevent local optima Join tree instead of join path Reuse local join data Summary
  • 26. 26 ELIS – Multimedia Lab Single machine Intel Core i5-3230M CPU @ 2.60GHz 8 GB RAM Both client and server Artificial delay of 100ms on server to simulate network delay Test setup
  • 27. 27 ELIS – Multimedia Lab WatDiv benchmark queries, 100ms delay on server Median # HTTP calls Median time (s) Results
  • 28. 28 ELIS – Multimedia Lab Less HTTP calls with more client-side processing Ideal for slow connection situations Still room for improvements No parallelism Focus on BGPs More work per HTTP call Not guaranteed to be better Conclusion
  • 29. 29 ELIS – Multimedia Lab Thank you! Come see demo #13 on thursday Questions?