2. About the Paper
The paper was published at the Third Latin
American Web Congress in 2005
Have 56 citation.
Andreas Harth
National University
of Galway, Ireland
Prof. Stefan Decker
National University of
Galway, Ireland
2
3. Outline
Overview of Semantic Web.
Overview of Indexes.
Paper motivation.
Methodology.
Experiment & Result.
Conclusion.
3
4. Semantic Web
Also called :
Web 3.0.
the Linked Data Web.
the Web of Data…whatever you call it.
the next major evolution in connecting information.
4
5. Why semantic web?
It enables data to be linked from a source to any
other source.
It can be understood by computers so that they can
perform increasingly sophisticated tasks on our
behalf.
5
7. Semantic Web Standards
RDF (Resource Description Framework): The data modeling
language for the Semantic Web (like UML). All Semantic Web
information is stored and represented in the RDF.
SPARQL : The query language of the Semantic Web.
OWL (Web Ontology Language) The schema language, or
knowledge representation (KR) language, of the Semantic Web.
7
8. What is RDF?
RDF is the data model of the Semantic Web.
That means that all data in Semantic Web
technologies is represented as RDF.
If you store Semantic Web data, it's in RDF.
If you query Semantic Web data (typically using
SPARQL), it's RDF data. If you send Semantic Web
data to your friend, it's RDF.
8
15. What is database index?
A database index is a data structure that improves the
speed of data retrieval operations on a database table at
the cost of additional writes and storage space to maintain
the index data structure.
Index goal : The index structure enables fast retrieval of
data
15
18. Paper motivation
Previous Systems provide a storage infrastructure for RDF data, but
index structure which do not support typical query scenarios
for data from the Web which results in poor query answering
performance in some cases.
18
19. Methodology
The researchers present a new index structure that handle the
data from the Web .
Implemented the index structure in a lightweight software called
YARS
19
20. RDF Index structures
The authors suggested an index structure that contains two
sets:
1. Lexicon : covers the string representation of RDF graph
(r,l,b)
2. Quad indexes : cover the quads (triples).
20
29. YARS
Web application that built in JAVA.
Has two parts:
a storage component that handles both persistent and in-
memory indexes.
a query handler to perform query processing and evaluation.
29
30. Experiment
They evaluated the performance based on a dataset of 2.8
million triples (293 MB).
The testing server has :
Pentium-4 2.4 GHz
4 GB RAM
running Debian Sarge .
30
31. Experiment
They considered the following RDF stores for evaluation:
Sesame.
Kowari (failed to get a running version).
Redland.
Jena2. ([9] shows that Sesame generally supersedes Jena in
performance results)
31
33. Result – index construction
System Index size (bytes)
Redland 2.164.019.200
Sesame MySQL 340.381.636
Sesame native 39.997.992
YARS 1.090.002.944
33
Table 8: Index size for the synthetic Univ20 dataset.
36. Conclusion
The auther introduced query processing for RDF which is an I
portant issue in sematic web.
YARS has some overhead for resolving the dependencies and
order in comparison with others.
36
37. Criticism
- In experiment , the researchers removed “Kowari “ engine
because the cannot install it.
37