SlideShare une entreprise Scribd logo
1  sur  43
Télécharger pour lire hors ligne
making sense of text and data
Atanas Kiryakov
Webinar, July 2020
Reasoning with Big Knowledge Graphs:
Choices, Pitfalls and Proven Recipes
Who are we?
o Leader
ü Semantic technology vendor established year 2000
ü Part of Sirma Group: 400 persons, listed at Sofia Stock Exchange
o Profitable and growing
ü Global: 80% of revenue from London and New York
ü Clients: S&P, BBC, FT, Top-5 US Bank, UK Parliament, Fujitsu, …
ü Verticals: Financial services, Health care and Life sciences, Publishing, Manufacturing
o Innovator
ü Attracted over $15M in innovation funding
ü Member of W3C, EDMC, ODI, STI and LDBC, developing next gen. standards
…, the market leaders in this space
continue to be Neo4J and Ontotext
(GraphDB), which are graph and RDF
database providers respectively.
These are the longest established
vendors in this space (both founded
in 2000) so they have a longevity and
experience that other suppliers
cannot yet match.
Bloor Research
Graph Database Market Update 2020
Ontotext GraphDB™ - the Flagship Product
Ontotext Portfolio
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o Reasoning With GraphDB
Presentation Outline
Knowledge Graphs = Rich Data in Context
KGs put data in context via
linking and semantic metadata
We help enterprises get profound insights
via interlinking, analyzing and exploring:
o diverse databases
o text documents and other content
o proprietary & global data
What is a Knowledge Graph?
o The KG represents a collection
of interlinked descriptions
of concepts and entities
ü Concepts describe each other
ü Connections provide context
ü Context helps comprehension!
o A KG can be used as:
ü Database: can be queried
ü Graph: can be analyzed as network
ü Knowledge base: new facts can be inferred
Read more: https://www.ontotext.com/knowledgehub/fundamentals/what-is-a-knowledge-graph/
What is Semantics?
o Formal semantics allows new valid
facts to be inferred
ü Both data and schema can be interpreted
ü Semantic schema = ontology
ü Languages: RDF Schema (RDFS), OWL
o Only the relevant semantics is
formalized in the schema
ü The meaning of relativeOf is not fully described by
defining it as owl:SymmetricProperty
ü The best model is the simplest one that can do the
work. But not simpler! myData: Maria
ptop:Agent
ptop:Person
ptop:Woman
ptop:childOf
ptop:parentOf
rdfs:range
owl:inverseOf
inferred
myData:Ivan
owl:relativeOf
owl:inverseOfowl:SymmetricProperty
rdfs:subPropertyOf
owl:inverseOf
owl:inverseOf
rdf:type
rdf:type
rdf:type
Reasoning Benefits
o Schema alignment and easy querying in diverse datasets
ü Across sources similar relationships can be modeled in a different way - one can use parentOf, another
childOf and a third one just the more general relativeOf
ü The database will return Ivan as a result of the query (Maria relativeOf ?x) when the fact derived from
the source and asserted is (Ivan childOf Maria)
o Getting deeper and more complete results
ü Finding patterns and inferring new relationships
ü Instant discovery of hidden relationships scattered across multiple sources
o Consistency checking and quality validation
ü RDF Shapes ensure graph consistency and quality
The Pitfalls of Reasoning
o Over-engineered ontologies
ü Too expressive ontology language
ü Results of inference hard to understand and verify
ü Performance penalties far greater than the benefits
o Inappropriate reasoning support
ü Inference implementations that work well with taxonomies and conceptual models of few
thousands of concepts, but cannot cope with KG of millions of entities
o Inappropriate data layer architecture
ü One such example is reasoning with virtual KG, which is often infeasible
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o Reasoning With GraphDB
Presentation Outline
Search in British Museum’s Collection
o Artefacts are described via the granular ontology CIDOC CRM
o Searching in such collection requires Fundamental Relations
ü Aggregation of large number of paths through CRM data into a smaller number of searchable relations
o E.g.: FR "Thing from Place"
British Museum’s Collection: Volumetrics
o Museum objects: 2,051,797
ü Thesaurus entries: 415,509
o Explicit statements: 195,208,156
o Total statements: 916,735,486
ü Expansion ratio is 4.7x, i.e., for each statement, 3.7 more are inferred
ü Nodes (unique URLs and literals): 53,803,189
o Loading time (including materialization):
ü 22.2h on RAM drive
ü 32.9h on non-SSD hard drives
GraphDB Benchmarking
o LDBC: TPC-like benchmarks for graph databases
o Members include: Ontotext, OpenLink, neo4j, CWI, UPM, ORACLE,
IBM, *Sparsity
o LDBC Semantic Publishing Benchmark
ü Based on BBC’s Dynamic Semantic Publishing editorial workflow
ü Updates, adding new content metadata or updating the reference knowledge (e.g., new people)
ü Aggregation queries retrieve content according to various criteria (e.g., to generate a topic web page)
ü The only benchmark that involves reasoning and updates
LDBC SPB Results of GraphDB
Clients
reading / writing Reads/s Writes/s
0 / 1 0.0000 11.4067
0 / 2 0.0000 14.3033
0 / 4 0.0000 14.6700
0 / 8 0.0000 15.1067
1 / 0 17.8258 0.0000
4 / 0 43.0833 0.0000
8 / 0 70.3767 0.0000
16 / 0 83.2633 0.0000
8 / 2 52.5667 9.2867
8 / 4 54.0233 9.6167
8 / 8 54.9067 9.5733
10 / 2 59.9467 8.5333
10 / 4 62.2867 8.4767
10 / 8 61.7167 8.6067
16 / 2 68.8100 5.0600
16 / 4 70.3900 5.1067
16 / 8 70.2300 4.9967
16 / 16 70.9467 5.0567
o CPU: 1 x E5-1650
o RAM: 20G heap
o Dataset: LDBC SPB 256
o DB: GraphDB SE 8.0, RDF Statements:
254,948,985 (explicit), 480,405,141 (total)
OWL-Horst-optimized rule set
o Creative works: 8,821,535
FactForge: Data Integration
o DBpedia (the English version) 496M
o GeoNames (all geographic features on Earth) 150M
o owl:sameAs links between DBpedia and Geonames 471K
o GLEI (global company register data) 3M
o Panama Papers DB (#LinkedLeaks) 20M
o Other datasets and ontologies: WordNet, WorldFacts, FIBO
o News metadata (2000 articles/day enriched by NOW) 1 023M
o Total size (2.2B explicit + 328M inferred statements) 2 522М
FIBO: Financial Industry Business Ontology
o Developed by EDMC, https://spec.edmcouncil.org/fibo/
o We loaded FIBO Foundations and BE
ü About 35 RDF files all together (old version)
o Reasoning profile: OWL 2 RL
o Loading takes 2-3 sec.
o Number of explicit statements: 5 696
o Number of total statements, including inferred: 15 713
ü About 10k statements materialized
FIBO-PROTON Mapping
o PROTON is an upper-level ontology
ü 500 classes, 200 properties; developed by Ontotext since 2004
ü used semantic annotation and LOD integration services, e.g, FactForge
ü mapped to DBPedia, Freebase, GeoNames
o A very basic mapping for public companies and few related
properties was loaded in 4 hours in FactForge:
fb:business.issuer rdfs:subClassOf pext:PublicCompany.
pext:PublicCompany rdfs:subClassOf fibo-be-corp-corp:PubliclyHeldCompany.
ptop:Organization rdfs:subClassOf fibo-fnd-org-fm:FormalOrganization.
dbp-prop:industry rdfs:subPropertyOf pext:industryOf.
pext:industryOf rdfs:subPropertyOf fibo-fnd-rel-rel:isClassifiedBy.
dbp-ont:subsidiary rdfs:subPropertyOf ptop:controls.
ptop:controls rdfs:subPropertyOf fibo-fnd-rel-rel:controls.
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o Reasoning With GraphDB
Presentation Outline
Rule-Based Reasoning
o Description Logic (DL) doesn’t scale
ü Satisfiability checking is not tractable
ü Complexity grows exponentially with size
o Rule-based inference engine
ü R-Entailment rules, PROLOG-style, as defined in [1]
o Sound and complete in PSPACE
ü Under some constraints: do not introduce
blank nodes, bound size of the rule bodies,
ground RDF graph, [1]
[1] Combining RDF and Part of OWL with Rules: Semantics, Decidability, Complexity
Herman J. ter Horst ,Published in International Semantic Web Conference 2005
More at: http://graphdb.ontotext.com/documentation/standard/reasoning.html
Complexity*
DLRules, LP
OWL Full
OWL DL
OWL Lite
RDFS
SWRL
Datalog
OWL 2 QL
Expressivity supported
by GraphDB
OWL 2 RL
OWL Horst
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o Reasoning With GraphDB
Presentation Outline
Forward-Chaining and Materialization
o All possible inferences are made upon update and are stored
ü The inferred statements are stored and indexed along the explicit ones
ü Interferences that are no longer supported upon delete are retracted
o Forward-chaining works, subject to conscious modeling
ü The overheads of the materialization approach are bearable
ü Say, 2x index size and 2x slower loading and updates
ü Marginal (if any) slowdown of queries
Query-time Reasoning and Backward-Chaining
o Perform reasoning query-time
ü No overhead upon data loading and updates
ü Two basic approaches: Backward-chaining and Query rewriting
o Backward-chaining slows down query evaluation dramatically
ü Alike PROLOG unification, the engine “dives” recursively, in order to exhaust all alternative
ways to find bindings for each separate triple pattern in the query
ü There is no way to guess before the actual evaluation the cardinality of the results for each
triple pattern
ü This makes query plan optimization impossible and ruins query performance
Query Rewriting
o Each pattern in the query is rewritten as disjunction of several
alternatives, based on reasoning on the schema/ontology/TBox
<?a rdf:type ptop:Person> query pattern will be expanded to something like
<?a rdf:type ptop:Person> OR
(<?p rdfs:range ptop:Person> AND <?b ?p ?a>) OR
(<?a rdf:type ?c> AND <?c rdfs:subClassOf ptop:Person >) …
o Execution of 10s combinations of variants is slow
ü Imagine a query with two patterns: the first one expands into 5 variants and the second into 6
variants. The engine will have to evaluate 30 alternative combinations
ü Think of implementing the semantics of owl:sameAs via query rewriting
o Query rewriting also delivers incomplete results
ü Recursion is not possible with SPARQL query rewriting
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o GraphDB
o Reasoning with GraphDB
o Reasoning Optimizations in GraphDB
Presentation Outline
GraphDB Essentials
o Scalable RDF / SPARQL engine
ü W3C standards support
ü NEW: RDF* support, property annotations
o Platform independent (100% Java)
o Open source API
ü Main contributor to the RDF4J project
o Reasoning and consistency checking
ü UNIQUE! Efficient reasoning support for big data
sets across the full lifecycle of the data: load, query, updates
Architecture
GraphDB Workbench
User friendly interface for database
administration
GraphDB Engine
REST API for database access
Plugin / Connectors
GraphDB Workbench
o SPARQL editor & autocomplete
o Schema visualization
o Graph exploration
o Database monitoring and administration
9/10/20
Visual Graph
#29
Features Free Standard Enterprise
RDF 1.1 support
SPARQL 1.1 support
RDFS, OWL2 RL and QL reasoning
Efficient query execution
Workbench interface
Community support
Unlimited number of CPU cores
Commercial support
Connectors for Elasticsearch & SOLR
High-availability cluster
Managed service
GraphDB Enterprise: Resilience & Availability
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o GraphDB
o Reasoning with GraphDB
o Reasoning Optimizations in GraphDB
Presentation Outline
Reasoning in GraphDB
o Fast forward-chaining materialization
ü Allows for efficient query evaluation on big datasets
o Incremental for both inserts and deletes
ü Inferred closure is updated transparently upon commit of transaction
o Sample rules:
ENTAILMENT CONSISITENCY
p <rdf:type> <owl:FunctionalProperty> x owl:sameAs y
x p y x owl:differentFrom y
x p z ------------------------
-------------------------------
y <owl:sameAs> z
OWL 2 Reasoning
o Built-in rule-sets for: RDFS, OWL-Horst, OWL2-RL, OWL2-QL
o Custom rule-sets easily defined
ü Ruleset optimizer/profiler
o Configurations with multiple rule-sets
ü E.g. one with consistency checking to be used for internal data and another one
with „open-world“ semantics for LOD and other external datasets
o NEW: Proof plug-in provides inference explanation
Predefined Rule-Sets
Ruleset Description
Empty No reasoning
rdfs Standard RDFS: subClassOf, subPropertyOf, domain and range of properties
rdfs-plus RDFS plus symmetric, transitive and inverse properties
owl-horst (pD*) sameAs, equivalentClass, equivalentProperty, SymmetricProperty,
TransitiveProperty, inverseOf, FunctionalProperty, InverseFunctionalProperty.
Partial support for: intersectionOf, someValuesFrom, hasValue, allValuesFrom
owl-max See the spec http://graphdb.ontotext.com/documentation/standard/reasoning.html
owl-rl (DL-LiteR) AsymmetricProperty, IrreflexiveProperty, propertyChainAxiom,
AllDisjointProperties, hasKey, unionOf, complementOf, oneOf, differentFrom,
AllDisjointClasses and all the property cardinality primitives. Adds more complete
support for intersectionOf, someValuesFrom, hasValue, allValuesFrom
owl-ql Partial compliance. See the spec https://www.w3.org/TR/owl2-profiles
Optimized Rule-Sets
o These versions exclude some RDFS reasoning rules, which are not useful
for most of the applications, but add substantial reasoning overheads
o “Optimized” ruleset versions suppress this rule
Id: rdf1_rdfs4a_4b
x a y
-------------------------------
x <rdf:type> <rdfs:Resource>
a <rdf:type> <rdfs:Resource>
y <rdf:type> <rdfs:Resource>
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o GraphDB
o Reasoning with GraphDB
o Reasoning Optimizations in GraphDB
Presentation Outline
Efficient Retraction of Inferred Facts
o Materialization causes troubles upon delete
ü It is not trivial to figure out which inferred statements are no longer supported
o Deletion without recomputing the inference closure is needed
ü Without it forward-chaining is not feasible for dynamic environments
o GraphDB retracts statements via a unique algorithm
ü Forward-chaining to find potentially affected inferences
ü Backward-chaining to test which inferences are still supported
ü No truth maintenance information overheads
ü Fast – the same order of magnitude as materialization upon insert
The Honey of owl:sameAs Equivalence
o owl:sameAs links the datasets in the Linked Open Data cloud
o owl:sameAs declares that two different URIs denote one and the same object
ü Aligns different identifiers of the same real-world entity used in different data sources
o For example, let’s say that we have three different URIs for Bulgaria and two for
Sofia (its capital)
dbpedia:Sofia owl:sameAs geonames:727011
geonames:727011 geo-ont:parentFeature geonames:732800
dbpedia:Bulgaria owl:sameAs geonames:732800
dbpedia:Bulgaria owl:sameAs opencyc-en:Bulgaria
The Sting of owl:sameAs Equivalence
o According to the standard semantics of owl:sameAs
ü It is a transitive and symmetric relationship
ü Statements, asserted using one of the equivalent URIs, should be inferred to appear with all equivalent
URIs placed in the same position
ü Thus the 4 statements in the example lead to 10 inferred statements :
geonames:727011 owl:sameAs dbpedia:Sofia
geonames:732800 owl:sameAs dbpedia:Bulgaria
geonames:732800 owl:sameAs opencyc-en:Bulgaria
opencyc-en:Bulgaria owl:sameAs dbpedia:Bulgaria
opencyc-en:Bulgaria owl:sameAs geonames:732800
dbpedia:Sofia geo-ont:parentFeature geonames:732800
dbpedia:Sofia geo-ont:parentFeature opencyc-en:Bulgaria
dbpedia:Sofia geo-ont:parentFeature dbpedia:Bulgaria
geonames:727011 geo-ont:parentFeature opencyc-en:Bulgaria
geonames:727011 geo-ont:parentFeature dbpedia:Bulgaria
The Honey and the Sting of owl:sameAs
E11 E22
E12 E21
E23
geonames:727011
dbpedia:Sofia
geonames:732800
dbpedia:Bulgaria
opencyc-en:Bulgaria
geo-ont:parentFeature
The Honey and the Sting of owl:sameAs
E11 E22
E12 E21
E23
geonames:727011
dbpedia:Sofia
geonames:732800
dbpedia:Bulgaria
opencyc-en:Bulgaria
geo-ont:parentFeature
owl:sameAs Optimization
o GraphDB features an optimization of owl:sameAs
ü It can use a single master-node in its indices to represent a class of sameAs-equivalent URIs
o Avoids inflating the indices with multiple equivalent statements
ü Imagine a statement that has 5 sameAs-equivalents of its subject, 2 of its predicate and 3 of its object.
Such statement would have 30 replicas in the indices after forward-chaining if such an optimization is
not used
o Helps presenting compact query results
ü The owl:sameAs equivalence can result in multiplication of the bindings of the variables in the process
of query evaluation with both forward- and backward-chaining. This leads to expansion of the result-
set with rows that differ only by referring to different URIs, which are sameAs-equivalent
ü Optionally, query results can be expanded, as if there is no optimization
Questions?
Experience the technology with our demonstrators
FactForge: Knowledge graph of linked open data and news
about People and Organizations http://factforge.net
RANK: News popularity ranking for companies http://rank.ontotext.com
NOW: Semantic News Portal http://now.ontotext.com
#43

Contenu connexe

Tendances

Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022ArangoDB Database
 
Using an employee knowledge graph for employee engagement and career mobility
Using an employee knowledge graph for employee engagement and career mobilityUsing an employee knowledge graph for employee engagement and career mobility
Using an employee knowledge graph for employee engagement and career mobilityNeo4j
 
Graph-Tool in Practice
Graph-Tool in PracticeGraph-Tool in Practice
Graph-Tool in PracticeMosky Liu
 
Knowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based SearchKnowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based SearchNeo4j
 
A Data Curation Framework: Data Curation and Research Support Services
A Data Curation Framework: Data Curation and Research Support ServicesA Data Curation Framework: Data Curation and Research Support Services
A Data Curation Framework: Data Curation and Research Support ServicesSusanMRob
 
Distributed Query Processing
Distributed Query ProcessingDistributed Query Processing
Distributed Query ProcessingMythili Kannan
 
Object Oriented Dbms
Object Oriented DbmsObject Oriented Dbms
Object Oriented Dbmsmaryeem
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Databasenehabsairam
 
From Taxonomies to Ontologies
From Taxonomies to OntologiesFrom Taxonomies to Ontologies
From Taxonomies to OntologiesChristine Connors
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouseJ M
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge GraphsJeff Z. Pan
 
Building an Enterprise Knowledge Graph @Uber: Lessons from Reality
Building an Enterprise Knowledge Graph @Uber: Lessons from RealityBuilding an Enterprise Knowledge Graph @Uber: Lessons from Reality
Building an Enterprise Knowledge Graph @Uber: Lessons from RealityJoshua Shinavier
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar DatabaseBiju Nair
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing Girish Dhareshwar
 
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftBuilding a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftAmazon Web Services
 

Tendances (20)

Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
 
Using an employee knowledge graph for employee engagement and career mobility
Using an employee knowledge graph for employee engagement and career mobilityUsing an employee knowledge graph for employee engagement and career mobility
Using an employee knowledge graph for employee engagement and career mobility
 
Graph-Tool in Practice
Graph-Tool in PracticeGraph-Tool in Practice
Graph-Tool in Practice
 
Knowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based SearchKnowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based Search
 
A Data Curation Framework: Data Curation and Research Support Services
A Data Curation Framework: Data Curation and Research Support ServicesA Data Curation Framework: Data Curation and Research Support Services
A Data Curation Framework: Data Curation and Research Support Services
 
Distributed Query Processing
Distributed Query ProcessingDistributed Query Processing
Distributed Query Processing
 
Object Oriented Dbms
Object Oriented DbmsObject Oriented Dbms
Object Oriented Dbms
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
From Taxonomies to Ontologies
From Taxonomies to OntologiesFrom Taxonomies to Ontologies
From Taxonomies to Ontologies
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouse
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge Graphs
 
Building an Enterprise Knowledge Graph @Uber: Lessons from Reality
Building an Enterprise Knowledge Graph @Uber: Lessons from RealityBuilding an Enterprise Knowledge Graph @Uber: Lessons from Reality
Building an Enterprise Knowledge Graph @Uber: Lessons from Reality
 
Inverted index
Inverted indexInverted index
Inverted index
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing
 
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftBuilding a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
 

Similaire à Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes

EDF2012 Mariana Damova - Factforge
EDF2012   Mariana Damova - FactforgeEDF2012   Mariana Damova - Factforge
EDF2012 Mariana Damova - FactforgeEuropean Data Forum
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationVladimir Alexiev, PhD, PMP
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of InformationAdrian Paschke
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-tonvitucci
 
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD CloudAnalyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD CloudMOVING Project
 
The Semantic Web meets the Code of Federal Regulations
The Semantic Web meets the Code of Federal RegulationsThe Semantic Web meets the Code of Federal Regulations
The Semantic Web meets the Code of Federal Regulationstbruce
 
RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)Vladimir Alexiev, PhD, PMP
 
Querying the Web of Data
Querying the Web of DataQuerying the Web of Data
Querying the Web of DataRinke Hoekstra
 
Release webinar: Sansa and Ontario
Release webinar: Sansa and OntarioRelease webinar: Sansa and Ontario
Release webinar: Sansa and OntarioBigData_Europe
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Ontotext
 
Semantic Web from the 2013 Perspective
Semantic Web from the 2013 PerspectiveSemantic Web from the 2013 Perspective
Semantic Web from the 2013 PerspectiveAdrian Paschke
 
EKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsEKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsFrancesco Osborne
 
Europe PubMed Central and Linked Data
Europe PubMed Central and Linked DataEurope PubMed Central and Linked Data
Europe PubMed Central and Linked DataJee-Hyub Kim
 
Approximation and Self-Organisation on the Web of Data
Approximation and Self-Organisation on the Web of DataApproximation and Self-Organisation on the Web of Data
Approximation and Self-Organisation on the Web of DataKathrin Dentler
 

Similaire à Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes (20)

EDF2012 Mariana Damova - Factforge
EDF2012   Mariana Damova - FactforgeEDF2012   Mariana Damova - Factforge
EDF2012 Mariana Damova - Factforge
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of Information
 
Linked Open Data and Ontotext Projects
Linked Open Data and Ontotext ProjectsLinked Open Data and Ontotext Projects
Linked Open Data and Ontotext Projects
 
Fact forge20 edf
Fact forge20 edfFact forge20 edf
Fact forge20 edf
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-to
 
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD CloudAnalyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
 
The Semantic Web meets the Code of Federal Regulations
The Semantic Web meets the Code of Federal RegulationsThe Semantic Web meets the Code of Federal Regulations
The Semantic Web meets the Code of Federal Regulations
 
20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture
 
Ontology development
Ontology developmentOntology development
Ontology development
 
RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)
 
Querying the Web of Data
Querying the Web of DataQuerying the Web of Data
Querying the Web of Data
 
Release webinar: Sansa and Ontario
Release webinar: Sansa and OntarioRelease webinar: Sansa and Ontario
Release webinar: Sansa and Ontario
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020
 
Linking Open Data
Linking Open DataLinking Open Data
Linking Open Data
 
Semantic Web from the 2013 Perspective
Semantic Web from the 2013 PerspectiveSemantic Web from the 2013 Perspective
Semantic Web from the 2013 Perspective
 
Open Access Repository Junction
Open Access Repository JunctionOpen Access Repository Junction
Open Access Repository Junction
 
EKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsEKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
 
Europe PubMed Central and Linked Data
Europe PubMed Central and Linked DataEurope PubMed Central and Linked Data
Europe PubMed Central and Linked Data
 
Approximation and Self-Organisation on the Web of Data
Approximation and Self-Organisation on the Web of DataApproximation and Self-Organisation on the Web of Data
Approximation and Self-Organisation on the Web of Data
 

Plus de Ontotext

Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsOntotext
 
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingAnalytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingOntotext
 
It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsOntotext
 
The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise Ontotext
 
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your DataOntotext
 
[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and NewsOntotext
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Ontotext
 
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesHercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesOntotext
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps Ontotext
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandOntotext
 
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...Ontotext
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformOntotext
 
How is smart data cooked?
How is smart data cooked?How is smart data cooked?
How is smart data cooked?Ontotext
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessOntotext
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataOntotext
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
The Knowledge Discovery Quest
The Knowledge Discovery Quest The Knowledge Discovery Quest
The Knowledge Discovery Quest Ontotext
 
Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingOntotext
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
 
Semantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchSemantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchOntotext
 

Plus de Ontotext (20)

Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 steps
 
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingAnalytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
 
It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got Semantics
 
The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise
 
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
 
[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
 
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesHercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on Demand
 
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing Platform
 
How is smart data cooked?
How is smart data cooked?How is smart data cooked?
How is smart data cooked?
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining Process
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
The Knowledge Discovery Quest
The Knowledge Discovery Quest The Knowledge Discovery Quest
The Knowledge Discovery Quest
 
Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining Processing
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
Semantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchSemantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial Research
 

Dernier

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Dernier (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes

  • 1. making sense of text and data Atanas Kiryakov Webinar, July 2020 Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
  • 2. Who are we? o Leader ü Semantic technology vendor established year 2000 ü Part of Sirma Group: 400 persons, listed at Sofia Stock Exchange o Profitable and growing ü Global: 80% of revenue from London and New York ü Clients: S&P, BBC, FT, Top-5 US Bank, UK Parliament, Fujitsu, … ü Verticals: Financial services, Health care and Life sciences, Publishing, Manufacturing o Innovator ü Attracted over $15M in innovation funding ü Member of W3C, EDMC, ODI, STI and LDBC, developing next gen. standards
  • 3. …, the market leaders in this space continue to be Neo4J and Ontotext (GraphDB), which are graph and RDF database providers respectively. These are the longest established vendors in this space (both founded in 2000) so they have a longevity and experience that other suppliers cannot yet match. Bloor Research Graph Database Market Update 2020 Ontotext GraphDB™ - the Flagship Product
  • 5. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o Reasoning With GraphDB Presentation Outline
  • 6. Knowledge Graphs = Rich Data in Context KGs put data in context via linking and semantic metadata We help enterprises get profound insights via interlinking, analyzing and exploring: o diverse databases o text documents and other content o proprietary & global data
  • 7. What is a Knowledge Graph? o The KG represents a collection of interlinked descriptions of concepts and entities ü Concepts describe each other ü Connections provide context ü Context helps comprehension! o A KG can be used as: ü Database: can be queried ü Graph: can be analyzed as network ü Knowledge base: new facts can be inferred Read more: https://www.ontotext.com/knowledgehub/fundamentals/what-is-a-knowledge-graph/
  • 8. What is Semantics? o Formal semantics allows new valid facts to be inferred ü Both data and schema can be interpreted ü Semantic schema = ontology ü Languages: RDF Schema (RDFS), OWL o Only the relevant semantics is formalized in the schema ü The meaning of relativeOf is not fully described by defining it as owl:SymmetricProperty ü The best model is the simplest one that can do the work. But not simpler! myData: Maria ptop:Agent ptop:Person ptop:Woman ptop:childOf ptop:parentOf rdfs:range owl:inverseOf inferred myData:Ivan owl:relativeOf owl:inverseOfowl:SymmetricProperty rdfs:subPropertyOf owl:inverseOf owl:inverseOf rdf:type rdf:type rdf:type
  • 9. Reasoning Benefits o Schema alignment and easy querying in diverse datasets ü Across sources similar relationships can be modeled in a different way - one can use parentOf, another childOf and a third one just the more general relativeOf ü The database will return Ivan as a result of the query (Maria relativeOf ?x) when the fact derived from the source and asserted is (Ivan childOf Maria) o Getting deeper and more complete results ü Finding patterns and inferring new relationships ü Instant discovery of hidden relationships scattered across multiple sources o Consistency checking and quality validation ü RDF Shapes ensure graph consistency and quality
  • 10. The Pitfalls of Reasoning o Over-engineered ontologies ü Too expressive ontology language ü Results of inference hard to understand and verify ü Performance penalties far greater than the benefits o Inappropriate reasoning support ü Inference implementations that work well with taxonomies and conceptual models of few thousands of concepts, but cannot cope with KG of millions of entities o Inappropriate data layer architecture ü One such example is reasoning with virtual KG, which is often infeasible
  • 11. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o Reasoning With GraphDB Presentation Outline
  • 12. Search in British Museum’s Collection o Artefacts are described via the granular ontology CIDOC CRM o Searching in such collection requires Fundamental Relations ü Aggregation of large number of paths through CRM data into a smaller number of searchable relations o E.g.: FR "Thing from Place"
  • 13. British Museum’s Collection: Volumetrics o Museum objects: 2,051,797 ü Thesaurus entries: 415,509 o Explicit statements: 195,208,156 o Total statements: 916,735,486 ü Expansion ratio is 4.7x, i.e., for each statement, 3.7 more are inferred ü Nodes (unique URLs and literals): 53,803,189 o Loading time (including materialization): ü 22.2h on RAM drive ü 32.9h on non-SSD hard drives
  • 14. GraphDB Benchmarking o LDBC: TPC-like benchmarks for graph databases o Members include: Ontotext, OpenLink, neo4j, CWI, UPM, ORACLE, IBM, *Sparsity o LDBC Semantic Publishing Benchmark ü Based on BBC’s Dynamic Semantic Publishing editorial workflow ü Updates, adding new content metadata or updating the reference knowledge (e.g., new people) ü Aggregation queries retrieve content according to various criteria (e.g., to generate a topic web page) ü The only benchmark that involves reasoning and updates
  • 15. LDBC SPB Results of GraphDB Clients reading / writing Reads/s Writes/s 0 / 1 0.0000 11.4067 0 / 2 0.0000 14.3033 0 / 4 0.0000 14.6700 0 / 8 0.0000 15.1067 1 / 0 17.8258 0.0000 4 / 0 43.0833 0.0000 8 / 0 70.3767 0.0000 16 / 0 83.2633 0.0000 8 / 2 52.5667 9.2867 8 / 4 54.0233 9.6167 8 / 8 54.9067 9.5733 10 / 2 59.9467 8.5333 10 / 4 62.2867 8.4767 10 / 8 61.7167 8.6067 16 / 2 68.8100 5.0600 16 / 4 70.3900 5.1067 16 / 8 70.2300 4.9967 16 / 16 70.9467 5.0567 o CPU: 1 x E5-1650 o RAM: 20G heap o Dataset: LDBC SPB 256 o DB: GraphDB SE 8.0, RDF Statements: 254,948,985 (explicit), 480,405,141 (total) OWL-Horst-optimized rule set o Creative works: 8,821,535
  • 16. FactForge: Data Integration o DBpedia (the English version) 496M o GeoNames (all geographic features on Earth) 150M o owl:sameAs links between DBpedia and Geonames 471K o GLEI (global company register data) 3M o Panama Papers DB (#LinkedLeaks) 20M o Other datasets and ontologies: WordNet, WorldFacts, FIBO o News metadata (2000 articles/day enriched by NOW) 1 023M o Total size (2.2B explicit + 328M inferred statements) 2 522М
  • 17. FIBO: Financial Industry Business Ontology o Developed by EDMC, https://spec.edmcouncil.org/fibo/ o We loaded FIBO Foundations and BE ü About 35 RDF files all together (old version) o Reasoning profile: OWL 2 RL o Loading takes 2-3 sec. o Number of explicit statements: 5 696 o Number of total statements, including inferred: 15 713 ü About 10k statements materialized
  • 18. FIBO-PROTON Mapping o PROTON is an upper-level ontology ü 500 classes, 200 properties; developed by Ontotext since 2004 ü used semantic annotation and LOD integration services, e.g, FactForge ü mapped to DBPedia, Freebase, GeoNames o A very basic mapping for public companies and few related properties was loaded in 4 hours in FactForge: fb:business.issuer rdfs:subClassOf pext:PublicCompany. pext:PublicCompany rdfs:subClassOf fibo-be-corp-corp:PubliclyHeldCompany. ptop:Organization rdfs:subClassOf fibo-fnd-org-fm:FormalOrganization. dbp-prop:industry rdfs:subPropertyOf pext:industryOf. pext:industryOf rdfs:subPropertyOf fibo-fnd-rel-rel:isClassifiedBy. dbp-ont:subsidiary rdfs:subPropertyOf ptop:controls. ptop:controls rdfs:subPropertyOf fibo-fnd-rel-rel:controls.
  • 19. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o Reasoning With GraphDB Presentation Outline
  • 20. Rule-Based Reasoning o Description Logic (DL) doesn’t scale ü Satisfiability checking is not tractable ü Complexity grows exponentially with size o Rule-based inference engine ü R-Entailment rules, PROLOG-style, as defined in [1] o Sound and complete in PSPACE ü Under some constraints: do not introduce blank nodes, bound size of the rule bodies, ground RDF graph, [1] [1] Combining RDF and Part of OWL with Rules: Semantics, Decidability, Complexity Herman J. ter Horst ,Published in International Semantic Web Conference 2005 More at: http://graphdb.ontotext.com/documentation/standard/reasoning.html Complexity* DLRules, LP OWL Full OWL DL OWL Lite RDFS SWRL Datalog OWL 2 QL Expressivity supported by GraphDB OWL 2 RL OWL Horst
  • 21. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o Reasoning With GraphDB Presentation Outline
  • 22. Forward-Chaining and Materialization o All possible inferences are made upon update and are stored ü The inferred statements are stored and indexed along the explicit ones ü Interferences that are no longer supported upon delete are retracted o Forward-chaining works, subject to conscious modeling ü The overheads of the materialization approach are bearable ü Say, 2x index size and 2x slower loading and updates ü Marginal (if any) slowdown of queries
  • 23. Query-time Reasoning and Backward-Chaining o Perform reasoning query-time ü No overhead upon data loading and updates ü Two basic approaches: Backward-chaining and Query rewriting o Backward-chaining slows down query evaluation dramatically ü Alike PROLOG unification, the engine “dives” recursively, in order to exhaust all alternative ways to find bindings for each separate triple pattern in the query ü There is no way to guess before the actual evaluation the cardinality of the results for each triple pattern ü This makes query plan optimization impossible and ruins query performance
  • 24. Query Rewriting o Each pattern in the query is rewritten as disjunction of several alternatives, based on reasoning on the schema/ontology/TBox <?a rdf:type ptop:Person> query pattern will be expanded to something like <?a rdf:type ptop:Person> OR (<?p rdfs:range ptop:Person> AND <?b ?p ?a>) OR (<?a rdf:type ?c> AND <?c rdfs:subClassOf ptop:Person >) … o Execution of 10s combinations of variants is slow ü Imagine a query with two patterns: the first one expands into 5 variants and the second into 6 variants. The engine will have to evaluate 30 alternative combinations ü Think of implementing the semantics of owl:sameAs via query rewriting o Query rewriting also delivers incomplete results ü Recursion is not possible with SPARQL query rewriting
  • 25. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o GraphDB o Reasoning with GraphDB o Reasoning Optimizations in GraphDB Presentation Outline
  • 26. GraphDB Essentials o Scalable RDF / SPARQL engine ü W3C standards support ü NEW: RDF* support, property annotations o Platform independent (100% Java) o Open source API ü Main contributor to the RDF4J project o Reasoning and consistency checking ü UNIQUE! Efficient reasoning support for big data sets across the full lifecycle of the data: load, query, updates
  • 27. Architecture GraphDB Workbench User friendly interface for database administration GraphDB Engine REST API for database access Plugin / Connectors
  • 28. GraphDB Workbench o SPARQL editor & autocomplete o Schema visualization o Graph exploration o Database monitoring and administration 9/10/20
  • 30. Features Free Standard Enterprise RDF 1.1 support SPARQL 1.1 support RDFS, OWL2 RL and QL reasoning Efficient query execution Workbench interface Community support Unlimited number of CPU cores Commercial support Connectors for Elasticsearch & SOLR High-availability cluster Managed service GraphDB Enterprise: Resilience & Availability
  • 31. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o GraphDB o Reasoning with GraphDB o Reasoning Optimizations in GraphDB Presentation Outline
  • 32. Reasoning in GraphDB o Fast forward-chaining materialization ü Allows for efficient query evaluation on big datasets o Incremental for both inserts and deletes ü Inferred closure is updated transparently upon commit of transaction o Sample rules: ENTAILMENT CONSISITENCY p <rdf:type> <owl:FunctionalProperty> x owl:sameAs y x p y x owl:differentFrom y x p z ------------------------ ------------------------------- y <owl:sameAs> z
  • 33. OWL 2 Reasoning o Built-in rule-sets for: RDFS, OWL-Horst, OWL2-RL, OWL2-QL o Custom rule-sets easily defined ü Ruleset optimizer/profiler o Configurations with multiple rule-sets ü E.g. one with consistency checking to be used for internal data and another one with „open-world“ semantics for LOD and other external datasets o NEW: Proof plug-in provides inference explanation
  • 34. Predefined Rule-Sets Ruleset Description Empty No reasoning rdfs Standard RDFS: subClassOf, subPropertyOf, domain and range of properties rdfs-plus RDFS plus symmetric, transitive and inverse properties owl-horst (pD*) sameAs, equivalentClass, equivalentProperty, SymmetricProperty, TransitiveProperty, inverseOf, FunctionalProperty, InverseFunctionalProperty. Partial support for: intersectionOf, someValuesFrom, hasValue, allValuesFrom owl-max See the spec http://graphdb.ontotext.com/documentation/standard/reasoning.html owl-rl (DL-LiteR) AsymmetricProperty, IrreflexiveProperty, propertyChainAxiom, AllDisjointProperties, hasKey, unionOf, complementOf, oneOf, differentFrom, AllDisjointClasses and all the property cardinality primitives. Adds more complete support for intersectionOf, someValuesFrom, hasValue, allValuesFrom owl-ql Partial compliance. See the spec https://www.w3.org/TR/owl2-profiles
  • 35. Optimized Rule-Sets o These versions exclude some RDFS reasoning rules, which are not useful for most of the applications, but add substantial reasoning overheads o “Optimized” ruleset versions suppress this rule Id: rdf1_rdfs4a_4b x a y ------------------------------- x <rdf:type> <rdfs:Resource> a <rdf:type> <rdfs:Resource> y <rdf:type> <rdfs:Resource>
  • 36. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o GraphDB o Reasoning with GraphDB o Reasoning Optimizations in GraphDB Presentation Outline
  • 37. Efficient Retraction of Inferred Facts o Materialization causes troubles upon delete ü It is not trivial to figure out which inferred statements are no longer supported o Deletion without recomputing the inference closure is needed ü Without it forward-chaining is not feasible for dynamic environments o GraphDB retracts statements via a unique algorithm ü Forward-chaining to find potentially affected inferences ü Backward-chaining to test which inferences are still supported ü No truth maintenance information overheads ü Fast – the same order of magnitude as materialization upon insert
  • 38. The Honey of owl:sameAs Equivalence o owl:sameAs links the datasets in the Linked Open Data cloud o owl:sameAs declares that two different URIs denote one and the same object ü Aligns different identifiers of the same real-world entity used in different data sources o For example, let’s say that we have three different URIs for Bulgaria and two for Sofia (its capital) dbpedia:Sofia owl:sameAs geonames:727011 geonames:727011 geo-ont:parentFeature geonames:732800 dbpedia:Bulgaria owl:sameAs geonames:732800 dbpedia:Bulgaria owl:sameAs opencyc-en:Bulgaria
  • 39. The Sting of owl:sameAs Equivalence o According to the standard semantics of owl:sameAs ü It is a transitive and symmetric relationship ü Statements, asserted using one of the equivalent URIs, should be inferred to appear with all equivalent URIs placed in the same position ü Thus the 4 statements in the example lead to 10 inferred statements : geonames:727011 owl:sameAs dbpedia:Sofia geonames:732800 owl:sameAs dbpedia:Bulgaria geonames:732800 owl:sameAs opencyc-en:Bulgaria opencyc-en:Bulgaria owl:sameAs dbpedia:Bulgaria opencyc-en:Bulgaria owl:sameAs geonames:732800 dbpedia:Sofia geo-ont:parentFeature geonames:732800 dbpedia:Sofia geo-ont:parentFeature opencyc-en:Bulgaria dbpedia:Sofia geo-ont:parentFeature dbpedia:Bulgaria geonames:727011 geo-ont:parentFeature opencyc-en:Bulgaria geonames:727011 geo-ont:parentFeature dbpedia:Bulgaria
  • 40. The Honey and the Sting of owl:sameAs E11 E22 E12 E21 E23 geonames:727011 dbpedia:Sofia geonames:732800 dbpedia:Bulgaria opencyc-en:Bulgaria geo-ont:parentFeature
  • 41. The Honey and the Sting of owl:sameAs E11 E22 E12 E21 E23 geonames:727011 dbpedia:Sofia geonames:732800 dbpedia:Bulgaria opencyc-en:Bulgaria geo-ont:parentFeature
  • 42. owl:sameAs Optimization o GraphDB features an optimization of owl:sameAs ü It can use a single master-node in its indices to represent a class of sameAs-equivalent URIs o Avoids inflating the indices with multiple equivalent statements ü Imagine a statement that has 5 sameAs-equivalents of its subject, 2 of its predicate and 3 of its object. Such statement would have 30 replicas in the indices after forward-chaining if such an optimization is not used o Helps presenting compact query results ü The owl:sameAs equivalence can result in multiplication of the bindings of the variables in the process of query evaluation with both forward- and backward-chaining. This leads to expansion of the result- set with rows that differ only by referring to different URIs, which are sameAs-equivalent ü Optionally, query results can be expanded, as if there is no optimization
  • 43. Questions? Experience the technology with our demonstrators FactForge: Knowledge graph of linked open data and news about People and Organizations http://factforge.net RANK: News popularity ranking for companies http://rank.ontotext.com NOW: Semantic News Portal http://now.ontotext.com #43