SlideShare une entreprise Scribd logo
1  sur  66
Télécharger pour lire hors ligne
mm-ADT
A Multi-Model Abstract Datatype
Dr. Marko A. Rodriguez
Founder, RReduX Inc.
Project Management Committee, Apache TinkerPop
Rodriguez, M.A., “mm-ADT: A Multi-Model Abstract Datatype,” ApacheCon 2019, Las Vegas, NV, September 2019.
REDUX
RFunded by:
Collaborators
Daniel Kuppitz and Stephen Mallette
WARNING
This Slide Presentation Should Not Be Used as a Reference
As of September 2019, there exists a rough draft of the mm-ADT
specification (0.1-alpha) and a Java-based prototype of the mm-
ADT-bc compiler. The specifics presented here are likely to
change as the project matures.
The goal is to release the following artifacts by early 2020.
1.) A stable 1.0 mm-ADT specification document.
2.) A Java-based mm-ADT-bc compiler and virtual machine.
3.) A basic reference implementation of an mm-ADT database.
Database Datatypes
WIDE-COLUMN DATABASE
RELATIONAL DATABASE
KEY/VALUE DATABASE
PROPERTY GRAPH DATABASE
DOCUMENT DATABASE
RDF DATABASE
Database Components
STORAGE SYSTEM PROCESSING ENGINE
QUERY LANGUAGE
Universal Database Components
UNIVERSAL DATA STRUCTURE
UNIVERSAL PROCESSING MODEL
UNIVERSAL INSTRUCTION SET
80%
solid.
90%
solid.
60%
solid.
as of Sept, 2019
Presentation is primarily on this topic.
Synthetic Database Systems
Synthetic Database Systems
Synthetic Database Systems
Synthetic Database Systems
mm-ADT Compliant
The Benefit to Component Providers
…
…
…
QUERY
LANGUAGE
PROVIDERS
PROCESSING
ENGINE
PROVIDERS
STORAGE
SYSTEM
PROVIDERS
Your query language’s queries can
execute real-time, near-time, or
batch-time over any database.
Your processing engine can be
programmed by any query language
and compute over any database.
Your database can be processed by
many processors whose queries are
defined by many different query
languages.
M
ulti-m
odel
Program
m
ability
Largeruserbase
mm-ADT
A Multi-Model Abstract Datatype
Any Data Structure/Language Storage/Processing Agnosticism Universal Operators
GET
PUT
FILTER
SELECT
PROJECT
ORDER
DEDUP
GROUP
COUNT
SUM
MEAN
COALESCE
REPEAT
BRANCH
ARITHMETIC
RECORD LAYOUT
INDEX STRUCTURES
PARTITIONING
DENORMALIZATIONS
SORT ORDERS
…
PULL-BASED ITERATION
PUSH-BASED REACTIVE
MESSAGE PASSING
LINEAR SCAN/FILTER
LAZY/STRICT EVALUTION
QUANTUM WAVE INTERFERENCE
KEY/VALUE
JSON/XML DOCUMENT
PROPERTY/RDF GRAPH
RELATIONAL
WIDE-COLUMN
QUANTUM
…
SQL
SPARQL
CYPHER
GREMLIN
GRAPHQL
CQL
QUERY DOCUMENT
QUANTUM UNITIARY MATRICES
yes. any data structure.
yes. any data query language.
yes. any algorithm.
yes. addition. :)
mm-ADT Requirements
A Cluster-Oriented Bytecode and Virtual Machine
A type system for capturing database models and schemas.
Datatypes, composite structures, constraints, foreign keys,
cardinalities, dependencies.
A database perspective w/ loose coupling for future use cases.
Indices, denormalizations, read/write costs, transactions,
offloading computations, complex data access paths, multi-user,
distributed data, data locality, real-time/batch executions.
A bytecode foundation for use across software and languages.
Compatible with Java, C++, Go, etc.-based databases.
Reasonable compilation strategies to/from:
SQL, SPARQL, Cypher, Gremlin, GraphQL, QueryDocs, …future.
A universal processing model able to capture various strategies.
Single-threaded, single-machine queries.
Distributed, cluster-oriented queries.
Lazy and space-efficient for main-memory queries (real-time).
Eager and time-efficient for disk-based queries (batch).
Turing-complete for any known query/algorithm.
Part 1
Universal Data Structure
mm-ADT Bytecode Uses
model-ADT Definitions vs. model-ADT Queries
Created by the engineers developing an
mm-ADT compliant storage system.
Defines the storage system’s abstract
data type. A complex bytecode, but even
for feature rich storage systems, its
typically less than 100 lines of code. Created by the storage system user. A
subtype of the storage system’s model-ADT.
Best understood as the user’s application’s
“schema.” Can be generated by the storage
system via its schema language.
Created by higher-level query languages.
However, it can be written directly by a
human user. A simple bytecode with a
similar structure to the popular query
language of the respective model-ADT.
[model,pg=>mm,
[define,vertex,…]
[define,edge,…]
[define,db,…]] [model,social=>pg,
[define,person,vertex&…]
[define,likes,edge&…]
[define,db,…]]
[db][get,age]
[dedup]
[sum]
modeldefinitionsupported
bythestoragesystem
instance
data
m
anipulation
and
analysis
sub-m
odeldefinitions
describing
the
user’s
dom
ain
ofdiscourse
When compiling query bytecode,
definitions are used for type inference/
checking, query optimization, and
cross model embeddings.
The
m
ostcom
plex
m
odel
to
date
is
65
lines
ofcode.
The language used for all examples is the (somewhat) human readable/writable bytecode language called mm-ADT-bc.
@q
true|false
@bool
@int
-infty..infty
@real
-infty..infty
‘marko’
@str
@list
[@obj*]
@rec
[(@obj:@obj)*]
@obj
@num@seq
Built-In Datatypes
Fundamental Structures Defined Outside of mm-ADT
@inst
[@str,@obj*]
@db
database
root structure
object
quantifier
…,-2,-1,0,1,2,…
key/value associative array
mixed-typed array
floating point
bytecode
default
reserved type aliases
a char is a one element strtypically
[define,person,[name:@str,age:@int]]
Custom Datatypes
Type Definitions
[define,person,[name:@str,age:@int]]
symbol structureopcode
Custom Datatypes
Type Definitions
@str
[is,[type][eq, str ]]: @obj => @str?
@int
[is,[type][eq, int ]]: @obj => @int?
@person
[is,[get,name][type][eq, str ]][is,[get,age][type][eq, int ]]: @obj => @person?
@inst
mm-ADT types are filter bytecode (predicates). If an object is unfiltered by the bytecode, then the object is that type.
[define,person,[name:[is,[type][eq, str ]],age:[is,[type][eq, int ]]]]
[define,person,[name:@str,age:@int&gte(0)]]
Custom Datatypes
Type Refinement
[define,probability, @real&gte(0.0)&lte(1.0)]
[define,varchar255, @str&[is,[len][lte,255]]
[define,short, @int&gte(-32767)&lte(32767)]
[define,pair, [@obj,@obj]]
[define,triple, [@obj,@obj,@obj]]
[define,years,@int&gte(0)]
[define,person,[name:@str,age:@years]
anonymous
type
named
type
Anonymous types are the norm in mm-ADT.
[define,person,[name:@str,age:@int&gte(0),friend:@person]]
Custom Datatypes
Type Recursion
[define,vertex,inE:@edge*,outE:@edge*]
[define,edge,outV:@vertex,inV:@vertex]
Graph
m
odel-ADTs
are
recursive
structures
outE inE
outV inV
[define,person,[name:@str,age:@int&gte(0),friend:@person?]]
Custom Datatypes
Type Quantifiers
{3,5}: 3 to 5
* : {0,infty}
+ : {1,infty}
? : {0,1}
{2} : {2,2}
{3,} : {3,infty}
All standard query languages are bound to the natural numbers with standard multiplication and addition definitions.
[define,none, @obj{0} ]
[define,some, @obj{1} ] // @obj
[define,maybe,@obj{0,1}] // @obj?
Quantifiers can be matrices composed of complex numbers.
Constructive and deconstructive wave interference patterns can be incorporated into a query.
This does not radically alter the bytecode as quantifiers are fundamental to mm-ADT.
Quantifiers can be real numbers (for example, values between 0.0 and 1.0).
Real quantifiers can model energy diffusions, fuzzy set semantics, probabilistic/stochastic behavior.
Unitary
M
artrix
Quantifiers
Real Num
ber
Quantifiers
Quantifiers are any algebraic ring with unity.
[mult][add][sub][zero][one]
@str?
@person?@none | @some = @maybe
{0,0} + {1,1} = {0,1}
{min,max}
[define,q,@unitary]
@vertex?
[define,q,@real&gte(0.0)&lte(1.0)]
@obj{[1,0,1,0]}
@obj{0.92}
[define,person,[name:@str,age:@int&gte(0),friend:@person?]]
[define,loner,@person&[friend:@person{0}]]
Custom Datatypes
Subtypes
{x1,x2}&{y1,y2} => {x1*y1,x2*y2}
{x1,x2}|{y1,y2} => {min(x1,y1),max(x2,y2)}
{0,1}
{0,0}
[name:@str,age:@int&gte(0),friend:@person?]&[friend:@person{0}]
=>
[name:@str,age:@int&gte(0),friend:@person{0}]
Composition through & and | are defined for each type with default definitions from the underlying fundamental type.
super type
loner <: person
N
ote
thatthere
are
also
axiom
s
forthe
com
position
ofa
type’s
instruction
rew
rites.
N
otdiscussed
in
this
presentation.
composition
{0,1}&{0,0}=>{0,0}=>{0}
type quantifier
composition
[define,person,[name:@str,age:@int&gte(0),friend:@person?]]
[define,older,@person[age :@int&gte(0)~x,
friend:@person&[age:@int&gte(0)&lt(~x)]?]]
Custom Datatypes
Type Dependents
[define,complex,[@real~x1,@real~x2]
-> [add,@complex&[@real~y1,@real~y2]] => [map,@complex&[~x1+~y1,~x2+~y2]]
-> [mult,@complex&[@real~y1,@real~y2]] => [map,@complex&[(~x1*~y1)-(~x2*~y2),
(~x1*~y2)-(~x2*~y1)]]
Infix arithmetic notation is still being designed. Primary issue is * and + parser ambiguities due to quantifier shorthands.
anonymous
subtype
f · h : A → D
type
value/structure
type
quantifier
instance
access
mapsFrom
token
mm-ADT Datatypes
The General Structure of a Type
A{x,y} <= [i]*
-> [f]+ => [j]*
-> [g]+ => [k]*
-> [h]+ => [l]*
instruction
pattern
instruction
rewrite
instruction
token
type
instructions
mapsTo
token
The type is represented in mm-ADT-bc syntax. All mm-ADT bytecode generating languages leverage these components.
A <= [i]
-> [f] => [B <= [j]]
-> [g] => [C <= [k]]
f : A → B
A -> [f] => B
-> [g] => C
A -> [f] => B -> [h] => D
Inspired by OOP where
methods/instructions
grouped by domain class/type.
Types with instructions
denote function branches.
Paths from type to type
via instructions denote
function composition.
def
f(a) = b
spec
f : A → B
g : A → C
+ *
domain
range
[define,person,
[name:@str~x,age:@int] <= [db][get,people][is,[get,name][eq,~x]]]
@person&[name:marko,age:29] all constant values
@person&[age:29] name unbound
@person&[name:marko] accessible via <=
instance
type
reference
mm-ADT Datatypes
Instance Data Access Path
“people 29 years of age”
instance access
(canonical representation)
mm-ADT is a cluster-oriented programming language where data access has different costs depending on locality.
A <= […]
[define,person,[name:@str,age:@int]]
[define,db,person*
-> [order,[gt,[get,age]]] =>
-> [dedup,[get,name]] =>
-> [count] => [ref,@int <= [db][count]]
-> [is,[get,name][eq,@str~x]] => [ref,@person? <= [db][is,[get,name][eq,~x]]]
indexlookup
aggregate
mm-ADT Datatypes
Instruction Rewrites
no-op
All [ref] instructions are resolved by the storage system or processing engine and are used to access secondary structures.
[db][is,[get,name][eq,marko]] => [ref,@person? <= [db][is,[get,name][eq,marko]]]
O(log n) index queryO(n) linear scan
submitted bytecode rewritten bytecode
Primary Structure: The structure associated with the database’s conceptual model.
Secondary Structures: The auxiliary structures used to improve query performance and ensure data integrity.
Relational model-ADT
Primary and Secondary Structures
index
str bool intint
schema
denormalization
sort order
unique
checks
aggregates
aa 1
ab 1
ba 2
bb 7
ca 7
cb 7
da 4
dc 8
1 2
1 13
2 17
7 29
4 30
3 33
2 33
8 39
9 60
9 65
3 81
5 82
1 83
< 10
SUM=567
Relational model-ADT
RELATIONAL DATABASE
[model,rdb=>mm,
[define,value,@bool|@num|@str]
[define,row,[(@str:@value)*]]
[define,table,@row*]
[define,db,[(@str:@table)*]]]
DOMAIN SPECIFIC RELATIONAL SCHEMA
[model,social=>rdb,
[define,person,[name:@str?,age:@int?]]
[define,persons,@person*]
[define,db,[people:@persons]]]
CREATE TABLE people (
name varchar(255),
age int
)
[define,person,[name:@str?,age:@int?]]
[define,persons,@person*]
[define,db,[people:@persons]]
Relational model-ADT
Primary Structure (Subtype)
Relational model-ADT
Secondary Structures (Primary Key)
[define,person,[name:@str,age:@int?]]
[define,persons,@person*
-> [dedup,[get,name]] => ]
[define,db,[people:@persons]]
CREATE TABLE people (
name varchar(255),
age int,
PRIMARY KEY(name)
)
Relational model-ADT
Secondary Structures (Sort Orders)
[define,person,[name:@str,age:@int?]]
[define,persons,@person*
-> [dedup,[get,name]] =>
-> [order,[gt,[get,name]]] => ]
[define,db,[people:@persons]]
CREATE TABLE people (
name varchar(255),
age int,
PRIMARY KEY(name,ASC)
)
[define,person,[name:@str,age:@int]]
[define,persons,@person*
-> [dedup,[get,name]] =>
-> [order,[gt,[get,name]]] => ]
[define,db,[people:@persons]]
Relational model-ADT
Secondary Structures (Null Values)
CREATE TABLE people (
name varchar(255),
age int NOT NULL,
PRIMARY KEY(name,ASC)
)
Relational model-ADT
Secondary Structures (Check Values)
[define,person,[name:@str,age:@int&gte(0)]]
[define,persons,@person*
-> [dedup,[get,name]] =>
-> [order,[gt,[get,name]]] => ]
[define,db,[people:@persons]]
CREATE TABLE people (
name varchar(255),
age int NOT NULL,
PRIMARY KEY(name,ASC),
CHECK (age >= 0)
)
[define,person,[name:@str~x,
age:@int&gte(0),
friend:@person&[name:@str]] <= [db][get,people]
[is,[get,name]
[eq,~x]]]
[define,persons,@person*
-> [dedup,[get,name]] =>
-> [order,[gt,[get,name]]] => ]
[define,db,[people:@persons]]
Relational model-ADT
Secondary Structures (Foreign Key)
CREATE TABLE people (
name varchar(255),
age int NOT NULL,
friend varchar(255),
PRIMARY KEY(name,ASC),
CHECK (age >= 0),
FOREIGN KEY (friend) REFERENCES people(name)
)
instance access
[define,person,[name:@str~x,
age:@int&gte(0),
friend:@person&[name:@str]] <= [db][get,people]
[is,[get,name]
[eq,~x]]
[define,persons,@person*
-> [dedup,[get,name]] =>
-> [order,[gt,[get,name]]] =>
-> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x]?]]
[define,db,[people:@persons]]
Relational model-ADT
Secondary Structures (Single Key Index)
CREATE TABLE people (
name varchar(255),
age int NOT NULL,
friend varchar(255),
PRIMARY KEY(name,ASC),
CHECK (age >= 0),
FOREIGN KEY (friend) REFERENCES people(name),
CREATE UNIQUE INDEX name_idx ON people(name)
)
[define,person,[name:@str~x,
age:@int&gte(0),
friend:@person&[name:@str]] <= [db][get,people]
[is,[get,name]
[eq,~x]]
[define,persons,@person*
-> [dedup,[get,name]] =>
-> [order,[gt,[get,name]]] =>
-> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x]?
-> [is,[get,age][eq,@int~y]] => [ref,@person&[name:~x,age:~y]?]]
-> [is,[get,age][eq,@int~y]] => [ref,@person&[age:~y]*
-> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x,age:~y]?]]]
[define,db,[people:@persons]]
Relational model-ADT
Secondary Structures (Multi-Key Index)
CREATE TABLE people (
name varchar(255),
age int NOT NULL,
friend varchar(255),
PRIMARY KEY(name,ASC),
CHECK (age >= 0),
FOREIGN KEY (friend) REFERENCES people(name),
CREATE UNIQUE INDEX name_idx ON people(name),
CREATE INDEX name_age_idx ON people(name,age)
)
[define,person,[name:@str~x,
age:@int&gte(0),
friend:@person&[name:@str]] <= [db][get,people]
[is,[get,name]
[eq,~x]]
[define,persons,@person*
-> [dedup,[get,name]] =>
-> [order,[gt,[get,name]]] =>
-> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x]?
-> [is,[get,age][eq,@int~y]] => [ref,@person&[name:~x,age:~y]?]]
-> [is,[get,age][eq,@int~y]] => [ref,@person&[age:~y]*
-> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x,age:~y]?]]
-> [count] => [ref,@int <= [db][get,people]
[count]]]
[define,db,[people:@persons]]
Relational model-ADT
Secondary Structures (Aggregates)
CREATE TABLE people (
name varchar(255),
age int NOT NULL,
friend varchar(255),
PRIMARY KEY(name,ASC),
CHECK (age >= 0),
FOREIGN KEY (friend) REFERENCES people(name),
CREATE UNIQUE INDEX name_idx ON people(name),
CREATE INDEX name_age_idx ON people(name,age)
)
[define,person,[name:@str~x,
age:@int&gte(0),
friend:@person&[name:@str]] <= [db][get,people]
[is,[get,name]
[eq,~x]
-> [get,friend][get,friend] => ]
[define,persons,@person*
-> [dedup,[get,name]] =>
-> [order,[gt,[get,name]]] =>
-> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x]?
-> [is,[get,age][eq,@int~y]] => [ref,@person&[name:~x,age:~y]?]]
-> [is,[get,age][eq,@int~y]] => [ref,@person&[age:~y]*
-> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x,age:~y]?]]
-> [count] => [ref,@int <= [db][get,people]
[count]]]
[define,db,[people:@persons]]
Relational model-ADT
Secondary Structures (Domain Logic)
CREATE TABLE people (
name varchar(255),
age int NOT NULL,
friend varchar(255),
PRIMARY KEY(name,ASC),
CHECK (age >= 0),
FOREIGN KEY (friend) REFERENCES people(name),
CREATE UNIQUE INDEX name_idx ON people(name),
CREATE INDEX name_age_idx ON people(name,age)
)
conceptual
secondary structures
Friends are pair bonded.
Its a weird example,
but we got stuck with such a simple model.
no-op
model-ADT Data Access Graph
Primary and Secondary Structures Specify Data Access Paths
secondary
structures
primary
structure
dereference
reference
Secondary structures “teleport” the processor from one location in the primary structure to another.
<
Data Access Path
Primary Structure Only
@person
@persons
@person
@person
@person
@person
@person
@person
@str
@str
[get,people]
map barrier
@db
[get,name]
[get,name]
[is,[get,age][gt,21]]
filter map reduce
[db]
initial
@person
@person
@person
@person
@person
@str
@str
@str
[get,name] [dedup]
@int
[count]
[get,name]
[get,name]
@str
@str
@str
@str
@str
...
[is,[get,age][gt,21]]
[is,[get,age][gt,21]]
[is,[get,age][gt,21]]
[is,[get,age][gt,21]]
[is,[get,age][gt,21]]
[is,[get,age][gt,21]]
[db][get,people]
[is,[get,age][gt,21]]
[get,name]
[dedup]
[count]
CREATE TABLE people (
name varchar(255) NOT NULL,
age int NOT NULL
)
[define,person,[name:@str,age:@int]]
[define,persons,@person*]
[define,db,[people:@persons]]
How
many unique names are there
for people over 21 years of age?
map map
ref
deref
time spent
flatmap
Data Access Path
Primary and Secondary Structures
[db][get,people]
[is,[get,age][gt,21]]
[get,name]
[dedup]
[count]
CREATE TABLE people (
name varchar(255),
age int NOT NULL,
PRIMARY KEY(name),
CREATE INDEX idx ON people(age)
)
[get,name]
[dedup]
[count]@person
[age:gt(21)]*
@str* @int
[is,[get,age]
[gt,21]]
@int
dereferenceindex query names
in index
names
are unique
index
hit count
@persons
[get,people]
@db
[db]
[define,person,[name:@str,age:@int]]
[define,persons,@person*
-> [dedup,[get,name]] =>
-> [is,[get,age][@rel~x,@int~y]] => [ref,@person[age:~x(~y)]*
-> [get,name] => [ref,@str* <= […]
-> [dedup] =>
-> [count] => [ref,@int <= […]]]]]
[define,db,[people:@persons]]
map mapmap map mapinitial no-op
runtime instruction rewrites
ref
deref
time spent
Common model-ADTs
Primary Structure
KEY-VALUE STORE
[model,kv=>mm,
[define,k,@num|@str]
[define,v,@obj]
[define,kv,[@k,@v]]
[define,db,@kv*]]
PROPERTY GRAPH DATABASE
[model,pg=>mm,
[define,properties,[(@str:@str|@num|@bool)*]]
[define,element,@properties&[id:@obj,label:@str]]
[define,edge,@element&[outV:@vertex,inV:@vertex]]
[define,vertex,@element&[inE:@edge*,outE:@edge*]]
[define,db,@vertex*]]
RELATIONAL DATABASE
[model,rdb=>mm,
[define,value,@bool|@num|@str]
[define,row,[(@str:@value)*]]
[define,table,@row*]
[define,db,[(@str:@table)*]]]
DOCUMENT DATABASE
[model,doc=>mm,
[define,dval,@bool|@num|@str|@dobj|@dlist]
[define,dlist,[(@dval)*]]
[define,dobj,[(@str:@dval)*]]
[define,doc,@dobj&[_id:@str]]
[define,collection,@doc*]
[define,db,[(@str:@collection)*]]]
A model-ADT is an abstract datatype modeled/embedded in mm-ADT.
[model]
domain of discourse
[model, rdb=>mm, [define,…][define,…][define,…]]
[model, pg=>rdb,[define,…][define,…][define,…]]
[model, doc=>rdb,[define,…][define,…][define,…]]
[model, kv=>rdb,[define,…][define,…][define,…]]
[model,quantum=>pg, [define,…][define,…][define,…]]
Multiple model-ADTs
Models, Embeddings, and Queries
rdb mm
pg
doc
kv the most expressive model
aligned with the storage system
storage system
implements
mm-ADT objects
storage system’s
supported models
X model
embedded
in Y model
all models must have
an embedding path
to mm-ADT
property graph query (Gremlin=>pg)
[db][is,[get,label][eq,person]]
[get,outE]
[is,[get,label][eq,phones]]
[get,inV]
[get,home]
document query (MongoDB Query=>doc)
[db][get,people]
[get,phones]
[get,home]
key/value query (REST=>kv)
[db][get,people]
[get,phones]
[get,home]
relational query (SQL=>rdb)
[db][get,people]
[at,[db][get,phones],
[get,person_id][is,[eq,[get,id]]]]
[get,home]
A query language compiles to the most aligned model-ADT. The generated bytecode is translated to mm-ADT via rewrites.
SQL
Gremlin
MongoDB Query
REST API
mm-ADT-bc
m
odel-ADT
definition
bytecode
m
odel-ADT
query
bytecode
quantum
QQL
X
Y
X must be a subset of
Y to be embedded
All embeddings are
injective. X embedded
in Y states that there
exists a bijection from
X to some subset of Y.
Part 2
Universal Processing Model
[define,person,[name:@str,age:@int]]
[define,db,[people:@person*]]
mm-ADT Bytecode
Query Instructions
[db] :@obj{0} => @db
[get,people] :@db => @person*
[is, :@person => @person?
[get,name] :@person => @str
[eq,marko]] :@str => @bool
[get,age] :@person => @int
W
hat are the ages
of the people named marko?
[db] :@obj{0} => @db
[get,people] :@db => @person*
[is, :@person => @person?
[get,name] :@person => @str
[eq,marko]] :@str => @bool
[get,age] :@person => @int
[get,people][db]
[name:marko,…]
[name:stephen,…]
[name:kuppitz,…]
[is, [get,name] [eq,marko]
29
]
[get,age]
[people:…]
[name:marko,…] marko [name:marko,…]true
mm-ADT Bytecode
Query Execution
W
hat are the ages
of the people named marko? Universal Computing via Streams
Universal computing via streams. Real-time, near-time, batch-time. Single machine, cluster-oriented. RAM, disk-based.
Stream Ring Theory
An Algebraic Ring for Stream Computing
f*g “Multiplication” is instruction composition
f+g “Addition” is stream branching (clone/split)
-f “Negative” inverts the object quantifier
0 [_]{0}: @obj* => @obj{0}
1 [_]{1}: @obj* => @obj*
Axioms
f+(g+h) = (f+g)+h
f*(g*h) = (f*g)*h
f*(g+h) = (f*g)+(f*h)
(f+h)*g = (f*g)+(h*g)
f*0 = 0
f*1 = f
f+0 = f
f+1 = f+1
f-f = f+(-f) = 0
The stream ring is the product of the quantifier ring and the instruction ring.
Turing
Com
plete
https://zenodo.org/record/2565243
notXOR
addition associativity
multiplication associativity
left distributive
right distributive
multiplicative zero
multiplicative one
additive zero
additive one
negative
Stream Ring Theory
Addition, Multiplication, and Distribution
a
b
a b c a b
c
d e
f
a + ba * b * c a * b * (c + (d*e)) * f
a
b
a
b
(a+b)*c = (a*c)+(b*c)
c
c
c
a*(b+c) = (a*b)+(a*c)
b
c
a
a
b
c
a= =
@inst~a + @inst~b = [branch,~a,~b]@inst~a * @inst~b = ~a~b
@inst~a{x,y} * @inst~b{w,z} = ~a~b{x*w,y*z}
@inst~a{x,y} + @inst~b{w,z} = [branch,~a,~b]{x+w,y+z}
Instruction and object quantifiers must be from the same algebraic ring as instruction quantifiers modulate object quantifiers.
Stream Ring Theory
Additive and Multiplicative Identities
a
0
x
a
0
x
x
a
0
y a
0
y
a + 0 = a
a 1
x
a 1
y
a 1
y
a * 1 = a
a(x) = y
0 = [_]{0}
1 = [_]{1}
filter
identity
a + b - ab = x?
x(a + b - ab) =
xa + xb - x(ab) =
xa + xb + -x(ab) =
_____________________
0 + 0 + 0 = 0
x + 0 + 0 = x
0 + x + 0 = x
x + x + -x = x
x{0,1} => x?
set union a ∪ b
a + b = x{0,2}
x(a + b) =
xa + xb =
__________________
0 + 0 = 0
0 + x = x
x + 0 = x
x + x = x{2}
x{0,2}
multi-set union a b
a: x => x?
b: x => x?
Stream Ring Theory
Binomial and Multinomial Expansions
a
b
a
b
c
b
b
c
(a+b)*(b+c) = ab + ac + b² + bc
a
b
b
c
=
FO
IL
m
ethod
Part 3
Universal Instruction Set
mm-ADT Instruction Set
Instruction Types
inst
map filter
flatmap barrier
reduce
branch sideeffect
machineinitial terminal
Type checking includes both object type and quantifier.
flatmap : @obj => @obj*
map : @obj => @obj
filter : @obj => @obj?
barrier : @obj* => @obj*
reduce : @obj* => @obj
initial : @obj{0} => @obj*
terminal: @obj* => @obj{0}
ring
ring
commutative ring
ring
near-ring
virtualm
achine
operationalinstructions
mm-ADT Instruction Set
Instruction Types
MAP
[get,@obj]: @rec => @obj
[path] : @obj => @list
[type] : @obj => @obj
…
REDUCE
[count]: @obj* => @int
[max] : @num* => @num
[sum] : @num* => @num
…
BARRIER
[dedup,@obj*] : @obj* => @obj*
[order,[lt|gt,@obj]*]: @obj* => @obj*
[range,@int,@int] : @obj* => @obj*
…
SIDE-EFFECT
[put,@obj,@obj] : @seq => @seq
[drop,@obj] : @seq => @seq
[fit,@int?,@obj]: @list => @list
…
FILTER
[_] : @obj* => @obj*
[and,@obj{2,}]: @obj => @obj?
[or,@obj{2,}] : @obj => @obj?
[is,@bool] : @obj => @obj?
…
BRANCH
[repeat,@inst{1,3}] : @obj => @obj*
[ifelse,@bool,@inst{2}]: @obj => @obj*
[choose,[@bool,@inst]+]: @obj => @obj*
[coalesce,@inst{2,}] : @obj => @obj*
…
INITIAL
TERMINAL
MACHINE
[ref,@obj*]: @obj => @obj*
SELECT name FROM people WHERE age < 29
[db][get,people]
[is,[get,age][lt,29]]
[get,name]
g.V(1).out(‘created’)
.in(‘created’)
.hasId(neq(1))
.groupCount().by(‘name’)
[db][get,V]
[is,[get,id][eq,1]]
[get,outE]
[is,[get,label][eq,created]]
[get,inV]
[get,inE]
[is,[get,label][eq,created]]
[get,outV]
[is,[get,id][neq,1]]
[group,[get,name],[_],[count]]
person(name:marko){friends{name}}
[db][get,people]
[is,[get,name][eq,marko]]
[get,friends]
[get,name]
Query Language Compilation
Generating model-ADT Bytecode
Query Language Compilation
Query Language and Storage System Decoupling
[model,doc=>mm] [model,rdb=>mm] [model,pg=>mm]
model-ADT embeddings use
rewrite rules to encode one
model-ADT within another.
This enables query
bytecode intended for one
model-ADT to execute
against another model-ADT.
Query languages compile to
their respective model-ADT
bytecode specification.
doc=>rdb
doc=>pg
rdb=>doc
rdb=>pg
pg=>doc
pg=>rdb
m
m
-ADT
objects
or
language
translation
mm-ADT Components
Storage System, Processing Engine, and Query Language
[model,wc=>mm,…]
[model,kv=>mm,…]
[model,pg=>mm,…]
An mm-ADT storage system
publishes the model-ADTs it
supports (and is optimized for).
Within a particular model-ADT
context, the database will produce
respective mm-ADT objects for
the processor to consume.
For all unsupported models, a
model-ADT embedding can be
used to translate to a supported
model-ADT. However, while
semantically correct, the storage
system might lack appropriate
secondary structures.
An mm-ADT processing engine is
able to accept arbitrary mm-ADT
bytecode and generate an
execution pipeline that can read/
write mm-ADT objects to/from the
underlying mm-ADT compliant
storage system.
mm-ADT processing engines are
agnostic to the model-ADT
bytecode encoding. They must
faithfully implement every
instruction in the mm-ADT
instruction set.
An mm-ADT query language has a
compiler to a specific model-ADT
bytecode specification.
The processing engine translates
the language compiler’s model-
ADT bytecode into an execution
pipeline. At runtime, the processor
and storage system communicate
via mm-ADT objects to ultimately
yield the answer to the query.
[db][get,outE]
[get,inV]
[get,name]
a b
c
d e
f
mm-ADT Compliant Components
Storage System, Processing Engine, and Query Language
compilation: @str => @inst*
g.V().out().values(‘name’) => [db][get,outE][get,inV][get,name]
rewrite : @inst* => @inst*
[db][get,outE][get,inV][get,name] => [db][get,name]
evaluation : @inst* => @obj*
[db][get,name] => marko,kuppitz,stephen
Common model-ADTs
Key/Value Store
[db][count] // 0
[db][put,[a,[name:marko,age:29]]]
[put,[b,[name:vadas,age:27]]]
[put,[c,[name:josh,age:32]]]
[put,[d,[name:peter,age:10]]]
[db][put,[d,[name:peter,age:35]]] // updated d value
[db][count] // 4
[db][is,[get,0][eq,a]]
[get,1] // [name:marko,age:29]
[db][group,[get,1]
[get,age]
[ifelse,[is,[gt,30]],
[map,old],
[map,young]],
[map,1],
[sum]] // [old:2,young:2]
GET/PUT
semantics
m
m
-ADT
kv=>m
m
Bytecode
Key/Value Store
Primary and Secondary Structures
[model,kv=>mm,
[define,k,@num|@str]
[define,v,@bool|@num|@str|@list|@rec]
[define,kv,[@k,@v]
-> [put,0,@k] => [error]
-> [drop,0|1] => [error]]
[define,db,kv*
-> [put,@kv[@k~k,@v~v]] => [coalesce,
[is,[get,0][eq,~k]][put,1,~v],
[put,[~k,~v]]]
-> [dedup,[get,0]] =>
-> [count] => [ref,@int <= [db][count]]
-> [is,[get,0][eq,@k~k]] => [ref,@kv? <= [db][is,[get,0]
[eq,~k]]
-> [group,@inst*{3}~mr] => [ref,@rec <= [db][group,~mr]]]]
sorted
keys
key
counter
key
index
map-reduce
framework
Document Database
db.people.insertOne({name:marko,
age:29,
projects:[{name:lop,lang:java}],
knows:[{name:josh},{name:vadas}]})
db.people.find({"knows.name":{$regex:/^j.*/}},{name:1,age:1})
[db][get,people]
[put,[name:marko,
age:29,
projects:[[name:lop,lang:java]],
knows:[[name:josh],[name:vadas]]]]
[db][get,people]
[is,[get,knows][get,name][regex,'^j.*']]
[ref,[name:@string,age:@int]]
MongoDB
JSON QueryDoc
mm-ADT
doc=>mm Bytecode
Document Database
Primary and Secondary Structures
[model,doc=>mm,
[define,dval,@bool|@num|@str|@dobj|@dlist
-> [is,[get,@str~x][eq,@dval~y]] => [ref,@dval? <= […]]]
[define,dlist,[(@dval)*]]
[define,dobj,[(@str:@dval)*]]
[define,doc,@dobj&[_id:@str]]
[define,collection,@doc*
-> [dedup,[get,_id]] =>
-> [is,[get,_id][eq,@str~x]] => [ref,@doc? <= […][is,[get,_id][eq,~x]]]
-> [count] => [ref,@int <= […][count]]
-> [group,@inst*{3}~mr] => [ref,@rec <= […][group,~mr]]]
[define,db,[(@str:@collection)*]]]
map-reduce
framework
doc
counter
recursive patternmatching
index by _id
Property Graph Database
[model,pg=>mm,
[define,tokens,id|label|inE|outE|outV|inV]
[define,properties,[(@str&not(tokens):@str|@num|@bool)*]]
[define,element,@properties&[id:@obj~x,label:@str]
-> [drop,id|label] => [error]
-> [put,id|label,@obj] => [error]]
[define,vertex,@element&[inE:@edge*,outE:@edge*] <= [db][is,[get,id][eq,~x]]
-> [get,outE] => [ref,@edge* <= […][get,outE]
-> [put,@edge~x] => [branch,[put,~x],[get,inV][get,inE][put,~x]]
-> [is,[get,@str~y][eq,@obj~z]] => [ref,@edge* <= […][is,[get,~y][eq,~z]]
-> [count] => [ref,@int <= […][count]]]
-> [is,[get,label][eq,@str~y]] => [ref,@edge* <= […][is,[get,label][eq,~y]]
-> [count] => [ref,@int <= […][count]]
-> [get,inV] => [ref,@vertex* <= […][get,inV]
-> [get,(label|id)~y] => [ref,@str|@obj <= […][get,~y]]
-> [count] => [ref,@int <= […][count]]]]]]
[define,edge,@element&[outV:@vertex,inV:@vertex]
-> [put,outV|inV,@vertex] => [error]
-> [drop,outV|inV] => [error]]
[define,db,@vertex* <= [db]
-> [dedup,[get,id]] =>
-> [get,outE] => [ref,@edge* <= […][get,outE]
-> [dedup,[get,id]] => ]
-> [is,[get,id][eq,@obj~x]] => [ref,@vertex? <= […][is,[get,id][eq,~x]]]
-> [count] => [ref,@int <= […][count]]
-> [get,(inE|outE)~x] => [ref,@edge* <= […][get,~x]
-> [count] => [ref,@int <= […][count]]
-> [is,[get,label][eq,@str~y]] => [ref,@edge* <= […][is,[get,label][eq,~y]]
-> [count] => [ref,@int <= […][count]]]]
Property Graph Database
Primary and Secondary Structures
vertex-centric
edge
property
indices
vertex-centric
edge
label indicesadjacent vertex
id
and
label
denorm
alization
vertex
count
edge
count
edge
count
by
label
RDF Triple Store
[model,rdf=>mm,
[define,uri,@str&regex('//^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(?([^#]*))?(#(.*))?')]
[define,lit,@str|@num|@bool]
[define,bnode,@str&regex('_:.*')]
[define,s,@uri|@bnode]
[define,p,@uri]
[define,o,@uri|@bnode|@lit]
[define,stmt,[s:@s,p:@p,o:@o]
-> [put,s|p|o,@object] => [error]
-> [drop,s|p|o] => [error]]
[define,db,stmt*
-> [dedup] =>
-> [count] => [ref,@int <= […][count]
-> [is,[get,s][eq,@s~x]] => [ref,@stmt? <= […][is,[get,s][eq,~x]]
-> [is,[get,p][eq,@p~y]] => [ref,@stmt? <= […][is,[get,s][eq,~x]][is,[get,p][eq,~y]]]
-> [at,[db],[is,…] => [ref,@stmt? <= […]
-> [is,[get,o][eq,@p~y]] => [ref,@stmt? <= […][is,[get,s][eq,~x]][is,[get,o][eq,~y]]]]
-> [is,[get,p][eq,@p~x]] => [ref,@stmt? <= […][is,[get,p][eq,~x]]
-> [dedup]
-> [count] => [ref,@int <= […][count]]]
-> [is,[get,o][eq,@o~x]] => [ref,@stmt? <= […][is,[get,o][eq,~x]]]]
RDF Triple Store
Primary and Secondary Structures
denorm
alized
length-2
paths
unique
predicate
count
statem
ents
indexed
by
subject
statem
ent
count
statem
ents
indexed
by
subject/predicate
Next Generation Models
mm-ADT is for cluster-oriented, general purpose computing.
Marko A. Rodriguez
Daniel Kuppitz
Stephen Mallette
Patronage from RReduX and DataStax
Credits

Contenu connexe

Similaire à mm-ADT: A Multi-Model Abstract Data Type

Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDSOverloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDSSumant Tambe
 
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...Michael Rys
 
Apache doris (incubating) introduction
Apache doris (incubating) introductionApache doris (incubating) introduction
Apache doris (incubating) introductionleanderlee2
 
Lecture05sql 110406195130-phpapp02
Lecture05sql 110406195130-phpapp02Lecture05sql 110406195130-phpapp02
Lecture05sql 110406195130-phpapp02Lalit009kumar
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital.AI
 
U-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for DevelopersU-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for DevelopersMichael Rys
 
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...MongoDB
 
Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0Anuj Sahni
 
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...MongoDB
 
JavaOne 2013: Memory Efficient Java
JavaOne 2013: Memory Efficient JavaJavaOne 2013: Memory Efficient Java
JavaOne 2013: Memory Efficient JavaChris Bailey
 
Intro to hadoop ecosystem
Intro to hadoop ecosystemIntro to hadoop ecosystem
Intro to hadoop ecosystemGrzegorz Kolpuc
 
CIKB - Software Architecture Analysis Design
CIKB - Software Architecture Analysis DesignCIKB - Software Architecture Analysis Design
CIKB - Software Architecture Analysis DesignAntonio Castellon
 
[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features
[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features
[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced FeaturesAndrew Liu
 
2021 04-20 apache arrow and its impact on the database industry.pptx
2021 04-20  apache arrow and its impact on the database industry.pptx2021 04-20  apache arrow and its impact on the database industry.pptx
2021 04-20 apache arrow and its impact on the database industry.pptxAndrew Lamb
 
Introduction to Data Analtics with Pandas [PyCon Cz]
Introduction to Data Analtics with Pandas [PyCon Cz]Introduction to Data Analtics with Pandas [PyCon Cz]
Introduction to Data Analtics with Pandas [PyCon Cz]Alexander Hendorf
 

Similaire à mm-ADT: A Multi-Model Abstract Data Type (20)

Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDSOverloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
 
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
 
Apache doris (incubating) introduction
Apache doris (incubating) introductionApache doris (incubating) introduction
Apache doris (incubating) introduction
 
MongoDB 3.0
MongoDB 3.0 MongoDB 3.0
MongoDB 3.0
 
Lecture05sql 110406195130-phpapp02
Lecture05sql 110406195130-phpapp02Lecture05sql 110406195130-phpapp02
Lecture05sql 110406195130-phpapp02
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
 
Modern C++
Modern C++Modern C++
Modern C++
 
U-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for DevelopersU-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for Developers
 
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
 
Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0
 
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...
 
iOS Application Development
iOS Application DevelopmentiOS Application Development
iOS Application Development
 
JavaOne 2013: Memory Efficient Java
JavaOne 2013: Memory Efficient JavaJavaOne 2013: Memory Efficient Java
JavaOne 2013: Memory Efficient Java
 
Intro to hadoop ecosystem
Intro to hadoop ecosystemIntro to hadoop ecosystem
Intro to hadoop ecosystem
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
CIKB - Software Architecture Analysis Design
CIKB - Software Architecture Analysis DesignCIKB - Software Architecture Analysis Design
CIKB - Software Architecture Analysis Design
 
[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features
[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features
[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features
 
2021 04-20 apache arrow and its impact on the database industry.pptx
2021 04-20  apache arrow and its impact on the database industry.pptx2021 04-20  apache arrow and its impact on the database industry.pptx
2021 04-20 apache arrow and its impact on the database industry.pptx
 
Google Cloud Spanner Preview
Google Cloud Spanner PreviewGoogle Cloud Spanner Preview
Google Cloud Spanner Preview
 
Introduction to Data Analtics with Pandas [PyCon Cz]
Introduction to Data Analtics with Pandas [PyCon Cz]Introduction to Data Analtics with Pandas [PyCon Cz]
Introduction to Data Analtics with Pandas [PyCon Cz]
 

Plus de Marko Rodriguez

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic MachineMarko Rodriguez
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryMarko Rodriguez
 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialMarko Rodriguez
 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryMarko Rodriguez
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph ComputingMarko Rodriguez
 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageMarko Rodriguez
 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with GraphsMarko Rodriguez
 
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataTitan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataMarko Rodriguez
 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph DatabasesMarko Rodriguez
 
Traversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinTraversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinMarko Rodriguez
 
The Path-o-Logical Gremlin
The Path-o-Logical GremlinThe Path-o-Logical Gremlin
The Path-o-Logical GremlinMarko Rodriguez
 
The Gremlin in the Graph
The Gremlin in the GraphThe Gremlin in the Graph
The Gremlin in the GraphMarko Rodriguez
 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMarko Rodriguez
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataMarko Rodriguez
 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Marko Rodriguez
 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceMarko Rodriguez
 
The Graph Traversal Programming Pattern
The Graph Traversal Programming PatternThe Graph Traversal Programming Pattern
The Graph Traversal Programming PatternMarko Rodriguez
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in ComputingMarko Rodriguez
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly CommunityMarko Rodriguez
 

Plus de Marko Rodriguez (20)

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machine
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph Theory
 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM Dial
 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal Machinery
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph Computing
 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal Language
 
The Path Forward
The Path ForwardThe Path Forward
The Path Forward
 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with Graphs
 
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataTitan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph Databases
 
Traversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinTraversing Graph Databases with Gremlin
Traversing Graph Databases with Gremlin
 
The Path-o-Logical Gremlin
The Path-o-Logical GremlinThe Path-o-Logical Gremlin
The Path-o-Logical Gremlin
 
The Gremlin in the Graph
The Gremlin in the GraphThe Gremlin in the Graph
The Gremlin in the Graph
 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to Redemption
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of Data
 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network Science
 
The Graph Traversal Programming Pattern
The Graph Traversal Programming PatternThe Graph Traversal Programming Pattern
The Graph Traversal Programming Pattern
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in Computing
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly Community
 

Dernier

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 

Dernier (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 

mm-ADT: A Multi-Model Abstract Data Type

  • 1. mm-ADT A Multi-Model Abstract Datatype Dr. Marko A. Rodriguez Founder, RReduX Inc. Project Management Committee, Apache TinkerPop Rodriguez, M.A., “mm-ADT: A Multi-Model Abstract Datatype,” ApacheCon 2019, Las Vegas, NV, September 2019. REDUX RFunded by: Collaborators Daniel Kuppitz and Stephen Mallette
  • 2. WARNING This Slide Presentation Should Not Be Used as a Reference As of September 2019, there exists a rough draft of the mm-ADT specification (0.1-alpha) and a Java-based prototype of the mm- ADT-bc compiler. The specifics presented here are likely to change as the project matures. The goal is to release the following artifacts by early 2020. 1.) A stable 1.0 mm-ADT specification document. 2.) A Java-based mm-ADT-bc compiler and virtual machine. 3.) A basic reference implementation of an mm-ADT database.
  • 3. Database Datatypes WIDE-COLUMN DATABASE RELATIONAL DATABASE KEY/VALUE DATABASE PROPERTY GRAPH DATABASE DOCUMENT DATABASE RDF DATABASE
  • 4. Database Components STORAGE SYSTEM PROCESSING ENGINE QUERY LANGUAGE
  • 5. Universal Database Components UNIVERSAL DATA STRUCTURE UNIVERSAL PROCESSING MODEL UNIVERSAL INSTRUCTION SET 80% solid. 90% solid. 60% solid. as of Sept, 2019 Presentation is primarily on this topic.
  • 10. mm-ADT Compliant The Benefit to Component Providers … … … QUERY LANGUAGE PROVIDERS PROCESSING ENGINE PROVIDERS STORAGE SYSTEM PROVIDERS Your query language’s queries can execute real-time, near-time, or batch-time over any database. Your processing engine can be programmed by any query language and compute over any database. Your database can be processed by many processors whose queries are defined by many different query languages. M ulti-m odel Program m ability Largeruserbase
  • 11. mm-ADT A Multi-Model Abstract Datatype Any Data Structure/Language Storage/Processing Agnosticism Universal Operators GET PUT FILTER SELECT PROJECT ORDER DEDUP GROUP COUNT SUM MEAN COALESCE REPEAT BRANCH ARITHMETIC RECORD LAYOUT INDEX STRUCTURES PARTITIONING DENORMALIZATIONS SORT ORDERS … PULL-BASED ITERATION PUSH-BASED REACTIVE MESSAGE PASSING LINEAR SCAN/FILTER LAZY/STRICT EVALUTION QUANTUM WAVE INTERFERENCE KEY/VALUE JSON/XML DOCUMENT PROPERTY/RDF GRAPH RELATIONAL WIDE-COLUMN QUANTUM … SQL SPARQL CYPHER GREMLIN GRAPHQL CQL QUERY DOCUMENT QUANTUM UNITIARY MATRICES yes. any data structure. yes. any data query language. yes. any algorithm. yes. addition. :)
  • 12. mm-ADT Requirements A Cluster-Oriented Bytecode and Virtual Machine A type system for capturing database models and schemas. Datatypes, composite structures, constraints, foreign keys, cardinalities, dependencies. A database perspective w/ loose coupling for future use cases. Indices, denormalizations, read/write costs, transactions, offloading computations, complex data access paths, multi-user, distributed data, data locality, real-time/batch executions. A bytecode foundation for use across software and languages. Compatible with Java, C++, Go, etc.-based databases. Reasonable compilation strategies to/from: SQL, SPARQL, Cypher, Gremlin, GraphQL, QueryDocs, …future. A universal processing model able to capture various strategies. Single-threaded, single-machine queries. Distributed, cluster-oriented queries. Lazy and space-efficient for main-memory queries (real-time). Eager and time-efficient for disk-based queries (batch). Turing-complete for any known query/algorithm.
  • 14. mm-ADT Bytecode Uses model-ADT Definitions vs. model-ADT Queries Created by the engineers developing an mm-ADT compliant storage system. Defines the storage system’s abstract data type. A complex bytecode, but even for feature rich storage systems, its typically less than 100 lines of code. Created by the storage system user. A subtype of the storage system’s model-ADT. Best understood as the user’s application’s “schema.” Can be generated by the storage system via its schema language. Created by higher-level query languages. However, it can be written directly by a human user. A simple bytecode with a similar structure to the popular query language of the respective model-ADT. [model,pg=>mm, [define,vertex,…] [define,edge,…] [define,db,…]] [model,social=>pg, [define,person,vertex&…] [define,likes,edge&…] [define,db,…]] [db][get,age] [dedup] [sum] modeldefinitionsupported bythestoragesystem instance data m anipulation and analysis sub-m odeldefinitions describing the user’s dom ain ofdiscourse When compiling query bytecode, definitions are used for type inference/ checking, query optimization, and cross model embeddings. The m ostcom plex m odel to date is 65 lines ofcode. The language used for all examples is the (somewhat) human readable/writable bytecode language called mm-ADT-bc.
  • 15. @q true|false @bool @int -infty..infty @real -infty..infty ‘marko’ @str @list [@obj*] @rec [(@obj:@obj)*] @obj @num@seq Built-In Datatypes Fundamental Structures Defined Outside of mm-ADT @inst [@str,@obj*] @db database root structure object quantifier …,-2,-1,0,1,2,… key/value associative array mixed-typed array floating point bytecode default reserved type aliases a char is a one element strtypically
  • 17. [define,person,[name:@str,age:@int]] symbol structureopcode Custom Datatypes Type Definitions @str [is,[type][eq, str ]]: @obj => @str? @int [is,[type][eq, int ]]: @obj => @int? @person [is,[get,name][type][eq, str ]][is,[get,age][type][eq, int ]]: @obj => @person? @inst mm-ADT types are filter bytecode (predicates). If an object is unfiltered by the bytecode, then the object is that type. [define,person,[name:[is,[type][eq, str ]],age:[is,[type][eq, int ]]]]
  • 18. [define,person,[name:@str,age:@int&gte(0)]] Custom Datatypes Type Refinement [define,probability, @real&gte(0.0)&lte(1.0)] [define,varchar255, @str&[is,[len][lte,255]] [define,short, @int&gte(-32767)&lte(32767)] [define,pair, [@obj,@obj]] [define,triple, [@obj,@obj,@obj]] [define,years,@int&gte(0)] [define,person,[name:@str,age:@years] anonymous type named type Anonymous types are the norm in mm-ADT.
  • 20. [define,person,[name:@str,age:@int&gte(0),friend:@person?]] Custom Datatypes Type Quantifiers {3,5}: 3 to 5 * : {0,infty} + : {1,infty} ? : {0,1} {2} : {2,2} {3,} : {3,infty} All standard query languages are bound to the natural numbers with standard multiplication and addition definitions. [define,none, @obj{0} ] [define,some, @obj{1} ] // @obj [define,maybe,@obj{0,1}] // @obj? Quantifiers can be matrices composed of complex numbers. Constructive and deconstructive wave interference patterns can be incorporated into a query. This does not radically alter the bytecode as quantifiers are fundamental to mm-ADT. Quantifiers can be real numbers (for example, values between 0.0 and 1.0). Real quantifiers can model energy diffusions, fuzzy set semantics, probabilistic/stochastic behavior. Unitary M artrix Quantifiers Real Num ber Quantifiers Quantifiers are any algebraic ring with unity. [mult][add][sub][zero][one] @str? @person?@none | @some = @maybe {0,0} + {1,1} = {0,1} {min,max} [define,q,@unitary] @vertex? [define,q,@real&gte(0.0)&lte(1.0)] @obj{[1,0,1,0]} @obj{0.92}
  • 21. [define,person,[name:@str,age:@int&gte(0),friend:@person?]] [define,loner,@person&[friend:@person{0}]] Custom Datatypes Subtypes {x1,x2}&{y1,y2} => {x1*y1,x2*y2} {x1,x2}|{y1,y2} => {min(x1,y1),max(x2,y2)} {0,1} {0,0} [name:@str,age:@int&gte(0),friend:@person?]&[friend:@person{0}] => [name:@str,age:@int&gte(0),friend:@person{0}] Composition through & and | are defined for each type with default definitions from the underlying fundamental type. super type loner <: person N ote thatthere are also axiom s forthe com position ofa type’s instruction rew rites. N otdiscussed in this presentation. composition {0,1}&{0,0}=>{0,0}=>{0} type quantifier composition
  • 22. [define,person,[name:@str,age:@int&gte(0),friend:@person?]] [define,older,@person[age :@int&gte(0)~x, friend:@person&[age:@int&gte(0)&lt(~x)]?]] Custom Datatypes Type Dependents [define,complex,[@real~x1,@real~x2] -> [add,@complex&[@real~y1,@real~y2]] => [map,@complex&[~x1+~y1,~x2+~y2]] -> [mult,@complex&[@real~y1,@real~y2]] => [map,@complex&[(~x1*~y1)-(~x2*~y2), (~x1*~y2)-(~x2*~y1)]] Infix arithmetic notation is still being designed. Primary issue is * and + parser ambiguities due to quantifier shorthands. anonymous subtype
  • 23. f · h : A → D type value/structure type quantifier instance access mapsFrom token mm-ADT Datatypes The General Structure of a Type A{x,y} <= [i]* -> [f]+ => [j]* -> [g]+ => [k]* -> [h]+ => [l]* instruction pattern instruction rewrite instruction token type instructions mapsTo token The type is represented in mm-ADT-bc syntax. All mm-ADT bytecode generating languages leverage these components. A <= [i] -> [f] => [B <= [j]] -> [g] => [C <= [k]] f : A → B A -> [f] => B -> [g] => C A -> [f] => B -> [h] => D Inspired by OOP where methods/instructions grouped by domain class/type. Types with instructions denote function branches. Paths from type to type via instructions denote function composition. def f(a) = b spec f : A → B g : A → C + * domain range
  • 24. [define,person, [name:@str~x,age:@int] <= [db][get,people][is,[get,name][eq,~x]]] @person&[name:marko,age:29] all constant values @person&[age:29] name unbound @person&[name:marko] accessible via <= instance type reference mm-ADT Datatypes Instance Data Access Path “people 29 years of age” instance access (canonical representation) mm-ADT is a cluster-oriented programming language where data access has different costs depending on locality. A <= […]
  • 25. [define,person,[name:@str,age:@int]] [define,db,person* -> [order,[gt,[get,age]]] => -> [dedup,[get,name]] => -> [count] => [ref,@int <= [db][count]] -> [is,[get,name][eq,@str~x]] => [ref,@person? <= [db][is,[get,name][eq,~x]]] indexlookup aggregate mm-ADT Datatypes Instruction Rewrites no-op All [ref] instructions are resolved by the storage system or processing engine and are used to access secondary structures. [db][is,[get,name][eq,marko]] => [ref,@person? <= [db][is,[get,name][eq,marko]]] O(log n) index queryO(n) linear scan submitted bytecode rewritten bytecode
  • 26. Primary Structure: The structure associated with the database’s conceptual model. Secondary Structures: The auxiliary structures used to improve query performance and ensure data integrity. Relational model-ADT Primary and Secondary Structures index str bool intint schema denormalization sort order unique checks aggregates aa 1 ab 1 ba 2 bb 7 ca 7 cb 7 da 4 dc 8 1 2 1 13 2 17 7 29 4 30 3 33 2 33 8 39 9 60 9 65 3 81 5 82 1 83 < 10 SUM=567
  • 27. Relational model-ADT RELATIONAL DATABASE [model,rdb=>mm, [define,value,@bool|@num|@str] [define,row,[(@str:@value)*]] [define,table,@row*] [define,db,[(@str:@table)*]]] DOMAIN SPECIFIC RELATIONAL SCHEMA [model,social=>rdb, [define,person,[name:@str?,age:@int?]] [define,persons,@person*] [define,db,[people:@persons]]]
  • 28. CREATE TABLE people ( name varchar(255), age int ) [define,person,[name:@str?,age:@int?]] [define,persons,@person*] [define,db,[people:@persons]] Relational model-ADT Primary Structure (Subtype)
  • 29. Relational model-ADT Secondary Structures (Primary Key) [define,person,[name:@str,age:@int?]] [define,persons,@person* -> [dedup,[get,name]] => ] [define,db,[people:@persons]] CREATE TABLE people ( name varchar(255), age int, PRIMARY KEY(name) )
  • 30. Relational model-ADT Secondary Structures (Sort Orders) [define,person,[name:@str,age:@int?]] [define,persons,@person* -> [dedup,[get,name]] => -> [order,[gt,[get,name]]] => ] [define,db,[people:@persons]] CREATE TABLE people ( name varchar(255), age int, PRIMARY KEY(name,ASC) )
  • 31. [define,person,[name:@str,age:@int]] [define,persons,@person* -> [dedup,[get,name]] => -> [order,[gt,[get,name]]] => ] [define,db,[people:@persons]] Relational model-ADT Secondary Structures (Null Values) CREATE TABLE people ( name varchar(255), age int NOT NULL, PRIMARY KEY(name,ASC) )
  • 32. Relational model-ADT Secondary Structures (Check Values) [define,person,[name:@str,age:@int&gte(0)]] [define,persons,@person* -> [dedup,[get,name]] => -> [order,[gt,[get,name]]] => ] [define,db,[people:@persons]] CREATE TABLE people ( name varchar(255), age int NOT NULL, PRIMARY KEY(name,ASC), CHECK (age >= 0) )
  • 33. [define,person,[name:@str~x, age:@int&gte(0), friend:@person&[name:@str]] <= [db][get,people] [is,[get,name] [eq,~x]]] [define,persons,@person* -> [dedup,[get,name]] => -> [order,[gt,[get,name]]] => ] [define,db,[people:@persons]] Relational model-ADT Secondary Structures (Foreign Key) CREATE TABLE people ( name varchar(255), age int NOT NULL, friend varchar(255), PRIMARY KEY(name,ASC), CHECK (age >= 0), FOREIGN KEY (friend) REFERENCES people(name) ) instance access
  • 34. [define,person,[name:@str~x, age:@int&gte(0), friend:@person&[name:@str]] <= [db][get,people] [is,[get,name] [eq,~x]] [define,persons,@person* -> [dedup,[get,name]] => -> [order,[gt,[get,name]]] => -> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x]?]] [define,db,[people:@persons]] Relational model-ADT Secondary Structures (Single Key Index) CREATE TABLE people ( name varchar(255), age int NOT NULL, friend varchar(255), PRIMARY KEY(name,ASC), CHECK (age >= 0), FOREIGN KEY (friend) REFERENCES people(name), CREATE UNIQUE INDEX name_idx ON people(name) )
  • 35. [define,person,[name:@str~x, age:@int&gte(0), friend:@person&[name:@str]] <= [db][get,people] [is,[get,name] [eq,~x]] [define,persons,@person* -> [dedup,[get,name]] => -> [order,[gt,[get,name]]] => -> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x]? -> [is,[get,age][eq,@int~y]] => [ref,@person&[name:~x,age:~y]?]] -> [is,[get,age][eq,@int~y]] => [ref,@person&[age:~y]* -> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x,age:~y]?]]] [define,db,[people:@persons]] Relational model-ADT Secondary Structures (Multi-Key Index) CREATE TABLE people ( name varchar(255), age int NOT NULL, friend varchar(255), PRIMARY KEY(name,ASC), CHECK (age >= 0), FOREIGN KEY (friend) REFERENCES people(name), CREATE UNIQUE INDEX name_idx ON people(name), CREATE INDEX name_age_idx ON people(name,age) )
  • 36. [define,person,[name:@str~x, age:@int&gte(0), friend:@person&[name:@str]] <= [db][get,people] [is,[get,name] [eq,~x]] [define,persons,@person* -> [dedup,[get,name]] => -> [order,[gt,[get,name]]] => -> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x]? -> [is,[get,age][eq,@int~y]] => [ref,@person&[name:~x,age:~y]?]] -> [is,[get,age][eq,@int~y]] => [ref,@person&[age:~y]* -> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x,age:~y]?]] -> [count] => [ref,@int <= [db][get,people] [count]]] [define,db,[people:@persons]] Relational model-ADT Secondary Structures (Aggregates) CREATE TABLE people ( name varchar(255), age int NOT NULL, friend varchar(255), PRIMARY KEY(name,ASC), CHECK (age >= 0), FOREIGN KEY (friend) REFERENCES people(name), CREATE UNIQUE INDEX name_idx ON people(name), CREATE INDEX name_age_idx ON people(name,age) )
  • 37. [define,person,[name:@str~x, age:@int&gte(0), friend:@person&[name:@str]] <= [db][get,people] [is,[get,name] [eq,~x] -> [get,friend][get,friend] => ] [define,persons,@person* -> [dedup,[get,name]] => -> [order,[gt,[get,name]]] => -> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x]? -> [is,[get,age][eq,@int~y]] => [ref,@person&[name:~x,age:~y]?]] -> [is,[get,age][eq,@int~y]] => [ref,@person&[age:~y]* -> [is,[get,name][eq,@str~x]] => [ref,@person&[name:~x,age:~y]?]] -> [count] => [ref,@int <= [db][get,people] [count]]] [define,db,[people:@persons]] Relational model-ADT Secondary Structures (Domain Logic) CREATE TABLE people ( name varchar(255), age int NOT NULL, friend varchar(255), PRIMARY KEY(name,ASC), CHECK (age >= 0), FOREIGN KEY (friend) REFERENCES people(name), CREATE UNIQUE INDEX name_idx ON people(name), CREATE INDEX name_age_idx ON people(name,age) ) conceptual secondary structures Friends are pair bonded. Its a weird example, but we got stuck with such a simple model. no-op
  • 38. model-ADT Data Access Graph Primary and Secondary Structures Specify Data Access Paths secondary structures primary structure dereference reference Secondary structures “teleport” the processor from one location in the primary structure to another. <
  • 39. Data Access Path Primary Structure Only @person @persons @person @person @person @person @person @person @str @str [get,people] map barrier @db [get,name] [get,name] [is,[get,age][gt,21]] filter map reduce [db] initial @person @person @person @person @person @str @str @str [get,name] [dedup] @int [count] [get,name] [get,name] @str @str @str @str @str ... [is,[get,age][gt,21]] [is,[get,age][gt,21]] [is,[get,age][gt,21]] [is,[get,age][gt,21]] [is,[get,age][gt,21]] [is,[get,age][gt,21]] [db][get,people] [is,[get,age][gt,21]] [get,name] [dedup] [count] CREATE TABLE people ( name varchar(255) NOT NULL, age int NOT NULL ) [define,person,[name:@str,age:@int]] [define,persons,@person*] [define,db,[people:@persons]] How many unique names are there for people over 21 years of age? map map ref deref time spent flatmap
  • 40. Data Access Path Primary and Secondary Structures [db][get,people] [is,[get,age][gt,21]] [get,name] [dedup] [count] CREATE TABLE people ( name varchar(255), age int NOT NULL, PRIMARY KEY(name), CREATE INDEX idx ON people(age) ) [get,name] [dedup] [count]@person [age:gt(21)]* @str* @int [is,[get,age] [gt,21]] @int dereferenceindex query names in index names are unique index hit count @persons [get,people] @db [db] [define,person,[name:@str,age:@int]] [define,persons,@person* -> [dedup,[get,name]] => -> [is,[get,age][@rel~x,@int~y]] => [ref,@person[age:~x(~y)]* -> [get,name] => [ref,@str* <= […] -> [dedup] => -> [count] => [ref,@int <= […]]]]] [define,db,[people:@persons]] map mapmap map mapinitial no-op runtime instruction rewrites ref deref time spent
  • 41. Common model-ADTs Primary Structure KEY-VALUE STORE [model,kv=>mm, [define,k,@num|@str] [define,v,@obj] [define,kv,[@k,@v]] [define,db,@kv*]] PROPERTY GRAPH DATABASE [model,pg=>mm, [define,properties,[(@str:@str|@num|@bool)*]] [define,element,@properties&[id:@obj,label:@str]] [define,edge,@element&[outV:@vertex,inV:@vertex]] [define,vertex,@element&[inE:@edge*,outE:@edge*]] [define,db,@vertex*]] RELATIONAL DATABASE [model,rdb=>mm, [define,value,@bool|@num|@str] [define,row,[(@str:@value)*]] [define,table,@row*] [define,db,[(@str:@table)*]]] DOCUMENT DATABASE [model,doc=>mm, [define,dval,@bool|@num|@str|@dobj|@dlist] [define,dlist,[(@dval)*]] [define,dobj,[(@str:@dval)*]] [define,doc,@dobj&[_id:@str]] [define,collection,@doc*] [define,db,[(@str:@collection)*]]] A model-ADT is an abstract datatype modeled/embedded in mm-ADT. [model] domain of discourse
  • 42. [model, rdb=>mm, [define,…][define,…][define,…]] [model, pg=>rdb,[define,…][define,…][define,…]] [model, doc=>rdb,[define,…][define,…][define,…]] [model, kv=>rdb,[define,…][define,…][define,…]] [model,quantum=>pg, [define,…][define,…][define,…]] Multiple model-ADTs Models, Embeddings, and Queries rdb mm pg doc kv the most expressive model aligned with the storage system storage system implements mm-ADT objects storage system’s supported models X model embedded in Y model all models must have an embedding path to mm-ADT property graph query (Gremlin=>pg) [db][is,[get,label][eq,person]] [get,outE] [is,[get,label][eq,phones]] [get,inV] [get,home] document query (MongoDB Query=>doc) [db][get,people] [get,phones] [get,home] key/value query (REST=>kv) [db][get,people] [get,phones] [get,home] relational query (SQL=>rdb) [db][get,people] [at,[db][get,phones], [get,person_id][is,[eq,[get,id]]]] [get,home] A query language compiles to the most aligned model-ADT. The generated bytecode is translated to mm-ADT via rewrites. SQL Gremlin MongoDB Query REST API mm-ADT-bc m odel-ADT definition bytecode m odel-ADT query bytecode quantum QQL X Y X must be a subset of Y to be embedded All embeddings are injective. X embedded in Y states that there exists a bijection from X to some subset of Y.
  • 44. [define,person,[name:@str,age:@int]] [define,db,[people:@person*]] mm-ADT Bytecode Query Instructions [db] :@obj{0} => @db [get,people] :@db => @person* [is, :@person => @person? [get,name] :@person => @str [eq,marko]] :@str => @bool [get,age] :@person => @int W hat are the ages of the people named marko?
  • 45. [db] :@obj{0} => @db [get,people] :@db => @person* [is, :@person => @person? [get,name] :@person => @str [eq,marko]] :@str => @bool [get,age] :@person => @int [get,people][db] [name:marko,…] [name:stephen,…] [name:kuppitz,…] [is, [get,name] [eq,marko] 29 ] [get,age] [people:…] [name:marko,…] marko [name:marko,…]true mm-ADT Bytecode Query Execution W hat are the ages of the people named marko? Universal Computing via Streams Universal computing via streams. Real-time, near-time, batch-time. Single machine, cluster-oriented. RAM, disk-based.
  • 46. Stream Ring Theory An Algebraic Ring for Stream Computing f*g “Multiplication” is instruction composition f+g “Addition” is stream branching (clone/split) -f “Negative” inverts the object quantifier 0 [_]{0}: @obj* => @obj{0} 1 [_]{1}: @obj* => @obj* Axioms f+(g+h) = (f+g)+h f*(g*h) = (f*g)*h f*(g+h) = (f*g)+(f*h) (f+h)*g = (f*g)+(h*g) f*0 = 0 f*1 = f f+0 = f f+1 = f+1 f-f = f+(-f) = 0 The stream ring is the product of the quantifier ring and the instruction ring. Turing Com plete https://zenodo.org/record/2565243 notXOR addition associativity multiplication associativity left distributive right distributive multiplicative zero multiplicative one additive zero additive one negative
  • 47. Stream Ring Theory Addition, Multiplication, and Distribution a b a b c a b c d e f a + ba * b * c a * b * (c + (d*e)) * f a b a b (a+b)*c = (a*c)+(b*c) c c c a*(b+c) = (a*b)+(a*c) b c a a b c a= = @inst~a + @inst~b = [branch,~a,~b]@inst~a * @inst~b = ~a~b @inst~a{x,y} * @inst~b{w,z} = ~a~b{x*w,y*z} @inst~a{x,y} + @inst~b{w,z} = [branch,~a,~b]{x+w,y+z} Instruction and object quantifiers must be from the same algebraic ring as instruction quantifiers modulate object quantifiers.
  • 48. Stream Ring Theory Additive and Multiplicative Identities a 0 x a 0 x x a 0 y a 0 y a + 0 = a a 1 x a 1 y a 1 y a * 1 = a a(x) = y 0 = [_]{0} 1 = [_]{1} filter identity a + b - ab = x? x(a + b - ab) = xa + xb - x(ab) = xa + xb + -x(ab) = _____________________ 0 + 0 + 0 = 0 x + 0 + 0 = x 0 + x + 0 = x x + x + -x = x x{0,1} => x? set union a ∪ b a + b = x{0,2} x(a + b) = xa + xb = __________________ 0 + 0 = 0 0 + x = x x + 0 = x x + x = x{2} x{0,2} multi-set union a b a: x => x? b: x => x?
  • 49. Stream Ring Theory Binomial and Multinomial Expansions a b a b c b b c (a+b)*(b+c) = ab + ac + b² + bc a b b c = FO IL m ethod
  • 51. mm-ADT Instruction Set Instruction Types inst map filter flatmap barrier reduce branch sideeffect machineinitial terminal Type checking includes both object type and quantifier. flatmap : @obj => @obj* map : @obj => @obj filter : @obj => @obj? barrier : @obj* => @obj* reduce : @obj* => @obj initial : @obj{0} => @obj* terminal: @obj* => @obj{0} ring ring commutative ring ring near-ring virtualm achine operationalinstructions
  • 52. mm-ADT Instruction Set Instruction Types MAP [get,@obj]: @rec => @obj [path] : @obj => @list [type] : @obj => @obj … REDUCE [count]: @obj* => @int [max] : @num* => @num [sum] : @num* => @num … BARRIER [dedup,@obj*] : @obj* => @obj* [order,[lt|gt,@obj]*]: @obj* => @obj* [range,@int,@int] : @obj* => @obj* … SIDE-EFFECT [put,@obj,@obj] : @seq => @seq [drop,@obj] : @seq => @seq [fit,@int?,@obj]: @list => @list … FILTER [_] : @obj* => @obj* [and,@obj{2,}]: @obj => @obj? [or,@obj{2,}] : @obj => @obj? [is,@bool] : @obj => @obj? … BRANCH [repeat,@inst{1,3}] : @obj => @obj* [ifelse,@bool,@inst{2}]: @obj => @obj* [choose,[@bool,@inst]+]: @obj => @obj* [coalesce,@inst{2,}] : @obj => @obj* … INITIAL TERMINAL MACHINE [ref,@obj*]: @obj => @obj*
  • 53. SELECT name FROM people WHERE age < 29 [db][get,people] [is,[get,age][lt,29]] [get,name] g.V(1).out(‘created’) .in(‘created’) .hasId(neq(1)) .groupCount().by(‘name’) [db][get,V] [is,[get,id][eq,1]] [get,outE] [is,[get,label][eq,created]] [get,inV] [get,inE] [is,[get,label][eq,created]] [get,outV] [is,[get,id][neq,1]] [group,[get,name],[_],[count]] person(name:marko){friends{name}} [db][get,people] [is,[get,name][eq,marko]] [get,friends] [get,name] Query Language Compilation Generating model-ADT Bytecode
  • 54. Query Language Compilation Query Language and Storage System Decoupling [model,doc=>mm] [model,rdb=>mm] [model,pg=>mm] model-ADT embeddings use rewrite rules to encode one model-ADT within another. This enables query bytecode intended for one model-ADT to execute against another model-ADT. Query languages compile to their respective model-ADT bytecode specification. doc=>rdb doc=>pg rdb=>doc rdb=>pg pg=>doc pg=>rdb m m -ADT objects or language translation
  • 55. mm-ADT Components Storage System, Processing Engine, and Query Language [model,wc=>mm,…] [model,kv=>mm,…] [model,pg=>mm,…] An mm-ADT storage system publishes the model-ADTs it supports (and is optimized for). Within a particular model-ADT context, the database will produce respective mm-ADT objects for the processor to consume. For all unsupported models, a model-ADT embedding can be used to translate to a supported model-ADT. However, while semantically correct, the storage system might lack appropriate secondary structures. An mm-ADT processing engine is able to accept arbitrary mm-ADT bytecode and generate an execution pipeline that can read/ write mm-ADT objects to/from the underlying mm-ADT compliant storage system. mm-ADT processing engines are agnostic to the model-ADT bytecode encoding. They must faithfully implement every instruction in the mm-ADT instruction set. An mm-ADT query language has a compiler to a specific model-ADT bytecode specification. The processing engine translates the language compiler’s model- ADT bytecode into an execution pipeline. At runtime, the processor and storage system communicate via mm-ADT objects to ultimately yield the answer to the query. [db][get,outE] [get,inV] [get,name] a b c d e f
  • 56. mm-ADT Compliant Components Storage System, Processing Engine, and Query Language compilation: @str => @inst* g.V().out().values(‘name’) => [db][get,outE][get,inV][get,name] rewrite : @inst* => @inst* [db][get,outE][get,inV][get,name] => [db][get,name] evaluation : @inst* => @obj* [db][get,name] => marko,kuppitz,stephen
  • 58. Key/Value Store [db][count] // 0 [db][put,[a,[name:marko,age:29]]] [put,[b,[name:vadas,age:27]]] [put,[c,[name:josh,age:32]]] [put,[d,[name:peter,age:10]]] [db][put,[d,[name:peter,age:35]]] // updated d value [db][count] // 4 [db][is,[get,0][eq,a]] [get,1] // [name:marko,age:29] [db][group,[get,1] [get,age] [ifelse,[is,[gt,30]], [map,old], [map,young]], [map,1], [sum]] // [old:2,young:2] GET/PUT semantics m m -ADT kv=>m m Bytecode
  • 59. Key/Value Store Primary and Secondary Structures [model,kv=>mm, [define,k,@num|@str] [define,v,@bool|@num|@str|@list|@rec] [define,kv,[@k,@v] -> [put,0,@k] => [error] -> [drop,0|1] => [error]] [define,db,kv* -> [put,@kv[@k~k,@v~v]] => [coalesce, [is,[get,0][eq,~k]][put,1,~v], [put,[~k,~v]]] -> [dedup,[get,0]] => -> [count] => [ref,@int <= [db][count]] -> [is,[get,0][eq,@k~k]] => [ref,@kv? <= [db][is,[get,0] [eq,~k]] -> [group,@inst*{3}~mr] => [ref,@rec <= [db][group,~mr]]]] sorted keys key counter key index map-reduce framework
  • 61. Document Database Primary and Secondary Structures [model,doc=>mm, [define,dval,@bool|@num|@str|@dobj|@dlist -> [is,[get,@str~x][eq,@dval~y]] => [ref,@dval? <= […]]] [define,dlist,[(@dval)*]] [define,dobj,[(@str:@dval)*]] [define,doc,@dobj&[_id:@str]] [define,collection,@doc* -> [dedup,[get,_id]] => -> [is,[get,_id][eq,@str~x]] => [ref,@doc? <= […][is,[get,_id][eq,~x]]] -> [count] => [ref,@int <= […][count]] -> [group,@inst*{3}~mr] => [ref,@rec <= […][group,~mr]]] [define,db,[(@str:@collection)*]]] map-reduce framework doc counter recursive patternmatching index by _id
  • 63. [model,pg=>mm, [define,tokens,id|label|inE|outE|outV|inV] [define,properties,[(@str&not(tokens):@str|@num|@bool)*]] [define,element,@properties&[id:@obj~x,label:@str] -> [drop,id|label] => [error] -> [put,id|label,@obj] => [error]] [define,vertex,@element&[inE:@edge*,outE:@edge*] <= [db][is,[get,id][eq,~x]] -> [get,outE] => [ref,@edge* <= […][get,outE] -> [put,@edge~x] => [branch,[put,~x],[get,inV][get,inE][put,~x]] -> [is,[get,@str~y][eq,@obj~z]] => [ref,@edge* <= […][is,[get,~y][eq,~z]] -> [count] => [ref,@int <= […][count]]] -> [is,[get,label][eq,@str~y]] => [ref,@edge* <= […][is,[get,label][eq,~y]] -> [count] => [ref,@int <= […][count]] -> [get,inV] => [ref,@vertex* <= […][get,inV] -> [get,(label|id)~y] => [ref,@str|@obj <= […][get,~y]] -> [count] => [ref,@int <= […][count]]]]]] [define,edge,@element&[outV:@vertex,inV:@vertex] -> [put,outV|inV,@vertex] => [error] -> [drop,outV|inV] => [error]] [define,db,@vertex* <= [db] -> [dedup,[get,id]] => -> [get,outE] => [ref,@edge* <= […][get,outE] -> [dedup,[get,id]] => ] -> [is,[get,id][eq,@obj~x]] => [ref,@vertex? <= […][is,[get,id][eq,~x]]] -> [count] => [ref,@int <= […][count]] -> [get,(inE|outE)~x] => [ref,@edge* <= […][get,~x] -> [count] => [ref,@int <= […][count]] -> [is,[get,label][eq,@str~y]] => [ref,@edge* <= […][is,[get,label][eq,~y]] -> [count] => [ref,@int <= […][count]]]] Property Graph Database Primary and Secondary Structures vertex-centric edge property indices vertex-centric edge label indicesadjacent vertex id and label denorm alization vertex count edge count edge count by label
  • 65. [model,rdf=>mm, [define,uri,@str&regex('//^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(?([^#]*))?(#(.*))?')] [define,lit,@str|@num|@bool] [define,bnode,@str&regex('_:.*')] [define,s,@uri|@bnode] [define,p,@uri] [define,o,@uri|@bnode|@lit] [define,stmt,[s:@s,p:@p,o:@o] -> [put,s|p|o,@object] => [error] -> [drop,s|p|o] => [error]] [define,db,stmt* -> [dedup] => -> [count] => [ref,@int <= […][count] -> [is,[get,s][eq,@s~x]] => [ref,@stmt? <= […][is,[get,s][eq,~x]] -> [is,[get,p][eq,@p~y]] => [ref,@stmt? <= […][is,[get,s][eq,~x]][is,[get,p][eq,~y]]] -> [at,[db],[is,…] => [ref,@stmt? <= […] -> [is,[get,o][eq,@p~y]] => [ref,@stmt? <= […][is,[get,s][eq,~x]][is,[get,o][eq,~y]]]] -> [is,[get,p][eq,@p~x]] => [ref,@stmt? <= […][is,[get,p][eq,~x]] -> [dedup] -> [count] => [ref,@int <= […][count]]] -> [is,[get,o][eq,@o~x]] => [ref,@stmt? <= […][is,[get,o][eq,~x]]]] RDF Triple Store Primary and Secondary Structures denorm alized length-2 paths unique predicate count statem ents indexed by subject statem ent count statem ents indexed by subject/predicate
  • 66. Next Generation Models mm-ADT is for cluster-oriented, general purpose computing. Marko A. Rodriguez Daniel Kuppitz Stephen Mallette Patronage from RReduX and DataStax Credits