A discussion of strategies for designing application schemas that use the Riak distributed key-value store.
Video available here: http://vimeo.com/17604126
17. Walking without
Relational crutches
•Sparse data (optional/multi-value fields)
b a sho
18. Walking without
Relational crutches
•Sparse data (optional/multi-value fields)
•Richer data structures
b a sho
19. Walking without
Relational crutches
•Sparse data (optional/multi-value fields)
•Richer data structures
•Meaningful identifiers
b a sho
20. Walking without
Relational crutches
•Sparse data (optional/multi-value fields)
•Richer data structures
•Meaningful identifiers
•Innovative access patterns
b a sho
58. Layouts and Snippets
• Layouts and Snippets
are accessed by name
• SQL: WHERE name=?
b a sho
59. Layouts and Snippets
• Layouts and Snippets
are accessed by name
layouts/Main
• SQL: WHERE name=?
• Simple access by key snippets/top-nav
b a sho
60. Layouts and Snippets
• Layouts and Snippets
are accessed by name
layouts/Main
• SQL: WHERE name=?
• Simple access by key snippets/top-nav
• Simple value structure
(content + metadata)
b a sho
61. Users
Normally accessed by login or email: WHERE login=? OR email=?
b a sho
62. Users
Normally accessed by login or email: WHERE login=? OR email=?
• #1: Login or Email key
b a sho
63. Users
Normally accessed by login or email: WHERE login=? OR email=?
• #1: Login or Email key
• Easy lookup on one,
manual index other
b a sho
64. Users
Normally accessed by login or email: WHERE login=? OR email=?
• #1: Login or Email key
• Easy lookup on one,
manual index other
• #2: Arbitrary key
b a sho
65. Users
Normally accessed by login or email: WHERE login=? OR email=?
• #1: Login or Email key
• Easy lookup on one,
manual index other
• #2: Arbitrary key
• Independent of
email/login changes
b a sho
66. Users
Normally accessed by login or email: WHERE login=? OR email=?
• #1: Login or Email key
• Easy lookup on one,
manual index other
• #2: Arbitrary key
• Independent of
email/login changes
• Manually index both
b a sho
67. Users
Normally accessed by login or email: WHERE login=? OR email=?
• #1: Login or Email key • #3: Users-list object
• Easy lookup on one,
manual index other
• #2: Arbitrary key
• Independent of
email/login changes
• Manually index both
b a sho
68. Users
Normally accessed by login or email: WHERE login=? OR email=?
• #1: Login or Email key • #3: Users-list object
• Easy lookup on one, • Fits “small teams”
manual index other philosophy
• #2: Arbitrary key
• Independent of
email/login changes
• Manually index both
b a sho
69. Users
Normally accessed by login or email: WHERE login=? OR email=?
• #1: Login or Email key • #3: Users-list object
• Easy lookup on one, • Fits “small teams”
manual index other philosophy
• #2: Arbitrary key • Quick lookup
(one fetch)
• Independent of
email/login changes
• Manually index both
b a sho
70. Users
Normally accessed by login or email: WHERE login=? OR email=?
• #1: Login or Email key • #3: Users-list object
• Easy lookup on one, • Fits “small teams”
manual index other philosophy
• #2: Arbitrary key • Quick lookup
(one fetch)
• Independent of
email/login changes • Bad for large list, still
need unique IDs
• Manually index both
b a sho
73. Page Rendering
• Load and render layout
in page context
b a sho
74. Page Rendering
• Load and render layout
in page context
• Render parts, snippets
via template tags
b a sho
75. Page Rendering
• Load and render layout
in page context
• Render parts, snippets
via template tags
• Iterate over sections of
page hierarchy
b a sho
78. Rendering Page Parts
•Parts are dependent on the page
•Compose / Denormalize (part of page)
b a sho
79. Rendering Page Parts
•Parts are dependent on the page
•Compose / Denormalize (part of page)
•Reduces lookups, conceptual unity, no
index needed
b a sho
80. Rendering Page Parts
•Parts are dependent on the page
•Compose / Denormalize (part of page)
•Reduces lookups, conceptual unity, no
index needed
• Larger objects, no lazy fetching
b a sho
81. Denormalized Page
{
title:”Home Page”,
parts:[
{name:”body”,
content:”...”},
{name:”sidebar”,
content:”...”},
],
// ...
}
b a sho
83. Site Hierarchy
...or, how to find a page (currently)
b a sho
84. Site Hierarchy
...or, how to find a page (currently)
• Start at the root:
WHERE parent_id IS NULL
b a sho
85. Site Hierarchy
...or, how to find a page (currently)
• Start at the root:
WHERE parent_id IS NULL
• Check if the current
page matches URL
b a sho
86. Site Hierarchy
...or, how to find a page (currently)
• Start at the root:
WHERE parent_id IS NULL
• Check if the current
page matches URL
• Check children for
matching path
segment, recurse
b a sho
87. Site Hierarchy
...or, how to find a page (currently)
• Start at the root:
WHERE parent_id IS NULL
• Check if the current
page matches URL
• Check children for
matching path
segment, recurse
• Return “not found”
page or 404
b a sho
88. Site Hierarchy
...or, how to find a page (analysis)
b a sho
89. Site Hierarchy
...or, how to find a page (analysis)
• O(log N) queries to
find requested page
b a sho
90. Site Hierarchy
...or, how to find a page (analysis)
• O(log N) queries to
find requested page
• Index on parent_id
speeds retrieval
b a sho
91. Site Hierarchy
...or, how to find a page (analysis)
• O(log N) queries to
find requested page
• Index on parent_id
speeds retrieval
• Traversing/iterating for
content generation
b a sho
92. Site Hierarchy
...or, how to find a page (analysis)
• O(log N) queries to
find requested page
• Index on parent_id
speeds retrieval
• Traversing/iterating for
content generation
• Generating URL paths
b a sho
93. Site Hierarchy
...or, how to find a page (in Riak)
b a sho
94. Site Hierarchy
...or, how to find a page (in Riak)
• Blog post:
http://ow.ly/3jDAO
b a sho
95. Site Hierarchy
...or, how to find a page (in Riak)
• Blog post:
http://ow.ly/3jDAO
• #1 Parent/child links
b a sho
96. Site Hierarchy
...or, how to find a page (in Riak)
• Blog post:
http://ow.ly/3jDAO
• #1 Parent/child links
• #2 Material path key
b a sho
97. Site Hierarchy
...or, how to find a page (in Riak)
• Blog post:
http://ow.ly/3jDAO
• #1 Parent/child links
• #2 Material path key
• #3 Tree object
b a sho
99. #1: Parent/Child Links
A Natural Tree
• Easy to understand & A
implement
B C
D E F
b a sho
100. #1: Parent/Child Links
A Natural Tree
</riak/pages/C>;
riaktag=”child”
• Easy to understand & A
implement
</riak/pages/A>;
riaktag=”parent”
• Doubly-linked for easy
traversal in either B C
direction
D E F
b a sho
101. #1: Parent/Child Links
A Natural Tree
</riak/pages/C>;
riaktag=”child”
• Easy to understand & A
implement
</riak/pages/A>;
riaktag=”parent”
• Doubly-linked for easy
traversal in either B C
direction
• Easy to move entire
subtrees D E F
b a sho
104. #1: Parent/Child Links
A Natural Tree
A
• Child order is arbitrary
• Still O(log N) traversal B C
D E F
b a sho
105. #1: Parent/Child Links
A Natural Tree
A
• Child order is arbitrary
• Still O(log N) traversal B C
• Two writes to add or
update
D E F
b a sho
106. #2: Material Path Key
“Best Guess” discovery
_root_ (A)
B
C
B/D
C/E
“pages” bucket
b a sho
107. #2: Material Path Key
“Best Guess” discovery
_root_ (A)
• Use path-to-page as
key B
C
B/D
C/E
“pages” bucket
b a sho
108. #2: Material Path Key
“Best Guess” discovery
_root_ (A)
• Use path-to-page as
key B
• Best case: one lookup C
to find requested page
B/D
C/E
“pages” bucket
b a sho
109. #2: Material Path Key
“Best Guess” discovery
_root_ (A)
• Use path-to-page as
key B
• Best case: one lookup C
to find requested page
B/D
• Ancestor pages listed
inside object C/E
“pages” bucket
b a sho
110. #2: Material Path Key
“Best Guess” discovery
{ _root_ (A)
• Use path-to-page as title:”D”,
key ancestors:[
B
“B”,
• Best case: one lookup “_root_”
C
to find requested page ]
B/D
• Ancestor pages listed
inside object C/E
“pages” bucket
b a sho
111. #2: Material Path Key
“Best Guess” discovery
_root_ (A)
B
C
B/D
C/E
“pages” bucket
b a sho
112. #2: Material Path Key
“Best Guess” discovery
• Large update cost for _root_ (A)
internal nodes
B
C
B/D
C/E
“pages” bucket
b a sho
113. #2: Material Path Key
“Best Guess” discovery
• Large update cost for _root_ (A)
internal nodes
• Dynamic URLs hard B
C
B/D
C/E
“pages” bucket
b a sho
114. #2: Material Path Key
“Best Guess” discovery
• Large update cost for _root_ (A)
internal nodes
• Dynamic URLs hard B
• Key-filters (0.14) C
needed for efficient
B/D
child lists
C/E
“pages” bucket
b a sho
115. #2: Material Path Key
“Best Guess” discovery
• Large update cost for _root_ (A)
internal nodes
• Dynamic URLs hard B
• Key-filters (0.14) C
needed for efficient
B/D
child lists
• Need fallback for C/E
“misses”
“pages” bucket
b a sho
116. #3: Tree Object
“Branches and Leaves”
{
key:”A”,
children:[
{key:”B”,
children:[“D”]},
{key:”C”,
children:[“E”,”F”]}
]
}
b a sho
117. #3: Tree Object
“Branches and Leaves”
• One request to get site
structure
{
key:”A”,
children:[
{key:”B”,
children:[“D”]},
{key:”C”,
children:[“E”,”F”]}
]
}
b a sho
118. #3: Tree Object
“Branches and Leaves”
• One request to get site
structure
{
key:”A”,
• Separates content from children:[
{key:”B”,
organization
children:[“D”]},
{key:”C”,
children:[“E”,”F”]}
]
}
b a sho
119. #3: Tree Object
“Branches and Leaves”
• One request to get site
structure
{
key:”A”,
• Separates content from children:[
{key:”B”,
organization
children:[“D”]},
{key:”C”,
• Intrinsic ordering ]
children:[“E”,”F”]}
}
b a sho
120. #3: Tree Object
“Branches and Leaves”
• One request to get site
structure
{
key:”A”,
• Separates content from children:[
{key:”B”,
organization
children:[“D”]},
{key:”C”,
• Intrinsic ordering ]
children:[“E”,”F”]}
}
• Massive structure
changes are quick
b a sho
121. #3: Tree Object
“Branches and Leaves”
{
key:”A”,
children:[
{key:”B”,
children:[“D”]},
{key:”C”,
children:[“E”,”F”]}
]
}
b a sho
122. #3: Tree Object
“Branches and Leaves”
• Large site = large tree
object (expensive to
transfer) {
key:”A”,
children:[
{key:”B”,
children:[“D”]},
{key:”C”,
children:[“E”,”F”]}
]
}
b a sho
123. #3: Tree Object
“Branches and Leaves”
• Large site = large tree
object (expensive to
transfer) {
key:”A”,
• Some metadata children:[
{key:”B”,
(slug?)will need to be children:[“D”]},
{key:”C”,
in-tree for efficient children:[“E”,”F”]}
traversal ]
}
b a sho
124. #3: Tree Object
“Branches and Leaves”
• Large site = large tree
object (expensive to
transfer) {
key:”A”,
• Some metadata children:[
{key:”B”,
(slug?)will need to be children:[“D”]},
{key:”C”,
in-tree for efficient children:[“E”,”F”]}
traversal ]
}
• Multiple writers
problematic
b a sho
126. Hybrid Solutions
TIMTOWTDI
•Material paths for quick lookups, links
for relative traversal
b a sho
127. Hybrid Solutions
TIMTOWTDI
•Material paths for quick lookups, links
for relative traversal
•Tree object for relative lookups,
secondary index for material paths
b a sho
129. Key Takeaways
•Design for most common access pattern:
the key is your index
b a sho
130. Key Takeaways
•Design for most common access pattern:
the key is your index
•Denormalize dependent data types
b a sho
131. Key Takeaways
•Design for most common access pattern:
the key is your index
•Denormalize dependent data types
•Build richer (or simpler) data structures
b a sho
132. Key Takeaways
•Design for most common access pattern:
the key is your index
•Denormalize dependent data types
•Build richer (or simpler) data structures
•Use links to connect normalized or
independent types
b a sho
134. Useful Tips
•Use query or identity caches to reduce
duplicate fetches
b a sho
135. Useful Tips
•Use query or identity caches to reduce
duplicate fetches
•Store data in JSON, XML, or Erlang
terms for MapReduce
b a sho
136. Useful Tips
•Use query or identity caches to reduce
duplicate fetches
•Store data in JSON, XML, or Erlang
terms for MapReduce
•Use Riak Search where appropriate to
reduce complexity
b a sho
138. Review
•Analyze your relational model
b a sho
139. Review
•Analyze your relational model
•Identify pain points, take stats
b a sho
140. Review
•Analyze your relational model
•Identify pain points, take stats
•Design some alternatives
b a sho
141. Review
•Analyze your relational model
•Identify pain points, take stats
•Design some alternatives
•Test, Measure, Repeat!
b a sho
142. Plug
Interested in learning about support,
consulting, or Enterprise features?
Email info@basho.com or go to
http://www.basho.com/contact.html to talk
with us.
www.basho.com
b a sho
143. Plug
Interested in learning about support,
consulting, or Enterprise features?
Email info@basho.com or go to
http://www.basho.com/contact.html to talk
with us.
www.basho.com
sean@basho.com
@seancribbs
b a sho