SlideShare une entreprise Scribd logo
1  sur  39
Télécharger pour lire hors ligne
Analytics and Graph Traversal with Solr
Yonik Seeley
Cloudera
2	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
My	
  Background	
  
• Creator	
  of	
  Solr	
  
• Cloudera	
  Engineer	
  	
  
• LucidWorks	
  Co-­‐Founder	
  
• Lucene/Solr	
  commiEer,	
  PMC	
  member	
  
• Apache	
  SoIware	
  FoundaKon	
  member	
  
• M.S.	
  in	
  Computer	
  Science,	
  Stanford	
  
3	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Graph	
  Traversal	
  
4	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Graph	
  Databases	
   •  Graph	
  Databases	
  are	
  all	
  about	
  
Nodes	
  and	
  Edges	
  (relaKonships)	
  
Stanford	
  
	
  	
  	
  	
  	
  	
  RPI	
  
	
  	
  	
  	
  Ann	
  
	
  Cloudera	
  
	
  	
  	
  	
  	
  NJ	
  
	
  	
  	
  Mike	
  
aEended	
  
aEended	
  
recommended	
  
works_at	
  
lives_in	
  
aEended	
  
5	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
ProperKes	
  
Stanford	
  
	
  	
  	
  	
  Ann	
  
aEended	
  
start:	
  1992	
  
end:	
  1993	
  
degree:	
  MS	
  
subject:	
  Computer	
  Science	
  
bday:	
  5/01/1970	
  
type:	
  private	
  
opened:	
  1891	
  
locaKon:	
  Stanford,	
  CA	
  
6	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Graph	
  to	
  Document	
  Mapping	
  
RelaKonships	
  without	
  properKes	
  
• Only	
  index	
  the	
  nodes	
  
• properKes	
  are	
  field	
  values	
  
• nodes	
  without	
  properKes	
  can	
  be	
  skipped	
  
• Edges	
  defined	
  at	
  query-­‐Kme	
  only	
  
• implicit	
  based	
  on	
  field	
  value	
  matches	
  
Node1	
  
id:	
  node1	
  
relaKon:	
  node2	
  
Node2	
  
id:	
  node2	
  
7	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Graph	
  to	
  Document	
  Mapping	
  
RelaKonships	
  with	
  properKes	
  
	
  
• RelaKonships	
  with	
  properKes:	
  
• 	
  Model	
  the	
  relaKonship	
  as	
  a	
  document	
  
• "Pointers"	
  can	
  be	
  field	
  values	
  on	
  any	
  of	
  the	
  documents	
  
	
  
RelaKonship1	
   Node2	
  Node1	
  RelaKonship1	
   Node2	
  Node1	
  
target1:	
  node1	
  
target2:	
  node2	
  
rel:	
  relaKonship1	
   rel:	
  relaKonship1	
  
id:	
  relaKonship1	
  
OR	
  target:	
  [node1,	
  node2]	
  
id:	
  node1	
   id:	
  node2	
  
8	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Document	
  Mapping	
   type:	
  edu	
  
name:	
  Stanford	
  
opened:	
  1891	
  
address:	
  Stanford,	
  CA	
  
state:	
  CA	
  
type:	
  aEendance	
  
who:	
  Ann	
  
where:	
  Stanford	
  
start:	
  1992	
  
end:	
  1993	
  
degree:	
  MS	
  
subject:	
  Computer	
  Science	
  
type:	
  person	
  
name:	
  Ann	
  
bday:	
  5/01/1970	
  
address:	
  Branchburg,	
  NJ	
  
state:	
  NJ	
  
9	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Graph	
  Query	
  
10	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Graph	
  Query	
  (filter)	
  
•  Breadth-­‐first	
  	
  graph	
  traversal	
  
•  Modeled	
  as	
  a	
  normal	
  Query	
  
• usable	
  as	
  main	
  query,	
  filter	
  query,	
  facet	
  query,	
  input	
  to	
  another	
  query,	
  etc	
  
• cached	
  by	
  default	
  in	
  filterCache	
  
	
  
q={!graph	
  from=nodeIdField	
  to=edgeIdField}<starting_query>	
  
	
  
•  Output	
  is	
  a	
  set	
  of	
  documents	
  
• edges	
  are	
  defined	
  by	
  matches	
  between	
  the	
  fromField	
  and	
  toField	
  
• each	
  iteraKon	
  moves	
  to	
  nodes	
  idenKfied	
  by	
  the	
  edge	
  field	
  
11	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Graph	
  Filter	
  –	
  conKnued	
  
•  OpKonal	
  arguments	
  
• maxDepth	
  –	
  maximum	
  number	
  of	
  hops	
  from	
  the	
  root	
  
• traversalFilter	
  –	
  arbitrary	
  query	
  applied	
  to	
  nodes	
  on	
  each	
  hop	
  
• returnRoot	
  –	
  (true/false)	
  include	
  the	
  root	
  in	
  the	
  final	
  set	
  
• leafNodesOnly	
  –	
  (true/false)	
  return	
  only	
  docs	
  w/o	
  value	
  in	
  the	
  "to"	
  field	
  
•  NOTE:	
  {!graph}	
  isn't	
  (currently)	
  distributed!	
  
• Edges	
  are	
  only	
  followed	
  within	
  a	
  shard	
  
• SKll	
  useful,	
  and	
  compaKble	
  with	
  distributed	
  search	
  
12	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
TwiEer	
  Example	
  
q={!graph	
  from=user_id	
  to=following}name:Yonik	
  
	
  
user_id:	
  lucene_solr	
  
name:	
  Yonik	
  Seeley	
  
following:	
  [heismark,shalinmangar]	
  
user_id:	
  heismark	
  
name:	
  Mark	
  Miller	
  
following:	
  [lucene_solr,GRRMSpeaking,...]	
  
user_id:	
  shalinmangar	
  
name:	
  Shalin	
  Mangar	
  
following:	
  [romseygeek,_hossman,...]	
  
•  Finds	
  everyone	
  that	
  Yonik	
  follows,	
  and	
  
their	
  followers,	
  etc	
  
	
  
13	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
{!graph}	
  vs	
  {!join}	
  
	
  
q={!join	
  from=following	
  to=user_id}name:Yonik	
  
q={!graph	
  from=user_id	
  to=following	
  maxDepth=1	
  returnRoot=false}name:Yonik	
  
	
  
•  pseudo-­‐join	
  filter	
  query	
  {!join}	
  ==	
  single-­‐step	
  {!graph}	
  
•  Note	
  the	
  from/to	
  switch	
  (a	
  discrepancy	
  caught	
  too	
  late!)	
  
• graph:	
  travels	
  "to"	
  nodes	
  idenKfied	
  by	
  the	
  edge	
  field	
  
• join:	
  looks	
  at	
  values	
  in	
  the	
  "from"	
  field	
  and	
  travels	
  to	
  documents	
  with	
  those	
  
values	
  in	
  the	
  "to"	
  field.	
  
	
  
14	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Graph	
  Streaming	
  Expressions	
  
15	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Graph	
  streaming	
  expressions	
  
• Breadth-­‐first	
  graph	
  traversals	
  
• Part	
  of	
  streaming	
  expressions	
  
• fully	
  distributed	
  
• cross	
  collecKons	
  as	
  well	
  as	
  shards	
  
• parallelizable	
  
	
  
16	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Graph	
  streaming	
  expressions	
  example	
  
•  Index	
  some	
  books	
  in	
  one	
  collecKon	
  
curl	
  http://localhost:8983/solr/books/update	
  -­‐H	
  'Content-­‐type:text/csv'	
  -­‐d	
  '	
  
id,cat,pubyear_i,title,author,series_s,sequence_i	
  
book1,fantasy,2000,A	
  Storm	
  of	
  Swords,George	
  R.R.	
  Martin,A	
  Song	
  of	
  Ice	
  and	
  Fire,3	
  
book2,fantasy,2005,A	
  Feast	
  for	
  Crows,George	
  R.R.	
  Martin,A	
  Song	
  of	
  Ice	
  and	
  Fire,4	
  
book3,fantasy,2011,A	
  Dance	
  with	
  Dragons,George	
  R.R.	
  Martin,A	
  Song	
  of	
  Ice	
  and	
  Fire,5	
  
book4,sci-­‐fi,1987,Consider	
  Phlebas,Iain	
  M.	
  Banks,The	
  Culture,1	
  
book5,sci-­‐fi,1988,The	
  Player	
  of	
  Games,Iain	
  M.	
  Banks,The	
  Culture,2	
  
book6,sci-­‐fi,1990,Use	
  of	
  Weapons,Iain	
  M.	
  Banks,The	
  Culture,3	
  
book7,fantasy,1984,Shadows	
  Linger,Glen	
  Cook,The	
  Black	
  Company,2	
  
book8,fantasy,1984,The	
  White	
  Rose,Glen	
  Cook,The	
  Black	
  Company,3	
  
book9,fantasy,1989,Shadow	
  Games,Glen	
  Cook,The	
  Black	
  Company,4	
  
book10,sci-­‐fi,2001,Gridlinked,Neal	
  Asher,Ian	
  Cormac,1	
  
book11,sci-­‐fi,2003,The	
  Line	
  of	
  Polity,Neal	
  Asher,Ian	
  Cormac,2	
  
book12,sci-­‐fi,2005,Brass	
  Man,Neal	
  Asher,Ian	
  Cormac,3	
  
'	
  
17	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Graph	
  streaming	
  expressions	
  example	
  
•  Index	
  some	
  book	
  reviews	
  into	
  another	
  collecKon	
  
curl	
  http://localhost:8983/solr/reviews/update-­‐H	
  'Content-­‐type:text/csv'	
  -­‐d	
  '	
  
id,book_s,user_s,rating_i,review_t	
  
book1_r1,book1,Yonik,5,awesome	
  book!	
  
book1_r2,book1,Aarav,2,too	
  bloody	
  
book1_r3,book1,Haruka,5,awesome	
  world	
  building	
  
book2_r1,book2,Yonik,5,another	
  great	
  one	
  
book2_r2,book2,Maria,5,wow!	
  
book4_r1,book4,Yonik,2,i	
  am	
  lying...	
  actually	
  liked	
  it	
  
book4_r2,book4,Aarav,5,Loved	
  it	
  
book7_r1,book7,Yonik,4,read	
  back	
  in	
  college	
  but	
  it	
  was	
  good	
  
book10_r1,book10,Maria,5,I	
  want	
  a	
  gridlink!	
  
book11_r1,book11,Maria,1,Blech	
  
book11_r2,book11,Aarav,4,is	
  this	
  the	
  first	
  book?	
  
book12_r1,book12,Yonik,5,Mr.	
  Crane	
  is	
  scary...	
  
'	
  
1.	
  Find	
  books	
  I	
  like	
  
2.	
  Find	
  who	
  else	
  rated	
  
those	
  books	
  highly	
  
3.	
  Find	
  other	
  books	
  
they	
  rated	
  highly	
  
4.	
  Profit!	
  
18	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
1.	
  Search	
  expression	
  to	
  find	
  my	
  high	
  raKngs	
  
URL="http://localhost:8983/solr/reviews/stream"	
  
	
  
#	
  Use	
  search	
  expression	
  to	
  find	
  reviews	
  that	
  I	
  have	
  the	
  book	
  a	
  "5"	
  
curl	
  $URL	
  -­‐d	
  'expr=search(reviews,	
  q="user_s:Yonik	
  AND	
  rating_i:5",	
  
fl="id,book_s,user_s,rating_i",	
  sort="user_s	
  asc")'	
  
	
  
	
  
{"result-­‐set":{"docs":[	
  
{"raKng_i":5,"id":"book2_r1","user_s":"Yonik","book_s":"book2"},	
  
{"raKng_i":5,"id":"book1_r1","user_s":"Yonik","book_s":"book1"},	
  
{"raKng_i":5,"id":"book12_r1","user_s":"Yonik","book_s":"book12"},	
  
{"EOF":true,"RESPONSE_TIME":4}]}}	
  
19	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
2.	
  gatherNodes	
  expression	
  to	
  find	
  users	
  
curl	
  $URL	
  -­‐d	
  'expr=gatherNodes(reviews,	
  
	
  	
  	
  search(reviews,	
  q="user_s:Yonik	
  AND	
  rating_i:5",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  fl="book_s,user_s,rating_i",sort="user_s	
  asc"),	
  
	
  	
  	
  walk="book_s-­‐>book_s",	
  
	
  	
  	
  gather="user_s",	
  
	
  	
  	
  fq="rating_i:[4	
  TO	
  *]	
  -­‐user_s:Yonik",	
  
	
  	
  	
  trackTraversal=true	
  )'	
  
	
  
	
  
{"result-­‐set":{"docs":[	
  
{"node":"Haruka","collecKon":"reviews","field":"user_s","ancestors":["book1"],"level":1},	
  
{"node":"Maria","collecKon":"reviews","field":"user_s","ancestors":["book2"],"level":1},	
  
{"EOF":true,"RESPONSE_TIME":22}]}}	
  
"gather"	
  values	
  
20	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
3.	
  gatherNodes	
  to	
  find	
  high	
  raKngs	
  by	
  those	
  users	
  	
  
curl	
  $URL	
  -­‐d	
  'expr=gatherNodes(reviews,	
  
	
  	
  	
  	
  gatherNodes(reviews,	
  search(reviews,q="user_s:Yonik	
  AND	
  rating_i:
5",fl="id,book_s,user_s,rating_i",sort="user_s	
  asc"),	
  walk="book_s-­‐>book_s",	
  
gather="user_s",fq="rating_i:[4	
  TO	
  *]	
  -­‐user_s:Yonik"),	
  
	
  	
  	
  	
  walk="node-­‐>user_s",	
  gather="book_s",	
  fq="rating_i:[4	
  TO	
  *]",	
  
	
  	
  	
  	
  avg(rating_i),	
  
	
  	
  	
  	
  trackTraversal=true)'	
  
	
  
	
  
	
  
{"result-­‐set":{"docs":[	
  
{"node":"book10","avg(raKng_i)":5.0,"field":"book_s","level":
2,"collecKon":"reviews","ancestors":["Maria"]},	
  
{"EOF":true,"RESPONSE_TIME":65}]}}	
  
21	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Retrieving	
  complete	
  traversal	
  
curl	
  $URL	
  -­‐d	
  'expr=gatherNodes(reviews,	
  [...],	
  scaEer="branches,leaves")'	
  
	
  
	
  
	
  
{"result-­‐set":{"docs":[	
  
{"node":"book12","collecKon":"reviews","field":"book_s","level":0},	
  
{"node":"book1","collecKon":"reviews","field":"book_s","level":0},	
  
{"node":"book2","collecKon":"reviews","field":"book_s","level":0},	
  
{"node":"Haruka","collecKon":"reviews","field":"user_s","level":1},	
  
{"node":"Maria","collecKon":"reviews","field":"user_s","level":1},	
  
{"node":"book10","avg(raKng_i)":5.0,"field":"book_s","level":2,	
  
"collecKon":"reviews","ancestors":["Maria"]},	
  
{"EOF":true,"RESPONSE_TIME":111}]}}	
  
22	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
{!graph}	
  single	
  collecKon/shard	
  version	
  
curl	
  "http://localhost:8983/solr/reviews/query"	
  -­‐d	
  '	
  
q={!graph	
  from=user_s	
  to=user_s	
  	
  
	
  	
  	
  	
  	
  returnRoot=false	
  traversalFilter=$f1	
  v=$g1}&	
  
g1={!graph	
  from=book_s	
  to=book_s	
  
	
  	
  	
  	
  	
  returnRoot=false	
  traversalFilter=$f1	
  v=$q1}&	
  
q1=user_s:Yonik	
  AND	
  rating_i:5&	
  
f1=rating_i:[4	
  TO	
  *]	
  
'	
  
	
  
	
  
	
  
23	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
More	
  graph	
  expressions	
  
•  shortestPath	
  
• Finds	
  the	
  shortest	
  path	
  between	
  "from"	
  and	
  "to"	
  
	
  
•  scoreNodes	
  :	
  p-­‐idf	
  inspired	
  scoring	
  
• wraps	
  a	
  gatherNodes	
  expression	
  that	
  finds	
  the	
  co-­‐occurrence	
  count	
  
• p	
  factor	
  –	
  the	
  co-­‐occurrence	
  count	
  
• idf	
  factor	
  –	
  boosts	
  nodes	
  that	
  are	
  rarer	
  overall	
  
24	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Network	
  analysis	
  and	
  visualizaKon	
  
curl	
  http://localhost:8983/solr/reviews/graph	
  -­‐d	
  'expr=gatherNodes(reviews,	
  [...],	
  
scaEer="branches,leaves")'	
  
	
  
	
  
	
  
<?xml	
  version="1.0"	
  encoding="UTF-­‐8"?>	
  
<graphml	
  xmlns="hEp://graphml.graphdrawing.org/xmlns"	
  	
  
xmlns:xsi="hEp://www.w3.org/2001/XMLSchema-­‐instance"	
  	
  
xsi:schemaLocaKon="hEp://graphml.graphdrawing.org/xmlns	
  hEp://graphml.graphdrawing.org/xmlns/1.0/
graphml.xsd">	
  
<graph	
  id="G"	
  edgedefault="directed">	
  
<node	
  id="book12">	
  
	
  	
  <data	
  key="field">book_s</data>	
  
	
  	
  <data	
  key="level">0</data>	
  
</node>	
  
<node	
  id="book1">	
  
	
  	
  <data	
  key="field">book_s</data>	
  
[...]	
  
25	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
26	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Analyzing	
  Book	
  Reviews	
  
w/	
  JSON	
  Facet	
  API	
  
27	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
JSON	
  Facet	
  API	
  w/	
  Book	
  Reviews	
  
• Same	
  books	
  &	
  reviews	
  data	
  set	
  as	
  before,	
  except:	
  
• Index	
  books	
  and	
  reviews	
  into	
  the	
  same	
  collec<on	
  
• Index	
  a	
  book	
  and	
  its	
  reviews	
  into	
  the	
  same	
  shard	
  
• eliminates	
  cross-­‐shard	
  "edges"	
  between	
  books	
  &	
  reviews	
  
28	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
compositeId	
  router	
  
shard1	
  
shard2	
  
shard3	
  
id:book1	
  
id:book1!review1	
  
id:book1!review2	
  
a	
  16	
  bit	
  
range	
  
full	
  32	
  bit	
  hash	
  of	
  "book1"	
  
	
  
top	
  16	
  bits	
  of	
  "book1",	
  
bottom	
  16	
  "review1"	
  
	
  
top	
  16	
  bits	
  of	
  "book1",	
  
bottom	
  16	
  "review2"	
  
	
  
• Easy	
  collocaKon	
  of	
  documents	
  in	
  SolrCloud	
  
• Works	
  right	
  out	
  of	
  the	
  box	
  (it's	
  default!)	
  
• Restrict	
  queries	
  to	
  shards	
  for	
  performance:	
  
&q=reviewer:yonik	
  AND	
  book_id:book1	
  
&_route_=book1!	
  
	
  
32-­‐bit	
  hash	
  ring	
  
29	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Refresher:	
  Facet	
  commands	
  and	
  Domains	
  
Domain	
  
Facet	
  
Command	
  
A	
  
•  Domain:	
  A	
  set	
  of	
  documents	
  
•  Facet	
  command:	
  create	
  sub-­‐domains	
  /	
  "facet	
  buckets"	
  
Facet	
  
Command	
  
B	
  
Domain	
  
Domain	
  
Domain	
  
Domain	
  
Facet	
  
Command	
  
C	
  
Domain	
  
Domain	
  
Domain	
  
Domain	
  
Domain	
  
Domain	
  
30	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Unique	
  authors,	
  books	
  by	
  genre	
  
curl	
  http://localhost:8983/solr/books/query	
  -­‐d	
  '	
  
q=cat:*&	
  
json.facet=	
  
{	
  
	
  num_authors	
  :	
  "hll(author)",	
  
	
  genres	
  :	
  {	
  
	
  	
  	
  	
  type:	
  terms,	
  
	
  	
  	
  	
  field:	
  cat	
  
	
  	
  }	
  
}	
  
'	
  
	
  
[…]	
  
	
  "facets":{	
  
	
  	
  	
  	
  "count":13,	
  
	
  	
  	
  	
  "num_authors":5,	
  
	
  	
  	
  	
  "genres":{	
  
	
  	
  	
  	
  	
  	
  "buckets":[{	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "val":"fantasy",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":7},	
  
	
  	
  	
  	
  	
  	
  	
  	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "val":"sci-­‐fi",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":6}]}}}	
  
root	
  domain	
  defined	
  by	
  docs	
  
matching	
  the	
  query	
  
hyper-­‐log-­‐log	
  
distributed	
  cardinality	
  
funcKon	
  
one	
  bucket	
  per	
  
unique	
  value	
  in	
  the	
  
"cat"	
  field	
  
31	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
domain	
  change:	
  join	
  
32	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Number	
  of	
  book	
  reviews	
  per	
  genre	
  
json.facet={	
  
	
  	
  genres	
  :	
  {	
  
	
  	
  	
  	
  type:	
  terms,	
  
	
  	
  	
  	
  field:	
  cat,	
  
	
  	
  	
  	
  facet:	
  {	
  
	
  	
  	
  	
  	
  	
  reviews	
  :	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  type:	
  query,	
  
	
  	
  	
  	
  	
  	
  	
  	
  domain:{join:{from:id,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  to:book_s}}	
  	
  	
  	
  	
  	
  	
  	
  
	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  }	
  
	
  	
  }	
  
}	
  
	
  "facets":{	
  
	
  	
  	
  	
  "count":13,	
  
	
  	
  	
  	
  "genres":{	
  
	
  	
  	
  	
  	
  	
  "buckets":[{	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "val":"fantasy",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":7,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "reviews":{	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":7}},	
  
	
  	
  	
  	
  	
  	
  	
  	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "val":"sci-­‐fi",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":6,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "reviews":{	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":5}}]}}}	
  
Calculated	
  per-­‐bucket	
  
domain	
  switch!	
  
happens	
  before	
  faceKng	
  
33	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Average	
  raKng	
  for	
  each	
  genre	
  
json.facet={	
  
	
  	
  genres	
  :	
  {	
  
	
  	
  	
  	
  type:	
  terms,	
  
	
  	
  	
  	
  field:	
  cat,	
  
	
  	
  	
  	
  facet:	
  {	
  
	
  	
  	
  	
  	
  	
  reviews	
  :	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  type:	
  query,	
  
	
  	
  	
  	
  	
  	
  	
  	
  domain:{join	
  {from:id,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  to:book_s}},	
  
	
  	
  	
  	
  	
  	
  	
  	
  facet:	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  rating:"avg(rating_i)"	
  
	
  	
  	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  	
  	
  }}}}	
  
	
  "facets":{	
  
	
  	
  	
  	
  "count":13,	
  
	
  	
  	
  	
  "genres":{	
  
	
  	
  	
  	
  	
  	
  "buckets":[{	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "val":"fantasy",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":7,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "reviews":{	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":7,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "rating":3.857142}},	
  
	
  	
  	
  	
  	
  	
  	
  	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "val":"sci-­‐fi",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":6,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "reviews":{	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":5,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "rating":4.2}}]}}}	
  
34	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Who	
  gives	
  the	
  highest	
  raKngs	
  per	
  genre?	
  
json.facet={	
  
	
  	
  genres	
  :	
  {	
  
	
  	
  	
  	
  type:	
  terms,	
  
	
  	
  	
  	
  field:	
  cat,	
  
	
  	
  	
  	
  facet:	
  {	
  
	
  	
  	
  	
  	
  	
  reviews	
  :	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  type:	
  terms,	
  field:	
  user_s,	
  
	
  	
  	
  	
  	
  	
  	
  	
  sort:	
  "rating	
  desc",	
  limit:3,	
  
	
  	
  	
  	
  	
  	
  	
  	
  domain:{join:{from:id,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  to:book_s}},	
  
	
  	
  	
  	
  	
  	
  	
  	
  facet:	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  rating:"avg(rating_i)"	
  
	
  	
  	
  	
  	
  	
  	
  	
  }	
  
[...]	
  
	
  "facets":{	
  
	
  	
  	
  	
  "count":13,	
  
	
  	
  	
  	
  "genres":{	
  
	
  	
  	
  	
  	
  	
  "buckets":[{	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "val":"fantasy",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":7,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "reviews":{	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "buckets":[	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  "val":"Haruka",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":1,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "rating":5.0},	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  "val":"Yonik",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":3,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "rating":4.66666667},	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  "val":"Maria",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":2,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "rating":3.0}]}},	
  
	
  	
  	
  	
  	
  	
  	
  	
  {	
  "val":"sci-­‐fi",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":6,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "reviews":{	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "buckets":[	
  
35	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Histogram:	
  average	
  raKng	
  trends	
  over	
  Kme	
  
json.facet={	
  
	
  	
  genres	
  :	
  {	
  
	
  	
  	
  	
  type:	
  terms,	
  
	
  	
  	
  	
  field:	
  cat,	
  
	
  	
  	
  	
  facet:	
  {	
  
	
  	
  	
  	
  	
  	
  reviews	
  :	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  domain:{join:{from:id,	
  
to:book_s}},	
  
	
  	
  	
  	
  	
  	
  	
  	
  type:	
  range,	
  
	
  	
  	
  	
  	
  	
  	
  	
  field:	
  review_date_i,	
  
	
  	
  	
  	
  	
  	
  	
  	
  start:	
  1980,	
  
	
  	
  	
  	
  	
  	
  	
  	
  end:	
  2020,	
  
	
  	
  	
  	
  	
  	
  	
  	
  gap:	
  10,	
  
	
  	
  	
  	
  	
  	
  	
  	
  facet:	
  {	
  rating:"avg(rating_i)"	
  }	
  
}}}}	
  	
  
	
  "facets":{	
  
	
  	
  	
  	
  "count":13,	
  
	
  	
  	
  	
  "genres":{	
  
	
  	
  	
  	
  	
  	
  "buckets":[{	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "val":"fantasy",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":7,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "reviews":{	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "buckets":[	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  "val":1980,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":1323,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "rating":3.17},	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  "val":1990,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":1452,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "rating":3.26},	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  "val":2000,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":1559	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "rating":3.48},	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  "val":2010,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "count":1793	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "rating":3.54}]}},	
  
	
  	
  	
  	
  	
  	
  	
  	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "val":"sci-­‐fi",	
  
36	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Streaming	
  Expressions	
  vs	
  JSON	
  Facets	
  
37	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
JSON	
  Facet	
  API	
  
•  More	
  focused	
  on	
  web-­‐scale	
  interacKve	
  responses	
  
•  Tighter	
  integraKon	
  
•  Just	
  another	
  search	
  component	
  
•  UKlizes	
  exisKng	
  distributed	
  search	
  framework	
  
•  Single	
  request-­‐response	
  top-­‐N,	
  grouping,	
  highlighKng,	
  faceKng,	
  etc.	
  
•  MulKple-­‐facets	
  in	
  single	
  request	
  
•  Block	
  join	
  /	
  nested	
  document	
  support	
  
• Document	
  centric	
  
38	
  ©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Streaming	
  Expressions	
  
•  More	
  general	
  purpose,	
  larger	
  scope	
  
• Wrap	
  streams	
  within	
  streams	
  to	
  do	
  preEy	
  much	
  anything	
  
• Not	
  Ked	
  to	
  documents	
  (analyKcs	
  across	
  joins	
  w/	
  external	
  DBs)	
  
• Update	
  streams,	
  machine	
  learning	
  streams,	
  etc.	
  
•  Exact	
  results	
  in	
  distributed	
  mode	
  (e.g.	
  cardinality)	
  
•  Distributed	
  joins,	
  graph	
  
•  Synergy:	
  Increasingly	
  works	
  with	
  JSON	
  Facet	
  API	
  to	
  push	
  down	
  work	
  to	
  leaves	
  
	
  
Thank	
  you!	
  

Contenu connexe

Tendances

Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...Lucidworks
 
Apache Sentry for Hadoop security
Apache Sentry for Hadoop securityApache Sentry for Hadoop security
Apache Sentry for Hadoop securitybigdatagurus_meetup
 
Web Development with Laravel 5
Web Development with Laravel 5Web Development with Laravel 5
Web Development with Laravel 5Soheil Khodayari
 
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceHadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceCloudera, Inc.
 
Building Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkBuilding Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkDatabricks
 
Metadata is a Love Note to the Future
Metadata is a Love Note to the FutureMetadata is a Love Note to the Future
Metadata is a Love Note to the FutureRachel Lovinger
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...HostedbyConfluent
 
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자Donghyeok Kang
 
Mantenimiento de la base de datos Oracle 11g
Mantenimiento de la base de datos Oracle 11gMantenimiento de la base de datos Oracle 11g
Mantenimiento de la base de datos Oracle 11gCarmen Soler
 
HBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBaseCon
 
Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...
Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...
Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...Simplilearn
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Introduction to HiveQL
Introduction to HiveQLIntroduction to HiveQL
Introduction to HiveQLkristinferrier
 
Apache Mahout Architecture Overview
Apache Mahout Architecture OverviewApache Mahout Architecture Overview
Apache Mahout Architecture OverviewStefano Dalla Palma
 
Hive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfsHive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfsYifeng Jiang
 
Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)Julian Hyde
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineTrey Grainger
 

Tendances (20)

Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
 
Apache Sentry for Hadoop security
Apache Sentry for Hadoop securityApache Sentry for Hadoop security
Apache Sentry for Hadoop security
 
Web Development with Laravel 5
Web Development with Laravel 5Web Development with Laravel 5
Web Development with Laravel 5
 
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceHadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
 
Building Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkBuilding Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache Spark
 
Metadata is a Love Note to the Future
Metadata is a Love Note to the FutureMetadata is a Love Note to the Future
Metadata is a Love Note to the Future
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
 
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
 
Internal Hive
Internal HiveInternal Hive
Internal Hive
 
Mantenimiento de la base de datos Oracle 11g
Mantenimiento de la base de datos Oracle 11gMantenimiento de la base de datos Oracle 11g
Mantenimiento de la base de datos Oracle 11g
 
HBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial Industry
 
Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...
Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...
Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Introduction to HiveQL
Introduction to HiveQLIntroduction to HiveQL
Introduction to HiveQL
 
Apache Mahout Architecture Overview
Apache Mahout Architecture OverviewApache Mahout Architecture Overview
Apache Mahout Architecture Overview
 
OpenRefine
OpenRefineOpenRefine
OpenRefine
 
Spark sql
Spark sqlSpark sql
Spark sql
 
Hive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfsHive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfs
 
Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engine
 

Similaire à Analytics and Graph Traversal with Solr - Yonik Seeley, Cloudera

Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH) Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH) David Fombella Pombal
 
Introduction to graph databases, Neo4j and Spring Data - English 2015 Edition
Introduction to graph databases, Neo4j and Spring Data - English 2015 EditionIntroduction to graph databases, Neo4j and Spring Data - English 2015 Edition
Introduction to graph databases, Neo4j and Spring Data - English 2015 EditionAleksander Stensby
 
"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - ChicagoErik Hatcher
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge GraphsJeff Z. Pan
 
Clojure - An Introduction for Lisp Programmers
Clojure - An Introduction for Lisp ProgrammersClojure - An Introduction for Lisp Programmers
Clojure - An Introduction for Lisp Programmerselliando dias
 
Graph Sample and Hold: A Framework for Big Graph Analytics
Graph Sample and Hold: A Framework for Big Graph AnalyticsGraph Sample and Hold: A Framework for Big Graph Analytics
Graph Sample and Hold: A Framework for Big Graph AnalyticsNesreen K. Ahmed
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentationMurat Çakal
 
FP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleFP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleChristophe Grand
 
The ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years OldThe ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years OldJohn Kunze
 
Cassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestCassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestDuyhai Doan
 
Webinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and GraphWebinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and GraphLucidworks
 
NoSQL Tel Aviv Meetup #2: Who Is Afraid of Graphs?
NoSQL Tel Aviv Meetup #2: Who Is Afraid of Graphs?NoSQL Tel Aviv Meetup #2: Who Is Afraid of Graphs?
NoSQL Tel Aviv Meetup #2: Who Is Afraid of Graphs?NoSQL TLV
 
Who's afraid of graphs
Who's afraid of graphsWho's afraid of graphs
Who's afraid of graphsSirKetchup
 
Intro to Graphs for Fedict
Intro to Graphs for FedictIntro to Graphs for Fedict
Intro to Graphs for FedictRik Van Bruggen
 
BUILDING WHILE FLYING
BUILDING WHILE FLYINGBUILDING WHILE FLYING
BUILDING WHILE FLYINGKamal Shannak
 
Introduction to graph databases in term of neo4j
Introduction to graph databases in term of neo4jIntroduction to graph databases in term of neo4j
Introduction to graph databases in term of neo4jAbdullah Hamidi
 

Similaire à Analytics and Graph Traversal with Solr - Yonik Seeley, Cloudera (20)

Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH) Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
 
Introduction to graph databases, Neo4j and Spring Data - English 2015 Edition
Introduction to graph databases, Neo4j and Spring Data - English 2015 EditionIntroduction to graph databases, Neo4j and Spring Data - English 2015 Edition
Introduction to graph databases, Neo4j and Spring Data - English 2015 Edition
 
"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge Graphs
 
Clojure - An Introduction for Lisp Programmers
Clojure - An Introduction for Lisp ProgrammersClojure - An Introduction for Lisp Programmers
Clojure - An Introduction for Lisp Programmers
 
Children of Ruby
Children of RubyChildren of Ruby
Children of Ruby
 
Graph Sample and Hold: A Framework for Big Graph Analytics
Graph Sample and Hold: A Framework for Big Graph AnalyticsGraph Sample and Hold: A Framework for Big Graph Analytics
Graph Sample and Hold: A Framework for Big Graph Analytics
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 
FP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleFP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit Hole
 
The ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years OldThe ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years Old
 
Cassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestCassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapest
 
Webinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and GraphWebinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and Graph
 
NoSQL Tel Aviv Meetup #2: Who Is Afraid of Graphs?
NoSQL Tel Aviv Meetup #2: Who Is Afraid of Graphs?NoSQL Tel Aviv Meetup #2: Who Is Afraid of Graphs?
NoSQL Tel Aviv Meetup #2: Who Is Afraid of Graphs?
 
Who's afraid of graphs
Who's afraid of graphsWho's afraid of graphs
Who's afraid of graphs
 
Intro to Graphs for Fedict
Intro to Graphs for FedictIntro to Graphs for Fedict
Intro to Graphs for Fedict
 
Spark etl
Spark etlSpark etl
Spark etl
 
Intro to Neo4j 2.0
Intro to Neo4j 2.0Intro to Neo4j 2.0
Intro to Neo4j 2.0
 
BUILDING WHILE FLYING
BUILDING WHILE FLYINGBUILDING WHILE FLYING
BUILDING WHILE FLYING
 
Solr4 nosql search_server_2013
Solr4 nosql search_server_2013Solr4 nosql search_server_2013
Solr4 nosql search_server_2013
 
Introduction to graph databases in term of neo4j
Introduction to graph databases in term of neo4jIntroduction to graph databases in term of neo4j
Introduction to graph databases in term of neo4j
 

Plus de Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

Plus de Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Dernier

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 

Dernier (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 

Analytics and Graph Traversal with Solr - Yonik Seeley, Cloudera

  • 1. Analytics and Graph Traversal with Solr Yonik Seeley Cloudera
  • 2. 2  ©  Cloudera,  Inc.  All  rights  reserved.   My  Background   • Creator  of  Solr   • Cloudera  Engineer     • LucidWorks  Co-­‐Founder   • Lucene/Solr  commiEer,  PMC  member   • Apache  SoIware  FoundaKon  member   • M.S.  in  Computer  Science,  Stanford  
  • 3. 3  ©  Cloudera,  Inc.  All  rights  reserved.   Graph  Traversal  
  • 4. 4  ©  Cloudera,  Inc.  All  rights  reserved.   Graph  Databases   •  Graph  Databases  are  all  about   Nodes  and  Edges  (relaKonships)   Stanford              RPI          Ann    Cloudera            NJ        Mike   aEended   aEended   recommended   works_at   lives_in   aEended  
  • 5. 5  ©  Cloudera,  Inc.  All  rights  reserved.   ProperKes   Stanford          Ann   aEended   start:  1992   end:  1993   degree:  MS   subject:  Computer  Science   bday:  5/01/1970   type:  private   opened:  1891   locaKon:  Stanford,  CA  
  • 6. 6  ©  Cloudera,  Inc.  All  rights  reserved.   Graph  to  Document  Mapping   RelaKonships  without  properKes   • Only  index  the  nodes   • properKes  are  field  values   • nodes  without  properKes  can  be  skipped   • Edges  defined  at  query-­‐Kme  only   • implicit  based  on  field  value  matches   Node1   id:  node1   relaKon:  node2   Node2   id:  node2  
  • 7. 7  ©  Cloudera,  Inc.  All  rights  reserved.   Graph  to  Document  Mapping   RelaKonships  with  properKes     • RelaKonships  with  properKes:   •   Model  the  relaKonship  as  a  document   • "Pointers"  can  be  field  values  on  any  of  the  documents     RelaKonship1   Node2  Node1  RelaKonship1   Node2  Node1   target1:  node1   target2:  node2   rel:  relaKonship1   rel:  relaKonship1   id:  relaKonship1   OR  target:  [node1,  node2]   id:  node1   id:  node2  
  • 8. 8  ©  Cloudera,  Inc.  All  rights  reserved.   Document  Mapping   type:  edu   name:  Stanford   opened:  1891   address:  Stanford,  CA   state:  CA   type:  aEendance   who:  Ann   where:  Stanford   start:  1992   end:  1993   degree:  MS   subject:  Computer  Science   type:  person   name:  Ann   bday:  5/01/1970   address:  Branchburg,  NJ   state:  NJ  
  • 9. 9  ©  Cloudera,  Inc.  All  rights  reserved.   Graph  Query  
  • 10. 10  ©  Cloudera,  Inc.  All  rights  reserved.   Graph  Query  (filter)   •  Breadth-­‐first    graph  traversal   •  Modeled  as  a  normal  Query   • usable  as  main  query,  filter  query,  facet  query,  input  to  another  query,  etc   • cached  by  default  in  filterCache     q={!graph  from=nodeIdField  to=edgeIdField}<starting_query>     •  Output  is  a  set  of  documents   • edges  are  defined  by  matches  between  the  fromField  and  toField   • each  iteraKon  moves  to  nodes  idenKfied  by  the  edge  field  
  • 11. 11  ©  Cloudera,  Inc.  All  rights  reserved.   Graph  Filter  –  conKnued   •  OpKonal  arguments   • maxDepth  –  maximum  number  of  hops  from  the  root   • traversalFilter  –  arbitrary  query  applied  to  nodes  on  each  hop   • returnRoot  –  (true/false)  include  the  root  in  the  final  set   • leafNodesOnly  –  (true/false)  return  only  docs  w/o  value  in  the  "to"  field   •  NOTE:  {!graph}  isn't  (currently)  distributed!   • Edges  are  only  followed  within  a  shard   • SKll  useful,  and  compaKble  with  distributed  search  
  • 12. 12  ©  Cloudera,  Inc.  All  rights  reserved.   TwiEer  Example   q={!graph  from=user_id  to=following}name:Yonik     user_id:  lucene_solr   name:  Yonik  Seeley   following:  [heismark,shalinmangar]   user_id:  heismark   name:  Mark  Miller   following:  [lucene_solr,GRRMSpeaking,...]   user_id:  shalinmangar   name:  Shalin  Mangar   following:  [romseygeek,_hossman,...]   •  Finds  everyone  that  Yonik  follows,  and   their  followers,  etc    
  • 13. 13  ©  Cloudera,  Inc.  All  rights  reserved.   {!graph}  vs  {!join}     q={!join  from=following  to=user_id}name:Yonik   q={!graph  from=user_id  to=following  maxDepth=1  returnRoot=false}name:Yonik     •  pseudo-­‐join  filter  query  {!join}  ==  single-­‐step  {!graph}   •  Note  the  from/to  switch  (a  discrepancy  caught  too  late!)   • graph:  travels  "to"  nodes  idenKfied  by  the  edge  field   • join:  looks  at  values  in  the  "from"  field  and  travels  to  documents  with  those   values  in  the  "to"  field.    
  • 14. 14  ©  Cloudera,  Inc.  All  rights  reserved.   Graph  Streaming  Expressions  
  • 15. 15  ©  Cloudera,  Inc.  All  rights  reserved.   Graph  streaming  expressions   • Breadth-­‐first  graph  traversals   • Part  of  streaming  expressions   • fully  distributed   • cross  collecKons  as  well  as  shards   • parallelizable    
  • 16. 16  ©  Cloudera,  Inc.  All  rights  reserved.   Graph  streaming  expressions  example   •  Index  some  books  in  one  collecKon   curl  http://localhost:8983/solr/books/update  -­‐H  'Content-­‐type:text/csv'  -­‐d  '   id,cat,pubyear_i,title,author,series_s,sequence_i   book1,fantasy,2000,A  Storm  of  Swords,George  R.R.  Martin,A  Song  of  Ice  and  Fire,3   book2,fantasy,2005,A  Feast  for  Crows,George  R.R.  Martin,A  Song  of  Ice  and  Fire,4   book3,fantasy,2011,A  Dance  with  Dragons,George  R.R.  Martin,A  Song  of  Ice  and  Fire,5   book4,sci-­‐fi,1987,Consider  Phlebas,Iain  M.  Banks,The  Culture,1   book5,sci-­‐fi,1988,The  Player  of  Games,Iain  M.  Banks,The  Culture,2   book6,sci-­‐fi,1990,Use  of  Weapons,Iain  M.  Banks,The  Culture,3   book7,fantasy,1984,Shadows  Linger,Glen  Cook,The  Black  Company,2   book8,fantasy,1984,The  White  Rose,Glen  Cook,The  Black  Company,3   book9,fantasy,1989,Shadow  Games,Glen  Cook,The  Black  Company,4   book10,sci-­‐fi,2001,Gridlinked,Neal  Asher,Ian  Cormac,1   book11,sci-­‐fi,2003,The  Line  of  Polity,Neal  Asher,Ian  Cormac,2   book12,sci-­‐fi,2005,Brass  Man,Neal  Asher,Ian  Cormac,3   '  
  • 17. 17  ©  Cloudera,  Inc.  All  rights  reserved.   Graph  streaming  expressions  example   •  Index  some  book  reviews  into  another  collecKon   curl  http://localhost:8983/solr/reviews/update-­‐H  'Content-­‐type:text/csv'  -­‐d  '   id,book_s,user_s,rating_i,review_t   book1_r1,book1,Yonik,5,awesome  book!   book1_r2,book1,Aarav,2,too  bloody   book1_r3,book1,Haruka,5,awesome  world  building   book2_r1,book2,Yonik,5,another  great  one   book2_r2,book2,Maria,5,wow!   book4_r1,book4,Yonik,2,i  am  lying...  actually  liked  it   book4_r2,book4,Aarav,5,Loved  it   book7_r1,book7,Yonik,4,read  back  in  college  but  it  was  good   book10_r1,book10,Maria,5,I  want  a  gridlink!   book11_r1,book11,Maria,1,Blech   book11_r2,book11,Aarav,4,is  this  the  first  book?   book12_r1,book12,Yonik,5,Mr.  Crane  is  scary...   '   1.  Find  books  I  like   2.  Find  who  else  rated   those  books  highly   3.  Find  other  books   they  rated  highly   4.  Profit!  
  • 18. 18  ©  Cloudera,  Inc.  All  rights  reserved.   1.  Search  expression  to  find  my  high  raKngs   URL="http://localhost:8983/solr/reviews/stream"     #  Use  search  expression  to  find  reviews  that  I  have  the  book  a  "5"   curl  $URL  -­‐d  'expr=search(reviews,  q="user_s:Yonik  AND  rating_i:5",   fl="id,book_s,user_s,rating_i",  sort="user_s  asc")'       {"result-­‐set":{"docs":[   {"raKng_i":5,"id":"book2_r1","user_s":"Yonik","book_s":"book2"},   {"raKng_i":5,"id":"book1_r1","user_s":"Yonik","book_s":"book1"},   {"raKng_i":5,"id":"book12_r1","user_s":"Yonik","book_s":"book12"},   {"EOF":true,"RESPONSE_TIME":4}]}}  
  • 19. 19  ©  Cloudera,  Inc.  All  rights  reserved.   2.  gatherNodes  expression  to  find  users   curl  $URL  -­‐d  'expr=gatherNodes(reviews,        search(reviews,  q="user_s:Yonik  AND  rating_i:5",                      fl="book_s,user_s,rating_i",sort="user_s  asc"),        walk="book_s-­‐>book_s",        gather="user_s",        fq="rating_i:[4  TO  *]  -­‐user_s:Yonik",        trackTraversal=true  )'       {"result-­‐set":{"docs":[   {"node":"Haruka","collecKon":"reviews","field":"user_s","ancestors":["book1"],"level":1},   {"node":"Maria","collecKon":"reviews","field":"user_s","ancestors":["book2"],"level":1},   {"EOF":true,"RESPONSE_TIME":22}]}}   "gather"  values  
  • 20. 20  ©  Cloudera,  Inc.  All  rights  reserved.   3.  gatherNodes  to  find  high  raKngs  by  those  users     curl  $URL  -­‐d  'expr=gatherNodes(reviews,          gatherNodes(reviews,  search(reviews,q="user_s:Yonik  AND  rating_i: 5",fl="id,book_s,user_s,rating_i",sort="user_s  asc"),  walk="book_s-­‐>book_s",   gather="user_s",fq="rating_i:[4  TO  *]  -­‐user_s:Yonik"),          walk="node-­‐>user_s",  gather="book_s",  fq="rating_i:[4  TO  *]",          avg(rating_i),          trackTraversal=true)'         {"result-­‐set":{"docs":[   {"node":"book10","avg(raKng_i)":5.0,"field":"book_s","level": 2,"collecKon":"reviews","ancestors":["Maria"]},   {"EOF":true,"RESPONSE_TIME":65}]}}  
  • 21. 21  ©  Cloudera,  Inc.  All  rights  reserved.   Retrieving  complete  traversal   curl  $URL  -­‐d  'expr=gatherNodes(reviews,  [...],  scaEer="branches,leaves")'         {"result-­‐set":{"docs":[   {"node":"book12","collecKon":"reviews","field":"book_s","level":0},   {"node":"book1","collecKon":"reviews","field":"book_s","level":0},   {"node":"book2","collecKon":"reviews","field":"book_s","level":0},   {"node":"Haruka","collecKon":"reviews","field":"user_s","level":1},   {"node":"Maria","collecKon":"reviews","field":"user_s","level":1},   {"node":"book10","avg(raKng_i)":5.0,"field":"book_s","level":2,   "collecKon":"reviews","ancestors":["Maria"]},   {"EOF":true,"RESPONSE_TIME":111}]}}  
  • 22. 22  ©  Cloudera,  Inc.  All  rights  reserved.   {!graph}  single  collecKon/shard  version   curl  "http://localhost:8983/solr/reviews/query"  -­‐d  '   q={!graph  from=user_s  to=user_s              returnRoot=false  traversalFilter=$f1  v=$g1}&   g1={!graph  from=book_s  to=book_s            returnRoot=false  traversalFilter=$f1  v=$q1}&   q1=user_s:Yonik  AND  rating_i:5&   f1=rating_i:[4  TO  *]   '        
  • 23. 23  ©  Cloudera,  Inc.  All  rights  reserved.   More  graph  expressions   •  shortestPath   • Finds  the  shortest  path  between  "from"  and  "to"     •  scoreNodes  :  p-­‐idf  inspired  scoring   • wraps  a  gatherNodes  expression  that  finds  the  co-­‐occurrence  count   • p  factor  –  the  co-­‐occurrence  count   • idf  factor  –  boosts  nodes  that  are  rarer  overall  
  • 24. 24  ©  Cloudera,  Inc.  All  rights  reserved.   Network  analysis  and  visualizaKon   curl  http://localhost:8983/solr/reviews/graph  -­‐d  'expr=gatherNodes(reviews,  [...],   scaEer="branches,leaves")'         <?xml  version="1.0"  encoding="UTF-­‐8"?>   <graphml  xmlns="hEp://graphml.graphdrawing.org/xmlns"     xmlns:xsi="hEp://www.w3.org/2001/XMLSchema-­‐instance"     xsi:schemaLocaKon="hEp://graphml.graphdrawing.org/xmlns  hEp://graphml.graphdrawing.org/xmlns/1.0/ graphml.xsd">   <graph  id="G"  edgedefault="directed">   <node  id="book12">      <data  key="field">book_s</data>      <data  key="level">0</data>   </node>   <node  id="book1">      <data  key="field">book_s</data>   [...]  
  • 25. 25  ©  Cloudera,  Inc.  All  rights  reserved.  
  • 26. 26  ©  Cloudera,  Inc.  All  rights  reserved.   Analyzing  Book  Reviews   w/  JSON  Facet  API  
  • 27. 27  ©  Cloudera,  Inc.  All  rights  reserved.   JSON  Facet  API  w/  Book  Reviews   • Same  books  &  reviews  data  set  as  before,  except:   • Index  books  and  reviews  into  the  same  collec<on   • Index  a  book  and  its  reviews  into  the  same  shard   • eliminates  cross-­‐shard  "edges"  between  books  &  reviews  
  • 28. 28  ©  Cloudera,  Inc.  All  rights  reserved.   compositeId  router   shard1   shard2   shard3   id:book1   id:book1!review1   id:book1!review2   a  16  bit   range   full  32  bit  hash  of  "book1"     top  16  bits  of  "book1",   bottom  16  "review1"     top  16  bits  of  "book1",   bottom  16  "review2"     • Easy  collocaKon  of  documents  in  SolrCloud   • Works  right  out  of  the  box  (it's  default!)   • Restrict  queries  to  shards  for  performance:   &q=reviewer:yonik  AND  book_id:book1   &_route_=book1!     32-­‐bit  hash  ring  
  • 29. 29  ©  Cloudera,  Inc.  All  rights  reserved.   Refresher:  Facet  commands  and  Domains   Domain   Facet   Command   A   •  Domain:  A  set  of  documents   •  Facet  command:  create  sub-­‐domains  /  "facet  buckets"   Facet   Command   B   Domain   Domain   Domain   Domain   Facet   Command   C   Domain   Domain   Domain   Domain   Domain   Domain  
  • 30. 30  ©  Cloudera,  Inc.  All  rights  reserved.   Unique  authors,  books  by  genre   curl  http://localhost:8983/solr/books/query  -­‐d  '   q=cat:*&   json.facet=   {    num_authors  :  "hll(author)",    genres  :  {          type:  terms,          field:  cat      }   }   '     […]    "facets":{          "count":13,          "num_authors":5,          "genres":{              "buckets":[{                      "val":"fantasy",                      "count":7},                  {                      "val":"sci-­‐fi",                      "count":6}]}}}   root  domain  defined  by  docs   matching  the  query   hyper-­‐log-­‐log   distributed  cardinality   funcKon   one  bucket  per   unique  value  in  the   "cat"  field  
  • 31. 31  ©  Cloudera,  Inc.  All  rights  reserved.   domain  change:  join  
  • 32. 32  ©  Cloudera,  Inc.  All  rights  reserved.   Number  of  book  reviews  per  genre   json.facet={      genres  :  {          type:  terms,          field:  cat,          facet:  {              reviews  :  {                  type:  query,                  domain:{join:{from:id,                                              to:book_s}}                            }          }      }   }    "facets":{          "count":13,          "genres":{              "buckets":[{                      "val":"fantasy",                      "count":7,                      "reviews":{                          "count":7}},                  {                      "val":"sci-­‐fi",                      "count":6,                      "reviews":{                          "count":5}}]}}}   Calculated  per-­‐bucket   domain  switch!   happens  before  faceKng  
  • 33. 33  ©  Cloudera,  Inc.  All  rights  reserved.   Average  raKng  for  each  genre   json.facet={      genres  :  {          type:  terms,          field:  cat,          facet:  {              reviews  :  {                  type:  query,                  domain:{join  {from:id,                                              to:book_s}},                  facet:  {                      rating:"avg(rating_i)"                  }              }}}}    "facets":{          "count":13,          "genres":{              "buckets":[{                      "val":"fantasy",                      "count":7,                      "reviews":{                          "count":7,                          "rating":3.857142}},                  {                      "val":"sci-­‐fi",                      "count":6,                      "reviews":{                          "count":5,                          "rating":4.2}}]}}}  
  • 34. 34  ©  Cloudera,  Inc.  All  rights  reserved.   Who  gives  the  highest  raKngs  per  genre?   json.facet={      genres  :  {          type:  terms,          field:  cat,          facet:  {              reviews  :  {                  type:  terms,  field:  user_s,                  sort:  "rating  desc",  limit:3,                  domain:{join:{from:id,                                              to:book_s}},                  facet:  {                      rating:"avg(rating_i)"                  }   [...]    "facets":{          "count":13,          "genres":{              "buckets":[{                      "val":"fantasy",                      "count":7,                      "reviews":{                          "buckets":[                              {  "val":"Haruka",                                  "count":1,                                  "rating":5.0},                              {  "val":"Yonik",                                  "count":3,                                  "rating":4.66666667},                              {  "val":"Maria",                                  "count":2,                                  "rating":3.0}]}},                  {  "val":"sci-­‐fi",                      "count":6,                      "reviews":{                          "buckets":[  
  • 35. 35  ©  Cloudera,  Inc.  All  rights  reserved.   Histogram:  average  raKng  trends  over  Kme   json.facet={      genres  :  {          type:  terms,          field:  cat,          facet:  {              reviews  :  {                  domain:{join:{from:id,   to:book_s}},                  type:  range,                  field:  review_date_i,                  start:  1980,                  end:  2020,                  gap:  10,                  facet:  {  rating:"avg(rating_i)"  }   }}}}      "facets":{          "count":13,          "genres":{              "buckets":[{                      "val":"fantasy",                      "count":7,                      "reviews":{                          "buckets":[                              {  "val":1980,                                  "count":1323,                                  "rating":3.17},                              {  "val":1990,                                  "count":1452,                                  "rating":3.26},                              {  "val":2000,                                  "count":1559                                    "rating":3.48},                              {  "val":2010,                                  "count":1793                                  "rating":3.54}]}},                  {                      "val":"sci-­‐fi",  
  • 36. 36  ©  Cloudera,  Inc.  All  rights  reserved.   Streaming  Expressions  vs  JSON  Facets  
  • 37. 37  ©  Cloudera,  Inc.  All  rights  reserved.   JSON  Facet  API   •  More  focused  on  web-­‐scale  interacKve  responses   •  Tighter  integraKon   •  Just  another  search  component   •  UKlizes  exisKng  distributed  search  framework   •  Single  request-­‐response  top-­‐N,  grouping,  highlighKng,  faceKng,  etc.   •  MulKple-­‐facets  in  single  request   •  Block  join  /  nested  document  support   • Document  centric  
  • 38. 38  ©  Cloudera,  Inc.  All  rights  reserved.   Streaming  Expressions   •  More  general  purpose,  larger  scope   • Wrap  streams  within  streams  to  do  preEy  much  anything   • Not  Ked  to  documents  (analyKcs  across  joins  w/  external  DBs)   • Update  streams,  machine  learning  streams,  etc.   •  Exact  results  in  distributed  mode  (e.g.  cardinality)   •  Distributed  joins,  graph   •  Synergy:  Increasingly  works  with  JSON  Facet  API  to  push  down  work  to  leaves