SlideShare une entreprise Scribd logo
1  sur  85
Télécharger pour lire hors ligne
Data	
  explora+on	
  with	
  Elas+csearch	
  
Aleksander	
  M.	
  Stensby	
  
Monokkel	
  A/S	
  
•  Aleksander	
  M.	
  Stensby	
  
•  CEO	
  in	
  Monokkel	
  AS	
  
•  Previously	
  COO	
  in	
  Integrasco	
  AS	
  
•  Working	
  with	
  search	
  and	
  data	
  analysis	
  since	
  2004	
  
www.monokkel.io	
  
•  Daglig	
  leder	
  i	
  Monokkel	
  AS	
  
•  Tidligere	
  COO	
  i	
  Integrasco	
  AS	
  
•  Persistering,	
  Prosessering	
  og	
  Presentasjon	
  av	
  data	
  
Persistence	
  –	
  Processing	
  –	
  PresentaHon	
  
Agenda	
  
•  Search	
  fundamentals	
  primer	
  
	
  
•  Intro	
  to	
  elasHcsearch	
  
	
  
•  Search,	
  filter	
  and	
  aggregate!	
  
Agenda	
  
•  Search	
  fundamentals	
  primer	
  
	
  
•  Intro	
  to	
  elasHcsearch	
  
•  Search,	
  filter	
  and	
  aggregate!	
  
…	
  and	
  some	
  bonus	
  visualisaHon!	
  
What	
  we	
  will	
  not	
  cover	
  today…	
  
•  All	
  the	
  different	
  searches,	
  filters	
  and	
  
aggregaHons	
  available	
  in	
  elasHcsearch	
  J	
  
	
  
•  Details	
  on	
  tokenizaHon,	
  analyzers…	
  
	
  
•  ElasHcsearch	
  in	
  producHon	
  and	
  performance	
  
tuning…	
  
•  Data	
  integraHon	
  
Search	
  fundamentals	
  101	
  
Document
Fields
(Key Value)
Title
Content
Signature
“We know what we
are, but know not
what we may be.”
Term	
   Frequency	
  
we	
   3	
  
know	
   2	
  
what	
   2	
  
are	
   1	
  
but	
   1	
  
not	
   1	
  
may	
   1	
  
be	
   1	
  
“We know what
we are, but
know not what
we may be.”
Term Vector
Index
“We were born to run”
“No one told you when
to run”
“Some were born to sing
the blues”
The	
  Inverted	
  Index	
  
Term	
   Frequency	
  
blues	
   1	
  
born	
   2	
  
no	
   1	
  
one	
   1	
  
run	
   2	
  
sing	
   1	
  
some	
   1	
  
the	
   1	
  
to	
   3	
  
told	
   1	
  
we	
   1	
  
were	
   2	
  
when	
   1	
  
you	
   1	
  
Documents	
  
3	
  
1,3	
  
2	
  
2	
  
1,2	
  
3	
  
3	
  
3	
  
1,2,3	
  
2	
  
1	
  
1,3	
  
2	
  
2	
  
dictionary postings
1. “We were born to
run ”
2. “No one told you
when to run”
3. “Some were born to
sing the blues”
Searching	
  
born	
  
1. “We were born to
run ”
2. “No one told you
when to run”
3. “Some were born to
sing the blues”
The	
  Boolean	
  Model	
  
Term	
   Frequency	
  
blues	
   1	
  
born	
   2	
  
no	
   1	
  
one	
   1	
  
run	
   2	
  
sing	
   1	
  
some	
   1	
  
the	
   1	
  
to	
   3	
  
told	
   1	
  
we	
   1	
  
were	
   2	
  
when	
   1	
  
you	
   1	
  
Documents	
  
3	
  
1,3	
  
2	
  
2	
  
1,2	
  
3	
  
3	
  
3	
  
1,2,3	
  
2	
  
1	
  
1,3	
  
2	
  
2	
  
dictionary postings
born	
  
Term	
   Frequency	
  
blues	
   1	
  
born	
   2	
  
no	
   1	
  
one	
   1	
  
run	
   2	
  
sing	
   1	
  
some	
   1	
  
the	
   1	
  
to	
   3	
  
told	
   1	
  
we	
   1	
  
were	
   2	
  
when	
   1	
  
you	
   1	
  
Documents	
  
3	
  
1,3	
  
2	
  
2	
  
1,2	
  
3	
  
3	
  
3	
  
1,2,3	
  
2	
  
1	
  
1,3	
  
2	
  
2	
  
dictionary postings
born	
  blues	
  
Term	
   Frequency	
  
blues	
   1	
  
born	
   2	
  
no	
   1	
  
one	
   1	
  
run	
   2	
  
sing	
   1	
  
some	
   1	
  
the	
   1	
  
to	
   3	
  
told	
   1	
  
we	
   1	
  
were	
   2	
  
when	
   1	
  
you	
   1	
  
Documents	
  
3	
  
1,3	
  
2	
  
2	
  
1,2	
  
3	
  
3	
  
3	
  
1,2,3	
  
2	
  
1	
  
1,3	
  
2	
  
2	
  
dictionary postings
born	
  OR	
  blues	
  
Term	
   Frequency	
  
blues	
   1	
  
born	
   2	
  
no	
   1	
  
one	
   1	
  
run	
   2	
  
sing	
   1	
  
some	
   1	
  
the	
   1	
  
to	
   3	
  
told	
   1	
  
we	
   1	
  
were	
   2	
  
when	
   1	
  
you	
   1	
  
Documents	
  
3	
  
1,3	
  
2	
  
2	
  
1,2	
  
3	
  
3	
  
3	
  
1,2,3	
  
2	
  
1	
  
1,3	
  
2	
  
2	
  
dictionary postings
born	
  AND	
  blues	
  
Term	
   Frequency	
  
blues	
   1	
  
born	
   2	
  
no	
   1	
  
one	
   1	
  
run	
   2	
  
sing	
   1	
  
some	
   1	
  
the	
   1	
  
to	
   3	
  
told	
   1	
  
we	
   1	
  
were	
   2	
  
when	
   1	
  
you	
   1	
  
Documents	
  
3	
  
1,3	
  
2	
  
2	
  
1,2	
  
3	
  
3	
  
3	
  
1,2,3	
  
2	
  
1	
  
1,3	
  
2	
  
2	
  
dictionary postings
born	
  NOT	
  blues	
  
Relevancy	
  and	
  Ranking	
  
•  Term	
  frequency	
  
	
  
•  Inverse	
  document	
  frequency	
  
	
  
•  Field-­‐length	
  norm	
  
Similarity	
  
1. “We were born to
run ”
2. “No one told you
when to run”
3. “Some were born to
sing the blues”
[2,	
  0]	
  
[0,	
  0]	
  
[2,	
  5]	
  
0	
  
0	
   1	
   2	
   3	
   4	
   5	
  
1	
  
2	
  
3	
  
“blues”	
  
“born”	
  
query:	
  	
  [2,5]	
  
doc	
  3:	
  	
  [2,5]	
  
doc	
  2:	
  	
  [0,0]	
  
doc	
  1:	
  	
  [2,0]	
  
Search	
  fundamentals	
  101!	
  
•  TokenizaHon	
  
	
  
•  NormalizaHon	
  (case,	
  stop	
  words	
  etc)	
  
	
  
•  Stemming,	
  synonyms	
  
Brief	
  history	
  of	
  elasHcsearch	
  
Shay	
  Banon	
  	
  
-­‐>	
  AbstracHon	
  Layer	
  on	
  top	
  of	
  Lucene	
  	
  
-­‐>	
  Compass	
  	
  
-­‐>	
  Rewricen	
  high	
  performance,	
  	
  
real-­‐Hme,	
  distributed	
  	
  
-­‐>	
  ElasHcsearch	
  	
  
-­‐>	
  February	
  2010	
  
elasHcsearch	
  
•  Open	
  source	
  search	
  engine	
  -­‐	
  wricen	
  in	
  Java	
  
	
  
•  Built	
  on	
  top	
  of	
  Lucene	
  	
  
	
  
•  Simple,	
  coherent,	
  RESTful	
  API	
  
•  Distributed,	
  scalable	
  search	
  engine	
  with	
  real-­‐
Hme	
  analyHcs	
  
{	
  }	
  
 
	
  
“more	
  useable	
  and	
  concise	
  API,	
  scalability,	
  and	
  
opera+onal	
  tools	
  on	
  top	
  of	
  Lucene’s	
  search	
  
implementa+on”	
  
ElasHcsearch	
  nodes	
  and	
  cluster	
  
node
node
node
cluster
ElasHcsearch	
  shards,	
  nodes	
  
index = shard
node
Lucene	
  index	
  and	
  segments	
  
segments
lucene
index
Much	
  more	
  than	
  just	
  search!	
  
•  Real-­‐Hme	
  analyHcs	
  
•  Log	
  analysis	
  
•  PredicHon	
  modelling	
  
•  RecommendaHons	
  
 
	
  
	
  
	
  
	
  
in	
  5	
  minutes	
  	
  
DEMO	
  
	
  
DEMO	
  
•  Install	
  ElasHcSearch	
  
	
  
•  Load	
  in	
  some	
  data	
  
	
  
•  Run	
  a	
  very	
  basic	
  search	
  
 
	
  
	
  
	
  
	
  
in	
  15	
  minutes	
  	
  
DEMO	
  
	
  
Easy	
  peasy…	
  
•  hcp://www.elasHcsearch.org/download	
  
	
  
•  bin/elasHcsearch	
  
	
  or	
  bin/elasHcsearch.bat	
  on	
  windows	
  
	
  
•  hcp://localhost:9200/	
  
	
  or	
  curl	
  –X	
  GET	
  hcp://localhost:9200/	
  
Easy	
  peasy	
  lemon	
  squeezy!	
  
hcp://localhost:9200/<index>/<type>/[<id>]	
  
	
  
Indexing	
  data	
  
curl	
  -­‐XPUT	
  'hcp://localhost:9200/monokkel/user/aleks'	
  
-­‐d	
  '{	
  "name"	
  :	
  "Aleksander	
  Stensby"	
  }’	
  
	
  
	
  
Indexing	
  data	
  
•  shakespeare.json	
  
– hcp://www.elasHcsearch.org/guide/en/kibana/
current/snippets/shakespeare.json	
  
	
  
•  curl	
  -­‐XPUT	
  localhost:9200/_bulk	
  -­‐-­‐data-­‐binary	
  
@shakespeare.json	
  
hcp://localhost:9200/<index>/<type>/	
  
	
  
hcp://localhost:9200/<index>/	
  
	
  
hcp://localhost:9200/	
  
_search	
  
Mapping	
  
•  Is	
  it	
  a	
  number?	
  String?	
  Date?	
  
•  Combining	
  mulHple	
  fields?	
  
•  Default	
  values?	
  
•  Stored?	
  
•  Analyzed?	
  
•  How	
  should	
  we	
  tokenize/analyse/normalize	
  
the	
  field?	
  
Mapping	
  
curl	
  -­‐XPUT	
  hcp://localhost:9200/shakespeare	
  -­‐d	
  '	
  
{	
  
	
  "mappings"	
  :	
  {	
  
	
  	
  "_default_"	
  :	
  {	
  
	
  	
  	
  "properHes"	
  :	
  {	
  
	
  	
  	
  	
  "speaker"	
  :	
  {"type":	
  "string",	
  "index"	
  :	
  "not_analyzed"	
  },	
  
	
  	
  	
  	
  "play_name"	
  :	
  {"type":	
  "string",	
  "index"	
  :	
  "not_analyzed"	
  },	
  
	
  	
  	
  	
  "line_id"	
  :	
  {	
  "type"	
  :	
  "integer"	
  },	
  
	
  	
  	
  	
  "speech_number"	
  :	
  {	
  "type"	
  :	
  "integer"	
  }	
  
	
  	
  	
  }	
  
	
  	
  }	
  
	
  }	
  
}	
  
';	
  
The	
  Query	
  DSL	
  
{	
  
	
  	
  	
  	
  "query":	
  {YOUR_QUERY_HERE}	
  
}	
  
Match	
  Query	
  
{	
  
	
  	
  	
  	
  "query":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "match":	
  {"text_entry"	
  :	
  "romeo"}	
  
	
  	
  	
  	
  }	
  
}	
  
MulH	
  Match	
  Query	
  
{	
  
	
  	
  	
  	
  "query":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
   	
  "mulM_match":	
  {	
  
	
   	
  "query":	
  	
  	
  	
  "romeo",	
  
	
   	
  "fields":	
  	
  	
  [	
  "text_entry",	
  "speaker"	
  ]	
  
	
  }	
  
	
  	
  	
  	
  }	
  
}	
  
Bool	
  Query	
  
{	
  
	
  	
  	
  	
  "query":	
  {	
  
"bool":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "must":	
  	
  	
  	
  	
  [	
  ],	
  
	
  	
  	
  	
  	
  	
  	
  	
  "must_not":	
  [	
  ],	
  
	
  	
  	
  	
  	
  	
  	
  	
  "should":	
  [	
  ]	
  
	
  	
  	
  	
  }	
  
}	
  
}	
  
Bool	
  Query	
  
{	
  
	
  	
  	
  	
  "query":	
  {	
  
"bool":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "must":	
  	
  	
  	
  	
  {	
  "match":	
  {"text_entry":	
  "romeo"	
  }},	
  
	
  	
  	
  	
  	
  	
  	
  	
  "must_not":	
  {	
  "match":	
  {"speaker":	
  	
  	
  "ROMEO"	
  }},	
  
	
  	
  	
  	
  	
  	
  	
  	
  "should":	
  [	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  "match":	
  {"speaker":	
  "JULIET"	
  }},	
  
	
  {	
  "match":	
  {"speaker":	
  "FRIAR	
  LAURENCE"	
  }}	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  ]	
  
	
  	
  	
  	
  }	
  
}	
  
}	
  
And	
  lots	
  more…	
  
filtered	
  query	
  
prefix	
  query	
  
simple	
  query	
  string	
  query	
  
range	
  query	
  
regexp	
  query	
  
term	
  query	
  
terms	
  query	
  
wildcard	
  query	
  
dis	
  max	
  query	
  
geoshape	
  query	
  
nested	
  query	
  
	
  
more	
  like	
  this	
  query	
  
more	
  like	
  this	
  field	
  query	
  
boosHng	
  query	
  
common	
  terms	
  query	
  
constant	
  score	
  query	
  
fuzzy	
  like	
  this	
  query	
  
fuzzy	
  like	
  this	
  field	
  query	
  
funcHon	
  score	
  query	
  
fuzzy	
  query	
  
has	
  child	
  query	
  
has	
  parent	
  query	
  
	
  
ids	
  query	
  
indices	
  query	
  
span	
  first	
  query	
  
span	
  mulH	
  term	
  query	
  
span	
  near	
  query	
  
span	
  not	
  query	
  
span	
  or	
  query	
  
span	
  term	
  query	
  
top	
  children	
  query	
  
minimum	
  should	
  match	
  
mulH	
  term	
  query	
  rewrite	
  
template	
  query	
  
	
  
	
  
hAp://www.elas+csearch.org/guide/en/elas+csearch/reference/current/query-­‐dsl-­‐queries.html	
  
Filtering	
  
•  Filters	
  do	
  not	
  score	
  so	
  they	
  are	
  faster	
  to	
  
execute	
  than	
  queries	
  
	
  
•  Filters	
  can	
  be	
  cached	
  in	
  memory	
  -­‐	
  significantly	
  
faster	
  than	
  queries	
  
	
  
If relevance is not important, use
filters, otherwise, use queries!
The	
  Filtered	
  Query:	
  
{	
  
	
  	
  	
  	
  "query":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "filtered":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "query":	
  	
  {YOUR_QUERY_HERE},	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "filter":	
  {YOUR_FILTER_HERE}	
  
	
  }	
  
	
  	
  	
  	
  }	
  
}	
  
The	
  Filtered	
  Query:	
  
{	
  
	
  	
  	
  	
  "query":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "filtered":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "query":	
  	
  {	
  "match":	
  {"content":	
  "monokkel"	
  }},	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "filter":	
  {	
  "term":	
  {	
  "tag":	
  "awesome"	
  }}	
  
	
  }	
  
	
  	
  	
  	
  }	
  
}	
  
Term	
  Filter	
  
{	
  
	
  	
  	
  	
  "query":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "filtered":	
  {	
  
	
   	
  "filter":	
  {	
  	
  
	
   	
   	
  "term":	
  {	
  
	
   	
   	
   	
  "speaker":	
  "ROMEO"	
  	
  
	
   	
   	
  }	
  
	
   	
  }	
  
	
  }	
  
	
  	
  	
  	
  }	
  
}	
  
Terms	
  Filter	
  
{	
  
	
  	
  	
  	
  "query":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "filtered":	
  {	
  
	
   	
  "filter":	
  {	
  	
  
	
   	
   	
  "terms":	
  {	
  
	
   	
   	
   	
  "speaker":	
  ["ROMEO",	
  "JULIET"]	
  	
  
	
   	
   	
  }	
  
	
   	
  }	
  
	
  }	
  
	
  	
  	
  	
  }	
  
}	
  
Bool	
  Filter	
  
{	
  
	
  	
  	
  	
  "query":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "filtered":	
  {	
  
	
   	
  "filter":	
  {	
  	
  
	
   	
   	
  	
  "bool"	
  :	
  {	
  
	
  	
  	
  	
  	
  	
   	
   	
   	
   	
  "must"	
  :	
  	
  	
  	
  	
  [],	
  
	
  	
  	
  	
  	
  	
   	
   	
   	
   	
  "should"	
  :	
  	
  	
  [],	
  
	
  	
  	
  	
  	
  	
   	
   	
   	
   	
  "must_not"	
  :	
  []	
  
	
  	
  	
   	
   	
   	
  } 	
   	
  	
  
	
   	
  }	
  
	
  }	
  
	
  	
  	
  	
  }	
  
}	
  
Range	
  Filter	
  
{	
  
	
  	
  	
  	
  "query":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "filtered":	
  {	
  
	
   	
  "filter":	
  {	
  	
  
	
   	
   	
  	
  "range"	
  :	
  {	
  
	
  	
  	
  	
   	
   	
   	
   	
  "price"	
  :	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  "gt"	
  :	
  20,	
  
	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  "lt"	
  :	
  40	
  
	
  	
  	
  	
   	
   	
   	
   	
  }	
  
	
   	
   	
  } 	
   	
  	
  
	
   	
  }	
  
	
  }	
  
	
  	
  	
  	
  }	
  
}	
  
And	
  lots	
  more…	
  
match	
  all	
  filter	
  
and	
  filter	
  
not	
  filter	
  
or	
  filter	
  
prefix	
  filter	
  
query	
  filter	
  
regexp	
  filter	
  
type	
  filter	
  
	
  
geo	
  bounding	
  box	
  filter	
  
geo	
  distance	
  filter	
  
geo	
  distance	
  range	
  filter	
  
geo	
  polygon	
  filter	
  
geoshape	
  filter	
  
geohash	
  cell	
  filter	
  
has	
  child	
  filter	
  
has	
  parent	
  filter	
  
ids	
  filter	
  
indices	
  filter	
  
limit	
  filter	
  
nested	
  filter	
  
script	
  filter	
  
hAp://www.elas+csearch.org/guide/en/elas+csearch/reference/current/query-­‐dsl-­‐filters.html	
  
Kibana	
  
•  hcp://www.elasHcsearch.org/overview/
kibana/installaHon/	
  
	
  
•  bin/kibana 	
  	
  
or	
  bin/kibana.bat	
  on	
  windows	
  
	
  
•  hcp://localhost:5601/	
  
	
  
AggregaHons	
  
•  Buckets	
  and	
  Metrics:	
  
par++oning	
  documents	
  based	
  on	
  a	
  criteria	
  
SELECT	
  COUNT(color)	
  
FROM	
  table	
  
GROUP	
  BY	
  color	
  
	
  
An	
  aggrega+on	
  is	
  a	
  combina+on	
  of	
  buckets	
  and	
  
metrics	
  
metric
bucket
AggregaHons	
  
{	
  
	
  	
  	
  	
  "aggs":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "speakers":	
  {	
  
	
   	
  "terms":	
  {	
  	
  
	
   	
   	
  "field":	
  "speaker"	
  	
  
	
   	
  }	
  
	
  }	
  
	
  	
  	
  	
  }	
  
}	
  
your aggregation name
bucket type
AggregaHons	
  
AggregaHons	
  
{	
  
	
  	
  	
  	
  "aggs":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "beertypes":	
  {	
  
	
   	
  "terms":	
  {	
  	
  
	
   	
   	
  "field":	
  "beertype"	
  	
  
	
   	
  }	
  
	
  }	
  
	
  	
  	
  	
  }	
  
}	
  
AggregaHons	
  
{	
  
	
  	
  	
  	
  "aggs":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "beertypes":	
  {	
  
	
   	
  "terms":	
  {	
  	
  
	
   	
   	
  "field":	
  "beertype"	
  	
  
	
   	
  },	
  
	
   	
  "aggs":	
  {	
  
	
   	
   	
  "avg_ibu":	
  {	
  
	
   	
   	
   	
  "avg":	
  {	
  
	
   	
   	
   	
   	
  "field":	
  "ibu"	
  
	
   	
   	
   	
  }	
  
	
   	
   	
  }	
  
	
   	
  }	
  	
  
	
  }	
  
	
  	
  	
  	
  }	
  
}	
  
your aggregation name
metric type
AggregaHons	
  
min	
  
max	
  
sum	
  
avg	
  
stats	
  
extended	
  stats	
  
value	
  count	
  
percenHles	
  
percenHle	
  ranks	
  
cardinality	
  
top	
  hits	
  
scripted	
  metric	
  
global	
  
filter	
  
filters	
  
missing	
  
nested	
  
reverse	
  nested	
  
children	
  
terms	
  
significant	
  terms	
  
range	
  
date	
  range	
  
ipv4	
  range	
  
histogram	
  
date	
  historgram	
  
geo	
  bounds	
  
geo	
  distance	
  
geohash	
  grid	
  
hAp://www.elas+csearch.org/guide/en/elas+csearch/reference/current/search-­‐aggrega+ons.html	
  
And	
  a	
  whole	
  lot	
  more!	
  
•  Geosearch,	
  distance	
  and	
  bounds	
  	
  
•  ”More	
  Like	
  This”	
  
•  Suggesters	
  /	
  Autocomplete	
  
•  PercolaMon	
  
•  Language	
  drivers	
  
•  ScripMng	
  
Further	
  reading	
  and	
  some	
  great	
  
resources!	
  
•  hcp://www.elasHcsearch.org/guide/	
  
	
  
•  hcp://blog.monokkel.io/	
  
	
  
•  hcps://found.no/foundaHon/	
  
Shameful	
  self-­‐promoHon	
  	
  
/ Tarjei Romtveit
/ Tarjei Romtveit

Contenu connexe

Tendances

Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBertrand Delacretaz
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersBen van Mol
 
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UNSolr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UNLucidworks
 
Elastic search Walkthrough
Elastic search WalkthroughElastic search Walkthrough
Elastic search WalkthroughSuhel Meman
 
Webinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionWebinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionLucidworks
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
 
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, GermanyHarnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, GermanyAndré Ricardo Barreto de Oliveira
 
Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]Karel Minarik
 
Elasticsearch (Rubyshift 2013)
Elasticsearch (Rubyshift 2013)Elasticsearch (Rubyshift 2013)
Elasticsearch (Rubyshift 2013)Karel Minarik
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 MinutesKarel Minarik
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache SolrChristos Manios
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesRahul Jain
 
Beautiful REST+JSON APIs with Ion
Beautiful REST+JSON APIs with IonBeautiful REST+JSON APIs with Ion
Beautiful REST+JSON APIs with IonStormpath
 
Solr: 4 big features
Solr: 4 big featuresSolr: 4 big features
Solr: 4 big featuresDavid Smiley
 
Search Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearchSearch Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearchFlorian Hopf
 
Using Apache Solr
Using Apache SolrUsing Apache Solr
Using Apache Solrpittaya
 

Tendances (20)

Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and Solr
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET Developers
 
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UNSolr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UN
 
Elastic search Walkthrough
Elastic search WalkthroughElastic search Walkthrough
Elastic search Walkthrough
 
Webinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionWebinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with Fusion
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
elasticsearch
elasticsearchelasticsearch
elasticsearch
 
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, GermanyHarnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
 
Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]
 
Elasticsearch (Rubyshift 2013)
Elasticsearch (Rubyshift 2013)Elasticsearch (Rubyshift 2013)
Elasticsearch (Rubyshift 2013)
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 Minutes
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
it's just search
it's just searchit's just search
it's just search
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
Beautiful REST+JSON APIs with Ion
Beautiful REST+JSON APIs with IonBeautiful REST+JSON APIs with Ion
Beautiful REST+JSON APIs with Ion
 
Solr: 4 big features
Solr: 4 big featuresSolr: 4 big features
Solr: 4 big features
 
How Solr Search Works
How Solr Search WorksHow Solr Search Works
How Solr Search Works
 
Search Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearchSearch Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearch
 
Using Apache Solr
Using Apache SolrUsing Apache Solr
Using Apache Solr
 

En vedette

Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with ElasticsearchSamantha Quiñones
 
Utah Code Camp 2014 - Learning from Data by Thomas Holloway
Utah Code Camp 2014 - Learning from Data by Thomas HollowayUtah Code Camp 2014 - Learning from Data by Thomas Holloway
Utah Code Camp 2014 - Learning from Data by Thomas HollowayThomas Holloway
 
Quality Reach I SMX West Slides
Quality Reach I SMX West SlidesQuality Reach I SMX West Slides
Quality Reach I SMX West Slides97th Floor
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in actionCodemotion
 
ElasticSearch: la tenés atroden Google
ElasticSearch: la tenés atroden GoogleElasticSearch: la tenés atroden Google
ElasticSearch: la tenés atroden GoogleMariano Iglesias
 
10 pasos para desarrollar un plan de negocios en internet.
10 pasos para desarrollar un plan de negocios en internet. 10 pasos para desarrollar un plan de negocios en internet.
10 pasos para desarrollar un plan de negocios en internet. Interlat
 
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...Sematext Group, Inc.
 
Social Media Report - Marketing Week Live & Insight Show 2017
Social Media Report - Marketing Week Live & Insight Show 2017Social Media Report - Marketing Week Live & Insight Show 2017
Social Media Report - Marketing Week Live & Insight Show 2017Linkfluence
 

En vedette (10)

Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with Elasticsearch
 
Utah Code Camp 2014 - Learning from Data by Thomas Holloway
Utah Code Camp 2014 - Learning from Data by Thomas HollowayUtah Code Camp 2014 - Learning from Data by Thomas Holloway
Utah Code Camp 2014 - Learning from Data by Thomas Holloway
 
Quality Reach I SMX West Slides
Quality Reach I SMX West SlidesQuality Reach I SMX West Slides
Quality Reach I SMX West Slides
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
 
Cadenas marquistas 2009
Cadenas marquistas 2009Cadenas marquistas 2009
Cadenas marquistas 2009
 
ElasticSearch: la tenés atroden Google
ElasticSearch: la tenés atroden GoogleElasticSearch: la tenés atroden Google
ElasticSearch: la tenés atroden Google
 
10 pasos para desarrollar un plan de negocios en internet.
10 pasos para desarrollar un plan de negocios en internet. 10 pasos para desarrollar un plan de negocios en internet.
10 pasos para desarrollar un plan de negocios en internet.
 
Elasticsearch Introduction
Elasticsearch IntroductionElasticsearch Introduction
Elasticsearch Introduction
 
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
 
Social Media Report - Marketing Week Live & Insight Show 2017
Social Media Report - Marketing Week Live & Insight Show 2017Social Media Report - Marketing Week Live & Insight Show 2017
Social Media Report - Marketing Week Live & Insight Show 2017
 

Similaire à Data Exploration with Elasticsearch

Practical Elasticsearch - real world use cases
Practical Elasticsearch - real world use casesPractical Elasticsearch - real world use cases
Practical Elasticsearch - real world use casesItamar
 
Elasticsearch Distributed search & analytics on BigData made easy
Elasticsearch Distributed search & analytics on BigData made easyElasticsearch Distributed search & analytics on BigData made easy
Elasticsearch Distributed search & analytics on BigData made easyItamar
 
Elasticsearch at EyeEm
Elasticsearch at EyeEmElasticsearch at EyeEm
Elasticsearch at EyeEmLars Fronius
 
Optimizing Multilingual Search: Presented by David Troiano, Basis Technology
Optimizing Multilingual Search: Presented by David Troiano, Basis TechnologyOptimizing Multilingual Search: Presented by David Troiano, Basis Technology
Optimizing Multilingual Search: Presented by David Troiano, Basis TechnologyLucidworks
 
useR! 2012 Talk
useR! 2012 TalkuseR! 2012 Talk
useR! 2012 Talkrtelmore
 
Strengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBStrengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBlehresman
 
Locality sensitive hashing
Locality sensitive hashingLocality sensitive hashing
Locality sensitive hashingSEMINARGROOT
 
Architecture of a search engine
Architecture of a search engineArchitecture of a search engine
Architecture of a search engineSylvain Utard
 
Creating an Open Source Genealogical Search Engine with Apache Solr
Creating an Open Source Genealogical Search Engine with Apache SolrCreating an Open Source Genealogical Search Engine with Apache Solr
Creating an Open Source Genealogical Search Engine with Apache SolrBrooke Ganz
 
NEW LAUNCH! Natural Language Processing for Data Analytics - MCL343 - re:Inve...
NEW LAUNCH! Natural Language Processing for Data Analytics - MCL343 - re:Inve...NEW LAUNCH! Natural Language Processing for Data Analytics - MCL343 - re:Inve...
NEW LAUNCH! Natural Language Processing for Data Analytics - MCL343 - re:Inve...Amazon Web Services
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchLuiz Messias
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseAlexandre Rafalovitch
 
You're not using ElasticSearch (outdated)
You're not using ElasticSearch (outdated)You're not using ElasticSearch (outdated)
You're not using ElasticSearch (outdated)Timon Vonk
 
bm25 demystified
bm25 demystifiedbm25 demystified
bm25 demystifiedFan Robbin
 
Python.pptx
Python.pptxPython.pptx
Python.pptxAshaS74
 
<Little Big Data #1> 한국어 채팅 데이터로 머신러닝 하기 (한국어 보이게 수정)
<Little Big Data #1> 한국어 채팅 데이터로  머신러닝 하기 (한국어 보이게 수정)<Little Big Data #1> 한국어 채팅 데이터로  머신러닝 하기 (한국어 보이게 수정)
<Little Big Data #1> 한국어 채팅 데이터로 머신러닝 하기 (한국어 보이게 수정)Han-seok Jo
 

Similaire à Data Exploration with Elasticsearch (20)

Practical Elasticsearch - real world use cases
Practical Elasticsearch - real world use casesPractical Elasticsearch - real world use cases
Practical Elasticsearch - real world use cases
 
Elasticsearch Distributed search & analytics on BigData made easy
Elasticsearch Distributed search & analytics on BigData made easyElasticsearch Distributed search & analytics on BigData made easy
Elasticsearch Distributed search & analytics on BigData made easy
 
Elasticsearch at EyeEm
Elasticsearch at EyeEmElasticsearch at EyeEm
Elasticsearch at EyeEm
 
Optimizing Multilingual Search: Presented by David Troiano, Basis Technology
Optimizing Multilingual Search: Presented by David Troiano, Basis TechnologyOptimizing Multilingual Search: Presented by David Troiano, Basis Technology
Optimizing Multilingual Search: Presented by David Troiano, Basis Technology
 
useR! 2012 Talk
useR! 2012 TalkuseR! 2012 Talk
useR! 2012 Talk
 
NLTK
NLTKNLTK
NLTK
 
Strengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBStrengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDB
 
Locality sensitive hashing
Locality sensitive hashingLocality sensitive hashing
Locality sensitive hashing
 
Architecture of a search engine
Architecture of a search engineArchitecture of a search engine
Architecture of a search engine
 
Stopwords in Search
Stopwords in SearchStopwords in Search
Stopwords in Search
 
Elasto Mania
Elasto ManiaElasto Mania
Elasto Mania
 
Creating an Open Source Genealogical Search Engine with Apache Solr
Creating an Open Source Genealogical Search Engine with Apache SolrCreating an Open Source Genealogical Search Engine with Apache Solr
Creating an Open Source Genealogical Search Engine with Apache Solr
 
NEW LAUNCH! Natural Language Processing for Data Analytics - MCL343 - re:Inve...
NEW LAUNCH! Natural Language Processing for Data Analytics - MCL343 - re:Inve...NEW LAUNCH! Natural Language Processing for Data Analytics - MCL343 - re:Inve...
NEW LAUNCH! Natural Language Processing for Data Analytics - MCL343 - re:Inve...
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by Case
 
You're not using ElasticSearch (outdated)
You're not using ElasticSearch (outdated)You're not using ElasticSearch (outdated)
You're not using ElasticSearch (outdated)
 
bm25 demystified
bm25 demystifiedbm25 demystified
bm25 demystified
 
Python.pptx
Python.pptxPython.pptx
Python.pptx
 
<Little Big Data #1> 한국어 채팅 데이터로 머신러닝 하기 (한국어 보이게 수정)
<Little Big Data #1> 한국어 채팅 데이터로  머신러닝 하기 (한국어 보이게 수정)<Little Big Data #1> 한국어 채팅 데이터로  머신러닝 하기 (한국어 보이게 수정)
<Little Big Data #1> 한국어 채팅 데이터로 머신러닝 하기 (한국어 보이게 수정)
 
Text Mining
Text MiningText Mining
Text Mining
 

Dernier

Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...itnewsafrica
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 

Dernier (20)

Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 

Data Exploration with Elasticsearch

  • 1. Data  explora+on  with  Elas+csearch   Aleksander  M.  Stensby   Monokkel  A/S  
  • 2.
  • 3.
  • 4.
  • 5. •  Aleksander  M.  Stensby   •  CEO  in  Monokkel  AS   •  Previously  COO  in  Integrasco  AS   •  Working  with  search  and  data  analysis  since  2004   www.monokkel.io  
  • 6. •  Daglig  leder  i  Monokkel  AS   •  Tidligere  COO  i  Integrasco  AS   •  Persistering,  Prosessering  og  Presentasjon  av  data   Persistence  –  Processing  –  PresentaHon  
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. Agenda   •  Search  fundamentals  primer     •  Intro  to  elasHcsearch     •  Search,  filter  and  aggregate!  
  • 13. Agenda   •  Search  fundamentals  primer     •  Intro  to  elasHcsearch   •  Search,  filter  and  aggregate!   …  and  some  bonus  visualisaHon!  
  • 14. What  we  will  not  cover  today…   •  All  the  different  searches,  filters  and   aggregaHons  available  in  elasHcsearch  J     •  Details  on  tokenizaHon,  analyzers…     •  ElasHcsearch  in  producHon  and  performance   tuning…   •  Data  integraHon  
  • 18. “We know what we are, but know not what we may be.”
  • 19. Term   Frequency   we   3   know   2   what   2   are   1   but   1   not   1   may   1   be   1   “We know what we are, but know not what we may be.” Term Vector
  • 20.
  • 21. Index
  • 22. “We were born to run” “No one told you when to run” “Some were born to sing the blues”
  • 23.
  • 24.
  • 25.
  • 26. The  Inverted  Index   Term   Frequency   blues   1   born   2   no   1   one   1   run   2   sing   1   some   1   the   1   to   3   told   1   we   1   were   2   when   1   you   1   Documents   3   1,3   2   2   1,2   3   3   3   1,2,3   2   1   1,3   2   2   dictionary postings 1. “We were born to run ” 2. “No one told you when to run” 3. “Some were born to sing the blues”
  • 27. Searching   born   1. “We were born to run ” 2. “No one told you when to run” 3. “Some were born to sing the blues”
  • 28. The  Boolean  Model   Term   Frequency   blues   1   born   2   no   1   one   1   run   2   sing   1   some   1   the   1   to   3   told   1   we   1   were   2   when   1   you   1   Documents   3   1,3   2   2   1,2   3   3   3   1,2,3   2   1   1,3   2   2   dictionary postings born  
  • 29. Term   Frequency   blues   1   born   2   no   1   one   1   run   2   sing   1   some   1   the   1   to   3   told   1   we   1   were   2   when   1   you   1   Documents   3   1,3   2   2   1,2   3   3   3   1,2,3   2   1   1,3   2   2   dictionary postings born  blues  
  • 30. Term   Frequency   blues   1   born   2   no   1   one   1   run   2   sing   1   some   1   the   1   to   3   told   1   we   1   were   2   when   1   you   1   Documents   3   1,3   2   2   1,2   3   3   3   1,2,3   2   1   1,3   2   2   dictionary postings born  OR  blues  
  • 31. Term   Frequency   blues   1   born   2   no   1   one   1   run   2   sing   1   some   1   the   1   to   3   told   1   we   1   were   2   when   1   you   1   Documents   3   1,3   2   2   1,2   3   3   3   1,2,3   2   1   1,3   2   2   dictionary postings born  AND  blues  
  • 32. Term   Frequency   blues   1   born   2   no   1   one   1   run   2   sing   1   some   1   the   1   to   3   told   1   we   1   were   2   when   1   you   1   Documents   3   1,3   2   2   1,2   3   3   3   1,2,3   2   1   1,3   2   2   dictionary postings born  NOT  blues  
  • 33. Relevancy  and  Ranking   •  Term  frequency     •  Inverse  document  frequency     •  Field-­‐length  norm  
  • 34. Similarity   1. “We were born to run ” 2. “No one told you when to run” 3. “Some were born to sing the blues” [2,  0]   [0,  0]   [2,  5]   0   0   1   2   3   4   5   1   2   3   “blues”   “born”   query:    [2,5]   doc  3:    [2,5]   doc  2:    [0,0]   doc  1:    [2,0]  
  • 35. Search  fundamentals  101!   •  TokenizaHon     •  NormalizaHon  (case,  stop  words  etc)     •  Stemming,  synonyms  
  • 36. Brief  history  of  elasHcsearch   Shay  Banon     -­‐>  AbstracHon  Layer  on  top  of  Lucene     -­‐>  Compass     -­‐>  Rewricen  high  performance,     real-­‐Hme,  distributed     -­‐>  ElasHcsearch     -­‐>  February  2010  
  • 37. elasHcsearch   •  Open  source  search  engine  -­‐  wricen  in  Java     •  Built  on  top  of  Lucene       •  Simple,  coherent,  RESTful  API   •  Distributed,  scalable  search  engine  with  real-­‐ Hme  analyHcs   {  }  
  • 38.     “more  useable  and  concise  API,  scalability,  and   opera+onal  tools  on  top  of  Lucene’s  search   implementa+on”  
  • 39. ElasHcsearch  nodes  and  cluster   node node node cluster
  • 40. ElasHcsearch  shards,  nodes   index = shard node
  • 41. Lucene  index  and  segments   segments lucene index
  • 42. Much  more  than  just  search!   •  Real-­‐Hme  analyHcs   •  Log  analysis   •  PredicHon  modelling   •  RecommendaHons  
  • 43.           in  5  minutes     DEMO    
  • 44. DEMO   •  Install  ElasHcSearch     •  Load  in  some  data     •  Run  a  very  basic  search  
  • 45.           in  15  minutes     DEMO    
  • 46. Easy  peasy…   •  hcp://www.elasHcsearch.org/download     •  bin/elasHcsearch    or  bin/elasHcsearch.bat  on  windows     •  hcp://localhost:9200/    or  curl  –X  GET  hcp://localhost:9200/  
  • 47. Easy  peasy  lemon  squeezy!  
  • 49. Indexing  data   curl  -­‐XPUT  'hcp://localhost:9200/monokkel/user/aleks'   -­‐d  '{  "name"  :  "Aleksander  Stensby"  }’      
  • 50. Indexing  data   •  shakespeare.json   – hcp://www.elasHcsearch.org/guide/en/kibana/ current/snippets/shakespeare.json     •  curl  -­‐XPUT  localhost:9200/_bulk  -­‐-­‐data-­‐binary   @shakespeare.json  
  • 51.
  • 52.
  • 54. Mapping   •  Is  it  a  number?  String?  Date?   •  Combining  mulHple  fields?   •  Default  values?   •  Stored?   •  Analyzed?   •  How  should  we  tokenize/analyse/normalize   the  field?  
  • 55.
  • 56. Mapping   curl  -­‐XPUT  hcp://localhost:9200/shakespeare  -­‐d  '   {    "mappings"  :  {      "_default_"  :  {        "properHes"  :  {          "speaker"  :  {"type":  "string",  "index"  :  "not_analyzed"  },          "play_name"  :  {"type":  "string",  "index"  :  "not_analyzed"  },          "line_id"  :  {  "type"  :  "integer"  },          "speech_number"  :  {  "type"  :  "integer"  }        }      }    }   }   ';  
  • 57.
  • 58.
  • 59. The  Query  DSL   {          "query":  {YOUR_QUERY_HERE}   }  
  • 60. Match  Query   {          "query":  {                  "match":  {"text_entry"  :  "romeo"}          }   }  
  • 61. MulH  Match  Query   {          "query":  {                    "mulM_match":  {      "query":        "romeo",      "fields":      [  "text_entry",  "speaker"  ]    }          }   }  
  • 62. Bool  Query   {          "query":  {   "bool":  {                  "must":          [  ],                  "must_not":  [  ],                  "should":  [  ]          }   }   }  
  • 63. Bool  Query   {          "query":  {   "bool":  {                  "must":          {  "match":  {"text_entry":  "romeo"  }},                  "must_not":  {  "match":  {"speaker":      "ROMEO"  }},                  "should":  [                            {  "match":  {"speaker":  "JULIET"  }},    {  "match":  {"speaker":  "FRIAR  LAURENCE"  }}                      ]          }   }   }  
  • 64. And  lots  more…   filtered  query   prefix  query   simple  query  string  query   range  query   regexp  query   term  query   terms  query   wildcard  query   dis  max  query   geoshape  query   nested  query     more  like  this  query   more  like  this  field  query   boosHng  query   common  terms  query   constant  score  query   fuzzy  like  this  query   fuzzy  like  this  field  query   funcHon  score  query   fuzzy  query   has  child  query   has  parent  query     ids  query   indices  query   span  first  query   span  mulH  term  query   span  near  query   span  not  query   span  or  query   span  term  query   top  children  query   minimum  should  match   mulH  term  query  rewrite   template  query       hAp://www.elas+csearch.org/guide/en/elas+csearch/reference/current/query-­‐dsl-­‐queries.html  
  • 65. Filtering   •  Filters  do  not  score  so  they  are  faster  to   execute  than  queries     •  Filters  can  be  cached  in  memory  -­‐  significantly   faster  than  queries     If relevance is not important, use filters, otherwise, use queries!
  • 66. The  Filtered  Query:   {          "query":  {                  "filtered":  {                          "query":    {YOUR_QUERY_HERE},                          "filter":  {YOUR_FILTER_HERE}    }          }   }  
  • 67. The  Filtered  Query:   {          "query":  {                  "filtered":  {                          "query":    {  "match":  {"content":  "monokkel"  }},                          "filter":  {  "term":  {  "tag":  "awesome"  }}    }          }   }  
  • 68. Term  Filter   {          "query":  {                  "filtered":  {      "filter":  {          "term":  {          "speaker":  "ROMEO"          }      }    }          }   }  
  • 69. Terms  Filter   {          "query":  {                  "filtered":  {      "filter":  {          "terms":  {          "speaker":  ["ROMEO",  "JULIET"]          }      }    }          }   }  
  • 70. Bool  Filter   {          "query":  {                  "filtered":  {      "filter":  {            "bool"  :  {                      "must"  :          [],                      "should"  :      [],                      "must_not"  :  []              }          }    }          }   }  
  • 71. Range  Filter   {          "query":  {                  "filtered":  {      "filter":  {            "range"  :  {                  "price"  :  {                            "gt"  :  20,                            "lt"  :  40                  }        }          }    }          }   }  
  • 72. And  lots  more…   match  all  filter   and  filter   not  filter   or  filter   prefix  filter   query  filter   regexp  filter   type  filter     geo  bounding  box  filter   geo  distance  filter   geo  distance  range  filter   geo  polygon  filter   geoshape  filter   geohash  cell  filter   has  child  filter   has  parent  filter   ids  filter   indices  filter   limit  filter   nested  filter   script  filter   hAp://www.elas+csearch.org/guide/en/elas+csearch/reference/current/query-­‐dsl-­‐filters.html  
  • 73.
  • 74. Kibana   •  hcp://www.elasHcsearch.org/overview/ kibana/installaHon/     •  bin/kibana     or  bin/kibana.bat  on  windows     •  hcp://localhost:5601/    
  • 75.
  • 76.
  • 77. AggregaHons   •  Buckets  and  Metrics:   par++oning  documents  based  on  a  criteria   SELECT  COUNT(color)   FROM  table   GROUP  BY  color     An  aggrega+on  is  a  combina+on  of  buckets  and   metrics   metric bucket
  • 78. AggregaHons   {          "aggs":  {                  "speakers":  {      "terms":  {          "field":  "speaker"        }    }          }   }   your aggregation name bucket type
  • 80. AggregaHons   {          "aggs":  {                  "beertypes":  {      "terms":  {          "field":  "beertype"        }    }          }   }  
  • 81. AggregaHons   {          "aggs":  {                  "beertypes":  {      "terms":  {          "field":  "beertype"        },      "aggs":  {        "avg_ibu":  {          "avg":  {            "field":  "ibu"          }        }      }      }          }   }   your aggregation name metric type
  • 82. AggregaHons   min   max   sum   avg   stats   extended  stats   value  count   percenHles   percenHle  ranks   cardinality   top  hits   scripted  metric   global   filter   filters   missing   nested   reverse  nested   children   terms   significant  terms   range   date  range   ipv4  range   histogram   date  historgram   geo  bounds   geo  distance   geohash  grid   hAp://www.elas+csearch.org/guide/en/elas+csearch/reference/current/search-­‐aggrega+ons.html  
  • 83. And  a  whole  lot  more!   •  Geosearch,  distance  and  bounds     •  ”More  Like  This”   •  Suggesters  /  Autocomplete   •  PercolaMon   •  Language  drivers   •  ScripMng  
  • 84. Further  reading  and  some  great   resources!   •  hcp://www.elasHcsearch.org/guide/     •  hcp://blog.monokkel.io/     •  hcps://found.no/foundaHon/  
  • 85. Shameful  self-­‐promoHon     / Tarjei Romtveit / Tarjei Romtveit