Submit Search
Upload
Cassandra Basics: Indexing
•
Download as KEY, PDF
•
44 likes
•
22,819 views
Benjamin Black
Follow
An introduction to indexing with supercolumns and range queries in Cassandra.
Read less
Read more
Technology
Education
Business
Report
Share
Report
Share
1 of 48
Download now
Recommended
DBI
DBI
Lambert Lum
Exemple de création de base
Exemple de création de base
Saber LAJILI
SetFocus Portfolio
SetFocus Portfolio
donjoshu
Growing jQuery
Growing jQuery
gueste8d8bc
Cassandra Explained
Cassandra Explained
Eric Evans
[Infographie] Comment ameliorer la qualité de vos données pour votre DMP mark...
[Infographie] Comment ameliorer la qualité de vos données pour votre DMP mark...
Camp de Bases (Webedia Data Services)
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
DataStax Academy
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
DataStax Academy
Recommended
DBI
DBI
Lambert Lum
Exemple de création de base
Exemple de création de base
Saber LAJILI
SetFocus Portfolio
SetFocus Portfolio
donjoshu
Growing jQuery
Growing jQuery
gueste8d8bc
Cassandra Explained
Cassandra Explained
Eric Evans
[Infographie] Comment ameliorer la qualité de vos données pour votre DMP mark...
[Infographie] Comment ameliorer la qualité de vos données pour votre DMP mark...
Camp de Bases (Webedia Data Services)
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
DataStax Academy
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
DataStax Academy
Cassandra
Cassandra
오석 한
Graphite cluster setup blueprint
Graphite cluster setup blueprint
Anatoliy Dobrosynets
Understanding BYOE and How Today's User Experience Drives Value for UC
Understanding BYOE and How Today's User Experience Drives Value for UC
ShoreTel
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
aliproductninja
What is a DMP
What is a DMP
Sarah Jones
Highly Available Graphite
Highly Available Graphite
Matthew Barlocker
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
Vassilis Bekiaris
Cassandra and Spark
Cassandra and Spark
datastaxjp
data science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & Jupyter
Raj Singh
Introduction to Apache Spark
Introduction to Apache Spark
Juan Pedro Moreno
Presentation of Apache Cassandra
Presentation of Apache Cassandra
Nikiforos Botis
Introduction to Cassandra - Denver
Introduction to Cassandra - Denver
Jon Haddad
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
Ryu Kobayashi
Intro to py spark (and cassandra)
Intro to py spark (and cassandra)
Jon Haddad
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
Miklos Christine
Python & Cassandra - Best Friends
Python & Cassandra - Best Friends
Jon Haddad
Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014
Jon Haddad
Intro to Cassandra
Intro to Cassandra
Jon Haddad
The Cassandra Distributed Database
The Cassandra Distributed Database
Eric Evans
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark Meetup
Frens Jan Rumph
Building Your First Java Application with MongoDB
Building Your First Java Application with MongoDB
MongoDB
Elasticsearch for SQL Users
Elasticsearch for SQL Users
All Things Open
More Related Content
Viewers also liked
Cassandra
Cassandra
오석 한
Graphite cluster setup blueprint
Graphite cluster setup blueprint
Anatoliy Dobrosynets
Understanding BYOE and How Today's User Experience Drives Value for UC
Understanding BYOE and How Today's User Experience Drives Value for UC
ShoreTel
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
aliproductninja
What is a DMP
What is a DMP
Sarah Jones
Highly Available Graphite
Highly Available Graphite
Matthew Barlocker
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
Vassilis Bekiaris
Cassandra and Spark
Cassandra and Spark
datastaxjp
data science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & Jupyter
Raj Singh
Introduction to Apache Spark
Introduction to Apache Spark
Juan Pedro Moreno
Presentation of Apache Cassandra
Presentation of Apache Cassandra
Nikiforos Botis
Introduction to Cassandra - Denver
Introduction to Cassandra - Denver
Jon Haddad
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
Ryu Kobayashi
Intro to py spark (and cassandra)
Intro to py spark (and cassandra)
Jon Haddad
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
Miklos Christine
Python & Cassandra - Best Friends
Python & Cassandra - Best Friends
Jon Haddad
Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014
Jon Haddad
Intro to Cassandra
Intro to Cassandra
Jon Haddad
The Cassandra Distributed Database
The Cassandra Distributed Database
Eric Evans
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark Meetup
Frens Jan Rumph
Viewers also liked
(20)
Cassandra
Cassandra
Graphite cluster setup blueprint
Graphite cluster setup blueprint
Understanding BYOE and How Today's User Experience Drives Value for UC
Understanding BYOE and How Today's User Experience Drives Value for UC
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
The Big 3 - 3 Keys to the Customer Kingdom - Business process, Big data, and ...
What is a DMP
What is a DMP
Highly Available Graphite
Highly Available Graphite
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
Cassandra and Spark
Cassandra and Spark
data science toolkit 101: set up Python, Spark, & Jupyter
data science toolkit 101: set up Python, Spark, & Jupyter
Introduction to Apache Spark
Introduction to Apache Spark
Presentation of Apache Cassandra
Presentation of Apache Cassandra
Introduction to Cassandra - Denver
Introduction to Cassandra - Denver
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
Intro to py spark (and cassandra)
Intro to py spark (and cassandra)
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
Python & Cassandra - Best Friends
Python & Cassandra - Best Friends
Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014
Intro to Cassandra
Intro to Cassandra
The Cassandra Distributed Database
The Cassandra Distributed Database
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark Meetup
Similar to Cassandra Basics: Indexing
Building Your First Java Application with MongoDB
Building Your First Java Application with MongoDB
MongoDB
Elasticsearch for SQL Users
Elasticsearch for SQL Users
All Things Open
MongoDB - Features and Operations
MongoDB - Features and Operations
ramyaranjith
Json at work overview and ecosystem-v2.0
Json at work overview and ecosystem-v2.0
Boulder Java User's Group
Elasticsearch for SQL Users
Elasticsearch for SQL Users
Great Wide Open
Embedding a language into string interpolator
Embedding a language into string interpolator
Michael Limansky
Native json in the Cache' ObjectScript 2016.*
Native json in the Cache' ObjectScript 2016.*
Timur Safin
The Aggregation Framework
The Aggregation Framework
MongoDB
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...
MongoDB
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
Similar to Cassandra Basics: Indexing
(10)
Building Your First Java Application with MongoDB
Building Your First Java Application with MongoDB
Elasticsearch for SQL Users
Elasticsearch for SQL Users
MongoDB - Features and Operations
MongoDB - Features and Operations
Json at work overview and ecosystem-v2.0
Json at work overview and ecosystem-v2.0
Elasticsearch for SQL Users
Elasticsearch for SQL Users
Embedding a language into string interpolator
Embedding a language into string interpolator
Native json in the Cache' ObjectScript 2016.*
Native json in the Cache' ObjectScript 2016.*
The Aggregation Framework
The Aggregation Framework
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...
MongoDB .local Bengaluru 2019: Aggregation Pipeline Power++: How MongoDB 4.2 ...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
Recently uploaded
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
panagenda
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
LoriGlavin3
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
Nicole Novielli
How to write a Business Continuity Plan
How to write a Business Continuity Plan
Databarracks
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
Wes McKinney
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
Curtis Poe
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
LoriGlavin3
A Framework for Development in the AI Age
A Framework for Development in the AI Age
Cprime
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
Ravi Sanghani
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
DianaGray10
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
Neo4j
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
LoriGlavin3
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
Hiroshi SHIBATA
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
Skynet Technologies
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
Inflectra
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
Knoldus Inc.
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
LoriGlavin3
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
Ingrid Airi González
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
Sergiu Bodiu
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc
Recently uploaded
(20)
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
How to write a Business Continuity Plan
How to write a Business Continuity Plan
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
A Framework for Development in the AI Age
A Framework for Development in the AI Age
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
Cassandra Basics: Indexing
1.
Cassandra Basics
Indexing Benjamin Black, b@b3k.us
2.
Relational stores are SCHEMA
ORIENTED
3.
Start from your
SCHEMA & WORK FORWARDS
4.
Column stores are QUERY
ORIENTED
5.
Start from your
QUERIES & WORK BACKWARDS
6.
AT SCALE
7.
AT SCALE
Denormalization is THE NORM
8.
AT SCALE
9.
AT SCALE
Everything depends on THE INDICES
10.
Cassandra is an INDEX
CONSTRUCTION KIT
11.
Column Family
12.
Two-level Map key: {
column: value, column: value, ... }
13.
Super Column Family
14.
Three-level Map key: {
supercolumn: { column:value, column: value }, supercolumn: { ... } }
15.
column sorting defined
by CompareWith/ CompareSubcolumnsWith
16.
TimeUUIDType UTF8Type
ASCIIType LongType LexicalUUIDType
17.
row placement determined
by Partitioner
18.
RandomPartitioner Place based on
MD5 of key OrderPreservingPartitioner Place based on actual key
19.
Rows are sorted
by key on each node Regardless of partitioner
20.
One example in TWO
ACTS
21.
Prelude A USER DATABASE
22.
<ColumnFamily Name=”Users”
CompareWith=”UTF8Type” />
23.
“b”:
{“name”:”Ben”, “street”:”1234 Oak St.”, “city”:”Seattle”, “state”:”WA”} “jason”: {”name”:”Jason”, “street”:”456 First Ave.”, “city”:”Bellingham”, “state”:”WA”} “zack”: {”name”: “Zack”, “street”: “4321 Pine St.”, “city”: “Seattle”, “state”: “WA”} “jen1982”: {”name”:”Jennifer”, “street”:”1120 Foo Lane”, “city”:”San Francisco”, “state”:”CA”} “albert”: {”name”:”Albert”, “street”:”2364 South St.”, “city”:”Boston”, “state”:”MA”}
24.
SELECT name FROM
Users WHERE state=”WA”
25.
SELECT name FROM
Users WHERE state=”WA” How is WHERE clause formed?
26.
Act One Supercolumn Indexing
27.
<ColumnFamily Name=”LocationUserIndexSCF”
CompareWith=”UTF8Type” CompareSubcolumnsWith=”UTF8Type” ColumnType=”Super” />
28.
[state]: {
[city1]: {[name1]:[user1], [name2]:[user2], ... }, [city2]: {[name3]:[user3], [name4]:[user4], ... }, ... [cityX]: {[name5]:[user5], [name6]:[user6], ... } }
29.
“CA”: { “San
Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
30.
Row Key “CA”: {
“San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
31.
Row Key
Super Column “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
32.
Row Key
Colum Super Column n “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
33.
Row Key
Colum Super Column Value n “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
34.
Show me EVERYONE IN
WASHINGTON
35.
get(:LocationUserIndexSCF, ‘WA’)
36.
{
“Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
37.
Act Two Composite Key
Indexing
38.
Order Preserving Partitioner
+ Range Queries
39.
<ColumnFamily Name=”LocationUserIndexCF”
CompareWith=”UTF8Type” />
40.
[state1]/[city1]:
{[name1]:[user1], [name2]:[user2], ... } [state1]/[city2]: {[name3]:[user3], [name4]:[user4], ... } [state2]/[city1]: {[name5]:[user5], [name6]:[user6], ... } ... [stateX]/[cityY]: {[name7]:[user7], [name8]:[user8], ... }
41.
“CA/San Francisco”: {”Jennifer”:
“jen1982”} “MA/Boston”: {”Albert”: “albert”} “WA/Bellingham”: {”Jason”: “jason”} “WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”}
42.
Show me EVERYONE IN
WASHINGTON
43.
get_range(:LocationUserIndexCF, {:start: 'WA',
:finish:'WB'})
44.
{
”WA/Bellingham”: {”Jason”: “jason”}, “WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”} }
45.
Finale BUILD SOMETHING AWESOME
46.
(This part is
up to you)
47.
Appendix EXAMPLE KEYSPACE
48.
<Keyspace Name="UserDb">
<ColumnFamily Name="Users" CompareWith="UTF8Type" /> <ColumnFamily Name="LocationUserIndexSCF" CompareWith="UTF8Type" CompareSubcolumnsWith="UTF8Type" ColumnType="Super" /> <ColumnFamily Name="LocationUserIndexCF" CompareWith="UTF8Type" /> <ReplicaPlacementStrategy> org.apache.cassandra.locator.RackUnawareStrategy </ReplicaPlacementStrategy> <ReplicationFactor>1</ReplicationFactor> <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch> </Keyspace>
Editor's Notes
Download now