Aim of this presentation to provide enough information for enterprise architect to choose whether Cassandra will be project data store. Presentation describes each nuance of Cassandra architecture and ways to design data and work with them.
3. SEDA Architecture SEDA – Staged event-driven architecture Every unit of work is split into several stages that are executed in parallel threads. Each stage consist of input and output event queue, event handler and stage controller.
6. SEDA in Cassandra - Design Stage Manager presents Map between stage names and Java 5 thread pool executers. Each controller with queue is presented by ThreadPoolExecuter that can be configured through JMX.
9. Installing and launching Cassandra Launching server: bin/cassandra.bat use “-f” key to run sever in foreground, so that all of the server logs will print to standard out is started with single node cluster called “Test Cluster” listening on port 9160
10. Installing and launching Cassandra Starting command-line client interface: bin/cassandra-cli.bat you see [username@keyspace] at the beginning of every line
11. Creating a cluster In configuration file cassandra.yaml specify: seeds – the list of seeds for the cluster rpc_address and listen_address – network addresses
12. Creating a cluster initial_token – defining the node’s token range auto_bootstrap – enables auto-migration of data to the new node
13. nodetool ring Use nodetool for view configuration ~$ nodetool -h localhost -p 8080 ring Address Status State Load Owns Range Ring 850705… 10.203.71.154 Up Normal 2.53 KB 50.00 0|<--| 10.203.55.186 Up Normal 1.33 KB 50.00 850705…|-->|
14. Connecting to server Connect from command line: connect <HOSTNAME>/<PORT> [<USERNAME> ‘<PASSWORD>’]; Examples: connect localhost/9160; connect 127.0.0.1/9160 user ‘password’; Connect when staring command line client: cassandra-cli –h,––host <HOSTNAME> –p,––port <PORT> –k,––keyspace <KEYSPACE> –u,––username <USERNAME> –p,––password <PASSWORD>
15. Describing environment show cluster name; show keyspaces; show api version; describe cluster; describe keyspace [<KEYSPACE>];
17. Create keyspace Example: create keyspace Keyspace1 with placement_strategy = ‘org.apache.cassandra.locator.RackUnawareStrategy’ and replication_factor = 4;
18. Update keyspace Update attributes of created keyspace: update keyspace <KEYSPACE> with <ATTR1> = <VAL1> and <ATTR2> = <VAL2> ...;
19. Switch to keyspace use <KEYSPACE>; use <KEYSPACE> [<USERNAME> ‘<PASSWORD>’]; If you don’t specify username and password then credentials supplied to the ‘connect’ statement will be used If the server doesn’t support authentication it will ignore credentials
20. Switch to keyspace Example: use Keyspace1 user1 ‘qwerty123’; When you use keyspace you’ll see [user1@Keyspace1] at the beginning of every line
21. Create column family create column family <COL_FAMILY>; create column family <COL_FAMILY> with <ATTR1> = <VAL1> and <ATTR2> = <VAL1> ...; Example: create column family Users with column_type = Super and comparator = UTF8Type and rows_cached = 1000;
22. Update column family When column family is created you can update its attributes: update column family <COL_FAMILY> with <ATTR1> = <VAL1> and <ATTR2> = <VAL1> ...;
24. Comparators and validators You can specify comparator for column family and all subcolumns in column family (one for all) You can specify validators for each known column of column family You can specify default validator for column family that will be used for columns for which validators aren’t specified You can specify key validatorwhich will validate row keys
25. Attributes of column family column_type: can be Standard or Super(default - Standard) comparator: specifies how column names will be compared for sort order column_metadata: defines the validation and indexes for known columns default_validation_class: validator to use for values in columns which are not listed in the column_metadata. (default – BytesType) key_validation_class: validator for keys
26. Column metadata You can define validators for each known column in the family create column family User with column_metadata = [ {column_name: name, validation_class: UTF8Type}, {column_name: age, validation_class: IntegerType}, {column_name: birth, validation_class: UTF8Type} ]; Columns not listed in this section are validated with default_validation_class
27. Secondary indexes Allows queries by value get users where name = ‘Some user'; Can be created in background
28. Creating index Define it in column metadata For example in cassandra-cli:create column family users with comparator=UTF8Type and column_metadata=[{column_name: birth_date, validation_class: LongType, index_type: KEYS}];
29. Some restrictions Cassandra use hash indexes instead of btree indexes. Thus, in where condition at least one indexed field with operator “=“ must be presentSo, you can’t useget users where birth_date > 1970; but canget users where birth_date = 1990 and karma > 50;
31. Writing data To write data use set command: set Customers[‘ivan’][‘name’] = ‘Ivan’; set Customers[‘makar’][‘info’][‘age’] = 96;
32. Reading data To read data use get command: get Customers[‘ivan’][‘name’]; - this will display ‘Ivan’ get Customers[‘makar’]; - this will display all columns for key ‘makar’
33. Reading data To list a range of rows use list command: list Customers; list Customers[a:]; list Customers[a:c] limit 40; - you can specify limit of rows that will be displayed (default - 100)
34. Reading data To get columns number use count command: count Customers[‘ivan’] - this will display number of columns for key ‘ivan’
35. Deleting data To delete a row, a column or a subcolumn use del command: del Customers[‘ivan’]; - this will delete all columns for key ‘ivan’ del Customers[‘ivan’][‘name’]; - this will delete column name for key ‘ivan’ del Customers[‘ivan’][‘accounts’][‘2312784829312343’]; - this will delete a subcolumn with an account number from ‘accounts’ column for key ‘ivan’
36. Deleting data To delete all data in a column family use truncate command: truncate Customers;
37. Drop column family or keyspace drop column family Customers; drop keyspace Keyspace1;
39. Resources Home of Apache Cassandra Project http://cassandra.apache.org/ Apache Cassandra Wiki http://wiki.apache.org/cassandra/ Documentation provided by DataStaxhttp://www.datastax.com/docs/0.8/ Good explanation of creation secondary indexes http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html Eben Hewitt “Cassandra: The Definitive Guide”, O’REILLY, 2010, ISBN: 978-1-449-39041-9