At Choice Hotels International, we are in the midst of a multi-year effort to replace our 25 year old monolithic reservation system with a cloud-based, microservice-style architecture using Cassandra. Since processing the first live reservation on the new system in December 2015, we've been shifting an increasing amount of shopping and booking traffic to the new system, with retirement of the old system scheduled for early 2017.
After a quick review of our problem space, architecture, schema design, and Cassandra deployment, we'll take a closer look several challenges we faced and discuss how they impacted our data modeling, development and deployment:
* Managing data with varying consistency requirements
* Maintaining data integrity across microservice boundaries
* Performing complex queries involving overlapping time ranges
* Relying on time-to-live (TTL) for data cleanup
* Balancing denormalization, performance and cost
About the Speakers
Andrew Baker Senior Software Engineer, Choice Hotels International
Andrew is the technical lead of the service development team responsible for storage and maintenance of rates and reservations for thousands of hotels around the world.
Jeffrey Carpenter Systems Architect, Choice Hotels International
Jeff Carpenter is a software and systems architect with experience in the hospitality and defense industries, it. Jeff is currently working on a cloud-based hotel reservation system using Cassandra and is the author of the new O'Reilly book "Cassandra: The Definitive Guide, 2nd edition".
(Jeff)
The reservation system interfaces to many of our other IT systems so replacing it is a major undertaking
We interface with property systems so our franchisees can tell us about their room types, rates, and inventory
Internal channels like our website and mobile applications allow customers to shop and book rooms
External channels as well
Reporting and billing systems pull information about reservations
We interface with customer and loyalty systems to credit stays and support reward reservations
(Jeff)
Our current reservation system is over 25 years old - written in C and running on a large UNIX box with traditional RDBMS
We’re currently making reservations for over 6000 hotels worldwide, and distributing over 50 different channels – everything from our own website and mobile apps to GDS and OTA partners
This system is very performant and reliable, servicing over 4000 TPS
However, the system scales vertically - we need horizontal scalability for future growth
(Jeff)
Here are some of the tenets of our architecture that led to our use of Cassandra:
We wanted a microservices architecture based primarily on RESTful APIs.
We designed for the cloud to run anywhere
Externalize business rules
We use open source infrastructure where possible
The key architectural qualities we focus on are scalability, stability and security
(Jeff)
These are a few of the elements in our stack. We are mostly OSS
We use Cassandra as our primary data store
We use Spark and related technologies for reporting and analytics tasks
Drools is our rule engine
Our RESTful microservices are deployed using technologies such as tomcat and spring
We’re using Netflix open source technologies including Hystrix and the Simian Army to build in high availability
(Jeff)
We’ve deployed our system in 3 AWS regions, two of which are currently active, with more on the way
We’ve recently upgraded to direct connect between regions and our legacy data center, which has helped resolve some of the latency issues we’ll discuss later.
Our application is currently running on a single 12 node cluster (although we’ve been up as high as 18)
We’re running Cassandra 2.2 series releases but are planning upgrade to 3.X releases in order to take advantage of materialized views
(Can we add something on data size?)
This is a starting point, we’ll see where it grows…
(Jeff)
As our project has matured, so has our use of Cassandra
When we started the project in 2014, one of our first tasks was a Cassandra proof of concept where we scaled up to more writes a second than we thought we might reasonably need. We were excited to learn the ways of Cassandra data modeling and configuration.
As we worked toward our an initial capability toward the end of 2015, we had some growing pains as we learned the ropes of operating Cassandra in production. We put a lot of automation around Cassandra deployment and cluster maintenance and learned a lot about how to manage repairs. We ended 2015 with a successful Beta release where we booked our first reservation in production
In 2016 we’re continuing to add functionality while increasing the traffic and working on improving scalability, stability and security
In 2017 we’ll complete migration of our various partners and internal systems to the new system and retire the old reservation system
(Andrew)
Hotels - descriptive data about the hotels and their products, and policies. Quite static
Rates - prices that are charged for the products.
these can change many times a day, and could include an automated pricing system
Inventory - constantly changes as rooms are booked, cancelled, etc. Data quality and currency is extremely important here so we don’t oversell our hotels
Reservations - contract with the customer. Generally only changed when initiated by the customer, infrequent changes (The Marriott may disagree after dealing with me)
(Andrew)
After we identified our key data types, they seemed like a good way to divide the work and the system landscape
As work commenced, we actually divided things a little further, and kept the keyspace per service idea going, approaching a share-nothing architecture style.
We stayed with a single cluster for now to ease operations and reduce cost.
On top of this we added services to encapsulate the complexities shopping and booking
We used rules at this level to define business logic likely to change
We also built data maintenance applications to:
synchronize of data from other systems – our legacy system as well as some other systems that will stay in operation, such as property management systems
Verify data accuracy across systems and across service boundaries
Correct data issues caused by defects
(Andrew)
One of the things we’ve learned over time is that not all of our data has the same consistency requirements.
Our hotel owners make their living off of their rates and inventory, if we sell too few rooms or offer the wrong rate, they are well justified to complain and may be compensated for the trouble.
Customers hate being sold a room that wasn’t actually available, and it costs us money.
With all of the talk of consistency requirements, there is always an undercurrent of performance requirements. Slow websites don’t sell rooms, we have to deliver the goods quickly, so we have to give ourselves some wiggle room on the consistency, we can’t use EACH_QUORUM everywhere
And sometimes we might really need a LWT
We implemented varying consistency by using separate CLs for each query, carefully considering the tradeoffs and documenting the decisions
(Jeff)
One of the challenges of a microservices architecture is keeping changes in sync across service boundaries. One example situation is in booking a reservation.
Since the reservation represents our contract with the customer to reserve a specific room at a specific price and with certain conditions, we need to mark a reservation as committed at the same time as we reserve the inventory.
This is important so that we don’t accidentally overbook our hotel. Making the situation more complicated, there could be simultaneous bookings and data maintenance activities also trying to access the same inventory
Since these types are split across microservice boundaries, there is no transaction mechanism. In fact, since the data is in different rows (and different tables), Cassandra’s lightweight transactions are of no use to us here.
We solved this by a layered approach – LWTs to protect inventory counts, retries within the booking service, and compensating processes to detect and cleanup failures
(Jeff)
Thankfully we have a variety of tools in our toolbox for guaranteeing consistency. Some of these are provided by Cassandra but some of them are architecture approaches.
(Andrew)
Our hard work of planning consistency levels bit us in short order when we began migrating our data
We chose to migrate inventory in the same way we would maintain it, by processing update notifications
Our legacy system was able to perform a recap, sending an update for every record
When updating inventory counts from other systems, we wanted to make sure that we only accept counts for blocks of inventory that already exist.
Creating a block is an infrequent and low priority activity, so we picked CL_ONE.
Unfortunately, this is quite frequent when a hotel is not already set up, and is immediate followed by setting the inventory count
This breaks the “check block” step, which then breaks the inventory sync for that hotel.
This was a problem that only appeared when we were running at scale and doing a large inventory sync.
We realized that we needed to take a step back and evaluate flows more carefully in order to assign the right consistency levels.
(Andrew)
When we deployed multiple regions for the first time, we soon realized more of our CLs were wrong, with our EACH_QUORUM requests timing out on every request for short periods
Realizing we got it wrong the first time, our confidence getting it right now was low. Rather than mope, we made it easy to change, exposing environment variables for read and write consistency settings
Making consistency level a matter of configuration enables us to launch instances with different consistency levels to measure the differences in speed an accuracy of data.
(Andrew)
We started with EACH_QUORUM CLs but found our VPN configuration didn’t allow us to achieve QUORUM across regions with the default timeout settings
We had considered that and created a downgrading consistency retry policy.
If you think you can tolerate downgrading your consistency on retry, you can probably tolerate that low of a level to begin with.
If the initial times out regularly, you are essentially adding the timeout setting to your latency
Our initial deployment had our nodes communicating to each other through VPN in Phoenix, when we did not have direct connect
We have just switched to Direct Connect and hope to soon try higher consistency levels for some of our queries
(Jeff)
We’ve made use of Netflix’s Simian Army in order to build reliability into the system
Part of this was allowing Chaos Monkey to kill Cassandra nodes to make sure our clusters could survive losing nodes abruptly
This helped us mature our cluster monitoring and test automated cluster management capabilities.
It also helped us uncover an unexpected behavior of the Java driver, which has since been fixed in the 3.0 driver.
Our configuration was using a DNS name to locate the nodes in the cluster. Calling the “addContactPoint()” operation with the well known name initially bound to one record. If this happened to be the node that was killed by Chaos Monkey, before the record was cleared from DNS, the driver would fail to connect to that node, and would be unable to bootstrap. We worked around this by calling addContactPoints() instead, which binds to multiple IP addresses so that it can make multiple connection attempts.
This has been fixed in the 3.0 driver – addContactPoint() now binds to multiple Ips if you use a DNS name.
The moral of the story - to help mitigate against common connection issues, we created a common library to manage connections across services and connectivity. It loads connection information from the environment, including the cluster name, security credentials
(Andrew)
We store the rates per day
we need all the rates that overlap the requested arrival and departure dates.
We could iterate through the days in the search range, but we wanted to keep ourselves as flexible as we could, in order to handle other rate units, such as meeting rooms for 4 hours our cabins for 7 days.
Doing a true search for overlapping ranges requires a range search against the start and end time of each rate. Since Cassandra doesn’t support range queries over multiple attributes, we determined that searching for all rates before the departure date, would usually eliminate the largest volume of data, then the service could trim the rest.
(Jeff)
after religiously following the mantra of designing tables for each access pattern, we soon ran into cases where adding a table per unique access pattern proved to be too much
Take for example hotels and the number of ways by which various clients could search for hotels
Since the hotel records are quite large, imagine the impact of all of these tables on our cluster size and storage requirements for 6000+ hotels.
We reined this in by designing tables to support multiple queries and doing some filtering at the service layer, which helped us rein in our computing costs.
We’re also looking to move to Cassandra 3.X in order to take advantage of materialized views and SASI indexes, which will allow us to shift some of the processing burden back to the database
(Andrew)
As the previous slide implies, we are frequently adding to our schemas
Our platform team made the schema creation and evolution part of our deployment pipeline
When changing schemas, we produce separate CQL files, in numbered order, so that the pipeline can apply them sequentially. We also provide rollback scripts so that we can return the database to a stable state if a deployment fails.
This is a fairly standard schema management approach.
(Jeff)
In order to keep ourselves on track for database sizing and cost management, we’ve implemented a process to estimate these up front
Our estimation process begins with proposed schemas
Then in order to project size, we need to make some educated guesses – how long will string attributes be?
How many elements will be in a Set, List, or Map?
What are the likely partition and row counts for each table
Given this information, we use Artem Chebotko’s formulas to estimate the total data size for each table and then keyspace
Based on our estimates, the majority of our data consists of rates and inventory
(Andrew)
We have separated the shopping and booking concerns from our analysis and history uses, which means that in the shopping and booking systems, data relevant to the past is not much use.
As we insert our data, we set the TTL for when it will no longer be needed, which saves us from developing our own cleanup process and reduces our storage footprint.
We still need the historic data for analysis and customer service purposes, though, so we store it in a separate data platform which we feed from the reservation system using asynchronous event processing
Our colleague Narasimhan Sampath is talking at Strata NYC later this month about our data and analytics platform, which is based on Spark and Hadoop. Make sure to check out his talk if you’re attending Strata.
(Jeff)
We’ve defined SLAs for the overall performance of the system for key flows such as shopping
SLAs are defined in terms of 95% percentile response times under specified operating conditions
We allocate the response times down to individual services and progressively down to individual Cassandra queries
We can then create load tests to measure these targeted response times
We’ve focused on using actual services for load testing rather than Cassandra stress due to its limitations
(Jeff)
Moving forward, we’ll be investigating including Materialized Views and row caching to improve performance
We’re also looking at using Spark in some environments in order to support ad-hoc searching and exploration
(Jeff)
Cassandra is only part of your system architecture – don’t ask it to do too much or too little
Choose the right consistency level for each data type and query
Microservice architectures are great, but you’ll have to address joins and data consistency across services
Get to scale as quickly as possible to discover the edge cases
“Automate everything” includes monitoring data size and quality