11. • 10T of high frequency event data daily
• Constant increasing volume
“The web server that powers the interface can query both
datacenters, depending on which the user is closest to,”
“A small set of signals tend to double every eight months. So
we needed a model that can scale linearly.”
- Arun Jayandra, Microsoft
13. Data Protection
• Maximilian Schrems v Data Protection Commissioner
• No longer OK to ship EU data to US under “Safe
Harbour”
Product_Catalog RF=3
Product_Catalog RF=3 Customer_Data RF=3
Customer_Data RF=0
Product_Catalog RF=3
Customer_Data RF=3
14. • 300k customers
• Report on energy usage
• Predict boiler failure
“We’re dealing largely with time series data, and Spark is 10 to 100
times quicker as it is operating on data in-memory…Cassandra
delivers what we need today and if you look at the Internet of Things
space; that is what is really useful right now.” - Jim Anning, British Gas
Hive Active Heating™