Hadoop use cases have historically trended towards cost reduction through data warehouse offload. More recently, an uptick around customer-centric use cases have proven the ability for Hadoop to drive top-line revenue. In this session, Platfora solution architect Rob Rosen will discuss how the ability to coreelate multi-structured data in Hadoop leads to greater customer adoption, expanded cross-selling and reduced customer churn for enterprises deploying Hadoop-centric data lakes.
Rob asked me to discuss Customer Analytics
Have several clients who are using Hadoop infastructure for CA
Vizualizations
Let’s have some fun and talk about Customer Analytics
Over the past 12 months we’ve had great momentum growing our company. We’ve also had multiple product versions with feature innovations. We’ve received a lot of accolades in the industry. We have some of the best investors in the industry.
And on the right, we’re really proud to be partnering with our customers at they pioneer the way they analyze MSD within their organizations.
So, you might be asking yourself why traditional tools can’t do what we do. Well, today’s solutions weren’t built for Multi-Structured Data.
In the traditional BI/DW approach you do a lot of modeling up front and it’s typically focused on only 10% of structured data. The other classes of data are too big and unstructured and have never ended up in these systems partly because you can’t anticipate how you’ll want to use that data. And, in this world, when you change your questions you have to eat up 6-12 months having to re-architect the data to ask a new question. And, we all know that just doesn’t work.
On the right side, the analogy is that Hadoop is that large cardboard box in your garage where you can throw a bunch of stuff into it. And, it’s not just junk. These are diverse datasets that will help you answer the right types of questions immediately.
The Hadoop foundation just gives you the box. The value is when you make that box flexible and iterative. And, that’s what Big Data Analytics does.
Uses Hadoop as a tool in the toolbox, Doesn’t use it to reduce costs
Used to Augment existing architecture
Enables correlation of multi-structure data
Fundamental Shift happening
Moving away from focusing on cost reduction
So, the other side of the challenge is that the Business Analyst, or in this case the Tableau user, is focusing on just the tip of the iceberg. The proposition of Tableau holds here. As a business Analyst, I can go into a self-service tool and visualize my data. It’s simple and easy.
But the reality is that you’re only seeing 12% of the data. The other 88% is being ignored. This is OK if you’re asking the simple siloed questions. Except...
Except… if you want to iterate on your questions, you have to spend 18 months of IT work to distill this down and push the interesting data and the questions that are going to be important above the water line. And, that’s the classic problem. The IT work up front delays the time to value for the Business Analyst. But that’s nothing new. So what’s really changing here?
Well when you’re working with MSD:
The iceberg grows exponentially larger. Volume itself becomes a challenge in its own right. Even just knowing what you have becomes very difficult. So that means that both IT and business analysts carry the burden.
And, at the same time, the questions that you want to ask don’t stay above the water line anymore. Before I could look at “Sales by region” and I could pre-can what was going to be important. But now, I might find an interesting segment of people that purchase a product on my website after the second ad and I want to now look at that group of people and see how they engaged, how did they consume different product lines, and how did they behave in social challenges, etc. So, I’m looking at patterns of behavior against a group of people I’ve just selected and identified. This simply isn’t possible in the above the water line scenario. Months of work for each insight and our questions will probably never get answered.
So the key to analyzing MSD that is not just a nice-to-have but an absolute must have is that the Business Analyst workflow goes all the way to the bottom of the iceberg. So for the first time, they’re able to ask the questions that matter, the questions that weren’t possible before.
This is the litmus test for any solution that claims to be able to tackle MSD. You have to ask “Can the Business Analyst go end to end without having to involve IT”? Don’t get me wrong, we love IT. They play a critical role here but they shouldn’t be in the Business Analyst Workflow.
And Platfora creates the opportunity to ask new questions iteratively, against 100% of your data. Creating new answers to questions not possible before.
In today’s Multi-Structured Data world where data is coming in many forms, sizes, and at greater speed, you need a true Big Data Analytics solution. Platfora allows you to gain a holistic view of your customer’s journey by letting you:
Understand your audience better than you ever have before. You can analyze a wide variety of audience’s characteristics with advanced segmentation analysis. Platfora uncovers hidden relationships and drivers that lie behind the standard attributes normally used for segmentation, enabling deep behavioral analysis that gets at why your audience makes the choices they make.
Ask unlimited questions against your data. Platfora is a self-service, iterative, and fast platform that encourages you to ask new questions about your data. Platfora’s intuitive visual interface makes it easy for marketing professionals to follow hunches, test theories, and basically just keep refining their search until they find exactly what they are looking for – all with no coding required.
Enjoy the benefits of economies of scale. With Platfora, you can analyze 100% of your data, enabling you to identify more accurate insights over time while giving you the confidence that you’re making the right decisions. Best of all, you can get up and running with Platfora within days vs. months.
Today, companies are capturing information about customers at every touchpoint but the reality is that most companies are working with siloed marketing data because they’re using disparate tools to track online, offline, web, social, mobile, and advertising data.
Yesterday’s BI solutions might offer beautiful visualizations but the reality is that they only provide insights based on a subset of data. This can be misleading and dangerous as they force big data to become small. According to Forrester Research, “Most companies estimate they're analyzing a mere 12% of the data they have…missing out on data-driven insights hidden inside the 88% of data they're ignoring.”
This makes it really difficult to understand your audience.
The new world of data is multi-structured. What does that mean? It means that data is coming in new forms, in greater size, and at increasing speed.
Let’s start with the kind of data that most companies are generating. So much of the conversion around big data, data warehousing, and BI is very much around the traditional data that ends up in databases which is transactional data (records of point of sale, CRM, supply chain, etc.). And, that’s about 10% of the data in most companies. So where is the other 90% coming from? Two exploding classes of data. One is Customer Interaction data (all the digital touch points of a customer like web, mobile, social, ads, etc) and another is Machine data (IoT, sensors, security logs, etc).
We all understand that there’s more and more data being generated, more types of data across an organization and the data landscape inside any company is becoming more complicated.
The bigger questions that we want to answer aren’t solvable by today’s solutions.
So, for example you might want to ask “How do we increase upsell across our customer base?”
So you would start by asking “What patterns of engagement actually lead to upsell?” but the problem is it’s not about going into one silo to find your answer. It’s about connecting the dots across patterns of behavior across a large set of different touch points. You can’t understand how to answer this question until you understand how these different datasets weave together.
Or, maybe you’re wondering if you’re victim to low-and-slow cyber attacks.
You’re probably using one of the point solutions that are good at finding specific attacks, but you could still be victim to the low-and-slow cyber attacks (i.e. Target).
Again, it’s about seeing patterns across a lot of data over time to be able to identify these low and slow attacks.
Or you might be wondering what your customers really think of your product’s user experience.
This is a great question. One of our large customers was just talking to us about how they include over 20 different datasets in their linear viewing pattern analysis. And to do this you really have to understand how consumers are using their DVRs, the types of shows they’re watching, the movies they’re buying, how they’re responding to advertising, what are they doing on the web, etc.
There’s a lot of complex behavior in there and if you really want to understand the patterns of engagement, the quality of the service, etc you absolutely have to connect the dots.
Segment - Sceenshot of the segment builder summoned from the vizboard
Behavioral - show one of the screen/viz you have on that page already
Event Series - show a viz with a funnel on it with a Dimensional analytic split, include the funnel builder too if you have room
Add New dataset: Screenshot of Step 3 of Datainguest, make sure to have all the row visible
Easy Prepare you Data: Show image of the lens builder on top of viaboard (DONE)
No code Require: Close up on the Vizbuilder in the Vizboard with drop zone and all
Get up and running: show the viz you have at the top (DONE)
Analyze 100% of your data: Show the Data Catalog you have in the middle (DONE)
Eliminate Storage cost: not sure what to show from the product here...
In the beginning everyone wanted to be part of the “Big Data” revolution.
Many people saying they were doing Big Data
Now, it doesn’t matter. The key is Hadoop, and the potential from bringing in new data quickly
NOT “Big Data”…..Hadoop Infrastructure
Start with a business problem
Architecture Mandated
Create Solution
On the Internet of Things side, an example customer is Vivent. They’re unlocking insights around how consumers are using home automation devices and identifying patterns of behavior to drive product improvements:
With Platfora they:
Analyzed data from security and energy sensors inside consumer homes.
Performed limitless segmentation of data and received fast responses to queries.
Identified patterns in customer behavior to improve service, create new offerings.
Cohorts
Combines Customer with Internet of Things
How do we engage customer
Never been able to track customer logs
All info comes off the logs
Never been able to tie logs to anything else
Realized that the devices (iphones, etc) were not devices customers were using
Got engaged with Apple TV, ROKU – Saw an immediate spike in VOD usage
2 services: Live and VOD
Viewership of Live was 5% of overall viewership
Promote Live Viewership / Cannibalization
Icon on
Viewership by Demographic (age group)
Realized viewer profile for
Modern Family vs Good Luck Charlie
12 week
AutoTrader (ATG) tracks automobile searches on their online automobile shopping website. They sell advertising and listings to various dealers (customers) and OEMs (automobile manufacturers). For the Super Bowl, ATG wanted to be able to track Autotrader.com searches as they aligned with automobile commercials being run during the Super Bowl - for example, a commercial for the Audi A3 runs at 8pm, searches on AT.com immediately increase. Therefore, ATG has demonstrable metrics to provide to customers and OEMs to encourage ad sales and inventory placement on AT.com in direct accordance with advertising campaigns. Our project was a great success. We were able to show marked increase in Searches directly correlating to commercial airings. Moreover, we were also able to show the drop off in searches over the following days. And while this was a scenario initially executed for the Super Bowl, ATG saw tremendous value in doing this for other events as well as on a more scheduled basis to drive ad sales/placements and show direct feedback of the value of advertising/listing on AT.com.
Cc marketing
In recent years, Paytronix has experienced a dramatic growth in customer data volumes, ultimately leading them to adopt Hadoop. Providing customers the analysis they need has required using extensive ETL (extract-transform-load) processes to make the Hadoop data available in datamarts for traditional BI (business intelligence) applications. Before long, the highly capable Paytronix team found that they were devoting significant time and resource to moving data around at the expense of performing value-adding analysis for their customers. The complexity of managing the Hive environment, the need to reconcile disparate data formats, and the work cycles built in to managing a standard BI environment all combined to limit the depth of analysis they were able to provide.
But now, with Platfora, all of that has changed. Because Platfora provides analysis directly into Hadoop, Paytronix is no longer burdened with time- and resource-consuming ETL cycles. Meanwhile, Platfora’s intuitive visual interface for self-service analytics has eliminated the need for much of the complex and painstaking Hive programming that previously defined what analysis could (and could not) be done. And by providing access to all of the Hadoop data at once, Platfora provides a much more complete picture than could ever have been available from the piecemeal datamarts. This all adds up to more time and greater capability and flexibility for the Paytronix team:
Platfora is doing something fundamentally different vs. the competition.
We have a lot of unique IP that is both rich and deep up and down the stack. At the heart of it is what we call “Lenses” which is really giving you the ability to change your view within that iceberg and incorporate new datasets on the fly as your questions shift and not have to go back to IT and wait months and months.
And, getting up and running is incredibly easy.
Talk about TUI quote. “Completed in 20 minutes” & “Simply not possible before”
Here’s a workflow that we see that is typical:
Point Platfora at Hadoop – this supports Cloudera, Hortonworks, MapR, Pivotal, Amazon
Add and connect details – Visual catalog and raw data wrangling
Pick interesting data – Automatically drives Hadoop to build/refine memory lens
Visualize & find patterns – Follow patterns down to any level of detail
Share or drive actions – Collaborate, publish or export downstream feed
The key here is that it’s Iterative.