Human Factors of XR: Using Human Factors to Design XR Systems
Making Sense of NoSQL and Big Data Amidst High Expectations
1. Making Sense Of NoSQL And Big Data Amidst
High Expectations
Filed in Cloud Industry Insights by Gerardo Dada | September 6, 2012 3:30 pm
In the last six months there has been a dramatic increase in interest for NoSQL and Big Data. You probably
have heard “NoSQL is the future of databases,” or that “Big Data is a key technology that will allow
businesses to get much smarter.”
Analysts are bold in their predictions. Gartner, for example, predicts[1] that “Big Data will deliver
transformational benefits to enterprises within two to five years, and by 2015 will enable enterprises
adopting this technology to outperform competitors by 20% in every available financial metric.”
In the same report, Gartner places Big Data near the “Peak of Inflated Expectations” in the hype cycle, which
can be defined[2] as a phase that generates high amounts of enthusiasm and unrealistic expectations (i.e. what
most people would call a buzzword). Given the current hype, it is useful to take a step back and understand
where these technologies can be useful and try to distinguish hype from reality.
One aspect of the vision for Big Data is related to business intelligence applications, which seek to empower
businesses and organizations to derive intelligence and insights that will enable them to act smarter, resulting
in a significant competitive advantage. New forms of processing are needed to deal with the three core
characteristics of Big Data (from Gartner’s own definition): high volume, high velocity and / or high variety
of data.
Solutions such as Hadoop and NoSQL technologies facilitate storage and analysis of very large, unstructured
data sets that have been challenging to manage with traditional SQL databases. While these technologies
solve a significant part of the burden associated with business intelligence efforts, there are two key problems
that still need to be addressed.
The first problem is that it is still highly complex to source and integrate enterprise data. Extracting, de-
duplicating and correlating data about, say, customers and profitability, continue to be monumental tasks,
particularly because they tend to involve a large number of source databases and information systems with
potentially different definitions for the same piece of information.
The second problem is probably the harder one to solve because it goes beyond technology and into the skills
available to the organization. Having access to an incredible amount of data and the ability to do complex
queries are only part of the problem. To produce business value, one must derive insights from the data and
be able to act on it. For example, marketers seldom act on the data available to them[3]. In my experience,
most marketers (with the possible exception of those companies that extract direct revenue from website
visitors via online retail or advertising) rarely look at web analytics data and therefore fail to act on any
insights that these tools may offer.
Regardless of the challenges, Big Data can be incredibly powerful when properly applied, but it will require
expertise and skills that may not exist today in many enterprises (which is expected with any new
technology). In addition, the tools used to visualize, query and summarize data will need to mature. Given the
interest in this technology, I expect both of the challenges discussed above to be solved quickly by the
industry.
2. What seems to be lacking is a deep understanding of the type of problems Big Data is designed to solve. Big
Data or NoSQL technologies will not replace traditional databases that are designed to maintain relationships
between structured data sets and to perform operations such as transaction processing that require the
ACIDity provided by SQL (Atomicity, Consistency, Isolation and Durability of transactions). SQL databases
will continue to be fundamental technology tools for many, many years.
From a market perspective, Microsoft SQL Server’s revenue is roughly $2.5 billion and grew by 20 percent
in the last year[4]. Meanwhile, the total revenue for NoSQL databases, which, according to The451[5],
reached just $20 million in 2011. The451 expects the total NoSQL market to grow to $215 million by 2015,
which is still less than half the growth in license revenue that SQL Server saw in 2011. I use this comparison
only to highlight the sheer volume of problems that are still the sweet spot for enterprise mission critical
applications backed by relational databases.
The main point is to set the right expectations. As it is usually the case, it is about selecting the right tool for
the job, as GigaOM points out in the article “MongoDB or MySQL why not both?[6]” Because NoSQL
databases give organizations the advantages of scale and flexibility of data structures, they are a good tool for
managing large amounts of data where the relationship between the data elements is less important.
As the Wikipedia article[7] states: “NoSQL database management systems are useful when working with a
huge quantity of data and the data’s nature does not require a relational model for the data structure.” To
choose the right tool for your data problem, you should try to understand the business requirements across
three dimensions of size, variety of the type of data (unstructured versus highly structured data) and velocity
of ingestion and removal of data. In addition, NoSQL databases can often be deployed using commodity
hardware, making it an affordable technology to deploy from a hardware requirements perspective.
I propose that there are three key aspects of NoSQL and Big Data technologies that we should remember:
1. Organizations should choose database technologies based on the business requirements and the
problem at hand[8]. This requires understanding the virtues and challenges of each technology, and to
fight our natural inclination to favor “cool technologies.”
2. No single information management technology is the right solution for all needs, whether it is SQL,
NoSQL or any other. NoSQL and Big Data offer organizations very high value for specific business
and technology problems that require high amounts of varied data types with high velocity of change.
Some examples include log analysis, transaction analysis, very large data sets and many applications
that require computations and analysis that are impractical to perform in a relational database.
3. Most organizations will struggle to realize the utopian vision that Big Data will deliver unlimited
customer insights and “automatic” business value. It is not a panacea. It is still important to ask the
right questions and be able to act on insights, and to develop the right skills across the organization.
Rackspace has been involved in NoSQL technololgies for quite some time[9] (interestingly, a fellow Racker
coined the term NoSQL[10]). Our own IT department has deployed a NoSQL cluster on its own OpenStack
private cloud[11] to provide the business intelligence our management team needs (stay tuned for details).
At Rackspace we operate under a fundamental principle of openness[12], which means that we should
support the technology choices of our customers. Whether you need MySQL as a service[13], SQL
Server[14] on dedicated or cloud infrastructure or a NoSQL cluster using technologies from our partners
(such as Mongo[15] or Infochimps[16]), we aspire to offer you the right tool for your job.
Endnotes:
1. predicts: http://www.forbes.com/sites/louiscolumbus/2012/08/04/hype-cycle-for-cloud-computing-