HDP 2.3 introduces new capabilities that improve the user experience, security, and data governance for Hadoop users. Key features include dramatic simplification of administering Hadoop, enhanced encryption of data at rest, and extending data governance with the Apache Atlas metadata service. Atlas provides a scalable metadata service, integration with Hive metadata, and user interface for lineage searches. These new capabilities aim to eliminate complexity and improve productivity for Hadoop users.
TYPE OF ANALYSIS: SQL QUERIES WITH HIVE
+ A MAJOR HOME IMPROVEMENT RETAILER
+ SINGLE VIEW OF ITS CUSTOMERS – “THE GOLDEN RECORD”
+ SINGLE VIEW OF INVENTORY FOR SUPPLY CHAIN OPTIMIZATION
+ AND ALSO A SINGLE VIEW OF ITS AND COMPETITORS PRICES
+ LOW COST OF STORAGE = MORE DATA RETAINED FOR LONGER
+ LONGER RETENTION POWER = MULTIPLE “SINGLE VIEWS”
TYPE OF ANALYSIS: STREAM ANALYSIS
A MAJOR PROVIDER OF DIGITAL SECURITY SOLUTIONS CUT ITS TIME PROCESSING THE THREAT LANDSCAPE FROM
FOUR HOURS TO 2 SECONDS, WHICH DRAMATICALLY REDUCED THEIR CLIENTS’ WINDOW OF VULNERABILITY
STATS
+ PROCESSES 105 MILLION LOG EVENTS PER MINUTE
Dynamic availability
A data governance framework in any organization comprises a combination of people, process and technology that are in place to establish decision rights and accountabilities for information. A governance policy defines who can take what actions with what information, and when, under what circumstances, using what methods.
The technology goals for a data governance framework are to provide a platform for a common approach across all systems and data within the organization, Explicitly they need to be:
- Transparent: Governance standards & protocols must be clearly defined and available to all
- Reproducible: Recreate the relevant data landscape at a point in time
- Auditable: All relevant events and assets but be traceable with appropriate historical lineage
- Consistent: Compliance practices must be consistent
Apache Atlas is the only open source project created to solve the governance challenge in the open. The founding members of the project include all the members of the data governance initiative and others from the Hadoop community. The core functionality defined by the project includes the following:
Data Classification – create an understanding of the data within Hadoop and provide a classification of this data to external and internal sources
Centralized Auditing – provide a framework to capture and report on access to and modifications of data within Hadoop
Search & Lineage – allow pre-defined and ad hoc exploration of data and metadata while maintaining a history of how a data source or explicit data was constructed.
Security and Policy Engine – implement engines to protect and rationalize data access and according to compliance policy.
You should be hiring people focused on your unique data and application needs, not support engineers focused on the complicated internals of the data platform.
Many users who started out self-supporting have ultimately come to us to support the platform so they can focus on their application and business needs.
We enable HDP in the market through three types of offerings: 1) software support subscriptions, 2) expert consulting services, and 3) training.
Our primary focus is on our annual support subscriptions for HDP which provide the 24x7 support enterprises expect along with patches, updates, hot fixes, etc. that help keep their mission critical workloads running.
Since we have the most committers working on the dozens of open source projects, we’re uniquely able to:
-- Define and deliver an enterprise-focused roadmap for Enterprise Hadoop
-- Provide customers and partners a direct way to engage with the community to affect that roadmap (you can think of us as the product management function for these projects)
-- And finally, we ensure the patches and updates we make available to our customers are applied to the corresponding open source projects so there are no regressions in future releases of those open source components.
To net out: we enable customer success by listening to their needs and driving innovation into HDP. And our open source model provides the leverage to evolve the technology faster than any single vendor could accomplish alone.