How Converged Infrastructure Will Change IT Operations Management
Over the past decade, Enterprises have leveraged a shared service model to make IT more cost effective. The emergence of “Converged Infrastructure” and “Fabric-Based Infrastructure” will allow IT to offer purpose driven solutions rather than the function driven solutions of the past. To do this, IT will need to evolve towards more modular designs, rely more on open standards, and rethink their approach to management frameworks.
In this session you will learn:
How converged infrastructure is used to create purpose driven solutions
Why new operational challenges are faced as this new approach is used broadly
What changes need to occur to succeed with this new paradigm
2. Andrew White
Cloud and Smarter Infrastructure Solution Specialist
IBM Corporation
Mr. White has fifteen years of experience designing and managing the
deployment of Systems Monitoring and Event Management software. Prior
to joining IBM, Mr. White held various positions including the leader of the
Monitoring and Event Management organization of a Fortune 100 company
and developing solutions as a consultant for a wide variety of organizations,
including the Mexican Secretaría de Hacienda y Crédito Público, Telmex,
Wal-Mart of Mexico, JP Morgan Chase, Nationwide Insurance and the US
Navy Facilities and Engineering Command.
4. Ground rules for this
session…
• If you can’t tell if I am trying to be funny…
–
GO AHEAD AND LAUGH!
• Feel free to text, tweet, yammer, or whatever
to share with the rest of the attendees
• If you have a question, no need to wait until
the end. Just interrupt me. Seriously… I
don’t mind.
17. Plus we created “Technical Debt” adding
additional complexity unnecessarily.
18. Architecture by Accident
The Humble Start…
Meeting Demand…
The First Bottleneck…
The Second Bottleneck…
Becoming Mission
Critical…
Enabling SOA…
The Fun Begins…
How Did We Get Here?
20. This was always a people problem
According to Gartner:
“…when asked what the single biggest challenge for
companies deploying cloud services, respondents cited
management and operational processes at the top of the list
rather than technical issues… Most legacy management
software and infrastructure solutions are mainly point products
instead of end-to-end solutions and rely on manual processes
and a high level of experience and specialized skill sets. The
lack of integration and collective complexity impact operational
efficiencies, slow business agility, and increase operational
costs for IT environments.”
OUCH!!!
21. What Is a System?
It is a set of interconnected actors that change
over time when they are influenced by other
elements of the system.
Actor
Actor
Actor
Actor
Actor
Actor
Actor
Actor
22. Two Important Properties
• The causal effect between two actors will
always impact the entire system
• Correlation != Causation
23. Systems are Volatile
This properties makes it difficult to control the
behavior of the system. The good news is that
systems are perfect. They always deliver the
optimum result given a specific stimuli.
24. Feedback Loops
Unfortunately feedback has taken on both positive and negative
indications. In reality, positive feedback is not “praise” and
negative feedback is not “criticism.” Positive feedback
reinforces while negative feedback balances.
Profits
Reinforcing
Cost Cutting
Productivity
Balancing
25. The Profit Equation
Business Growth
Profits
Reinforcing
Cost Cutting
Productivity
Balancing
(+)
(+)
(-)
(-)
(+)
26. Mitigating Consequences
Business Growth
(+)
Profits
Reinforcing
Cost Cutting
Productivity
Balancing
(+)
(+)
(-)
(-)
(+)
Leverage IT
27. The Plot Thickens
Business Growth
(+)
Profits
Reinforcing
Cost Cutting
Productivity
Balancing
(+)
(+)
(-)
(-)
(+)
Leverage IT
IT Expense
Sustaining
Engineering
Application
Portfolio
Server Count
Storage
Consumption
IT Developers
Supportability
(+)
(+)
(+)
Complexity
(+)
(+)
(+)
Facilities
(+)
(-)
(+)
(+)
(+)
(-)
(-)
(-)
(+)
(-)
29. We have to get better
• Rigid and aging infrastructure
• Inefficient and unnecessary processes
64% of IT
Spending is “Run
the Engine”
• Application and information complexity are increasing
exponentially requiring more work to maintain
• The portion of IT’s focus on new business capabilities
is decreasing at an increasing rate
72% of IT
Budgets are
OPEX
• Technical debt is being accumulated
• Organizations are separated into “stovepipes” and
technology decisions are heavily influenced by
“religion” and self-serving interest
Personnel
Represent 63%
of IT Expenses
Source: Garter IT Key Metrics Database 2013
32. The CIO Agenda
• Top 3 Areas of Focus
– Reduce the time to deliver value to the business
– Reduce the cost of IT and create cost efficiencies
– Improve Operations by driving simplicity
• Problems to Solve
– How to add scale without adding complexity
– How to support increased consumption with additional cost
– How to deliver these without sacrificing customer experience
33. What a Coincidence!
In 2012, IDC conducted a study titled “Converged Systems: State of the
Market and Future Outlook 2012: Market Analysis” which concluded the top
three reasons for adopting Converged Infrastructure over the traditional
component-based approach are:
1. Time to Service – It needs to respond more quickly to new business
requests
2. Cost Efficiency – Converged Infrastructure delivers overall reduced cost
of ownership through workload consolidation, reduced space, and
reduced power and cooling costs
3. Operations Improvements – Consolidation of vendors, streamlined
support, pre-validated interoperability greatly simplify data center
operations
34. Cleaning Up the Landscape
Adapted from: Akella, Janaki. “IT Architecture: Cutting costs and complexity.” McKinsey Quarterly 13 Nov 2009
https://www.mckinseyquarterly.com/IT_architecture_Cutting_costs_and_complexity_2391
Silo
Monolithic
Framework
Niche
Management Security Business
Continuity
Launch Pad
Information Bus
Management Security Business
Continuity
35. Perceived Value
According to Gartner’s “Market Share Analysis:
Data Center Hardware Integrated Systems,
1Q11-2Q12,” integrated systems:
• Better performance
• Improved cost/performance ratio
• Simplified deployment
• Increased optimization
• Increased automation
• Lower cost of IT operations
• Simplified sourcing and support
• Change in focus from IT maintenance to IT innovation
37. Like any journey…
We have a beginning [What is our product]
and a map [the 4 part plan]
the destination [Software Designed Environment]
something to lighten the load [Converged Infrastructure]
and some required skills [Cost and Capacity Management]
38. The experience of working with IT has
become the product offered to the Business
http://www.flickr.com/photos/anneacaso/3693155059/sizes/l/in/photostream/
40. 1. Understand cost
2. Identify and remove waste
3. Manage to capacity
4. Execute good change management
41. Software Defined Environments provides abstractions of workloads,
services and infrastructure and an end-to-end mappings
Workload Abstraction
Based on pattern and
functional and non-functional requirements
Resource Abstraction
Semantically rich abstractions of heterogeneous
resource capabilities and system components
Mapping to resource
Map requirements to potential system
architectures. Proactively orchestrate
.
infrastructure and workload
Continuous Optimization
Autonomously construct available system
architecture to optimize workload outcome
Agility
Consumability Efficiency
Software Defined Environments
IMG
IMG
IMG Agile Workload
Development Services
Workload Abstraction
Analytics
Map/Reduce
Web 2.0 Pattern
Continuous, Autonomous Mapping
SSD HDD
Tape
Resource Abstraction
PowerVM
x86 KVM
Transactional
J2EE/OLTP
PowerVM
x86 KVM
RDMA
Ethernet
Software Defined Compute, Network and Storage
Agility, Consumability, Efficiency (ACE)
Web
42. Where we are headed
Private cloud
Hybrid IT
Public cloud
Traditional IT and clouds (public and/or private) that
remain separate but are bound together by technology
that enables data and application portability
Traditional IT
On or off premises cloud infrastructure
operated solely for an organization and
managed by the organization or a third party
Available to the general public or a large
industry group and owned by an
organization selling cloud services.
Appliances, pre-integrated systems and
standard hardware, software and networking.
43. Architecture on Purpose
Environments
QA
PROD
Banking Application
Banking Application
Banking Application
DEV
IBM UrbanCode Deploy
OpenStack Heat
IBM Platform Resource Scheduler
NetworkServer
Storage
Application "
Lifecycle
Applications
Heat Orchestration Template (HOT)Heat Orchestration Template (HOT)
OpenStack Heat
IBM Platform Resource Scheduler
NetworkServer
Storage
TEST
IBM Cloud Orchestrator
Public
Dedicated
Traditional Private
IT
Application
template
Infrastructure
template
Hardware
45. Top 5 reasons IT projects fail
1. The inability to challenge assumptions
2. Poor role definitions and unclear priorities
3. A “silo” mentality
4. The unwillingness to compromise
5. A focus on the technology rather than the focus on the solution
46. How many times have we
documented these lessons learned?
• The expectations of the stakeholders were not in
touch with the reality of what IT could deliver
• We underestimated the complexity
• The market changed before we finished
• The request was driven the the perception of a need
and not the reality
• Assumptions were undocumented and requirements
were hastily defined
47. Not having a common
understanding of quality puts more
pain into an organization than
anything else I have ever known.
Philip Crosby, Let’s Talk Quality, 1989
49. ITIL Overview of Capacity Management
Business Objective
IT Strategy
Tactical Processes
Service Desk, Incidents, Problems,
Changes, Releases, Configuration
Strategic Processes
SLM, Finance, Capacity, Availability,
Business Continuity
50. Why capacity?
• This process is typically run ad-hoc (e.g. spreadsheets and
“gut feel”)
• Planning is typically limited to individual silos
Requirements Business Case
Return on
Investment
Total Cost of
Ownership
Availability
Performance
Risk
51. Summary of CLOUD Recommendations
In 2011, The TechAmerica Foundation published a report for the Obama Administration
titled “US Deployment of the Cloud (CLOUD2)
• Need for collaboration &
standardization of data
access across national
borders
• Recommendations in
policy, infrastructure,
and training to help
facilitate broader
adoption of the cloud
• Require vendors to
share relevant
information about their
capabilities, offerings
and service levels
• Ensuring the
combination of factors
that allows consumers
of cloud services to be
confident that the
services are meeting
their computing needs
Trust Transparency
Transnational
Transformation Data Flows
52. CLOUD Recommendations on Trust
Ensuring that the cloud is meeting consumer’s needs for security, privacy, availability
Factors Contributing to Trust
• Transparency of practices
• Accountability
• Resiliency
• Redundancy
• Access and Connectivity
• Supply chain provenance
• Life cycle integrity
• Governance
53. Capacity Sub-Processes
Business Capacity
Service Capacity
Resource Capacity
Application
Sizing
Demand
Mgmt
Capacity Plan Data Warehouse
Iterative
Activities
• Monitor
• Analysis
• Tuning
• Implement
Modeling
• Trend
Analysis
Capacity
Data
Storage
• Business
• Service
• Technical
• Utilization
54. The 3 Needs of the Business
Service Level
Management
Meet the consumer’s expectations for service availability
Performance
Management
Ensure good performance for each consumer’s application
Resource
Optimization
Continuously rebalance resources to limit unnecessary capital expenses
55. Capacity Management at
the Resource Level
• Identify and understand the Capacity and utilization of
each component part of the IT infrastructure
• Recommend optimization of hardware and software
• Measure and store resource usage at a process level
• Identify bottlenecks and potential future problems
• Characterize workloads and business drivers
• Evaluate alternative upgrades to meet workloads
• Proactive rather than reactive
• No surprises in performance or IT budgets
56. Capacity Management at
the Service Level
• Identify and understand the IT services
• Assess their use of resources
• Identify their working patterns, peaks & troughs
• Ensure that SLA targets are viable
• Monitor performance to identify violations
• Resource data aggregated by application
• Pre-empt difficulties wherever possible
• Proactive rather than reactive
57. Capacity Management at
the Business Level
• Published corporate performance objectives
– Standard local metrics defining contribution
– Unification of analytical information
– Improved managers’ business insight
– Greater local accountability via KPIs
– Resource data aggregated by application and then weighted
• Enterprise framework for measurement
– Published Reports and exception reports
– Automated alarms and interpretation
– Interactive Dashboard for alert/drill down
– Predicted outcomes across framework
• Business agility to adjust as necessary
– Strategic modeling to view scenarios
– Ensured focus and drive to growth
– Effective liaison between IT & Management
58. Capacity Management Imperatives
Trending Organic Growth:
Analytic tools to help forecast demand and
identify opportunities for efficiency
Modeling Capacity Consumption:
Leveraging elastic resource capacity can help
delay capital expenditures
Providing Cost Transparency:
Metering allows you to affect behavior through
service pricing and helps control “sprawl”
59. Looking back to a simpler time
Answering “what if” questions…
• Change in technology, demand, etc… impact?
• Focus on Optimizing Server Cost versus Performance
Extremely Technology-centric
• Servers, Mainframes
• Occasionally Storage or Network – in isolation
• Few distributed servers, even fewer critical apps running on them
• No web-based applications or e-commerce
Big Value and Return, but also effort
• Highly trained staff
• Requires building a central, long term repository (CMIS)
• Scalability of Staff, Tools, …, Politics!
• Many analysts, few systems
Capacity planning was Resource-oriented, not Business/Service oriented
60. Capacity Models Used to Be Simple
Capacity
CAPEX Rising Demand Scenario
Consumed
Capacity
Time
Forecasted
Demand
Installed
Capacity
Falling Demand Scenario
Overhead
Downtime
61. A new thought process
In the past, the approach to capacity
was similar to an apartment complex.
Tenants arrive and occupy space for
several years at a time.
Consumption was fairly static and
easy to predict…
… In the future, our approach to
capacity will need to be more like a
hotel. Some tenants may be long
term consumers but most will occupy
the space for a short time and then
vacate. This will make forecasting
demand more difficult.
63. The evolution of cost and capacity
Used Capacity
Allocated Capacity
Useable Capacity
Raw Capacity
Stranded
Capacity
Allocated Capacity:
The sum of all assignments granted to all customers.
Each individual customer is paying for and expects to have
access to their entire assignment regardless of whether it
exists or not.
Usable Capacity:
The capability of the infrastructure after losses to
administration, hypervisors, redundancy, etc.
Rebalancing Threshold:
When the consumption crosses this threshold the
environment is rebalanced. If consumption does
not fall below the threshold then more capacity is
purchased.
Stand Alone
Deployment
Cloud
Deployment
64. Where does oversubscription occur?
Load Balancer!
Corporate!
LANs & VPNs!
Load Balancer!
Firewall!
Switch!
VM Server Farm!
Database!
NAS !
Appliances!
Storage!
Frame!
Web Servers!
Load Balancer!
Common Locations
1. Hypervisor
2. CPU Cycles
3. Memory
4. Blade Backplane I/O
5. SAN Fabric
6. Network Interfaces
7. Host Bus Adapters
8. Backup Device
9. WAN Circuits
10. Storage Processors
!
!
!
!
!
!
!
!
!
!
Here
Here
Here Here
66. The new KPIs
• Buffering Capacity:
The amount of capacity kept in reserve to absorb spikes in demand
• Flexibility vs Stiffness:
The systems ability to restructure itself as used capacity increases
beyond the balancing threshold
• Margin:
The maximum acceptable load before measurable occur to
application performance
• Tolerance:
How the applications behave as the system reaches the margin. This
can be either observed or forced behaviors.
68. The importance of cost management
The “Showback” Model – A Pragmatic Approach
A “showback” system presents individual business units
or projects how much is being spent on cloud services.
An Ideal Chargeback/Showback Cycle:
1. Increase transparency of costs and usage
2. Increase accountability within business units
6. Reduce IT services costs
3. Promote cost-conscious consumption
The “Chargeback” Model – The Ideal
A chargeback system holds business units or projects
accountable for cloud costs. Costs are “charged back” to
units or projects responsible for consumption.
6. Associate costs with actual benefits
5. Improve business/IT alignment
69. Tool Requirements
• The ability to collect performance and resource
consumption monitors for all systems which contribute to
the service
• A repository to warehouse the historical data
• The ability to import cost data and calculate consumption
in Natural Forecast Units
• Provide a facility to generate reports automatically
• Offer a policy engine to direct workload placement and
generate events to trigger a capacity review
• Include a modeling engine that can help forecast
consumption and provide recommendations for
rebalancing
• The tool needs to be “VM-aware”