SlideShare une entreprise Scribd logo
1  sur  13
Networking for DevOps
Application Architect and Networking
Traditionally application architect’s foray into networking dealt with solving server IO bottleneck
and offloading the CPU. Virtualization did not change this focus.
Application architects focused on solving the IO bottleneck
in order to minimize waste in CPU cycles. Technologies
such as RSS, LSO and TSO were incorporated into an
intelligent NIC to load balance traffic across multiple cores
in the server and therefore avoid cpu starvation
A parallel focus - driven by cost savings achieved via storage
and ethernet traffic convergence - was converged NIC
(CNIC) which carried storage and ethernet traffic on a single
wire.
Virtualization did not shift the focus away from solving IO
bottleneck. PCIe innovation such as SR-IOV and MR-IOV
were incorporated into CNICs. IOV technologies enabled
vNICs and VM specific offload services such as hypervisor
bypass technologies.
The scale of the application in Web 1.0 world did not
require application architects to focus on network topology,
segmentation and control plane protocols. 3-Tiered
datacenter network was sufficient
VSM Intelligent NIC
- TCP Offload
- Converged Wire
- RSS/LSO
- Flow Classification
N1Kv3
4
Networking Focus of Application
Architect in Web 1.0
2
1
1
2
3
4
Why Care for Network Topology
Today, network plays a critical role in distributed application execution. Two key service assurances
– Latency and Bandwidth – are influenced by network
• Today’s programming frameworks widely
use asynchronous IO, Latency Shifting
(Caching) and message based
communication. These frameworks enable
application logic and data to be distributed
among tens of thousands of servers across
multiple tiers. The nodes within a tier and
across tiers communicate synchronously or
asynchronously over a routed IP network.
• A distributed application execution
environment has to arbitrate the tradeoffs
in latency and bandwidth both of which are
greatly influenced by underlying network
topology and routing control plane.
L1
L2$
P P
L1
DRAM
Disk
DRAM
DiskDRAM
Disk
Rack
Local
Cluster
DRAM
Disk
DRAM
Disk
DRAM
Disk
Distributed System Latency Hierarchy
Local System
Mem: 80ns
Disk: 10ms
Latency
Local Rack
Mem: 200us
Disk: 28ms
Remote Rack
Mem: 500us
Disk: 30ms
DRAM
Disk
Network Topology – Graph Model
Network layout is a combination of chosen topology (design decision) plus chosen technology
(architecture decision). Graph is a concise and precise notation to describe a network topology.
• Crossbar
• Good for small input/output
• Complexity is N^2 where N is number of
input/output
• Number of switches required N^2 –
Problem when N is large.
• Fat tree Clos
• Can be non-blocking 1:1 or blocking x:1
• Characterized as Clos (m,n,r)
• Complexity is (2n+r)*rn = # of switches
• Torus
• Blocking network; but great at scale
• Optimized for data locality
• Good for growth and hybrid networks
• Complexity increases with switch port
count. k/(2logbaseK/2 (N). Where k = port
count and N = number of servers.
• High port count switches are better with
Clos than Tori.
• Direct and Indirect Topologies
• Crossbar, Fat-tree clos are indirect network
i.e. nodes are not part of the network
topology. Torus is a direct network.
1 2 n
x
x x
xx
x
x x x
1
2
n
Crossbar
1n
2n
rn
.
.
.
n x m
1
2
m
.
.
.
1
2
r
.
.
.
n
n
n
Fat-tree
2DTorus
Characterizing Network Performance
Latency =
Sending overhead +
TLinkProp x (d + 1) + (Tr + Ts+ Ta) x d +
PacketSize/BW x (d + 1) +
Receiving overhead
where
d = number of hops
Tr = Switch routing delay
Ta = Switch arbitration delay
Ts = Switch switch delay (pin2pin)
TLinkProp = Per link propagation delay
Effective Bandwidth = min-of (
N * BWIngress, s * N,
r * (BWBisection/g) ,
s * N * BWEgress )
where
s is the fraction of traffic that is accepted
r is the network efficiency
g is fraction that crosses bi-section
Network Performance
• Port Buffers directly affect “s”. Port buffers
sized to the length of the link optimizes “s”
and can be assumed to be s =1 .
• g is directly correlated to application traffic
pattern. A well distributed application will
max out the BWbisection
• r – network efficiency is a function of
multiple factors. The most prominent is link
and routing efficiency i.e. control plane.
• Effective bandwidth is the bandwidth
between user and application i.e. north
south. Bisectional bandwidth is the
minimum bandwidth between two nodes
i.e. east-west
Network topology affects the hop count i.e. paths through the network and therefore Bi-sectional
bandwidth and Latency. Application traffic patterns drives the rest of the performance metrics.
Traditional Datacenter Network
Traditionally datacenter networks were optimized to remove bottlenecks in the north-south traffic
i.e. optimize Effective Bandwidth. However, that architecture is not suitable for a distributed
application that has dominant flows that traverse east-west
Main Issues with this Architecture
• Topology is single rooted tree with single span/path
between source and destination, which causes bi-
sectional bandwidth to be much lower than effective
bandwidth i.e. no multiple paths
• Traffic among servers is 4X or higher compared to
traffic in/out of datacenter
• Not optimized for small flows. Observed flows inside
datacenter are short with 10-20 flows per server
• Adaptive routing is not fast enough. Optimization
requires complex L2/L3 configuration
• Ratio of bandwidth between memory/disk and CPU
to bandwidth between servers at all time high. This
hurts distributed computing which use the inter
server bandwidth
CoreAggregationAccess
Traditional Datacenter
Single L2 Domain
L3 Boundary
Changing Traffic Pattern in Datacenter
The observed ratio of north south traffic coming into a web application to traffic that is generated
inside the datacenter to serve the incoming session is observed to be 1:80 and higher
Web App
GUI Layer
Bus. Logic
Layer
session
cache
North-South
Traffic
Public Profile
WebApp
External Ad
Server
Internal
Private
Cloud
http-rpc or
Jms calls
Profile Service
Messenger
Service
Groups
Service
News
Service
Search
East-West
Traffic
r/o
r/w
r/w
r/w
Replicated
DB
1:80
Core
DB
write
DB Server
updates
Update
Server
Graph
Updates
Profile
Updates
JDBC
etc.
Datacenter Fabric
Industry took two approaches to scale the datacenter network: Overlays and
Interconnects.
• Issues that Overlays Address
• Multi-tenant Scalability
• VM mobility
• Virtual Network Scalability
• VM Placement
• Virtual to Physical and Virtual to Virtual
communication scalability
• Asymmetry of network innovation between
physical and virtual world
• What is not addressed by Overlays
• Standard way to terminate tunnel on the
hypervisor and physical switch
• Mapping between the virtual addresses and
physical addresses. (who fills that table at the
border gateway?)
• Network flooding (ARP and L2 Multcast)
• Topology unware and unoptimized
• Compatibility with ECMP
• Inter- datacenter traffic mobility
• Trombone because L2 focus of overlays
• Future proofing with SDN
Overlays should address the challenges
presented by
a. Highly distributed virtual applications such as
Hadoop/Bigdata. Where an application can span
multiple physical and virtual switches. Any overlay
tunnel should support both virtual and physical
endpoint
b. Sparse and intermittent connectivity of virtual
machines. The access switch may drop in/out of
participating in the virtual network
c. VMs are dynamic. VM creation, deletion,
Suspend/Resume cycles present a challenge for
network
d. Should work with existing physical switches without
software upgrade. Only the first hop that
add/removes packet markings should be required
new purchase
e. Failure domains should be limited to tunnel endpoints
f. Define multiple administrative domains
Datacenter Overlay Landscape
Overlay Technologies Adjacency Pros Cons
Fabric Path L2 - vPC Support
- ECMP upto 256
- Faster Convergence
- Multiple L2 VLAN
- No inter DC
- Needs ASIC Support
- Not vm aware
- No support for FCoE
TRILL L2 - Unlimited ECMP
- SPF delivery of unicast
- Fast convergence
- No inter DC
- Needs ASIC Support
- New Tools OA&M
- No vm aware
Shortest Path Bridging
(802.1aq)
L2 - Support for existing ethernet data
planes standards .ah and .ad
- Unicast/multicast
- Faster Convergence
- 16 way ECMP only
- Limited market traction
- Not vm aware
VXLAN L2 - MAC-in-UDP with 24 bit VNI
- Scalable
- Enables virtual L2 segments
- Lacks explicit control plane
- Requires IP Multicast
- Needs ASIC support
- Virtual tunnel endpoint only
NVGRE (Microsoft) L2 - GRE Tunnels
- Most asics have support for GRE
- Does not leverage UDP so out packet
headers cannot be leveraged
OTV/LISP L2 - Datacenter Interconnect - Limited platform support
Vpn4dc L3 - Proposed by service provider - Not much vendor support
There are multiple competing standards for overlays i.e. using L3 network infrastructure
to solve L2 scalability problems.
Datacenter Fabric – Programmatic View
The management plane offers DevOps the opportunity to influence the path of their
application data over the network. It is also the plane used by Cloud Controllers to
provision resource along that path
• Thus far, applications adapted to a network. With
the new management plane, the network can
adapt to the application.
• Intelligence shifts to the edge of the network.
Application can use APIs to probe networks and
alter their consumption and constraints.
• Policy definition points can analyze network data
to create patterns which drive policy creation
tools e.g. triangulating privacy zone, sampling at
100Gbs rates etc
• The network comes under pressure to scale
up/down to application needs. All the datacenter
fabric technologies aim to enable this elasticity in
the network.
Physical Switch
OpenStack
Virtual Switch
Server
VM
Compute
Service
Storage
Service
API
Network
Service
DevOps
Cloud
Controllers
Network
virtualization
technologies such
as FP, TRILL,
VXLAN, NVGRE,
SPB play here
Virtual Networking
Industry has a few competing virtualization stacks. The components may be different
but the networking issues are similar for DevOps
Components DevOps Needs to be aware of this embedded networking
functionality …
Hypervisor - Implements the v-switch. Examples of virtual switches include
Cisco N1Kv, OpenSwitch etc.
- Initiates the Vmotion which requires L2 adjacency i.e. within a
VLAN
- Challenges in scaling L2 across datacenter (DCI)
Virtual Switches - VLAN Capable
- Port group associated with VLAN
- Host processor does packet processing
- Challenges includes trunking of links between switch and
server
- Mapping server VLANS (in hypervisor) to physical switch VLANs
- Size of VLAN is increasingly becoming an issue. Being resolved
through encapsulation of L2 frames inside L3 (VXLAN, NVGRE,
FabricPath, TRILL)
Virtual NICs - Increasing getting intelligent with hardware assisted vNIC.
- Offloading to assist in TCP latency
- Teaming to increasing bandwidth into server
- Multi-tenancy with FEX (adapter and VM)
Cloud
Orchestration
Directors
- What changed is scalability and integration with external
orchestration systems
- Distributed Virtual Switches (across servers) presented
coordination challenges. Single control point are called
directors.
- Each hypervisor in a cluster continues to switch at L2
independently i.e. data paths are not centralized
Physical
Network
Virtual
Machines
Virtual
Switch
Virtual
Servers
Management
Center
VM vFW vSLB
vWAAS
Virtual Networking Basics
Software Defined Networking
Host based Centralized Controller
Orchestration
Topology
Director
Physical Network
Component Description
Directors - Directors for orchestration and topology –
need to scale. Topology graph needs to scale
for MSDC datacenter. What is the storage
model (asset inventory, configuration etc.)
- No explicit DevOps support i.e. no server and
tooling for developers
Controller - Centralized controller is yet to be proven for
datacenter class. deployment . Issues remain
around scalability, redundancy, security etc.
- Theoretically good for large scale tables, but
does not solve per device overflow
- Programmability comes at cost of
configuration latency
Physical Network Existing network with support for OpenFlow
Control
Plane
Mgmt Plane
Data Plane
Featur
es
Fwdin
g
Switch
Control
Plane
Mgmt Plane
Data Plane
Featur
es
Fwdin
g
Switch
Control
Plane
Mgmt Plane
Data Plane
Featur
es
Fwdin
g
Switch
SDN decouples control plane from the data plane with a yet to be proven assumption
that the economics of the two planes are distinct
Note: Software defined network is different from software
driven network. The latter is applications using available
APIs to provision the network services for higher level SLA
such as reservation, security etc.
HyperScal eDatacenter
HSDC address scale out networking requirement of very large datacenters with 100K+
hosts. Innovations are targeting four key areas
• Topology
- To overcome limitations of traditional tree, folded-clos
inspired topologies are used.
- Some topologies include ToR as leaf node while some
other like Bcube include host based software switches as
leaf
• CPU vs. ASIC
- Switch microarchitecture based on merchant silicon
implements clos inside the switch. Infiniband started this
trend in early 2000s.
- MSDC is biased towards merchant silicon, even though no
compelling feature has been identified
• Layer 2 vs. Layer 3
- FabricPath and TRILL scale the layer-2 network through
encapsulation of mac inside IP packet. Others protocols prefer
IP inside IP to scale the network. E.G Cisco’s Vinci
• Multipath Forwarding
- ECMP based static hash based load balancing has
increased TCP layer latency. New proposals to introduce
dynamic traffic engineering are being discussed.

Contenu connexe

Tendances

1. RINA motivation - TF Workshop
1. RINA motivation - TF Workshop1. RINA motivation - TF Workshop
1. RINA motivation - TF WorkshopARCFIRE ICT
 
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) ToolsA Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) ToolsVishal Sharma, Ph.D.
 
RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017ARCFIRE ICT
 
The Transport Layer
The Transport LayerThe Transport Layer
The Transport Layeradil raja
 
High Speed Networks - Applications in Finance
High Speed Networks - Applications in FinanceHigh Speed Networks - Applications in Finance
High Speed Networks - Applications in FinanceOmar Bashir
 
Computer networks comparison of aodv and olsr in ad hoc networks
Computer networks   comparison of aodv and olsr in ad hoc networksComputer networks   comparison of aodv and olsr in ad hoc networks
Computer networks comparison of aodv and olsr in ad hoc networksAli Haider
 
RINA Distributed Mobility Management over WiFi
RINA Distributed Mobility Management over WiFiRINA Distributed Mobility Management over WiFi
RINA Distributed Mobility Management over WiFiARCFIRE ICT
 
Network Fundamentals: Ch4 - Transport Layer
Network Fundamentals: Ch4 - Transport LayerNetwork Fundamentals: Ch4 - Transport Layer
Network Fundamentals: Ch4 - Transport LayerAbdelkhalik Mosa
 
An overview of SDN & Openflow
An overview of SDN & OpenflowAn overview of SDN & Openflow
An overview of SDN & OpenflowPeyman Faizian
 
Connection( less & oriented)
Connection( less & oriented)Connection( less & oriented)
Connection( less & oriented)ymghorpade
 
Unit 3 Network Layer PPT
Unit 3 Network Layer PPTUnit 3 Network Layer PPT
Unit 3 Network Layer PPTKalpanaC14
 
Differentiated Classes of Service and Flow Management using An Hybrid Broker1
Differentiated Classes of Service and Flow Management using An Hybrid Broker1Differentiated Classes of Service and Flow Management using An Hybrid Broker1
Differentiated Classes of Service and Flow Management using An Hybrid Broker1IDES Editor
 

Tendances (20)

Data Center Network Multipathing
Data Center Network MultipathingData Center Network Multipathing
Data Center Network Multipathing
 
1. RINA motivation - TF Workshop
1. RINA motivation - TF Workshop1. RINA motivation - TF Workshop
1. RINA motivation - TF Workshop
 
Ijariie1150
Ijariie1150Ijariie1150
Ijariie1150
 
Network layer tanenbaum
Network layer tanenbaumNetwork layer tanenbaum
Network layer tanenbaum
 
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) ToolsA Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
 
Ch4 net layer network
Ch4 net layer networkCh4 net layer network
Ch4 net layer network
 
RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017
 
Network Layer
Network LayerNetwork Layer
Network Layer
 
The Transport Layer
The Transport LayerThe Transport Layer
The Transport Layer
 
High Speed Networks - Applications in Finance
High Speed Networks - Applications in FinanceHigh Speed Networks - Applications in Finance
High Speed Networks - Applications in Finance
 
Computer networks comparison of aodv and olsr in ad hoc networks
Computer networks   comparison of aodv and olsr in ad hoc networksComputer networks   comparison of aodv and olsr in ad hoc networks
Computer networks comparison of aodv and olsr in ad hoc networks
 
RINA Distributed Mobility Management over WiFi
RINA Distributed Mobility Management over WiFiRINA Distributed Mobility Management over WiFi
RINA Distributed Mobility Management over WiFi
 
Network Fundamentals: Ch4 - Transport Layer
Network Fundamentals: Ch4 - Transport LayerNetwork Fundamentals: Ch4 - Transport Layer
Network Fundamentals: Ch4 - Transport Layer
 
SmartFlowwhitepaper
SmartFlowwhitepaperSmartFlowwhitepaper
SmartFlowwhitepaper
 
An overview of SDN & Openflow
An overview of SDN & OpenflowAn overview of SDN & Openflow
An overview of SDN & Openflow
 
C2C communication
C2C communicationC2C communication
C2C communication
 
Connection( less & oriented)
Connection( less & oriented)Connection( less & oriented)
Connection( less & oriented)
 
Network Layer
Network LayerNetwork Layer
Network Layer
 
Unit 3 Network Layer PPT
Unit 3 Network Layer PPTUnit 3 Network Layer PPT
Unit 3 Network Layer PPT
 
Differentiated Classes of Service and Flow Management using An Hybrid Broker1
Differentiated Classes of Service and Flow Management using An Hybrid Broker1Differentiated Classes of Service and Flow Management using An Hybrid Broker1
Differentiated Classes of Service and Flow Management using An Hybrid Broker1
 

Similaire à Link_NwkingforDevOps

Presentation on Data Center Use-Case & Trends
Presentation on Data Center Use-Case & TrendsPresentation on Data Center Use-Case & Trends
Presentation on Data Center Use-Case & TrendsAmod Dani
 
Traffic Optimization in Multi-Layered WANs using SDN
Traffic Optimization in Multi-Layered WANs using SDN Traffic Optimization in Multi-Layered WANs using SDN
Traffic Optimization in Multi-Layered WANs using SDN Infinera
 
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...Radisys Corporation
 
SDN and NFV Value in Business Services
SDN and NFV Value in Business ServicesSDN and NFV Value in Business Services
SDN and NFV Value in Business ServicesAlan Sardella
 
Disadvantages And Disadvantages Of Wireless Networked And...
Disadvantages And Disadvantages Of Wireless Networked And...Disadvantages And Disadvantages Of Wireless Networked And...
Disadvantages And Disadvantages Of Wireless Networked And...Kimberly Jones
 
Cloud interconnection networks basic .pptx
Cloud interconnection networks basic .pptxCloud interconnection networks basic .pptx
Cloud interconnection networks basic .pptxRahulBhole12
 
4th SDN Interest Group Seminar-Session 2-3(130313)
4th SDN Interest Group Seminar-Session 2-3(130313)4th SDN Interest Group Seminar-Session 2-3(130313)
4th SDN Interest Group Seminar-Session 2-3(130313)NAIM Networks, Inc.
 
Osi model with neworking overview
Osi model with neworking overviewOsi model with neworking overview
Osi model with neworking overviewSripati Mahapatra
 
Osimodelwithneworkingoverview 150618094119-lva1-app6892
Osimodelwithneworkingoverview 150618094119-lva1-app6892Osimodelwithneworkingoverview 150618094119-lva1-app6892
Osimodelwithneworkingoverview 150618094119-lva1-app6892Aswini Badatya
 
Osimodelwithneworkingoverview 150618094119-lva1-app6892
Osimodelwithneworkingoverview 150618094119-lva1-app6892Osimodelwithneworkingoverview 150618094119-lva1-app6892
Osimodelwithneworkingoverview 150618094119-lva1-app6892Saumendra Pradhan
 
Services and applications’ infrastructure for agile optical networks
Services and applications’ infrastructure for agile optical networksServices and applications’ infrastructure for agile optical networks
Services and applications’ infrastructure for agile optical networksTal Lavian Ph.D.
 
An Architecture for Data Intensive Service Enabled by Next Generation Optical...
An Architecture for Data Intensive Service Enabled by Next Generation Optical...An Architecture for Data Intensive Service Enabled by Next Generation Optical...
An Architecture for Data Intensive Service Enabled by Next Generation Optical...Tal Lavian Ph.D.
 
Network Planning & Design: An Art or a Science?
Network Planning & Design: An Art or a Science?Network Planning & Design: An Art or a Science?
Network Planning & Design: An Art or a Science?Vishal Sharma, Ph.D.
 
ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...
ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...
ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...ON.LAB
 
Transport SDN Overview and Standards Update: Industry Perspectives
Transport SDN Overview and Standards Update: Industry PerspectivesTransport SDN Overview and Standards Update: Industry Perspectives
Transport SDN Overview and Standards Update: Industry PerspectivesInfinera
 
Madge LANswitch 3LS Application Guide
Madge LANswitch 3LS Application GuideMadge LANswitch 3LS Application Guide
Madge LANswitch 3LS Application GuideRonald Bartels
 

Similaire à Link_NwkingforDevOps (20)

Presentation on Data Center Use-Case & Trends
Presentation on Data Center Use-Case & TrendsPresentation on Data Center Use-Case & Trends
Presentation on Data Center Use-Case & Trends
 
MPLS ppt
MPLS pptMPLS ppt
MPLS ppt
 
Software Defined Networking: Primer
Software Defined Networking: Primer Software Defined Networking: Primer
Software Defined Networking: Primer
 
Traffic Optimization in Multi-Layered WANs using SDN
Traffic Optimization in Multi-Layered WANs using SDN Traffic Optimization in Multi-Layered WANs using SDN
Traffic Optimization in Multi-Layered WANs using SDN
 
Lan Switching[1]
Lan Switching[1]Lan Switching[1]
Lan Switching[1]
 
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...
 
SDN and NFV Value in Business Services
SDN and NFV Value in Business ServicesSDN and NFV Value in Business Services
SDN and NFV Value in Business Services
 
Disadvantages And Disadvantages Of Wireless Networked And...
Disadvantages And Disadvantages Of Wireless Networked And...Disadvantages And Disadvantages Of Wireless Networked And...
Disadvantages And Disadvantages Of Wireless Networked And...
 
Cloud interconnection networks basic .pptx
Cloud interconnection networks basic .pptxCloud interconnection networks basic .pptx
Cloud interconnection networks basic .pptx
 
4th SDN Interest Group Seminar-Session 2-3(130313)
4th SDN Interest Group Seminar-Session 2-3(130313)4th SDN Interest Group Seminar-Session 2-3(130313)
4th SDN Interest Group Seminar-Session 2-3(130313)
 
Osi model with neworking overview
Osi model with neworking overviewOsi model with neworking overview
Osi model with neworking overview
 
Osimodelwithneworkingoverview 150618094119-lva1-app6892
Osimodelwithneworkingoverview 150618094119-lva1-app6892Osimodelwithneworkingoverview 150618094119-lva1-app6892
Osimodelwithneworkingoverview 150618094119-lva1-app6892
 
Osimodelwithneworkingoverview 150618094119-lva1-app6892
Osimodelwithneworkingoverview 150618094119-lva1-app6892Osimodelwithneworkingoverview 150618094119-lva1-app6892
Osimodelwithneworkingoverview 150618094119-lva1-app6892
 
Wiki2010 Unit 4
Wiki2010 Unit 4Wiki2010 Unit 4
Wiki2010 Unit 4
 
Services and applications’ infrastructure for agile optical networks
Services and applications’ infrastructure for agile optical networksServices and applications’ infrastructure for agile optical networks
Services and applications’ infrastructure for agile optical networks
 
An Architecture for Data Intensive Service Enabled by Next Generation Optical...
An Architecture for Data Intensive Service Enabled by Next Generation Optical...An Architecture for Data Intensive Service Enabled by Next Generation Optical...
An Architecture for Data Intensive Service Enabled by Next Generation Optical...
 
Network Planning & Design: An Art or a Science?
Network Planning & Design: An Art or a Science?Network Planning & Design: An Art or a Science?
Network Planning & Design: An Art or a Science?
 
ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...
ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...
ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...
 
Transport SDN Overview and Standards Update: Industry Perspectives
Transport SDN Overview and Standards Update: Industry PerspectivesTransport SDN Overview and Standards Update: Industry Perspectives
Transport SDN Overview and Standards Update: Industry Perspectives
 
Madge LANswitch 3LS Application Guide
Madge LANswitch 3LS Application GuideMadge LANswitch 3LS Application Guide
Madge LANswitch 3LS Application Guide
 

Link_NwkingforDevOps

  • 2. Application Architect and Networking Traditionally application architect’s foray into networking dealt with solving server IO bottleneck and offloading the CPU. Virtualization did not change this focus. Application architects focused on solving the IO bottleneck in order to minimize waste in CPU cycles. Technologies such as RSS, LSO and TSO were incorporated into an intelligent NIC to load balance traffic across multiple cores in the server and therefore avoid cpu starvation A parallel focus - driven by cost savings achieved via storage and ethernet traffic convergence - was converged NIC (CNIC) which carried storage and ethernet traffic on a single wire. Virtualization did not shift the focus away from solving IO bottleneck. PCIe innovation such as SR-IOV and MR-IOV were incorporated into CNICs. IOV technologies enabled vNICs and VM specific offload services such as hypervisor bypass technologies. The scale of the application in Web 1.0 world did not require application architects to focus on network topology, segmentation and control plane protocols. 3-Tiered datacenter network was sufficient VSM Intelligent NIC - TCP Offload - Converged Wire - RSS/LSO - Flow Classification N1Kv3 4 Networking Focus of Application Architect in Web 1.0 2 1 1 2 3 4
  • 3. Why Care for Network Topology Today, network plays a critical role in distributed application execution. Two key service assurances – Latency and Bandwidth – are influenced by network • Today’s programming frameworks widely use asynchronous IO, Latency Shifting (Caching) and message based communication. These frameworks enable application logic and data to be distributed among tens of thousands of servers across multiple tiers. The nodes within a tier and across tiers communicate synchronously or asynchronously over a routed IP network. • A distributed application execution environment has to arbitrate the tradeoffs in latency and bandwidth both of which are greatly influenced by underlying network topology and routing control plane. L1 L2$ P P L1 DRAM Disk DRAM DiskDRAM Disk Rack Local Cluster DRAM Disk DRAM Disk DRAM Disk Distributed System Latency Hierarchy Local System Mem: 80ns Disk: 10ms Latency Local Rack Mem: 200us Disk: 28ms Remote Rack Mem: 500us Disk: 30ms DRAM Disk
  • 4. Network Topology – Graph Model Network layout is a combination of chosen topology (design decision) plus chosen technology (architecture decision). Graph is a concise and precise notation to describe a network topology. • Crossbar • Good for small input/output • Complexity is N^2 where N is number of input/output • Number of switches required N^2 – Problem when N is large. • Fat tree Clos • Can be non-blocking 1:1 or blocking x:1 • Characterized as Clos (m,n,r) • Complexity is (2n+r)*rn = # of switches • Torus • Blocking network; but great at scale • Optimized for data locality • Good for growth and hybrid networks • Complexity increases with switch port count. k/(2logbaseK/2 (N). Where k = port count and N = number of servers. • High port count switches are better with Clos than Tori. • Direct and Indirect Topologies • Crossbar, Fat-tree clos are indirect network i.e. nodes are not part of the network topology. Torus is a direct network. 1 2 n x x x xx x x x x 1 2 n Crossbar 1n 2n rn . . . n x m 1 2 m . . . 1 2 r . . . n n n Fat-tree 2DTorus
  • 5. Characterizing Network Performance Latency = Sending overhead + TLinkProp x (d + 1) + (Tr + Ts+ Ta) x d + PacketSize/BW x (d + 1) + Receiving overhead where d = number of hops Tr = Switch routing delay Ta = Switch arbitration delay Ts = Switch switch delay (pin2pin) TLinkProp = Per link propagation delay Effective Bandwidth = min-of ( N * BWIngress, s * N, r * (BWBisection/g) , s * N * BWEgress ) where s is the fraction of traffic that is accepted r is the network efficiency g is fraction that crosses bi-section Network Performance • Port Buffers directly affect “s”. Port buffers sized to the length of the link optimizes “s” and can be assumed to be s =1 . • g is directly correlated to application traffic pattern. A well distributed application will max out the BWbisection • r – network efficiency is a function of multiple factors. The most prominent is link and routing efficiency i.e. control plane. • Effective bandwidth is the bandwidth between user and application i.e. north south. Bisectional bandwidth is the minimum bandwidth between two nodes i.e. east-west Network topology affects the hop count i.e. paths through the network and therefore Bi-sectional bandwidth and Latency. Application traffic patterns drives the rest of the performance metrics.
  • 6. Traditional Datacenter Network Traditionally datacenter networks were optimized to remove bottlenecks in the north-south traffic i.e. optimize Effective Bandwidth. However, that architecture is not suitable for a distributed application that has dominant flows that traverse east-west Main Issues with this Architecture • Topology is single rooted tree with single span/path between source and destination, which causes bi- sectional bandwidth to be much lower than effective bandwidth i.e. no multiple paths • Traffic among servers is 4X or higher compared to traffic in/out of datacenter • Not optimized for small flows. Observed flows inside datacenter are short with 10-20 flows per server • Adaptive routing is not fast enough. Optimization requires complex L2/L3 configuration • Ratio of bandwidth between memory/disk and CPU to bandwidth between servers at all time high. This hurts distributed computing which use the inter server bandwidth CoreAggregationAccess Traditional Datacenter Single L2 Domain L3 Boundary
  • 7. Changing Traffic Pattern in Datacenter The observed ratio of north south traffic coming into a web application to traffic that is generated inside the datacenter to serve the incoming session is observed to be 1:80 and higher Web App GUI Layer Bus. Logic Layer session cache North-South Traffic Public Profile WebApp External Ad Server Internal Private Cloud http-rpc or Jms calls Profile Service Messenger Service Groups Service News Service Search East-West Traffic r/o r/w r/w r/w Replicated DB 1:80 Core DB write DB Server updates Update Server Graph Updates Profile Updates JDBC etc.
  • 8. Datacenter Fabric Industry took two approaches to scale the datacenter network: Overlays and Interconnects. • Issues that Overlays Address • Multi-tenant Scalability • VM mobility • Virtual Network Scalability • VM Placement • Virtual to Physical and Virtual to Virtual communication scalability • Asymmetry of network innovation between physical and virtual world • What is not addressed by Overlays • Standard way to terminate tunnel on the hypervisor and physical switch • Mapping between the virtual addresses and physical addresses. (who fills that table at the border gateway?) • Network flooding (ARP and L2 Multcast) • Topology unware and unoptimized • Compatibility with ECMP • Inter- datacenter traffic mobility • Trombone because L2 focus of overlays • Future proofing with SDN Overlays should address the challenges presented by a. Highly distributed virtual applications such as Hadoop/Bigdata. Where an application can span multiple physical and virtual switches. Any overlay tunnel should support both virtual and physical endpoint b. Sparse and intermittent connectivity of virtual machines. The access switch may drop in/out of participating in the virtual network c. VMs are dynamic. VM creation, deletion, Suspend/Resume cycles present a challenge for network d. Should work with existing physical switches without software upgrade. Only the first hop that add/removes packet markings should be required new purchase e. Failure domains should be limited to tunnel endpoints f. Define multiple administrative domains
  • 9. Datacenter Overlay Landscape Overlay Technologies Adjacency Pros Cons Fabric Path L2 - vPC Support - ECMP upto 256 - Faster Convergence - Multiple L2 VLAN - No inter DC - Needs ASIC Support - Not vm aware - No support for FCoE TRILL L2 - Unlimited ECMP - SPF delivery of unicast - Fast convergence - No inter DC - Needs ASIC Support - New Tools OA&M - No vm aware Shortest Path Bridging (802.1aq) L2 - Support for existing ethernet data planes standards .ah and .ad - Unicast/multicast - Faster Convergence - 16 way ECMP only - Limited market traction - Not vm aware VXLAN L2 - MAC-in-UDP with 24 bit VNI - Scalable - Enables virtual L2 segments - Lacks explicit control plane - Requires IP Multicast - Needs ASIC support - Virtual tunnel endpoint only NVGRE (Microsoft) L2 - GRE Tunnels - Most asics have support for GRE - Does not leverage UDP so out packet headers cannot be leveraged OTV/LISP L2 - Datacenter Interconnect - Limited platform support Vpn4dc L3 - Proposed by service provider - Not much vendor support There are multiple competing standards for overlays i.e. using L3 network infrastructure to solve L2 scalability problems.
  • 10. Datacenter Fabric – Programmatic View The management plane offers DevOps the opportunity to influence the path of their application data over the network. It is also the plane used by Cloud Controllers to provision resource along that path • Thus far, applications adapted to a network. With the new management plane, the network can adapt to the application. • Intelligence shifts to the edge of the network. Application can use APIs to probe networks and alter their consumption and constraints. • Policy definition points can analyze network data to create patterns which drive policy creation tools e.g. triangulating privacy zone, sampling at 100Gbs rates etc • The network comes under pressure to scale up/down to application needs. All the datacenter fabric technologies aim to enable this elasticity in the network. Physical Switch OpenStack Virtual Switch Server VM Compute Service Storage Service API Network Service DevOps Cloud Controllers Network virtualization technologies such as FP, TRILL, VXLAN, NVGRE, SPB play here
  • 11. Virtual Networking Industry has a few competing virtualization stacks. The components may be different but the networking issues are similar for DevOps Components DevOps Needs to be aware of this embedded networking functionality … Hypervisor - Implements the v-switch. Examples of virtual switches include Cisco N1Kv, OpenSwitch etc. - Initiates the Vmotion which requires L2 adjacency i.e. within a VLAN - Challenges in scaling L2 across datacenter (DCI) Virtual Switches - VLAN Capable - Port group associated with VLAN - Host processor does packet processing - Challenges includes trunking of links between switch and server - Mapping server VLANS (in hypervisor) to physical switch VLANs - Size of VLAN is increasingly becoming an issue. Being resolved through encapsulation of L2 frames inside L3 (VXLAN, NVGRE, FabricPath, TRILL) Virtual NICs - Increasing getting intelligent with hardware assisted vNIC. - Offloading to assist in TCP latency - Teaming to increasing bandwidth into server - Multi-tenancy with FEX (adapter and VM) Cloud Orchestration Directors - What changed is scalability and integration with external orchestration systems - Distributed Virtual Switches (across servers) presented coordination challenges. Single control point are called directors. - Each hypervisor in a cluster continues to switch at L2 independently i.e. data paths are not centralized Physical Network Virtual Machines Virtual Switch Virtual Servers Management Center VM vFW vSLB vWAAS Virtual Networking Basics
  • 12. Software Defined Networking Host based Centralized Controller Orchestration Topology Director Physical Network Component Description Directors - Directors for orchestration and topology – need to scale. Topology graph needs to scale for MSDC datacenter. What is the storage model (asset inventory, configuration etc.) - No explicit DevOps support i.e. no server and tooling for developers Controller - Centralized controller is yet to be proven for datacenter class. deployment . Issues remain around scalability, redundancy, security etc. - Theoretically good for large scale tables, but does not solve per device overflow - Programmability comes at cost of configuration latency Physical Network Existing network with support for OpenFlow Control Plane Mgmt Plane Data Plane Featur es Fwdin g Switch Control Plane Mgmt Plane Data Plane Featur es Fwdin g Switch Control Plane Mgmt Plane Data Plane Featur es Fwdin g Switch SDN decouples control plane from the data plane with a yet to be proven assumption that the economics of the two planes are distinct Note: Software defined network is different from software driven network. The latter is applications using available APIs to provision the network services for higher level SLA such as reservation, security etc.
  • 13. HyperScal eDatacenter HSDC address scale out networking requirement of very large datacenters with 100K+ hosts. Innovations are targeting four key areas • Topology - To overcome limitations of traditional tree, folded-clos inspired topologies are used. - Some topologies include ToR as leaf node while some other like Bcube include host based software switches as leaf • CPU vs. ASIC - Switch microarchitecture based on merchant silicon implements clos inside the switch. Infiniband started this trend in early 2000s. - MSDC is biased towards merchant silicon, even though no compelling feature has been identified • Layer 2 vs. Layer 3 - FabricPath and TRILL scale the layer-2 network through encapsulation of mac inside IP packet. Others protocols prefer IP inside IP to scale the network. E.G Cisco’s Vinci • Multipath Forwarding - ECMP based static hash based load balancing has increased TCP layer latency. New proposals to introduce dynamic traffic engineering are being discussed.

Notes de l'éditeur

  1. SR-IOV is a specification that allows a PCIe device to appear to be multiple separate physical PCIe devices – for a single server. MR stands for MultiRoot and therefore MR-IOV is for multiple servers RSS, Receive side scaling, spreads the incoming packets across the multiple available core/cpus. Traditionally, the core0 received all incoming traffic and therefore became the bottleneck LSO, Large Send Offload,
  2. If you look at a distributed system as a hierarchy of stores (SRAM, DRAM and DISK) then the art or science of distribution of a network application boils down to managing the latency and bandwidth offered to an executing thread at different points in the distributed application. For example a thread executing has to make a tradeoff between a local cached line vs fetching from the memory. The tradeoff is arbitrated or managed by the local operating system. Similarly for a trade off between reading/writing from a local disk or a remote disk is arbitrated or managed by a network operating system in cohorts with the application infrastructure manager. This is when the topology can serve as a catalyst or as a inhibitor. Knowing the topology and therefore the capabilities or biases of the network operating system enables a devops/application architect to design better systems.
  3. Network layout – that which is actually deployed – is a combination of chosen topology (design decision) plus chosen technology (architecture decision). The best way to analyze or design a topology is using graph theory. CLOS Classic paper on clos: BlackWidow: High-Radix Clos Networks, S. Scott, D. Abts, J. Kim, W.J. Dally rn inputs, rn outputs (Note: in a switch the ports are bidirectional so the graph looks like it has even stages. Clos always has odd stages i.e. 3, 5, 7, ….) So rn = switch port count 2rnm + mr2 switches (this is less than r2n2, the complexity for crossbar) Let m = n (non blocking) then you have rn inputs 2rn2 + nr2 switches = (2n + r)rn (a crossbar is rn2 switches) Optimal choice of n and r? depends. For N3064, n = 32, R = number of leafs = number of spines. (assumption m = n) Proof for clos is through mathematical induction i.e. it is true for all n if it is true for n =1, n-1 and n+1; When n =1 and r =1 or C(1,1,r) Clos trivializes to crossbar. For higher stages i.e r >1 we have the following C(1) = N2 switches (crossbar) C(3) = 6N3/2 – 3N C(5) = 16N4/3 – 14N + 3N2/3 C(7) = 36N5/4 – 46N + 20N3/4 – 3N1/2 C(9) = 76N6/5 – 130N + 86N4/5 – 26N3/5 + 3N2/5 This says, we need to use more stages to scale the network higher i.e. bad idea to increase N (port count), M = spine width.
  4. Switch microarchitecture optimized to improve s, r, and g To make s = 1, buffer organizations to mitigate HOL blocking. r is optimized throught design of pipelining, queuing, routing, and arbitration within a switch boundary. r is calculated as r = rL x rR x rA x rS x rmArch x …
  5. Traffic matrix analysis research shows that it is difficult to summarize the patterns, the patterns are not repeating and unpredictable i.e. difficult to optimize. Failure analysis research shows that they are small in size but long in duration. Failures are mostly small in size (50%, < 4 devices, 95%, < 20 devices) Downtimes can be significant: 95% < 1min, 98% < 1hr, 99.6% < 1 day, 0.09% > 10 days With 1:1 redundance, 0.3% of failures in all redundant components. Use n:m redundancy
  6. Study of ARP in a datacenter http://www.nanog.org/meetings/nanog52/presentations/Tuesday/Karir-4-ARP-Study-Merit%20Network.pdf
  7. TRILL vs FP http://www.networkworld.com/community/blog/full-tilt-boogie-networking-cisco’s-fabricpat http://tools.ietf.org/html/draft-sridharan-virtualization-nvgre-00 http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-00 Both vxlan and nvgre use a scalable L3 network infrastructure to scale L2. However, they do not solve the security issues. One can still poison the arp cache, when on physical network one can spoof the network. Also, there is no support external physical VPN4DC lets VPN clients to connect to their leased or purchased computing resources in public data centers via their own VPNs. http://tools.ietf.org/html/draft-so-vpn4dc-00 There is L2-LISP but could not figure out how anything LISP related is overlay Juniper’s Qfabric and Brocade’s VCS are not mentioned as they are not on standards path or Cisco
  8. Virtual networking depends heavily on the hypervisor selection but the end goal is the same i.e. connect the virtual machines to each other and outside world using virtual + physical networking. Removing the disparity between network services consumed by a physical server and that consumed by a virtual server was the initial focus of innovations in the virtual networking space. Recently the focus has shifted to (a) scaling the virtual network (b) enable hybrid networks where physical and virtual resources co-exist in a policy domain.
  9. SDN decouples the control plane from data plane under the assumption that a separate control plane will follow a different economic curve from data plane. More specifically, control plane will follow the curve of server economics and data plane will follow the plane of commodity low end networking gear. The faults are already surfacing when we discuss the scalability of overlay protocols like vxlan and nvgre. Both of them require ASIC support i.e cannot be run faster in merchant silicon.