Jacek Wosz - Wasko
Marek Plaza - Wasko
Rozważania na temat współczesnej architerektury DataCenter, od rozwiązań "out-of the box" do otwartych rozwiązań w oparciu o teorie pół komutacyjnych. Przegląd możliwości, które są dostępne przy projektowaniu sieci łączącej wszystkie elementy, w jeden sprawnie funkcjonujący ekosystem. Od teorii po praktyczną implementację prezentowanych wariantów, pokażemy jak wybrać konkretne rozwiązanie dla własnych zastosowań.
Zarejestruj się na kolejną edycję PLNOG już dzisiaj: krakow.plnog.pl
3. DATA CENTER TIMELINE
L2 + STP + L3 + RVI
MC-LAG
QFabric
Virtual Chassis Fabric
3-Stage
5-Stage Performance
5-Stage Real Estate
VXLAN + EPVN Fabric
Traditional Ethernet Fabric IP Fabric
4. DATA CENTER EVOLUTION
March Towards IP Fabrics
L3
L2
L3
L2
L2 L2 L2
Traditional
Ubiquitous L2/L3
Simple Management
Storage Convergence
Active-Active Forwarding
L3
L2
L3
L2
L3
L2
L3
L2
L3
L2
Ethernet Fabric
Ubiquitous L2/L3
Full Control of App
Overlay Architecture
Hosting Environment
L3 L3
L3 L3 L3
IP Fabric
L3 Only
5. DATA CENTER PROGRESSION
Trending Towards Ethernet Fabrics and IP Fabrics
0%
25%
50%
75%
100%
Mid
Market
F500 T2 SP T1 SP MSDC
Traditional Ethernet Fabric IP Fabric
0%
25%
50%
75%
100%
Mid
Market
F500 T2 SP T1 SP MSDC
Traditional Ethernet Fabric IP Fabric
Today Next Generation
6. INDUSTRY TRENDS
Enterprise DC and Cloud
Public Cloud
XaaS
Iaas
Private Cloud
Business
Critical IT
Cloud-enable Campus
ITaas
Vanilla Enterprise
L3 CLOS
Overlay
L3
L3 CLOS
With
Overlay
L2/L3
•Scale out IP fabric
•Small Blast Radius
•Hyper scale multi-tenancy
•Overlay virtual network
•Virtualized IT
•Low scale multi-tenancy
•Consolidated IT
•Converged storage
•Simplified operation
•Simplified network & ops
•Virtualized network services
7. DECISION TREE
4 Questions to ask
VCF MC-LAG IP FabricQFabric
Do you have E2E
Storage convergence?
NSX or Contrail
Integration?
Interface Types
Port Density
1
2
3
4
* FAVORABLE
Scale
8. JUNIPER ARCHITECTURES
Juniper
Architectures
Open
Architectures
MC-LAG
…
QFX5100
Virtual Chassis
Up to 10 members
QFabric
Up to 128 members
IP Fabric
L3 Fabric
Virtual Chassis
Fabric
Up to 32 members
Benefits
Single point of
management and
control
Purpose-built and
turnkey
Benefits
Flexible deployment
scenarios
Open choice of
technologies and
protocols
One Architecture Does Not Fit All,
QFX5100 enables Choices!
10. 2 Spine Nodes
10GbE 10GbE10GbE
QFX5100-24Q
1 2 30
10GbE 10GbE
3 4
QFX5100-48T
1 2
2 X uplinks
• 30 x 10GbE racks
• 1,440 x 10GbE ports 6:1 OS
10GbE 10GbE10GbE
QFX5100-24Q
1 2 28
10GbE 10GbE
3 4
1 2 3 4
QFX5100-48T
QFX5100-96S4 X uplinks
• 28 x 10GbE racks
• 1,344 x 10GbE ports 3:1 OS
• 2,688 x 10GE ports 6:1 OS
4 Spine Nodes
2 or 4 Spine Node deployments
40GbE 40GbE
11. SERVER AND STORAGE CONNECTIVITY
Any Ethernet Media, High Resiliency,
Flexible deployment
10/100/1000M Copper
10/100/1000M Fiber
10G Copper
10G Fiber
10G or 40G Fabric
Any-port connectivity
In-Service Software Upgrade
n-Way multi-homing
Active-Active paths
Single Point of Management
FCoE Transit
iSCSI / NFS / CIFS
Lossless Ethernet / DCB
Hardware SDN support
Server Storage
QFX5100 QFX5100 QFX5100 QFX5100
12. Integrated Routing Engine (RE)
Inline Control PlaneControl Plane
VCF INTEGRATED CONTROL PLANE
• Dual RE (routing engine) with backup’s
• Distributed In-Band Control plane
• VCCPD running on all members
• Automatic fabric topology discovery
• Loop-free fabric forwarding path construction
• Control traffic protection for converged fabric
Master Backup
13. Intelligent spine and leaf nodes
Federated state
Distributed Forwarding
Data Plane
Backup
RE
• All Fabric links active-active
• Traffic load balanced on all links
• 1.8usec inter rack latency
Master RE
• In rack switching
• 550nsec in rack latency
• 16 way server multi-homing
VCF INTEGRATED DATA PLANE
14. RESILENT CONTROL AND
DATA PLANES
Control Plane Redundancy
Quad RE (routing engine)
redundancy
Resilient In-Band Control plane
GRES, NSR, NSB
c
Data Plane Redundancy
Active-active uplink forwarding
server multi-homing
uplink redundancy
BackupActive Hot-
Backup
Back
up
1RU, 48 SFP+ & 1 QIC
Redundant
Routing Engines
Uplink
redundancy
Server
multi-homing
OvSwitch
Virtual Server
VM VM VM
OvSwitch
Virtual Server
VM VM VM
15. VCF DEPLOYMENT
request virtual-chassis {fabric | [disable]} devices same all-members [reboot]
Provisioning
Setting Mode
Rack and Cabling
Default mode
System
Bring up
VC
Mode
Auto-
Provisioned
Pre-
provisioned
Non-
provisioned
Fabric
Mode
16. VCF DEPLOYMENT METHODS
Auto-
provisioned
• Plug and Play
• Pre-provision Spine Switches using single CLI
• Remaining switches will join VCF automatically as a line card
Pre-
provisioned
• No ambiguity of member role
• All switches will be pre-provisioned into VCF
Non-
provisioned
• Flexible
• Configure VCP ports then regular VC master election will happen
automatically
{set | delete} virtual-chassis {pre-provisioned | auto-provisioned}
17. NG DATA CENTER WITH OVERLAY
S S S S
L L L L L L L L L L L LL L L L
Virtual Chassis Fabric
POD
E1 E2
Single POD – 768 Ports
Small
Data Center
S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF
S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF
S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF
S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF
S3 S4 S6S5
E1 E2
16 PODs – 12,288 Ports
Medium
Data Center
S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF
S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF
S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF
S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF
S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF
S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF
S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF S S S S
L L L L L L L L L L L LL L L L
VCF
S2 S3 S4S1 S6 S7 S8S5
E1 E2 E3 E4
32 PODs – 24,576 Ports
Large
Data Center
19. WHY IP FABRICS?
Three Primary Use Cases
Mid Market F500 T2 SP T1 SP MSDC Mid Market F500 T2 SP T1 SP MSDC Mid Market F500 T2 SP T1 SP MSDC
Underlay
IP Fabric
Server
Hypervisor
VM VM VM
Server
Hypervisor
VM VM VM
Server
Hypervisor
VM VM VM
Overlay
Overlay
Controller
IT-as-a-Service
Software-Defined Data Center
Self-Service
IP Fabric
Physical Server Physical Server Physical Server
Edge / Transit Network
Peers Internet
App App App App App App App App App
Software-as-a-Service
Over-the-Top Web Services
Hyper Physical Scale
VXLAN Fabric – IP Fabric + VXLAN + EVPN
Physical Server – T1 Physical Server – T2 Physical Server – T3
Edge / Transit Network
Peers Internet
App App App App App App App App App
Hosting / IX
Multi-Tenancy
Hyper Logical Scale
23. IP FABRIC TOPOLOGIES
One Size Doesn’t Fit All
S S S S
L L L L L L L L L L L LL L L L
3:1
O/S
3-Stage IP Fabric
S S S S S S S S
L L L L L L L L L L L L L L L L
A A A A A A A A
3:1
O/S
5-Stage IP Fabric
Performance
L L L L L L L L
A A A A A A A A A A A A A A A A
S S S S
3:1
O/S
3:1
O/S
3:1
O/S
3:1
O/S
24:1
O/S
5-Stage IP Fabric
Real Estate / POD
S L ASpine Leaf Access
24. S S S S S S S S
L L L L L L L L L L L L L L L L
A A A A A A A A
3:1
O/S
S L ASpine Leaf Access
5-STAGE IP FABRIC
Single Use Case: MSDC Performance Architecture
vSpine vSpine vSpine vSpine
25. S L ASpine Leaf Access
5-STAGE IP FABRIC
Three Use Cases
L L L L L L L L
A A A A A A A A A A A A A A A A
S S S S
3:1
O/S
3:1
O/S
3:1
O/S
3:1
O/S
24:1
O/S
IT-as-a-Service
Enterprise
Overlay Architecture
Software-as-a-Service
Large Web Services
Over-the-Top
Hosting
Infrastructure-as-a-Service
Overlay Architecture
POD POD POD POD
27. MULTI-STAGE CLOS ROLES
Spine
Leaf
vSpine
Combination of Spine and Leaf
Acts as a logical switch
Virtual peering point for Access
Over-Subscription dependent on the Spine
and Leaf roles
Single BGP Autonomous System Number
Peers via eBGP to access switches
Backplane of multi-stage CLOS
Always 1:1 Over-Subscription
Provide BGP Route Reflection
Peers via iBGP to Leaf nodes
NNI of multi-stage CLOS
Variable Over-Subscription
Peers via iBGP to Spine nodes
Peers via eBGP to Access nodes
Access
Provide access to end-points such as compute and storage
Typically 3:1 Over-Subscription in ENT and SP environments, and 1:1 for HPC
Peers via eBGP to vSpine nodes
Provides L3 gateway services to end-points
Provides Link Aggregation to end-points
29. MULTI-STAGE CLOS BENEFITS
Massive scale – over 73,000x10GE access ports
High performance – variable over-subscription 1:1 to N:1
Pay as you grow – start small and increment 1U at a time
Low latency with fixed switches
Very small “blast radius” upon failures in the network
Standards based deign – supports multiple vendors
Deterministic latency with a fixed spine and leaf topology
Very flexible physical deployments: TOR, EOR, MOR
30. How to quickly create IP clos fabric ?
OpenClos
https://github.com/Juniper/OpenClos
https://techwiki.juniper.net/Automation_Scripting/010_Getting_Started_and_Reference
31. Open Clos structure overview
HTTP Server DHCP ServerTRAP Receiver Database
Centos or Ubuntu w/ Python 2.7
Collection of Python + Conf Files
CLI
https://github.com/Juniper/OpenClos
https://techwiki.juniper.net/Automation_Scripting/010_Getting_Started_and_Reference
32. Read | Write CLI
Read | Write Python Classes and Scripts
Read | Write REST API
How to interact with OpenClos
33. Directories structure
├── jnpr
│ └── openclos/
│ ├── conf/
│ ├── data
│ ├── out
│ ├── script/
│ └── tests/
│ ├── performance/
│ └── unit/
conf/ contains user configuration files in YML and JSON
as well as configuration templates
openclos/ contains all python files
data/ contains the local database if SQLLite is used
out/ contains all generated files (dhcp config, cabling plan etc..)
scripts/ contains scripts used by Network Director
tests/ contains all unit tests (for development)
36. Have inventory of Mac addresses
Console access (recommended, not mandatory)
All spines must have same model
Pre-requirement
1
2
3
37. Create an inventory File for this Fabric
Adapt Configuration templates (optional)
Define the new POD in conf/closTemplate.yaml
Create POD in the Database
Steps to create a new IP Fabric POD
1
2
3
4
5 Generate Cabling Plan and Configurations
41. Pre-ZTP verification
1 – Check if the DHCP configuration looks OK
cat /etc/dhcp/dhcpd.conf
2 – Check if the DHCP server is running
service isc-dhcp-server status (ubuntu)
3 – Check if the REST server is running
ps –aux | grep rest
curl http://localhost/openclos
4 – Check if configurations listed on the DHCP configuration are available
curl http://localhost/openclos/ip-fabric/{pod-id}/devices/{device-id}/conf
42. ZTP verification
1 – On the device
> show dhcp client binding to see if the device has an IP
> Monitor REST server logs to see query from devices coming
48. wasko@openclos:~/OpenClos-master$ vim jnpr/openclos/conf/openclos.yaml
outputDir : out
# Logging level possible values: DEBUG, INFO, WARNING, ERROR, CRITICAL
logLevel :
fabric : INFO
Device family and port names
deviceFamily :
QFX5100-24Q :
uplinkPorts :
downlinkPorts :
ports : 'et-0/0/[0-23]'
QFX5100-48S :
uplinkPorts : 'et-0/0/[48-53]'
downlinkPorts : 'xe-0/0/[0-47]'
ports :
# HttpServer for REST and ZTP.
httpServer :
ipAddr : 10.1.1.114
port : 80
# various scripts; the backup database script is engine specific
script :
database:
backup : script/backup_sqlite.sh
Fabric parameters (openclos.yaml)
49. Global parameters (closTemplate.yaml)
wasko@openclos:~/OpenClos-master$ vim jnpr/openclos/conf/closTemplate.yaml
ztp:
dhcpSubnet : 10.0.2.0/24
dhcpOptionRoute : 10.0.2.254
pods:
# pod name or pod identifier
anotherPod:
spineCount : 2
# possible options for leafDeviceType are QFX5100-24Q
spineDeviceType : QFX5100-24Q
leafCount : 2
# possible options for leafDeviceType are QFX5100-96S, QFX5100-48S
leafDeviceType : QFX5100-48S
hostOrVmCountPerLeaf : 254
interConnectPrefix : 192.169.0.0
vlanPrefix : 172.17.0.0
loopbackPrefix : 10.11.0.0
spineAS : 300
leafAS : 400
topologyType : threeStage
inventory : inventoryPLNOG.json
spineJunosImage : jinstall-qfx-5-flex-13.2X51-D30.4-domestic-signed.tgz
leafJunosImage : jinstall-qfx-5-flex-13.2X51-D30.4-domestic-signed.tgz
50. Let’s start
{master:0}
root@QFX5100> request system zeroize
warning: System will be rebooted and may not boot without configuration
Erase all data, including configuration and log files? [yes,no] (no) yes
warning: ipsec-key-management subsystem not running - not needed by configuration.
warning: zeroizing fpc0
{master:0}
root@QFX5100> Mar 2 04:38:09 init: tnp-process (PID 1230) stopped by signal 17
Terminated
.
Terminated
root@QFX5100:RE:0% Mar 2 04:38:15 init: event-processing (PID 984) exited with status=0 Normal Exit
Waiting (max 60 seconds) for system process `vnlru_mem' to stop...done
Waiting (max 60 seconds) for system process `vnlru' to stop...done
Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining...0 0 0 done
syncing disks... All buffers synced.
Uptime: 9h57m4s
recorded reboot as normal shutdown
unloading fpga driver
unloading host-dev
Shutting down ACPI
Rebooting...
51. Let’s start
wasko@openclos:~/OpenClos-master$ sudo -H python jnpr/openclos/tests/sampleApplication.py
[sudo] password for wasko:
Couldn't import dot_parser, loading of dot files will not be possible.
INFO:fabric:Created pod name: ' PLNOG'
INFO:writer:Writing cabling plan: /home/wasko/OpenClos-aster/jnpr/openclos/out/anotherPod/cablingPlan.json
INFO:writer:Writing cabling plan: /home/wasko/OpenClos-master/jnpr/openclos/out/anotherPod/cablingPlan.dot
INFO:writer:Writing config for device: leaf-PLNOG-11
INFO:writer:Writing config for device: leaf-PLNOG-12
INFO:writer:Writing config for device: spine-PLNOG-11
INFO:writer:Writing config for device: spine-PLNOG-12
INFO:writer:Writing dhcpd.conf for pod: PLNOG
Reading package lists... Done
* Starting ISC DHCP server dhcpd
...done.
INFO:rest:REST server started at 10.1.1.114:80
Bottle v0.12.8 server starting up (using WSGIRefServer())...
Listening on http://10.1.1.114:80/
Hit Ctrl-C to quit.
10.1.1.254 - - [01/Mar/2015 19:02:11] "GET / HTTP/1.1" 200 401
10.1.1.254 - - [01/Mar/2015 19:02:12] "GET / HTTP/1.1" 200 401
10.1.1.254 - - [01/Mar/2015 19:02:29] "GET /pods/PLNOG/devices/leaf-PLNOG-12/config HTTP/1.1" 200 4529
10.0.2.21 - - [01/Mar/2015 19:04:44] "GET /pods/PLNOG/devices/leaf-PLNOG-12/config HTTP/1.1" 200 4529
10.0.2.21 - - [01/Mar/2015 19:05:07] "GET /jinstall-qfx-5-flex-13.2X51-D30.4-domestic-signed.tgz HTTP/1.1"
200 450420680
(...)
53. Fetching soft/conf + ZTP
root>
Auto Image Upgrade: DHCP Options for client interface vme.0:
ConfigFile: pods/anotherPod/devices/spine-PLNOG-11/config ImageFile: jinstall-qfx-5-flex-
13.2X51-D30.4-domestic-signed.tgz Gateway: 10.0.2.254 DHCP Server: 10.0.2.254 File Server:
10.1.1.114 Options state: All options set
Auto Image Upgrade: DHCP Client Bound interfaces: vme.0
Auto Image Upgrade: DHCP Client Unbound interfaces: irb.0 em1.0
Auto Image Upgrade: To stop, on CLI apply "delete chassis auto-image-upgrade" and commit
Auto Image Upgrade: Active on client interface: vme.0
Auto Image Upgrade: Interface:: "vme"
Auto Image Upgrade: Server:: "10.1.1.114"
Auto Image Upgrade: Image File:: "jinstall-qfx-5-flex-13.2X51-D30.4-domestic-signed.tgz"
Auto Image Upgrade: Server File:: "config"
Auto Image Upgrade: Gateway:: "10.0.2.254"
Auto Image Upgrade: Protocol:: "http"
Auto Image Upgrade: Start fetching config file from server 10.1.1.114 through vme using http
Auto Image Upgrade: File config fetched from server 10.1.1.114 through vme
Auto Image Upgrade: Start fetching jinstall-qfx-5-flex-13.2X51-D30.4-domestic-signed.tgz file
from server 10.1.1.114 through vme using http
Auto Image Upgrade: Committed Configuration config received from 10.1.1.114 through vme