SlideShare une entreprise Scribd logo
1  sur  64
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Network Performance:
Making Every Packet Count
M i k e F u r r , P r i n c i p a l E n g i n e e r , E C 2
N o v e m b e r 2 9 , 2 0 1 7
N E T 4 0 1
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What to expect from this session
Tuning TCP
on Linux
TCP Performance Application
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
TCP
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
TCP
• Transmission Control Protocol
• Underlies SSH, HTTP, *SQL, SMTP
• Stream delivery, flow control
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
TCP
Jack Jill
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Jack Jill
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Limiting in-flight data
Jack Jill
Receive
Window
Receive
Window
Congestion
Window
Congestion
Window
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Bandwidth delay product
Jack Jill
2 ms round-trip time
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Bandwidth delay product
Jack Jill
100 ms round-trip time
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Receive window
Receiver controlled, signaled to sender
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Congestion window
Jack Jill
Receive
Window
Receive
Window
Congestion
Window
Congestion
Window
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Congestion window
• Sender controlled
• Window is managed by the congestion control algorithm
• Inputs—vary by algorithm

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Initial congestion window
$ ip route list
default via 10.16.16.1 dev eth0
10.16.16.0/24 dev eth0 proto kernel scope link
169.254.169.254 dev eth0 scope link
1448 1448 1448 = 4344 bytes
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Initial congestion window
# ip route change 10.16.16.0/24 dev eth0 
proto kernel scope link initcwnd 16
$ ip route list
default via 10.16.16.1 dev eth0
10.16.16.0/24 dev eth0 proto kernel scope link initcwnd 16
169.254.169.254 dev eth0 scope link
1448 1448 1448 1448[ + 12 ] = 23168 bytes
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
0
20
40
60
80
100
0% 2% 4% 6% 8% 10%
Loss Rate
Impact of loss on TCP throughput
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Loss is visible as TCP retransmissions
$ netstat -s | grep retransmit
58496 segments retransmitted
52788 fast retransmits
135 forward retransmits
3659 retransmits in slow start
392 SACK retransmits failed
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Socket level diagnostic
$ ss -ite
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008
timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <->
ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40
mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138
retrans:0/11737 rcv_space:26847
TCP State
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Socket level diagnostic
Bytes queued for
transmission
$ ss -ite
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008
timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <->
ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40
mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138
retrans:0/11737 rcv_space:26847
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Socket level diagnostic
$ ss -ite
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008
timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <->
ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40
mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138
retrans:0/11737 rcv_space:26847
Congestion
control algorithm
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Socket level diagnostic
$ ss -ite
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008
timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <->
ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40
mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138
retrans:0/11737 rcv_space:26847
Retransmission
timeout
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Socket level diagnostic
$ ss -ite
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008
timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <->
ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40
mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138
retrans:0/11737 rcv_space:26847
Congestion
window
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Socket level diagnostic
$ ss -ite
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008
timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <->
ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40
mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138
retrans:0/11737 rcv_space:26847
Retransmissions
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Monitoring retransmissions in real time
Observable using Linux kernel tracing
# tcpretrans
TIME PID LADDR:LPORT -- RADDR:RPORT STATE
03:31:07 106588 10.16.16.18:443 R> 10.16.16.75:52291 ESTABLISHED
https://github.com/brendangregg/perf-tools/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Congestion control algorithm
Jack Jill
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Congestion control algorithms in Linux
• New Reno: Pre-2.6.8
• BIC: 2.6.8–2.6.18
• CUBIC: 2.6.19+
• Pluggable architecture
• Other algorithms often available
• BBR, Vegas, Illinois, Westwood, Highspeed, Scalable
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tuning congestion control algorithm
$ sysctl net.ipv4.tcp_available_congestion_control
net.ipv4.tcp_available_congestion_control = cubic reno
$ find /lib/modules -name tcp_*
[…]
# modprobe tcp_illinois
$ sysctl net.ipv4.tcp_available_congestion_control
net.ipv4.tcp_available_congestion_control = cubic reno illinois
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tuning congestion control algorithm
# sysctl net.ipv4.tcp_congestion_control=illinois
net.ipv4.tcp_congestion_control = illinois
# echo “net.ipv4.tcp_congestion_control = illinois” >
/etc/sysctl.d/01-tcp.conf
[Restart network processes]
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
TCP-BBR
• Available in Linux 4.9
• Uses pacing and active probing to estimate Bandwidth and RTT
• Starting in 4.13, fq no longer required
# modprobe sch_fq
# modprobe tcp_bbr
# sysctl net.core.default_qdisc=fq
# sysctl net.ipv4.tcp_congestion_control=bbr
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Retransmission timer
• Input to when the congestion control
algorithm considers a packet lost
• Too low: spurious retransmission; congestion control
can over-react and be slow to re-open the congestion
window
• Too high: increased latency while algorithm determines
a packet is lost and retransmits
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tuning retransmission timer minimum
• Default minimum: 200 ms
# ip route list
default via 10.16.16.1 dev eth0
10.16.16.0/24 dev eth0 proto kernel scope link
169.254.169.254 dev eth0 scope link
Route to other
instances in
our subnet
(same AZ)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tuning retransmission timer minimum
# ip route list
default via 10.16.16.1 dev eth0
10.16.16.0/24 dev eth0 proto kernel scope link
169.254.169.254 dev eth0 scope link
# ip route change 10.16.16.0/24 dev eth0 proto kernel 
scope link rto_min 50ms
# ip route list
default via 10.16.16.1 dev eth0
10.16.16.0/24 dev eth0 proto kernel scope link rto_min 
lock 50ms
169.254.169.254 dev eth0 scope link
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Queueing along the network path
Jack Jill
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Queueing along the network path
• Intermediate routers along a path have
interface buffers
• High load leads to more packets in buffer
• Latency increases due to queue time
• Can trigger retransmission timeouts
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Active queue management
$ tc qdisc list
qdisc mq 0: dev eth0 root
qdisc pfifo_fast 0: dev eth0 parent :1 bands 3 […]
qdisc pfifo_fast 0: dev eth0 parent :2 bands 3 […]
# tc qdisc add dev eth0 root fq_codel
qdisc fq_codel 8006: dev eth0 root refcnt 9 limit 10240p
flows 1024 quantum 9015 target 5.0ms interval 100.0ms ecn
www.bufferbloat.net/projects/codel/wiki
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Maximum transmission unit
3.47% overhead versus 0.58% overhead
Improvement seen among instances in your VPC
1448 B
Payload
8949 B Payload
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tuning maximum transmission unit
# ip link list
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc
mq state UP mode DEFAULT group default qlen 1000
link/ether 06:f1:b7:e1:3b:e7
# ip route list
default via 10.16.16.1 dev eth0
10.16.16.0/24 dev eth0 proto kernel scope link
169.254.169.254 dev eth0 scope link
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tuning maximum transmission unit
# ip route change default via 10.16.16.1 dev eth0 mtu 1500
# ip route list
default via 10.16.16.1 dev eth0 mtu 1500
10.16.16.0/24 dev eth0 proto kernel scope link
169.254.169.254 dev eth0 scope link
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 enhanced networking
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 enhanced networking
Virtualization
Layer
HW NIC
Virtualization
Layer
HW NIC
Xen-PV Xen-PV
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 enhanced networking
HW NIC HW NIC
VF VF
Intel
82599
Intel
82599
10 Gbps
Virtualization
Layer
Virtualization
Layer
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 Elastic Network Adapter
ENA ENA
VF VF
20 Gbps
25 Gbps
Virtualization
Layer
Virtualization
Layer
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
PV-XEN
$ ethtool -k eth0
driver: vif
Enhanced Networking
$ ethtool -i eth0
driver: ixgbevf
C3, C4, D2, I2, R3,
M4 (not m4.16XL)
Elastic Network Adapter
$ ethtool -i eth0
driver: ena
F1, G3, I3, P2, P3, R4, X1,
m4.16xlarge
Verifying ENA is enabled
https://github.com/amzn/amzn-drivers
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Applying our new knowledge
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Test setup
• m4.16xlarge instances—Jack and Jill
• Amazon Linux 2017.09 (Kernel 4.9.51-10.52.amzn1)
• Web Server: Nginx 1.12.1
• Client: ApacheBench 2.3
• TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
• Transferring uncompressible data (random bits)
• Origin data stored in tmpfs (RAM based; no server disk I/O)
• Data discarded once retrieved (no client disk I/O)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Application 1
HTTPS with intermediate network loss
Jack Jill
0.5%
loss
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Test setup
• 1 test server instance, 1 test client instance
• 80 ms RTT
• 80 parallel clients retrieving a 100 MB object
$ ab -n 1600 -c 80 https://server/100m
• Simulated packet loss
# tc qdisc add dev eth0 root netem loss 0.5%
Goal: Minimize throughput impact with 0.5% loss
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 1
DefaultsDefaults w/0.5% loss
23.2 s
42.8 s 37.6 s52.3 s
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 1
Cubic w/0.5% loss Illinois w/0.5% loss
20.7 s
42.8 s52.3 s 41.5 s
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 1
Cubic w/0.5% loss BBR w/0.5% loss
42.8 s
11.1 s
52.3 s 38.3 s
74%
Decrease!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 1
BBR no loss BBR w/0.5% loss
44.7 s
8.8 s 11.1 s
38.3 s
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 1
BBR no loss Cubic no loss
44.7 s
8.8 s 11.1s
38.3s
23.2 s
37.6 s
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Application 2
Data transfer; low RTT path
Jack Jill
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Test setup
• 1 test server instance, 1 test client instance
• 1 ms RTT
• 8 parallel clients retrieving a 10 MB object
$ ab -n 100000 -c 8 https://server/10m
• Start at default RTO, then decrease
Goal: Minimize latency at high percentiles with 0.2% loss
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 2
p99.99
200 ms
2 ms
p50
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 2
RTO:200 p99.99 Latency RTO:50 p99.99 Latency
200 ms
100 ms
50%
Decrease!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Application 3
High transaction rate HTTP service
Jack Jill
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Test setup
• 1 test server instance, 1 test client instance
• 80 ms RTT
• HTTP, not HTTPS
• 1500 MTU
• 200k requests for a 10k object
$ ab -n 200000 -c 200 http://server/10k
Goal: Minimize latency
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 3
Test P50 latency Avg BW
Initial congestion window—3 packets 321 ms 12.550 Mbps
Initial congestion window—10 packets 241 ms 16.765 Mbps
Initial congestion window—16 packets 161 ms 22.518 Mbps
79%
Increase!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Takeaways
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Takeaways
• The network doesn’t have to be a black box—Linux
tools can be used to interrogate and understand
• Simple tweaks to settings can dramatically increase
performance—test, measure, change
• Understand what your application needs from the
network, and tune accordingly
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank You
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!
Remember to complete
your evaluations!

Contenu connexe

Tendances

VPP事始め
VPP事始めVPP事始め
VPP事始めnpsg
 
10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化Takuya ASADA
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptablesKernel TLV
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_mapslcplcp1
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingViller Hsiao
 
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)Kentaro Ebisawa
 
ConfD で Linux にNetconfを喋らせてみた
ConfD で Linux にNetconfを喋らせてみたConfD で Linux にNetconfを喋らせてみた
ConfD で Linux にNetconfを喋らせてみたAkira Iwamoto
 
Introduction to Initramfs - Initramfs-tools and Dracut
Introduction to Initramfs - Initramfs-tools and DracutIntroduction to Initramfs - Initramfs-tools and Dracut
Introduction to Initramfs - Initramfs-tools and DracutTaisuke Yamada
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDPlcplcp1
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPThomas Graf
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPFRogerColl2
 
Intel 82599 10GbE Controllerで遊ぼう
Intel 82599 10GbE Controllerで遊ぼうIntel 82599 10GbE Controllerで遊ぼう
Intel 82599 10GbE Controllerで遊ぼうTakuya ASADA
 
WebRTC mediasoup on raspberrypi3
WebRTC mediasoup on raspberrypi3WebRTC mediasoup on raspberrypi3
WebRTC mediasoup on raspberrypi3mganeko
 
Staring into the eBPF Abyss
Staring into the eBPF AbyssStaring into the eBPF Abyss
Staring into the eBPF AbyssSasha Goldshtein
 
ロードバランスへの長い道
ロードバランスへの長い道ロードバランスへの長い道
ロードバランスへの長い道Jun Kato
 
Kernel Recipes 2017 - EBPF and XDP - Eric Leblond
Kernel Recipes 2017 - EBPF and XDP - Eric LeblondKernel Recipes 2017 - EBPF and XDP - Eric Leblond
Kernel Recipes 2017 - EBPF and XDP - Eric LeblondAnne Nicolas
 
Pushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack UpPushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack UpJames Denton
 

Tendances (20)

VPP事始め
VPP事始めVPP事始め
VPP事始め
 
10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptables
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
 
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
 
ConfD で Linux にNetconfを喋らせてみた
ConfD で Linux にNetconfを喋らせてみたConfD で Linux にNetconfを喋らせてみた
ConfD で Linux にNetconfを喋らせてみた
 
Introduction to Initramfs - Initramfs-tools and Dracut
Introduction to Initramfs - Initramfs-tools and DracutIntroduction to Initramfs - Initramfs-tools and Dracut
Introduction to Initramfs - Initramfs-tools and Dracut
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPF
 
Intel 82599 10GbE Controllerで遊ぼう
Intel 82599 10GbE Controllerで遊ぼうIntel 82599 10GbE Controllerで遊ぼう
Intel 82599 10GbE Controllerで遊ぼう
 
Vpc notes
Vpc notesVpc notes
Vpc notes
 
WebRTC mediasoup on raspberrypi3
WebRTC mediasoup on raspberrypi3WebRTC mediasoup on raspberrypi3
WebRTC mediasoup on raspberrypi3
 
JUNOS: OSPF and BGP
JUNOS: OSPF and BGPJUNOS: OSPF and BGP
JUNOS: OSPF and BGP
 
Staring into the eBPF Abyss
Staring into the eBPF AbyssStaring into the eBPF Abyss
Staring into the eBPF Abyss
 
ロードバランスへの長い道
ロードバランスへの長い道ロードバランスへの長い道
ロードバランスへの長い道
 
Kernel Recipes 2017 - EBPF and XDP - Eric Leblond
Kernel Recipes 2017 - EBPF and XDP - Eric LeblondKernel Recipes 2017 - EBPF and XDP - Eric Leblond
Kernel Recipes 2017 - EBPF and XDP - Eric Leblond
 
Pushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack UpPushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack Up
 
Https
HttpsHttps
Https
 

Similaire à AWS TCP Performance Guide

CMP315_Optimizing Network Performance for Amazon EC2 Instances
CMP315_Optimizing Network Performance for Amazon EC2 InstancesCMP315_Optimizing Network Performance for Amazon EC2 Instances
CMP315_Optimizing Network Performance for Amazon EC2 InstancesAmazon Web Services
 
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...Amazon Web Services
 
(NET404) Making Every Packet Count
(NET404) Making Every Packet Count(NET404) Making Every Packet Count
(NET404) Making Every Packet CountAmazon Web Services
 
AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)Amazon Web Services
 
Building CloudScale Networks - AWS Summit Sydney 2018
Building CloudScale Networks - AWS Summit Sydney 2018Building CloudScale Networks - AWS Summit Sydney 2018
Building CloudScale Networks - AWS Summit Sydney 2018Amazon Web Services
 
Deep Dive into AWS Fargate - CON333 - re:Invent 2017
Deep Dive into AWS Fargate - CON333 - re:Invent 2017Deep Dive into AWS Fargate - CON333 - re:Invent 2017
Deep Dive into AWS Fargate - CON333 - re:Invent 2017Amazon Web Services
 
Handy Networking Tools and How to Use Them
Handy Networking Tools and How to Use ThemHandy Networking Tools and How to Use Them
Handy Networking Tools and How to Use ThemSneha Inguva
 
Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017
Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017
Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017Amazon Web Services
 
Leveraging Network Offload to Accelerate SDN and NFV Deployments
Leveraging Network Offload to Accelerate SDN and NFV DeploymentsLeveraging Network Offload to Accelerate SDN and NFV Deployments
Leveraging Network Offload to Accelerate SDN and NFV DeploymentsNetronome
 
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterLISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterIvan Babrou
 
Cilium:: Application-Aware Microservices via BPF
Cilium:: Application-Aware Microservices via BPFCilium:: Application-Aware Microservices via BPF
Cilium:: Application-Aware Microservices via BPFCynthia Thomas
 
Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018
Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018
Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018Amazon Web Services
 
From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...
From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...
From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...Amazon Web Services
 
Forward Networks - Networking Field Day 13 presentation
Forward Networks - Networking Field Day 13 presentationForward Networks - Networking Field Day 13 presentation
Forward Networks - Networking Field Day 13 presentationAndrew Wesbecher
 
[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...
[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...
[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...Nur Shiqim Chok
 
Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1Yongyoon Shin
 
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeAcademy
 

Similaire à AWS TCP Performance Guide (20)

CMP315_Optimizing Network Performance for Amazon EC2 Instances
CMP315_Optimizing Network Performance for Amazon EC2 InstancesCMP315_Optimizing Network Performance for Amazon EC2 Instances
CMP315_Optimizing Network Performance for Amazon EC2 Instances
 
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
 
Kubernetes on AWS
Kubernetes on AWSKubernetes on AWS
Kubernetes on AWS
 
(NET404) Making Every Packet Count
(NET404) Making Every Packet Count(NET404) Making Every Packet Count
(NET404) Making Every Packet Count
 
AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)
 
Building CloudScale Networks - AWS Summit Sydney 2018
Building CloudScale Networks - AWS Summit Sydney 2018Building CloudScale Networks - AWS Summit Sydney 2018
Building CloudScale Networks - AWS Summit Sydney 2018
 
Deep Dive into AWS Fargate - CON333 - re:Invent 2017
Deep Dive into AWS Fargate - CON333 - re:Invent 2017Deep Dive into AWS Fargate - CON333 - re:Invent 2017
Deep Dive into AWS Fargate - CON333 - re:Invent 2017
 
Building Cloudscale Networks
Building Cloudscale NetworksBuilding Cloudscale Networks
Building Cloudscale Networks
 
Handy Networking Tools and How to Use Them
Handy Networking Tools and How to Use ThemHandy Networking Tools and How to Use Them
Handy Networking Tools and How to Use Them
 
Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017
Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017
Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017
 
Leveraging Network Offload to Accelerate SDN and NFV Deployments
Leveraging Network Offload to Accelerate SDN and NFV DeploymentsLeveraging Network Offload to Accelerate SDN and NFV Deployments
Leveraging Network Offload to Accelerate SDN and NFV Deployments
 
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterLISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
 
Introduction to TCP/IP
Introduction to TCP/IPIntroduction to TCP/IP
Introduction to TCP/IP
 
Cilium:: Application-Aware Microservices via BPF
Cilium:: Application-Aware Microservices via BPFCilium:: Application-Aware Microservices via BPF
Cilium:: Application-Aware Microservices via BPF
 
Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018
Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018
Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018
 
From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...
From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...
From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...
 
Forward Networks - Networking Field Day 13 presentation
Forward Networks - Networking Field Day 13 presentationForward Networks - Networking Field Day 13 presentation
Forward Networks - Networking Field Day 13 presentation
 
[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...
[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...
[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...
 
Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1
 
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
 

Plus de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

AWS TCP Performance Guide

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Network Performance: Making Every Packet Count M i k e F u r r , P r i n c i p a l E n g i n e e r , E C 2 N o v e m b e r 2 9 , 2 0 1 7 N E T 4 0 1
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What to expect from this session Tuning TCP on Linux TCP Performance Application
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. TCP
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. TCP • Transmission Control Protocol • Underlies SSH, HTTP, *SQL, SMTP • Stream delivery, flow control
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. TCP Jack Jill
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Jack Jill
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Limiting in-flight data Jack Jill Receive Window Receive Window Congestion Window Congestion Window
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Bandwidth delay product Jack Jill 2 ms round-trip time
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Bandwidth delay product Jack Jill 100 ms round-trip time
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Receive window Receiver controlled, signaled to sender
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Congestion window Jack Jill Receive Window Receive Window Congestion Window Congestion Window
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Congestion window • Sender controlled • Window is managed by the congestion control algorithm • Inputs—vary by algorithm 
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Initial congestion window $ ip route list default via 10.16.16.1 dev eth0 10.16.16.0/24 dev eth0 proto kernel scope link 169.254.169.254 dev eth0 scope link 1448 1448 1448 = 4344 bytes
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Initial congestion window # ip route change 10.16.16.0/24 dev eth0 proto kernel scope link initcwnd 16 $ ip route list default via 10.16.16.1 dev eth0 10.16.16.0/24 dev eth0 proto kernel scope link initcwnd 16 169.254.169.254 dev eth0 scope link 1448 1448 1448 1448[ + 12 ] = 23168 bytes
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 0 20 40 60 80 100 0% 2% 4% 6% 8% 10% Loss Rate Impact of loss on TCP throughput
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Loss is visible as TCP retransmissions $ netstat -s | grep retransmit 58496 segments retransmitted 52788 fast retransmits 135 forward retransmits 3659 retransmits in slow start 392 SACK retransmits failed
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Socket level diagnostic $ ss -ite State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008 timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <-> ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40 mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138 retrans:0/11737 rcv_space:26847 TCP State
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Socket level diagnostic Bytes queued for transmission $ ss -ite State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008 timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <-> ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40 mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138 retrans:0/11737 rcv_space:26847
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Socket level diagnostic $ ss -ite State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008 timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <-> ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40 mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138 retrans:0/11737 rcv_space:26847 Congestion control algorithm
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Socket level diagnostic $ ss -ite State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008 timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <-> ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40 mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138 retrans:0/11737 rcv_space:26847 Retransmission timeout
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Socket level diagnostic $ ss -ite State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008 timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <-> ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40 mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138 retrans:0/11737 rcv_space:26847 Congestion window
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Socket level diagnostic $ ss -ite State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008 timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <-> ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40 mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138 retrans:0/11737 rcv_space:26847 Retransmissions
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring retransmissions in real time Observable using Linux kernel tracing # tcpretrans TIME PID LADDR:LPORT -- RADDR:RPORT STATE 03:31:07 106588 10.16.16.18:443 R> 10.16.16.75:52291 ESTABLISHED https://github.com/brendangregg/perf-tools/
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Congestion control algorithm Jack Jill
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Congestion control algorithms in Linux • New Reno: Pre-2.6.8 • BIC: 2.6.8–2.6.18 • CUBIC: 2.6.19+ • Pluggable architecture • Other algorithms often available • BBR, Vegas, Illinois, Westwood, Highspeed, Scalable
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tuning congestion control algorithm $ sysctl net.ipv4.tcp_available_congestion_control net.ipv4.tcp_available_congestion_control = cubic reno $ find /lib/modules -name tcp_* […] # modprobe tcp_illinois $ sysctl net.ipv4.tcp_available_congestion_control net.ipv4.tcp_available_congestion_control = cubic reno illinois
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tuning congestion control algorithm # sysctl net.ipv4.tcp_congestion_control=illinois net.ipv4.tcp_congestion_control = illinois # echo “net.ipv4.tcp_congestion_control = illinois” > /etc/sysctl.d/01-tcp.conf [Restart network processes]
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. TCP-BBR • Available in Linux 4.9 • Uses pacing and active probing to estimate Bandwidth and RTT • Starting in 4.13, fq no longer required # modprobe sch_fq # modprobe tcp_bbr # sysctl net.core.default_qdisc=fq # sysctl net.ipv4.tcp_congestion_control=bbr
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Retransmission timer • Input to when the congestion control algorithm considers a packet lost • Too low: spurious retransmission; congestion control can over-react and be slow to re-open the congestion window • Too high: increased latency while algorithm determines a packet is lost and retransmits
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tuning retransmission timer minimum • Default minimum: 200 ms # ip route list default via 10.16.16.1 dev eth0 10.16.16.0/24 dev eth0 proto kernel scope link 169.254.169.254 dev eth0 scope link Route to other instances in our subnet (same AZ)
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tuning retransmission timer minimum # ip route list default via 10.16.16.1 dev eth0 10.16.16.0/24 dev eth0 proto kernel scope link 169.254.169.254 dev eth0 scope link # ip route change 10.16.16.0/24 dev eth0 proto kernel scope link rto_min 50ms # ip route list default via 10.16.16.1 dev eth0 10.16.16.0/24 dev eth0 proto kernel scope link rto_min lock 50ms 169.254.169.254 dev eth0 scope link
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Queueing along the network path Jack Jill
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Queueing along the network path • Intermediate routers along a path have interface buffers • High load leads to more packets in buffer • Latency increases due to queue time • Can trigger retransmission timeouts
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Active queue management $ tc qdisc list qdisc mq 0: dev eth0 root qdisc pfifo_fast 0: dev eth0 parent :1 bands 3 […] qdisc pfifo_fast 0: dev eth0 parent :2 bands 3 […] # tc qdisc add dev eth0 root fq_codel qdisc fq_codel 8006: dev eth0 root refcnt 9 limit 10240p flows 1024 quantum 9015 target 5.0ms interval 100.0ms ecn www.bufferbloat.net/projects/codel/wiki
  • 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Maximum transmission unit 3.47% overhead versus 0.58% overhead Improvement seen among instances in your VPC 1448 B Payload 8949 B Payload
  • 37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tuning maximum transmission unit # ip link list 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 06:f1:b7:e1:3b:e7 # ip route list default via 10.16.16.1 dev eth0 10.16.16.0/24 dev eth0 proto kernel scope link 169.254.169.254 dev eth0 scope link
  • 38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tuning maximum transmission unit # ip route change default via 10.16.16.1 dev eth0 mtu 1500 # ip route list default via 10.16.16.1 dev eth0 mtu 1500 10.16.16.0/24 dev eth0 proto kernel scope link 169.254.169.254 dev eth0 scope link
  • 39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 enhanced networking
  • 40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 enhanced networking Virtualization Layer HW NIC Virtualization Layer HW NIC Xen-PV Xen-PV
  • 41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 enhanced networking HW NIC HW NIC VF VF Intel 82599 Intel 82599 10 Gbps Virtualization Layer Virtualization Layer
  • 42. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 Elastic Network Adapter ENA ENA VF VF 20 Gbps 25 Gbps Virtualization Layer Virtualization Layer
  • 43. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. PV-XEN $ ethtool -k eth0 driver: vif Enhanced Networking $ ethtool -i eth0 driver: ixgbevf C3, C4, D2, I2, R3, M4 (not m4.16XL) Elastic Network Adapter $ ethtool -i eth0 driver: ena F1, G3, I3, P2, P3, R4, X1, m4.16xlarge Verifying ENA is enabled https://github.com/amzn/amzn-drivers
  • 44. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Applying our new knowledge
  • 45. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Test setup • m4.16xlarge instances—Jack and Jill • Amazon Linux 2017.09 (Kernel 4.9.51-10.52.amzn1) • Web Server: Nginx 1.12.1 • Client: ApacheBench 2.3 • TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256 • Transferring uncompressible data (random bits) • Origin data stored in tmpfs (RAM based; no server disk I/O) • Data discarded once retrieved (no client disk I/O)
  • 46. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Application 1 HTTPS with intermediate network loss Jack Jill 0.5% loss
  • 47. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Test setup • 1 test server instance, 1 test client instance • 80 ms RTT • 80 parallel clients retrieving a 100 MB object $ ab -n 1600 -c 80 https://server/100m • Simulated packet loss # tc qdisc add dev eth0 root netem loss 0.5% Goal: Minimize throughput impact with 0.5% loss
  • 48. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 1 DefaultsDefaults w/0.5% loss 23.2 s 42.8 s 37.6 s52.3 s
  • 49. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 1 Cubic w/0.5% loss Illinois w/0.5% loss 20.7 s 42.8 s52.3 s 41.5 s
  • 50. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 1 Cubic w/0.5% loss BBR w/0.5% loss 42.8 s 11.1 s 52.3 s 38.3 s 74% Decrease!
  • 51. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 1 BBR no loss BBR w/0.5% loss 44.7 s 8.8 s 11.1 s 38.3 s
  • 52. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 1 BBR no loss Cubic no loss 44.7 s 8.8 s 11.1s 38.3s 23.2 s 37.6 s
  • 53. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Application 2 Data transfer; low RTT path Jack Jill
  • 54. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Test setup • 1 test server instance, 1 test client instance • 1 ms RTT • 8 parallel clients retrieving a 10 MB object $ ab -n 100000 -c 8 https://server/10m • Start at default RTO, then decrease Goal: Minimize latency at high percentiles with 0.2% loss
  • 55. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 2 p99.99 200 ms 2 ms p50
  • 56. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 2 RTO:200 p99.99 Latency RTO:50 p99.99 Latency 200 ms 100 ms 50% Decrease!
  • 57. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Application 3 High transaction rate HTTP service Jack Jill
  • 58. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Test setup • 1 test server instance, 1 test client instance • 80 ms RTT • HTTP, not HTTPS • 1500 MTU • 200k requests for a 10k object $ ab -n 200000 -c 200 http://server/10k Goal: Minimize latency
  • 59. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 3 Test P50 latency Avg BW Initial congestion window—3 packets 321 ms 12.550 Mbps Initial congestion window—10 packets 241 ms 16.765 Mbps Initial congestion window—16 packets 161 ms 22.518 Mbps 79% Increase!
  • 60. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Takeaways
  • 61. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Takeaways • The network doesn’t have to be a black box—Linux tools can be used to interrogate and understand • Simple tweaks to settings can dramatically increase performance—test, measure, change • Understand what your application needs from the network, and tune accordingly
  • 62. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank You
  • 63. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you!