Intel trusted execution environment, SGX, offers an attractive solution for protecting one's private data in the public cloud environment, even in the presence of a malicious OS or VMM.
In this talk, we will:
* explore how SGX mitigates various attack surfaces and the caveats of naively using the technology to protect applications,
* discuss the performance implications of SGX on common applications and understand the new bottlenecks created by SGX, which may lead to a 5X performance degradation.
* describe an optimized SGX interface, HotCalls, that provides a 13-27x speedup compared to the built-in mechanism supplied by the SGX SDK.
* discuss how it is possible for the OS to manage secure memory without having access to it.
* explore various attack surfaces and published attacks which require collusion with the OS. Specifically, page-fault and page-fault-less “controlled channel attacks”, branch-shadowing attacks and potential mitigations.
Ofir Weisse is a Researcher PhD Student at University of Michigan.
Video available at: https://www.youtube.com/watch?v=I3TCctdnOEc
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
SGX Trusted Execution Environment
1. SGX Trusted Execution Environment
Linux Kernel Meetup
Tel-Aviv, May 10, 2018
Ofir Weisse
2. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Cloud Computing Attack Surface
Service Hosting
2
Medical Records
Intellectual Property
Private Data
3. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Cloud Computing Attack Surface
To lower costs - computation and storage
are moved to third party machines
This implies trust
Cloud provider
employees
3
OS
Virtualization
Software
SMM code
(firmware)
Hardware
The attack surface is large
4. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
SGX Secure Execution
What is the impact on overall application’s performance?
What creates the bottlenecks? Can we alleviate them?
How can the kernel attack SGX?
How can we defend against a malicious kernel? 4
Authenticated code
Malicious environment
Is it practical?
No SGX With SGX
Throughput degradation
5. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Outline
Part 1 – Performance Optimization
Intel SGX background
Measuring SGX performance bottlenecks
Improving SGX performance with HotCalls
Part 2 – Attacks on SGX
5
6. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
SGX in a nutshell
User Space ddddddd d
OS Kernel
VMM
SMM
RAM HW CPU
Enclave
6
Attestation
Remote
Client
7. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
SGX – Memory Organization
Physical Memory
Enclave Page Cache (EPC)
EPC metadata
Encrypted by
Memory Encryption Engine
(MEE)
7
8. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
SGX – Memory Organization
No roll back 8
Physical Memory
Enclave Page Cache (EPC)
EPC metadata
No roll-back
9. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
SGX Encrypted Memory Management
9
Virtual Address space (>4GB): code data
Physical memory:
Enclave Page Cache (EPC)
10. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
SGX Instructions
Supervisor Instructions (Ring 0)
ECREATE
EADD – copy to EPC
EEXTEND – add to SHA256
EINIT
EDBGRD
EDBGWR
EINIT
EWB – evict from EPC
ELD – load to EPC
ETRACK
10
User Instructions (Ring 3)
EENTER
EEXIT
ERESUME
EGETKEY
EREPORT
11. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
SGX Encrypted Memory Management
11
Virtual Address space (>4GB): code data
Physical memory:
Enclave Page Cache (EPC)
EADD
12. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Application memory address spacePlaintext Shared Memory
SGX - Secure Enclave Life-cycle
12
Enclave –
Trusted Code
Application –
Untrusted Code
Encrypted Memory
ocall
ecall
• Can access all memory
• No access to system
calls
• Can call system API
functions
(send, fread, etc.)
External
Verifier
SGX operations may become a bottleneck
13. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Outline
Part 1 – Performance Optimization
Intel SGX background
Measuring SGX performance bottlenecks
Improving SGX performance with HotCalls
Part 2 – Attacks on SGX
13
14. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Accessing encrypted memory
Read
Write
Control transfers
Ecalls (EENTER+EEXIT)
Ocalls (EEXIT+ERESUME)
SDK inefficiencies
What are the Potential Bottlenecks?
14
15. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Cost of Accessing Encrypted Memory
15
102%
overhead
Write Latency Read Latency
6% overhead
(Cache-miss: 30%) (Cache-miss: 20%)
Encrypted memory is a potential bottleneck
16. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Ecalls
SDK code
EENTER
EEXIT
Ocalls
SDK code
EEXIT
ERESUME
Cost of Secure Context Switch
16
Gathering required
enclave information
Defensive checks of
pointers
Legal destination
No overlaps
18. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Call Type Cycles
Linux System Call 150 x1 (OSDI 2010)
KVM Hypercall 1,300 x8 (ISCA 2016)
Call Type Cycles
Linux System Call 150 x1 (OSDI 2010)
KVM Hypercall 1,300 x8 (ISCA 2016)
SGX calls (warm cache) 8,600 x57
SGX calls (cold cache, median) 14,100 x94
SGX calls (cold cache top 5%) 16,000 x106
Context Switch in Perspective
18
#Calls per
second
Cycles overhead
@ 4GHz
10,000 2.15%
50,000 10.75%
100,000 21.5%
200,000 43%
Application
# Calls
/second
Core
spending
Memcached 200,000 43%
OpenVPN 275,000 57%
Lighttpd 270,000 56%
Real Applications
19. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Outline
Part 1 – Performance Optimization
Intel SGX background
Measuring SGX performance bottlenecks
Improving SGX performance with HotCalls
Part 2 – Attacks on SGX
19
20. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Fast EcallsFast OcallsHotCalls
HotCalls – New Calling Interface
EnclaveApplication
20
Properties:
Not dependent on OS mutexes, semaphores or signals
Maintains security properties of SGX
No context switch
21. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Fast EcallsFast OcallsHotCalls
HotCalls – New Calling Interface
ResponderRequester
21
Shared Memory
void *dataSpinlock call_ID Go | DoneSpinlockSpinlock void *data call_ID Go | DoneSpinlock Go | Done
Shared Memory
void *dataSpinlock call_ID Go | DoneSpinlock void *data call_ID
Shared Memory
void *dataSpinlock call_ID Go | DoneSpinlock call_ID
Additional thread
Go | Donevoid *data
Shared memory
No context switch
23. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
OS CallsOS Calls
HotCalls in Practice
24
OS Calls
- Porting strategy similar to Haven & SCONE
- Developed an SGX porting framework to automate the process
24. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Lost Cycles Estimation
25
Application
Frequent Calls
(Calls x1000 / second)
Total
Calls
Core
Time
Memcached
read(66.5), sendmsg(66.5)
RunEnclaveFucntion(66.5)
200K 42%
OpenVPN
poll(87), time(87), getpid(13.6),
write(30), recvfrom(30),
read(13.6) sendto(13.6)
275K 57%
Lighttpd
read(49),fcntl(25),
epoll_ctl(25), close(25),
setsockopt(25), __fxstat64(25)
inet_ntop(12),accept(12),
inet_addr(12),ioctl(12),
__open64_2(12), sendfile64(12)
shutdown(12),writev(12)
270K 56%
#Calls per
second
Core
overhead
10,000 2.15%
50,000 10.75%
100,000 21.5%
200,000 43%
Context switches consume up to 57% of the cycles
26. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Part 1 Conclusion
Naively porting applications may derail
performance
Memory access may be expensive
Interaction with the OS may be costly
Can optimize performance with HotCalls
Request latency is reduced by up to 13X
Throughput can be boosted to near-native
performance
27
27. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Attacks on SGX
Controlled Channel Attacks – with Page Faults
Controlled Channel Attacks – with Page Table Side Channels
Branch Shadowing
Defense Mechanisms
28
28. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Virtual to Physical Mapping 101
Virtual Memory: code data
29
Physical memory
The OS can induce a page fault on every memory access
Enclave Page Cache
(EPC)
29. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Controlled Channel Attack
(The Original)
30Source: “Controlled Channel Attacks”, IEEE S&P 2015
30. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Controlled Channel Attack
(The Original)
31Image Source: “Controlled Channel Attacks”, IEEE S&P 2015
31. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Controlled Channel Attack
32Image Source: “Controlled Channel Attacks”, IEEE S&P 2015
32. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
33Source: “Telling Your Secrets Without Page Faults”, USENIX Security 2017
33. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Controlled Channel Attack
Without Page Faults
Dirty bits in PTEs
Cache side channels
34
34. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Attacks on SGX
Controlled Channel Attacks – with Page Faults
Controlled Channel Attacks – with Page Table Side Channels
Branch Shadowing
Defense Mechanisms
35
35. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Branch Shadowing Attacks
Branch prediction and BTB
36
Source: “Inferring Fine-grained Control Flow Inside SGX Enclaves with Branch Shadowing”,
USENIX Security 2017
36. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Last Branch Record (LBR)
A trace of all recently taken branches and branch mispredictions
Alas, LBR is disabled when SGX enclaves are executing
37
37. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Branch Prediction and
Branch Target Buffer (BTB)
Current RIP [Bits 31:0] Taken/Not-taken Predicted Destination
0x7FADEA1050DE0000 Taken 0x7FADEA1050DE0300
0x7FADEA1050DE0100 Taken 0x7FADEA1050DE0200
0x7FADEA1050DE0200 Not Taken 0x7FADEA1050DE0100
0x7FADEA1050DE0300 Not Taken 0x7FADEA1050DE0500
38
What happens if RIP=0x0000000050DE0000?
39. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Single step(ish)
How can we pause execution after every branch?
Clock Interrupt causes Asynchronous Exit (AEX)
Then single step branch shadow code
Observe LBR
40
40. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Branch Shadowing Results
66% of the bits of 1024 RSA key were recovered using a
single run
With 10 runs – all the bits were recovered
41
41. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Defense Mechanisms
T-SGX – using Transactional Memory Extensions (TSX)
SGX-Shield – ASLR for SGX
Racing in Hyperspace
42
42. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
TSX –
Transactional Synchronization eXtensions
Computation performed in cache
Complete rollback upon abort
Faults are supressed
43
Transaction
XBEGIN
XEND
TSX Abort Transaction complete
44. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
T-SGX (2)
45
XBEGIN
XEND
TSX Abort Transaction complete
Timer Interrupt -> TSX abort
Image Source: “Inferring Fine-grained Control Flow Inside SGX Enclaves
with Branch Shadowing”, USENIX Security 2017
45. Linux Kernel Meetup, Tel-Aviv, May 10, 2018
Secure execution alone is only the first step for secure
systems
Performance impact may be prohibitive but can be
optimized with HotCalls
Including the OS/VMM in the threat model presents new
challenges
Conclusion
46
www.OfirWeisse.com
github.com/oweisse/hot-calls