Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

NBTC#2 - Why instrumentation is cooler then ice

A lightning introduction to code coverage fuzzing, the technology involved, as well as a quick tour of the core tools.

  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

NBTC#2 - Why instrumentation is cooler then ice

  1. 1. Why instrumentation is cooler then ice Alex Moneger
  2. 2. INTRODUCTION
  3. 3. The myth • Fuzzing is easy • Fuzzing is simple • Instrumentation is left as an exercise to the reader
  4. 4. The truth • Fuzzing requires effort • Generally requires adapting the target code • Most of the time requires to build a corpus of inputs • Requires minimizing the corpus • Requires instrumentation: – Did my target crash? – On what input? – Are my new inputs useful?
  5. 5. The hurdles • Tool selection • Tool integration • Reliability • Scale • A bug found prevents fuzzer from reaching further areas of code
  6. 6. INSTRUMENTATION
  7. 7. Before • 2 approaches: – Mutate data forever (randomly, byte flip, …) – Model data, mutate fields separately (Spike, Peach, Codenomicon, …) • Run for some iterations or until all states are modeled • hope for the best
  8. 8. Today • Genetic algorithms => retain only best inputs for further mutation 1. Mutate best input 2. Send to target 3. Measure impact based on some metric 4. Discard or prioritize input, back to 1.
  9. 9. Code coverage • Code coverage is the most used metric • Tells you if an input has triggered new code paths • All tools try to measure code coverage one way or another • Can be achieved : – binary instrumentation (PIN, DynamoRIO) – static rewriting (Dyninst) – kernel probing (perf) – HW (intel BTS => branch trace store)
  10. 10. How does it work • Model control flow using basic blocks • Discard unconditional edges (JMPs) • First approach, trace callgraph • Hard to compare 2 callgraphs • Best approach: retain edge count • Provides an unordered code coverage heatmap
  11. 11. Example callgraph
  12. 12. Compare code coverage maps? • Gained edges - lost edges > 0? • Simple, but will crush path divergence • Solution, keep track of interesting diverging paths • When no new edges, check edge hitcounts • Higher hitcounts, mean you control a loop boundary
  13. 13. CORPUS MINIMIZATION
  14. 14. Corpus minimization • You have collected all xml documents or IM packets from the internet • What is the minimal set of inputs which achieves maximal code coverage? • Open all inputs and record code coverage • Keep only valuable inputs
  15. 15. In practice • No open source tools to achieve this • Notable exception, with source on Nix for files => afl-cmin to the rescue • Otherwise, a good base is runtracer, drcov or coco.cpp pintool • Building the minset is up to you after that
  16. 16. WHAT NOW
  17. 17. An application • You want to fuzz an application/library • What next?
  18. 18. A few obvious questions first • Do you have source code? • Where does it take input from? – Network – File – … • Do you already have valid inputs? – Packets – Pdf – …
  19. 19. First of all • Turn on coredumps • Throw whatever you have at the binary • dd if=/dev/urandom bs=1024 count=1 | nc localhost 1234 • Or mutate some corpus inputs with radamsa • Keep CPU busy whilst you figure out a plan • Now think
  20. 20. You have source code • Find a way to get it to work with American Fuzzy Lop • AFL “batteries included” • AFL works great: – File input – Amazing performance/reliability (forkserver) – Instrumentation/stats built in (ASM instrumentation) – Scaling (distributed fuzzing) • Limitations: – Network fuzzing – Any form of daemon
  21. 21. Wrapping for AFL • Target can read from stdin or argv, your good • Otherwise, write a wrapper around your target functions • Read_from_stdin(char *buf) { target_func(buf); exit() } • Problem: complex when functions are tightly coupled (globals, complex structs, …)
  22. 22. No source? • Things start to get messy • Options: – Afl-qemu – Afl-pin – Afl-dyninst – Honggfuzz (Linux or requires HW support) – …
  23. 23. Mo problem • Idea is always the same • Through instrumentation, get code coverage info • Bind it someway to AFL: – AFL-qemu => Use Qemu userland to hook BBLs – AFL-PIN => Use PIN to hook BBLs, no forkserver support – AFL-Dyninst => static rewrite to hook BBLs
  24. 24. TODAY’S GAPS
  25. 25. Gaps • Smart fuzzing network daemons • Corpus minimization • Windows support • Triaging (exploitable doesn’t work on cores) • We need to build bricks, not solutions
  26. 26. Reference • Best advice on fuzzing by Ben Nagy: http://seclists.org/dailydave/2010/q4/47

×