Exploiting Undefined Behaviors for Efficient Symbolic Execution Asankhaya Sharma National University of Singapore
Motivation 17/2/2016 ICSE SRC 2014 2
Rainbow Pony 17/2/2016 ICSE SRC 2014 3
Software Engineering Works Hall, Anthony. "Realising the Benefits of Formal Methods." J. UCS 13.5 (2007): 669-678. 17/2/2016 ICSE SRC 2014 4
Symbolic Execution • Software engineering tools (KLEE, PEX, Klover etc.) routinely use symbolic execution for – Test case generation – Bug finding – Debugging – Performance analysis – Verification • Improving the performance can have big impact 17/2/2016 ICSE SRC 2014 5
Efficient Symbolic Execution • Bottlenecks for performance – Path explosion – Complexity of generated constraints • Existing approaches – Constraint subsumption and simplification – Concoclic execution – State merging, caching and reusing constraints – Static analysis 17/2/2016 ICSE SRC 2014 6
Key Idea • Compilers are really good at optimization based on undefined behaviors • Design a static analysis (Change Value Analysis) to introduce undefined behaviors in programs • Use the optimized binaries for symbolic execution 17/2/2016 ICSE SRC 2014 7
Main Benefits • Does not require any change in the underlying symbolic execution engine to use the results from static analysis for dynamic path exploration • Allows reduction in size of compiled binaries and prevents generation of irrelevant constraints 17/2/2016 ICSE SRC 2014 8
Overview of Approach Change Value Analysis of Program Program with Undefined Behaviors Compiler Optimizations Binary for Symbolic Execution 17/2/2016 ICSE SRC 2014 9
Change Value Analysis int foo (int x, int y, int z) { int a; a = z; if (x – y > 0) a = x; else a = y; if (z > a) printf(“z is max”); return a; } Changed Unchanged Undefined 17/2/2016 ICSE SRC 2014 10
Change Value Analysis int foo (int x, int y, int z) { int a; a = z; if (x – y > 0) a = x; else a = y; if (z > a) printf(“z is max”); return a; } Changed Unchanged Undefined a 17/2/2016 ICSE SRC 2014 11
Change Value Analysis int foo (int x, int y, int z) { int a; a = z; if (x – y > 0) a = x; else a = y; if (z > a) printf(“z is max”); return a; } Changed Unchanged Undefined a, x, y z 17/2/2016 ICSE SRC 2014 12
Introduce Undefined Behaviors int foo (int x, int y, int z) { int a; a = *; if (x – y > 0) a = x; else a = y; if (* > a) printf(“z is max”); return a; } Changed Unchanged Undefined a, x, y z Replace Unchanged variables with nondeterministic value * 17/2/2016 ICSE SRC 2014 13
After Compiler Optimizations int foo (int x, int y, int z) { int a; a = *; if (x – y > 0) a = x; else a = y; if (* > a) printf(“z is max”); return a; } int foo (int x, int y, int z) { int a; if (x – y > 0) a = x; else a = y; return a; } Eliminates 3 lines from the program 17/2/2016 ICSE SRC 2014 14
Experiments • Change Value Analysis implemented as a compiler pass in LLVM • Use an existing tool Fuzzgrind for symbolic execution of binaries • Benchmarks from Software-artifact Infrastructure Repository (SIR) 17/2/2016 ICSE SRC 2014 15
Results 0 200 400 600 800 1000 1200 Constraints (Num) Constraints with CVA 30% Reduction in Number of Constraints 17/2/2016 ICSE SRC 2014 16
Results 0 20 40 60 80 100 120 140 160 180 200 Time (Secs) Time with CVA 48% Reduction in Time taken for Symbolic Execution 17/2/2016 ICSE SRC 2014 17
Conclusions • Systematically introducing undefined behaviors in programs can speed up symbolic execution • Compilers already optimize code for efficiency and size, we show that they can also optimize code for use in symbolic execution and testing 17/2/2016 ICSE SRC 2014 18
Thank You ! • Questions ? • Source Code (GPL 3) – https://github.com/codelion/pathgrind – https://github.com/codelion/pa.llvm • Contact – asankhaya@nus.edu.sg 17/2/2016 ICSE SRC 2014 19
Comparison with Slicing 17/2/2016 ICSE SRC 2014 20 x = a; y = 5; if (a > 0) b = x + y; if (*) x = 1; else y = 0; if (y > 0) z = x; Slicing w.r.t variable z x = a; y = 5; if (a > 0) b = x + y; if (*) x = 1; else y = 0; if (y > 0) z = x; Slicing may include dependencies from infeasible paths Jaffar, Joxan, et al. "Path-sensitive backward slicing." Static Analysis. Springer Berlin Heidelberg, 2012. 231-247. Using Change Value analysis - “a” is in Unchanged set

Exploiting undefined behaviors for efficient symbolic execution

  • 1.
    Exploiting Undefined Behaviors forEfficient Symbolic Execution Asankhaya Sharma National University of Singapore
  • 2.
  • 3.
  • 4.
    Software Engineering Works Hall,Anthony. "Realising the Benefits of Formal Methods." J. UCS 13.5 (2007): 669-678. 17/2/2016 ICSE SRC 2014 4
  • 5.
    Symbolic Execution • Softwareengineering tools (KLEE, PEX, Klover etc.) routinely use symbolic execution for – Test case generation – Bug finding – Debugging – Performance analysis – Verification • Improving the performance can have big impact 17/2/2016 ICSE SRC 2014 5
  • 6.
    Efficient Symbolic Execution •Bottlenecks for performance – Path explosion – Complexity of generated constraints • Existing approaches – Constraint subsumption and simplification – Concoclic execution – State merging, caching and reusing constraints – Static analysis 17/2/2016 ICSE SRC 2014 6
  • 7.
    Key Idea • Compilersare really good at optimization based on undefined behaviors • Design a static analysis (Change Value Analysis) to introduce undefined behaviors in programs • Use the optimized binaries for symbolic execution 17/2/2016 ICSE SRC 2014 7
  • 8.
    Main Benefits • Doesnot require any change in the underlying symbolic execution engine to use the results from static analysis for dynamic path exploration • Allows reduction in size of compiled binaries and prevents generation of irrelevant constraints 17/2/2016 ICSE SRC 2014 8
  • 9.
    Overview of Approach ChangeValue Analysis of Program Program with Undefined Behaviors Compiler Optimizations Binary for Symbolic Execution 17/2/2016 ICSE SRC 2014 9
  • 10.
    Change Value Analysis intfoo (int x, int y, int z) { int a; a = z; if (x – y > 0) a = x; else a = y; if (z > a) printf(“z is max”); return a; } Changed Unchanged Undefined 17/2/2016 ICSE SRC 2014 10
  • 11.
    Change Value Analysis intfoo (int x, int y, int z) { int a; a = z; if (x – y > 0) a = x; else a = y; if (z > a) printf(“z is max”); return a; } Changed Unchanged Undefined a 17/2/2016 ICSE SRC 2014 11
  • 12.
    Change Value Analysis intfoo (int x, int y, int z) { int a; a = z; if (x – y > 0) a = x; else a = y; if (z > a) printf(“z is max”); return a; } Changed Unchanged Undefined a, x, y z 17/2/2016 ICSE SRC 2014 12
  • 13.
    Introduce Undefined Behaviors intfoo (int x, int y, int z) { int a; a = *; if (x – y > 0) a = x; else a = y; if (* > a) printf(“z is max”); return a; } Changed Unchanged Undefined a, x, y z Replace Unchanged variables with nondeterministic value * 17/2/2016 ICSE SRC 2014 13
  • 14.
    After Compiler Optimizations intfoo (int x, int y, int z) { int a; a = *; if (x – y > 0) a = x; else a = y; if (* > a) printf(“z is max”); return a; } int foo (int x, int y, int z) { int a; if (x – y > 0) a = x; else a = y; return a; } Eliminates 3 lines from the program 17/2/2016 ICSE SRC 2014 14
  • 15.
    Experiments • Change ValueAnalysis implemented as a compiler pass in LLVM • Use an existing tool Fuzzgrind for symbolic execution of binaries • Benchmarks from Software-artifact Infrastructure Repository (SIR) 17/2/2016 ICSE SRC 2014 15
  • 16.
    Results 0 200 400 600 800 1000 1200 Constraints (Num) Constraints withCVA 30% Reduction in Number of Constraints 17/2/2016 ICSE SRC 2014 16
  • 17.
    Results 0 20 40 60 80 100 120 140 160 180 200 Time (Secs) Time withCVA 48% Reduction in Time taken for Symbolic Execution 17/2/2016 ICSE SRC 2014 17
  • 18.
    Conclusions • Systematically introducingundefined behaviors in programs can speed up symbolic execution • Compilers already optimize code for efficiency and size, we show that they can also optimize code for use in symbolic execution and testing 17/2/2016 ICSE SRC 2014 18
  • 19.
    Thank You ! •Questions ? • Source Code (GPL 3) – https://github.com/codelion/pathgrind – https://github.com/codelion/pa.llvm • Contact – asankhaya@nus.edu.sg 17/2/2016 ICSE SRC 2014 19
  • 20.
    Comparison with Slicing 17/2/2016ICSE SRC 2014 20 x = a; y = 5; if (a > 0) b = x + y; if (*) x = 1; else y = 0; if (y > 0) z = x; Slicing w.r.t variable z x = a; y = 5; if (a > 0) b = x + y; if (*) x = 1; else y = 0; if (y > 0) z = x; Slicing may include dependencies from infeasible paths Jaffar, Joxan, et al. "Path-sensitive backward slicing." Static Analysis. Springer Berlin Heidelberg, 2012. 231-247. Using Change Value analysis - “a” is in Unchanged set