Visualizing Symbolic Execution with Bokeh Asankhaya Sharma SRC:CLR
Symbolic Execution (SE) • Analyzing a program to determine what inputs cause each part of a program to execute [Wikipedia] • The idea – Execute the program with an input – Build a symbolic formula during execution which captures the path taken by the input through the program 10 June 2015 PyData Singapore 2
Path Condition (PC) int max(int x, int y, int z){ int m = x; if(y>m && y>z) m = y; else if(z>m) m = z; return m; } max(1,3,2) = 3 Inputs: x0,y0,z0 PC: true PC: m0=x0 PC: m0=x0∧y0>m0∧y0>z0 ∧m1=y0 Output: m1 10 June 2015 PyData Singapore 3
10 June 2015 PyData Singapore 4 m = x m = y y>m && y>z z > m m = z return m true m=x …∧y>m∧y>z …∧¬(y>m∧y>z) …∧z>m …∧¬(z>m) …∧m=z …∧m=y Execution Tree
Path Exploration PC: m0=x0∧y0>m0∧y0>z0∧m1=y0 PC1: y0>x0∧y0>z0∧3=y0 Negate first constraint PC2: y0<=x0∧y0>z0∧3=y0 Check satisfiability using a constraint solver New Inputs: x0=3, y0=3, z0=2 Repeat SE with new inputs 10 June 2015 PyData Singapore 5
Why is SE useful? • Automated Fuzzing • Test Case Generation • Debugging Error Traces • Program Analysis • … 10 June 2015 PyData Singapore 6
Bottlenecks • Path Explosion – Loops and recursion – Unbounded number of paths in a program • Constraint Solving – int is easy but what about other data types floats, strings, bit vectors etc. – Handling data structures with pointers 10 June 2015 PyData Singapore 7
Exploiting Undefined Behaviors for Efficient Symbolic Execution [ICSE 14] 10 June 2015 PyData Singapore 8
Demo 1 • Symbolic execution with Pathgrind – fuzz/fuzz.py 10 June 2015 PyData Singapore 9
Bokeh • Bo(w)-Ke(ttle) 10 June 2015 PyData Singapore 10
10 June 2015 PyData Singapore 11
Demo 2 • Plotting with Bokeh – Line Plot – Scatter Plot – Bokeh Server 10 June 2015 PyData Singapore 12
Visualizing SE • Time Taken – Generate path conditions (path exploration) – Generate new inputs (by solving constraints) 10 June 2015 PyData Singapore 13
Demo 3 • Pathgrind + Bokeh = Visualize SE – fuzz/plotfuzz.py 10 June 2015 PyData Singapore 14
10 June 2015 PyData Singapore 15
All paths are not equal • Use Levenshtein distance to measure the similarity between the path conditions when represented as strings • Scatter plot of similarity using Bokeh 10 June 2015 PyData Singapore 16
10 June 2015 PyData Singapore 17
Optimization for SE • Prune paths that are >90% similar – As measured using Levenshtein edit distance 10 June 2015 PyData Singapore 18
10 June 2015 PyData Singapore 19
10 June 2015 PyData Singapore 20
Take Away • Symbolic Execution • Using Bokeh to Visualize SE • Identify Optimizations for SE • Future – Statically Sampling of Paths – Probabilistic Analysis 10 June 2015 PyData Singapore 21
We are hiring … Shape the future of software security at SourceClear. By joining our team, you can help define the way modern developers identify and fix vulnerabilities in their code. Check out https://jobs.lever.co/sourceclear 10 June 2015 PyData Singapore 22
Thank You! • Questions? • Contact – Twitter: @asankhaya • Links – Source Code: https://github.com/codelion/pathgrind – Slides: http://asankhaya.github.io/ppt/PyDataSing.pptx 10 June 2015 PyData Singapore 23

Visualizing Symbolic Execution with Bokeh

  • 1.
    Visualizing Symbolic Execution withBokeh Asankhaya Sharma SRC:CLR
  • 2.
    Symbolic Execution (SE) •Analyzing a program to determine what inputs cause each part of a program to execute [Wikipedia] • The idea – Execute the program with an input – Build a symbolic formula during execution which captures the path taken by the input through the program 10 June 2015 PyData Singapore 2
  • 3.
    Path Condition (PC) intmax(int x, int y, int z){ int m = x; if(y>m && y>z) m = y; else if(z>m) m = z; return m; } max(1,3,2) = 3 Inputs: x0,y0,z0 PC: true PC: m0=x0 PC: m0=x0∧y0>m0∧y0>z0 ∧m1=y0 Output: m1 10 June 2015 PyData Singapore 3
  • 4.
    10 June 2015PyData Singapore 4 m = x m = y y>m && y>z z > m m = z return m true m=x …∧y>m∧y>z …∧¬(y>m∧y>z) …∧z>m …∧¬(z>m) …∧m=z …∧m=y Execution Tree
  • 5.
    Path Exploration PC: m0=x0∧y0>m0∧y0>z0∧m1=y0 PC1:y0>x0∧y0>z0∧3=y0 Negate first constraint PC2: y0<=x0∧y0>z0∧3=y0 Check satisfiability using a constraint solver New Inputs: x0=3, y0=3, z0=2 Repeat SE with new inputs 10 June 2015 PyData Singapore 5
  • 6.
    Why is SEuseful? • Automated Fuzzing • Test Case Generation • Debugging Error Traces • Program Analysis • … 10 June 2015 PyData Singapore 6
  • 7.
    Bottlenecks • Path Explosion –Loops and recursion – Unbounded number of paths in a program • Constraint Solving – int is easy but what about other data types floats, strings, bit vectors etc. – Handling data structures with pointers 10 June 2015 PyData Singapore 7
  • 8.
    Exploiting Undefined Behaviorsfor Efficient Symbolic Execution [ICSE 14] 10 June 2015 PyData Singapore 8
  • 9.
    Demo 1 • Symbolicexecution with Pathgrind – fuzz/fuzz.py 10 June 2015 PyData Singapore 9
  • 10.
    Bokeh • Bo(w)-Ke(ttle) 10 June2015 PyData Singapore 10
  • 11.
    10 June 2015PyData Singapore 11
  • 12.
    Demo 2 • Plottingwith Bokeh – Line Plot – Scatter Plot – Bokeh Server 10 June 2015 PyData Singapore 12
  • 13.
    Visualizing SE • TimeTaken – Generate path conditions (path exploration) – Generate new inputs (by solving constraints) 10 June 2015 PyData Singapore 13
  • 14.
    Demo 3 • Pathgrind+ Bokeh = Visualize SE – fuzz/plotfuzz.py 10 June 2015 PyData Singapore 14
  • 15.
    10 June 2015PyData Singapore 15
  • 16.
    All paths arenot equal • Use Levenshtein distance to measure the similarity between the path conditions when represented as strings • Scatter plot of similarity using Bokeh 10 June 2015 PyData Singapore 16
  • 17.
    10 June 2015PyData Singapore 17
  • 18.
    Optimization for SE •Prune paths that are >90% similar – As measured using Levenshtein edit distance 10 June 2015 PyData Singapore 18
  • 19.
    10 June 2015PyData Singapore 19
  • 20.
    10 June 2015PyData Singapore 20
  • 21.
    Take Away • SymbolicExecution • Using Bokeh to Visualize SE • Identify Optimizations for SE • Future – Statically Sampling of Paths – Probabilistic Analysis 10 June 2015 PyData Singapore 21
  • 22.
    We are hiring… Shape the future of software security at SourceClear. By joining our team, you can help define the way modern developers identify and fix vulnerabilities in their code. Check out https://jobs.lever.co/sourceclear 10 June 2015 PyData Singapore 22
  • 23.
    Thank You! • Questions? •Contact – Twitter: @asankhaya • Links – Source Code: https://github.com/codelion/pathgrind – Slides: http://asankhaya.github.io/ppt/PyDataSing.pptx 10 June 2015 PyData Singapore 23