Hacking Bootcamp S4: ~help disassemble~

Today

  • Static analysis
  • Dynamic analysis
  • Exercises

Tools

Install please:

  • Editor of choice (e.g., VS Code, VIM, Emacs)
  • cfr
  • java
  • python
  • ipython
  • Ghidra

Reverse engineering

Reverse engineering is the process of understanding the behaviour of a system based on an analysis of its source code or source code artifacts (compilation output).

Normally we forward engineer a program from a specification, in reverse engineering we try to infer the specification from the program.

General workflow

  1. Identify entry point (main)
  2. Find user inputs
  3. Identify flow of data (specially input)
  4. Understand (and reverse) transformations
  5. Reconstruct valid input (gets us the flag)

Static analysis

  • Analyze the program without running it (statically)
  • We might have the source code of the program, or just the compilation artifacts
    • If only the compilation artifacts are present our job is harder
    • We might be able to get some possible source code from the artifacts (decompilation)

Compiled forms

  • Java class
  • Python Bytecode
  • x86-64

Dynamic analysis

  • We run the program and observe its behaviour
  • We can insert debugging statements and other instrumentation
  • We might also control some parts of the program execution using debugging tools

Static and dynamic analysis are complementary

Static analysis gives us a first idea of what the program might be doing

Dynamic analysis allows us to test our hypothesis

Tools

Python

print

Good ol’ print

prettyprint

from pprint import pprint

A prettier print

breakpoint()

Pauses execution and opens the Python debugger

help for help

ipython

A better, interactive Python shell

Useful for testing how snippets of code behave

Java

System.out.println

Java’s print with trailing line

x86-64

Ghidra

Reverse engineering tool that analyzes compiled binaries

Decompiles machine code into feasible C code (C code that would result in similar machine code)

  • Basic usage
    • The decompiler output is not the most readable C code
    • It’s our job to make it make sense
    • To do so, we can:
      • Rename variables and functions (right click -> rename)
      • Change the types of variables and functions (right click -> change type)
      • Add comments (right click -> add comment) to explain our inferred logic

GDB

GNU Debugger

Allows us to run and debug C (and other language) compiled programs

As most of our programs will be compiled without debugging information (source code information), we will need to understand x86-64 assembly to be able to use GDB

Language Documentation

You should also search for the documentation of any used libraries.

Exercises - Java

Exercises - Python

Exercises - x86-64

Further learning

For learning x86-64

Hack time!