Obfuscation Flashcards

1
Q

What is an informal definition of obfuscation?

A

To obfuscate a program P means to transform it into a executable program P’ from which it is harder to extract information than from P.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is an informal definition of reverse engineering?

A

The process of extracting data or a model of the system by inspecting its lower level description and/or behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Name 2 attack scenarios addressed by obfuscation

A

Stealing intellectual property, stealing secrets embedded in program

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Name the two main types of obfuscation and their respective properties

A

Static obfuscation:

  • obfuscated program remains fixed at runtime
  • raises bar against static analysis
  • can be attacked through dynamic techniques

Dynamic obfuscation

  • program keeps changing at runtime -> self modifying code
  • raises the bar against static analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are different ‘Points of insertion’ for obfuscation?

A

Source code, Intermediate representation, machine code

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the different Transformation targets?

A
  • layout -> scramble identifiers and code layout
  • data -> obfuscate data embedded in code
  • control flow -> obfuscate secret algorithms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Name 9 different static obfuscation techniques

A

Confuse Code Reader:

  • Scramble identifiers
  • Instruction substitution
  • Garbage code insertion
  • Merging and splitting functions
  • Control-flow flattening

Confuse Code Reader and Compiler:
- Opaque predicates

  • Virtualization obfuscation
  • Opaque expressions
  • White-box cryptography
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Scrambling identifiers?

A

Identifier names are replaced with random strings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is instruction substitution?

A

Replace binary operation by functionally equivalent but more complicated computations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is garbage code insertion?

A

Dead code is added

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are opaque predicates?

A

Opaque predicates are bogus branches in the control flow which always take the same branch, although hard to see for an attacker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is control-flow flattening?

A
  1. Put each basic block in a case of a switch statement

2. Wrap the switch statement in an infinite loop

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a possible attack on control-flow flattening and how could it be countered?

A
  1. Find next blocks of every basic block
  2. Rebuild original CFG
    Mitigation: assign opaque expression to next
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is an opaque expression?

A

An opaque expression is an expression that will always evaluate to the same value in a way not obvious for an attacker.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do opaque expressions from array aliasing work?

A
  1. A statically initialized array with seemingly random values
  2. The values are generated such that some invariant holds
  3. Update array cells with values that respect invariants
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How does virtualization obfuscation work?

A
  1. Generate random bytecode instruction set architecture (ISA) L covering all instructions of P
  2. Translate P to L bytecode program
  3. Generate emulator to interpret L bytecode on machine
    Output: P’ consisting of bytecode and emulator
17
Q

What is the goal and the idea behind White-Box cryptography?

A

Goal: Hide encryption/decryption key
Idea: Embed the key within the cipher

18
Q

What are some issues with software diversity?

A
  • analysis of crash dumps
  • incremental udpates
  • digitally signing all versions
19
Q

Name two types of software diversity

A
  • Pre-distribution Software Diversity

- Post-distribution Software Diversity

20
Q

In which phases does dynamic obfuscation run?

A
  1. At compile time
    - initial program configuration is generated
    - runtime code-transformer is added
  2. At runtime
    - interleave execution of the program with calls to the code-transfomer T
    - T changes the code at runtime
    - ideally a non-repeating series of configurations, in practice they repeat
21
Q

How does replacing instructions work?

A
  1. Replace real instructions with bogus instructions
  2. Just before execution replace bogus instructions with real instructions
  3. After execution replace real instructions with bogus instructions
22
Q

How does dynamic code merging work?

A
  1. Have two or more functions share the same location in memory
  2. Create templates for functions that share the same location
  3. Before function is called, patch memory using edit script to load it
23
Q

How does dynamic decryption and re-encryption work?

A
  1. Execute current basic block
  2. At some point the current block decrypts the next basic block
  3. Decryption key could be hash of some other basic block
  4. Jump to decrypted block
  5. Encrypt the previously executed basic block
    goto 1
24
Q

What is a non-obvious but annoying problem with self-modifying code?

A

Virus scanners will complain

25
Q

Name 3 dynamic obfuscation techniques

A

replacing instructions, dynamic code merging, dynamic decryption and re-encryption

26
Q

What are the 4 dimensions of Collberg’s obfuscation taxonomy?

A

Potency: comprehensibility of code by humans
Stealth: identifiability of obfuscated code
Resilience: resistance against automatic deobfuscation
Cost: performance and resource overhead of obfuscation

27
Q

Describe methodological steps to characterize and predict the strength of obfuscation

A
  1. Model MATE as attack-nets
  2. Model transitions as search problems
  3. Identify features to generate programs
  4. Obfuscate programs
  5. Attack obfuscated programs
  6. Feature extraction
  7. Predict average effort of attack