Week 4 - Data Encoding and Basic File Forensics Flashcards

(20 cards)

1
Q

How are files physically stored on storage media?

A

Strings of 1s and 0s (binary data) in data blocks.
Interpretation depends on standards, manufacturer specifications, and hardware structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What command shows a file in binary (bits) and hexadecimal?

A

xxd -b myfile → binary (bits)
xxd myfile → hexadecimal dump

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Encoding?

A

Rules for how data is stored and retrieved.

Binary encoding → for machines
Character encoding → for human understanding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is ASCII?

A

American Standard Code for Information Interchange.
1 byte = 1 character (0-127 standard, 128-255 extended).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Compare UTF-8, UTF-16, and UTF-32

A

UTF-8: Variable bytes, backward compatible with ASCII
UTF-16: 16 bits per character (Little/Big Endian)
UTF-32: 32 bits per character, supports all languages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Base64 encoding used for?

A

Converts binary data into a sequence of 64 printable ASCII characters. Commonly used in URLs, email attachments, and payloads

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Main categories of files in forensics?

A

Data Files – Binary or Character encoded
Program Files – Executables and libraries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Difference between Character-encoded and Binary-encoded files?

A

Character-encoded: Human-readable in text editor (XML, JSON, CSV, YAML, etc.)
Binary-encoded: Machine-readable, illegible as text, often show “magic numbers”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe the JSON format structure

A

Uses { } for objects, : to separate key:value, , to separate pairs, [ ] for lists. Supports nesting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe the CSV format

A

Values separated by commas or colon. One record per line. Optional header row with field names

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the XML format

A

Data enclosed in tags <tag>content</tag>. Supports nesting. Processing instructions start with <?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe YAML and INI formats

A

YAML: Key: Value pairs, uses indentation for nesting, - for lists
INI: Key=Value pairs, sections in [Section], no standard list support

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can you identify a file type without relying on the extension?

A

Using the Magic Number (file signature) – specific bytes at the start of the file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Give examples of common magic numbers.

A

GIF → GIF89a
WAV → RIFF
ZIP (including .docx) → PK

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why can .docx files be opened with a ZIP tool?

A

Because .docx is a ZIP archive containing XML files (word/document.xml, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Signed vs Unsigned Integers – explain the difference

A

Unsigned: Positive numbers only (0 to max)
Signed: Uses Two’s Complement to represent negative numbers

17
Q

What is Little Endian vs Big Endian?

A

Byte order for multi-byte data types:

Little Endian: Least significant byte first
Big Endian: Most significant byte first

18
Q

What standard is used for floating-point numbers?

19
Q

List the 4 main methods to identify a file type (in order of forensic preference)

A

Try to open it (on isolated/offline machine – risky)
Check file extension (easily changed/hidden)
Check Magic Number / File Signature (file command)
Examine contents (text editor, hexdump, strings, etc.)

20
Q

What are the main problems forensic analysts face when examining files?

A

Older or private formats
Overwritten/changed signatures
Fragmented or partially overwritten files
Encrypted files
Steganography (hidden data)
Context-dependent interpretation (endianness, encoding, structures)
Proving the interpretation matches the creator’s original intent