Lecture 3 - Processor and memory Flashcards

1
Q

What is processor architecture?

A

What components the processor consists of, and how they are connected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two types of processor architectures?

A

The two types seperates how memory is organized.

Von Neumann: Shared instruction- and data memory (general purpose).

Harvard: Seperate memories for instruction and data (mostly used in embedded systems)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is Harvard architectures best for embedded systems?

A

To optimize memory use.
You know the text segment at run time. Won’t be new instructions, and can therefor seperate data and text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the main processor functions?

A

Fetch: Get instruction from memory (send PC to memory and read from memory at this address)

Decode: What operation need to be performed, and what are the operands. Locates the operands

Executes: Read operands and execute instructions. Save results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a single cycle processor design?

A

All functions (fetch, decode, execute) are all performed in one cycle.

Cycle needs to be long enough to cover all the functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a downside with using single cycle designs?

A

Only one piece of hardware is being used at the time. During fetch, decode and execution is idle. During decode, fetch and execution is idle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is pipelined processor design?

A

Breaks instruction execution in different phases.

Execute one phase in one cycle. Because of this, less work is done per cycle - can increasy frequency

Can overlap execution of phases, as these are executed on different pieces of hardware.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What stages updates the PC in a pipeline design?

A

Fetch must update the PC, so that the next instruction is ready to be fetched next cycle.

For branch instructions, execute stages sets the PC - don’t know earlier because of branch conditions needs to be resolved.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the formula for execution time?

A

Instruction count * CPI * cycle time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the types of dependencies?

A

Data- and control dependencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Compare execution time of single- and pipeline processors

A

Instruction count is the same

CPI is the same (Because of overlap, though one instruction takes n cycles, the overlap causes one instruction to finish every cycle)

Pipeline has 1/n cycle time of single cycle

Pipeline can provide n-times the performance (the ideal case)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why is 1/n only the ideal performance increase of pipeline, and not the actual one

A

To be able to achieve 1/n cycle time in pipelined processors, the work for each phase must be evenly distributed. This is difficult to do.

Even if the cycle time is 1/n, the CPI being 1 is difficult because of dependencies causing hazards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are data dependencies?

A

An instruction reads the result of a previous instruction before the result is ready to be used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a control dependency?

A

Instruction execution depends on the outcome of branch instructions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is one way to avoid hazards?

A

Pause pipeline - increases execution time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are true (data) dependencies?

A

Data written by one instruction is being used by another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are named dependencies / false dependencies

A

No data movement between instructions, but instructions are using the same registers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are anti dependencies

A

A type of named dependency

One instruction reads a register. A later instruction writes to this register. These instructions can’t be reorderes as the first would read a different value if it were to execute after the second.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is output dependencies?

A

Type of named/false dependencies.

Output dependencies uses the same output registers.
For example when two instruction write to the same data. If other instructions depends on the stored value of this data from one of the instructions, re-ordering the write instructions will cause wrongful execution.
The instructions themselves are not effected by each other, but other instructions are

18
Q

What is branch penalty

A

When pipeline needs to be flushed because of wrong execution path.
Loss of cycles

19
Q

What can be done to avoid control hazards?

A

While a branch instruction is passing through the pipeline, start execution of instructions that are completely independent.

Disadvantages:
Difficult in general to find these instructions.

This is more difficult for deeper pipelines.

Exposes pipeline design to programmer/compiler -> architecture dependent code

Can also use a branch predictor

20
Q

What is a branch predictor?

A

Hardware that predicts the direction taken by the branch, before the next fetch is executed.

21
Q

What can be done to boost performance

A

Out-of-order execution

Superscalar processors (fetch, decode and execute multiple instructions per cycle)

Multithreaded processors (execute multiple instructions streams in parallell)

Caching, prefetch

22
Q

What are the different memory technologies that are available?

speed and cost for 1GB

A

SRAM: (1-10 ns), expensive ($1000)

DRAM: (100 ns), $10

Flash SSD: (100 micro meter), $1

Magnetic disk: (10 ms), $0.1

23
Q

What are the trade offs within memory?

A

Speed and cost

24
Q

What is SRAM and how does it store a bit

A

Static Random Acces Memory

Use two cross-coupled inverters. The output of one inverter is connected to the input of the other. Out from second connected to input of first. This connection holds the bit.

A bit is sent to first inverter - inverted, then sent to the second inverter - inverted back

Needs 6 transistors, 2 per inverter, 2 to access the cell.
The cell is used to read/write the bit

25
Q

What is DRAM

A

Dynamic Random Access Memory.

One capacitor stores a single bit. The charge of the capacitor indicates the value of the bit. Charged - 1, discharged - 0.

1 access transistor

26
Q

What is a problem with DRAM

A

Leaks charge over time.

A value stored in a DRAM cell can disappear over time. Charged value can be flipped over time.

27
Q

How is leaks prevented in DRAM?

A

Refresh DRAM periodically.

Check what is stored in capacitor. If the value is 1, we know there are some charge int he capacitor - need to put in more charge to keep it full.

28
Q

How is memory fetched from a 2D DRAM memory?

A

Address is stored in address register

The most significant bits in the address represent the row, the LSB represent the column.

Row address is decoded by row-decoder. Based on address bits - one row is activated - all cells in row is activated/read from capasitor.

The values from the row is driven from the cells. Data read from row is amplified because some capacitors may be lowering down.

Column decoder chooses the bits of interest.

29
Q

Compare DRAM and SRAM

A

DRAM is slower because it uses capacitors

DRAM provides higher desity - only need 1 capacitor and one transistor

DRAM has lower cost

DRAM requires refresh because of leaks, this refresh needs power and area. During refresh processor cannot use it - reduces performance

30
Q

What is the memory hierarchy?

A

Combines memory kinds. Small amounts of fast memory, close to the processor.
Large amount of slow memory, farther from the processor

This is to appear the existense of a large, fast memory.

Registers (flip flops): 100s of bytes

Cache (SRAM): 1kB-10MB, can be multiple levels

Main memory (DRAM): 1-64GB

Disk/SSD: 100s GB (SSD), 1-2 TB (Disk)

31
Q

What is the granilarity of which memory is transfered between memory levels?

A

Registers - cache: words, 4-8 bytes

Cache - main memory: blocks, 16-128 bytes

Main memory - disk: pages (1KB - 2MB)

32
Q

When transferring data between memory layers, why are more data transfered the further down the hierarchy we go

A

To compensate for the initial memory latency.
Fetching the first byte of data is expencive - but the subsequent once comes fast.

33
Q

Why is memory hiararchy effective?

A

Because of temporal and spatial locality

34
Q

What is temporal locality?

A

A recently accessed memory location is likely to be accessed again in the near future

35
Q

What is spatial locality?

A

Memory locations close to a recently accessed loction are likely to be accessed in the near future.

36
Q

What is a cache block/line?

A

The unit/granularity of data stored in the cache (32-128 bytes)

37
Q

What is cache hit?

A

Data is found in the cache

38
Q

What is cache miss?

A

Data is not found in cache

Needs to move further down the hierarchy (lower level caches, main memory)

When found - copy data to cache (because of temporal locality)

39
Q

What is hit rate?

A

Fraction of accesses that are in the given level of memory

40
Q

What is hit time?

A

Time required to access a memory level

41
Q

What is miss penalty?

A

Time required to fetch block into some level, from the next level down the hierarchy

42
Q

How are data identified in main memory

A

By their full 32-bit address

43
Q

How do we map a 32-bit address to a smaller memory such as a cache?

A

Only use a part of the address to map to an address in cache.

A tag field is introduced in the cache.
This tag field stores the higher order bits of the address that are not used to locate a memory in the cache.

As multiple addresses will map to the same cache line, the tag is compared with the higher order bits to identify what address is currently stored in the cache.

44
Q

What is the valid bit in the cache?

A

Indicates if memory has been moved to the cache line.

Even though a cache line might be empty, a memory address may still be pointing to this line. The valid bit is used to indicate that the cache line is not written, and that the address still needs to be fetched from memory.

45
Q

What types of bits are used in a direct mapped cache?

A

Tag: Identifies that the address is stored in cache

index bit: used to choose cache line

Byte offset: Choose the byte from within the cache line