Processor microarchitcecture Flashcards

(89 cards)

1
Q

Explain architcture vs microarchitecture

A
  • The architecture is the programmers view of the processor but the microarchitecture is the hardware view
  • In architecture the regsiters, instructions and addressing modes are visible but in micro internals are not exposed to programmer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain Von Neumann vs Harvard architecture

A
  • VN has a single memory store for both instructions and data and uses same address and data buses for both
  • Harvard has seperate memory and buses for instruction and data (small embedded processors)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a drawback of Von Neumann archticture?

A
  • Instruction fetch and data fetch cannot happen at the same time- causes bottleneck
  • Harvard is faster
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain the difference between CISC and RISC

A
  • CISC (complex instruction set computing) are have complex instructions that perform multiple operations
  • RISC (reduced instruction set computing) optimises hardware for common instruction rather than complex ones
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain RISC in more detail

A
  • Seperates instructions into data movement, control flow, arithmetic operations
  • Instruction size is fixed
  • Load-store architecture
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Give the bit width for all the buses and registers in MU0

A
  • 16-bit memory and instructions
  • 12-bit address bus and PC
  • 16-bit IR, accumulator and ALU
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain the instructions and flags used in MU0

A
  • 2 arithmetic operations ADD and SUB
  • Branch instructions use absolute addressing
  • 2 status flags N and Z
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain STUMP architecture

A
  • Load/store architecture- data only in regsiters and the load/store operations are only what interact with main memory
  • 16-bit processor
  • 8 registers (R0=0, R7=PC)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does S at the end of a STUMP instruction mean?

A
  • Indicates we are updating the status flags
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Give the 4 stump flags

A
  • N: Negative
  • Z: Zero
  • V: Overflow
  • C: Carry-out
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In Verilog what are inputs and outputs defined as?

A
  • Inputs are always wires
  • Outputs are either wires or regs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain combinatorial vs sequential logic and how we implement them

A
  • Uses blocking statements for combinatorial (=) and non-blocking for sequential (<=)
  • Combinatorial implemented using always @ (* ) but sequential uses @ posedge/negedge
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What must we always include in combinatorial logic and why?

A
  • Else statements or else we would be introducing a latch (would make it not combinatorial)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do we declare outputs in sequential and combinatorial logic

A
  • Use reg since value is stored
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the difference between a verilog task and function?

A
  • Functions only take inputs parameters and only return one output
  • Tasks take inputs and return multiple outputs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why is explicit description better than implicit in structural verilog?

A
  • Can see where connections are made and the order doesn’t matter, unlike in implicit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

When do we declare an output as a wire?

A
  • In structural verilog
  • During a continuous assignment (assign statement)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does the output of sequential systems depend on vs what output of combinatorial systems depend on?

A
  • Sequential: depends on current inputs and past history (requires registers)
  • Combinatorial: output depends only on the current inputs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What do the datapath and control path do?

A
  • Datapath handles flow and transformation of data (arithemetic and logical ops)
  • Control path generates control signals for data path operations (determines when components are active)

Sequential systems need a control path, but not always a data path

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the difference between a Mealy and Moore machine?

A
  • Moore machine: outputs depend only on current data and change synchronously with clock edge
  • Mealy machine: outputs depend on current state and inputs and can change asynchronously
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Which element of an SM diagram is only in mealy machines?

A

Conditional output box

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What 2 tests need to be done in the testbench to ensure the program is correct?

A
  • Unit testing (black box) makes sure each section does what is required
  • Integration testing to ensure they work together
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the 5 stages of CAD?

A
  • Design space exploration + modelling
  • RTL design
  • Logic synthesis
  • Layout, placing and routing
  • Electrical checking
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the difference between behaivoural/architectural and RTL design?

A
  • Behaivoural describes what the system does and is untimed
  • RTL describes how data moves between registers and is timed and cycle accurate

RTL is typically written in Verilog

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What does logical synthesis do?
- Maps a netlist into target technology- results in another netlist where all functions are represented by library cells
26
During synthesis, what is the benerfit of flattening the resulting netlist?
Gives greater representational flexibility
27
What is the floorplan of an Integrated Circuit in EDA? ## Footnote EDA = Electronic design automation
Schemative representation of tentative placement of its major functional blocks
28
Give 2 examples of geometrical constraints in floorplanning
- IP blocks come in pre-defined areas - Chip area
29
What checks do we need to do after floorplanning?
- **Equivalence** checking which compares synthesised netlist to the original RTL and highlights potential differences
30
What is the difference between layout and floorplanning?
- Floorplanning only allocates **regions**, whereas layout places **individual** devices and wires
31
What does the extractor do during chip layout?
- Interprets layout geometry as electrical components - Identifies transistors and interconnections - Results in schematic-like representation
32
What is buffer insertion?
- Placing buffers along long wires as they slow down signal transistors- can violate timing constraints
33
Give 4 post-layout tools
- **Design rule check**: ensures layout geometry obeys manufacturing rules - **Electrical rule check**: verifies power supply integrity - **Layout vs schematic**: confirms layout matches the intended schematic - **Process, voltage, temperature**: checks tolerances so the chip functions reliably
34
What is verification and testing important for?
- Important for Very Large Scale Integration VSLI - This is when we create integrated circuits by combining millions of MOS transistors onto a single chip
35
When is verification done and why?
- Done **before** silicon development as fixing can be costly after silicon - Does quality checking and bug fixing
36
What are the 2 types of software simulation during verification? | Explain and compare them
- **Digital** simulation: verify **logic**, reveal initialisation problems, **quicker** and doesn't waste hardware - **Analogue** simulation: verify **timing**, reveal edge speeds but is **slow**
37
Give the benefits and drawbacks of software and hardware simulation during verification
- Software: *controllable*, *afforable*, and *good observability* **but** is *slow* and has *no timing model* - Hardware: *fast*, allows us to *access buried signals* **but** is *inaccurate*, has limited observability and is *expensive*
38
What is the difference between testing and validation?
- Validation checks it fulfils requirements - Testing checks particular hardware parts work
39
What are the 2 ways to test chips?
- **Probe** test: tests **unpackaged chip** but requires time and a jig - **Package** test: tests for **damage after packaging** and is easier, but has more expensive discards
40
What 3 levels of coverage should chip testing be done on?
- **Design-time**: test all source code and every alternate route - **Circuit-level**: check synthesised circuit - **Production**: check every wire can change logical state
41
What are 2 important concepts for designing systems so they can be easily tested?
- **Controllability**: making it easy to inject test patterns- not deeply buried circuits where we can't easily change states - **Observability**: controlling the output- having a test port where you can deduce their states by looking at subsequent circuits
42
What are LFSRs and what are they used for?
- Linear Feedback Shift Registers - Generate **pseudo random sequences** - Used for **circuit testing** to test random patterns - Good for Built-In Self Test BIST where the chip tests its own modules at certain FPGA positions
43
What are the 2 parts of the control?
- **Finite state machine**: defines states and transitions - **Control decode**: derives control signal from the state + instruction (mealy)
44
What signals do MU0 and STUMP have in common and what do they have different?
- Both have Clock, Reset, mem_rd and mem_wr - MU0 only has a halt STP signal
45
What are the states in MU0 and STUMP
- MU0: fetch, decode/ execute - STUMP: fetch, execute, memory
46
Explain charge and current
- **Charge Q** is measured in **Coloumbs** C - **Current I** is the flow of charge and is measured in **Amps**
47
Explain voltage and resistance
- **Voltage V** is what causes current to flow and is measured in **volts** - **Resistance R** opposes the flow of current and is measured in **Ohms**
48
What is Ohms law?
V = I x R
49
How does current flow through resistors in parallel and in series?
- Series: **same current** through resistors - Parallel: current is **split** between resistors ## Footnote VOLTAGE IS OPPOSITE
50
How is resistance calculated in parallel and series?
- **Series**: Sum of all resitances - **Parallel**: 1/R = 1/R1 + 1/R2 +1/R3 ## Footnote CAPACITANCE IS OPPOSITE
51
Explain capacitance and what is its formula?
- Capacitance C is the ability to store charge and is measures in Farads F - C = Q / V
52
What do capacitors do, what are they made from and how do they work?
- Capacitors **store electric charge** - Made of 2 conducting plates seperated by insulating dielectric - When connected to voltage: **charge accumulates** on plates and voltage rises - Stops charging when capacitor voltage = source voltage
53
What are RC circuits?
- Made up of a resistor and capacitor - When the switch closes the capacitor voltage rises exponentially
54
What is the time constant in RC circuits?
- T = R x C - Time taken for capacitor to be fully charged = 5T
55
In simple terms, what are MOSFETs, and what are they used to build?
- Metal Oxide Semiconductor Field Effect Transistors - Control electric current using an electric field - Used to make CMOS (Complementary Metal Oxide Conductors) logic gate
56
What are the 3 MOSFET terminals and how do they connect based on the state?
- Gate, Drain and Source - Path between the drain and source is the **channel** - ON (conducting) when channel exists and OFF (non-conducting) when channel doesn't exist
57
How do MOSFETs act like voltage-controlled switches?
- Gate terminal has resistance and capacitance - Gate voltage doesn't change instantly until threshold voltage is reached (drain-source current increastes rapidly) ## Footnote Switching speed is limited due to this
58
How does an nMOS MOSFET behave?
- Gate=0 -> **OFF** (switch is open) - Gate=1 -> **ON** (switch is closed) - Passes 0 well, passes 1 poorly
59
How does a pMOS MOSFET behave?
- Gate=0 -> **ON** (switch is closed) - Gate=1 -> **OFF** (switch is open) - Passes 1 well, passes 0 poorly
60
What does a CMOS circuit consist of?
- pMOS pull-up network connected to **power supply** - nMOS pull-down network connected to **ground** - Both connected to output- if true pulled high if false pulled low ## Footnote Mutually exclusive- both can't be one
61
What is a Pull-Up network in CMOS and how does it work?
- Made of pMOS transistors that pull **output high** - pMOS=**ON** when **input is 0** - Series: a̅.b̅, Parallel: a̅+b̅
62
What is a Pull-Down network and how does it work?
- Made of nMOS transistors that pull **output low** - nMOS=**ON** when **input is 1** - Series: a.b, Parallel: a+b
63
How is an inverter implemented in CMOS?
- 1 pMOS transistor and 1 nMOS transistor - pMOS -> VDD - nMOS -> GND
64
How is a NOR gate implemented in a CMOS circuit?
- pMOS transistors in series - nMOS transistors in parallel
65
How is a NAND gate implented in a CMOS circuit?
- pMOS transistors in parallel - nMOS transistors in series
66
What considerations do we need when designing a CMOS circuit?
- Gate length should be as small as possible as it determines switching speed - Gate width should be as wide as possible as it determines drive strength - Capacitance is **proportional** to gate area
67
Why is NAND logic preferred over NOR logic in CMOS design?
- NAND: pMOS in parallel-> smaller area, lower impedance - NOR: pMOS in series-> larger area, higher impedance so slower - pMOS has **higher impedence** so is wider than nMOS to match nMOS drive strength
68
What causes power dissipation and why is CMOS energy efficient?
- **Dynamic power dissipation** happens when switching due to **stray capacitance** - **Static power dissipation** is prevented by high input impendance when idle
69
Why is capacative load important in CMOS circuits and how is it affected by decreasing the size of the device?
- Smaller device = thinner and closer wires therefore smaller capacitance - Higher capacitance = slower switching and more power consumed - **Dynamic power loss** is proportional to **capacitance x VDD^2**
70
Why is stray capacitance undesirable in CMOS circuits?
- Stray capacitance means RC delay which **slows output transitions**- switching not instantaneous - This increases dynamic power consumption
71
What are 3 ways of reducing dynamic power dissipation?
- **Reduce the capacitance**: difficult as depends on physical layout - **Increase gate size**: increases area so reduces output impedence but increases load - **Reduce supply voltage**: effective for power reduction but reduces noise margin so 0/1 are harder to distinguish
72
How do different adder design affect speed and power?
- Propagation delay: depends on how many gates a signal passes through - **Ripple-carry adder**: simple and small but high propagation - **Carry-lookahead adder**: faster and reduces carry-bit determination time - **Spintronics adder**: all-magnetic, non-volatile and low-power alternative
73
What are binary multipliers and how do they work?
- Multiplies two binary numbers, using binary adders - Works by **generating partial products**- multiply one input by each digit of the other, then **shift and sum** partial products - Can be **parallelised** to save time and improve speed
74
What are shift registers and how do they work?
- **Chain of flip-flops** sharing a clock: each output feeds the next input- shifts data by one position per clock - **One-place/bidirectional**: shifts sequentially left or right - **Barrel shifter**: shifts by any number of positions in one clock cycle using combinatorial logic ## Footnote Commonly used in hardware for floating-point arithmetic
75
How is system performance defined and measured?
- time per task = clock period x clock cycles per task - **Performance = 1/time per task** - Indicated **how fast a task can be executed**- depends on clock frequency and cycles per task - **Synchronous systems**: measures as **Cycles per instruction** or **Instructions per cycle**
76
What is latency and throughput?
- **Latency = execution time**: time to finish a fixed task - **Throughput = bandwidth**: number of tasks in a fixed time
77
What determines the minimum clock period in a synchronous system?
- **Propagation delay** of source register and combinatorial logic - **Setup time** of target register - **Timing margin** for uncertainties ## Footnote Ensures signals are stable and predicatble for correct operation
78
What 3 timing constraints ensure correct flip-flip operation?
- **Setup time**: input must be stable before clock edge - **Hold time**: input must remain stable after clock edge - **Propagation delay**: time for output to reflect input change after clock edge ## Footnote The simplest synchronous circuit consists of **only flip-flops**: no combinatorial logic
79
What is timing closure in digital design?
- Process of modifying a logic design to **meet timing requirements** - Ensures the system operates correctly at **target clock speed** - Simulation alone is unrealiable unless **critical paths are know**
80
What is Static Timing Analysis?
- Timing analysis that is **independant of input states** - Examines **all combinatorial logic** blocks between flip-flops - Identifies **critical paths** and setup/hold **time violations** ## Footnote Used to quicky identify paths that are significantly slow
81
What are the advantages and disadvantages of Static Timing Analysis?
- **Advantages**: low computational cost, fast, conservative with upper bounds - **Disadvantages**: can be too pessimistic, may report false critical paths
82
What is clock skew and how is it managed in synchronous systems?
- Clock skew is **difference in clock arrival times** at flip-flops - **Unavoidable** but kept small to avoid timing violations - **Balanced clock distribution** (eg. H-tree) equalises path length, load and drive to minimise skew
83
How are systems that require multiple clock signals handled safely?
- Must be partitioned into **clock domains** - Signals crossing domains require **resynchronisation** to **reduce metastability** - If a flip-flop goes metastable, it has **one clock cycle** to resolve
84
Why is floating-point representation used instead of fixed-point?
- **Fixed point**: radix point fixed- limited range=**precision loss** for large/small values - **Floating-point**: significand x base^exponent provide **wide dynamic range** with fixed number of bits - Precision **scales** with magnitude which makes it flexible
85
What do the mantissa and exponent represent in floating-point number?
- **Exponent**: shifts the radix- controls **range** - **Mantissa/significand**: stores meaningful digits- controls **precision** ## Footnote Normalisation means the (binary) number is stores as +-1.xxxx x 2^(exponent-bias)
86
Why is IEEE 754 important in floating-point representation?
- Guarantees **consistent** and **reproducible** results across processors - Defines formats (16, 32 and 64-bit) and field sizes - Explicitly specifies zero, infinity and NaN
87
How are arithemtic operations performed on floating-point numbers?
- **Multiply/divide**: operate on mantissas, add/subtract exponents, normalise, round - **Add/subtract**: align exponents by shifting smaller mantissa, then add/subtract and renormalise
88
Why do processors use specialised architectures instead of single general-purpose design?
- Different applications have different performance needs - Floating-point units (eg 8087 coprocessor) were originally seperate, later integrated on chip - **Digital Signal Processors** optimise hardware for specific workloads to achieve higher performance and efficiency
89
How do modern architecures increase throughput without increasing clock frequency?
- **Single-Instruction Multiple-Data** SIMD + **Superscalar**: can execute more than one instruction per clock cycle- dispatches multiple instructions to different execution units - **Very Long Instruction Words** VLIWs: Compiler explicitly schedules parallel instructions- simpler hardware ## Footnote - Both need wide instruction fetch buses - Superscalar is faster but needs more area and power