W5 - Disks and IO Flashcards

(87 cards)

1
Q

What hides the complexity of storage devices from us?

A

File systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Is there any guarantee our files will be stored in contiguous locations on the disk?

A

No. Our files are bits of data most likely spread over many different parts of the storage device (e.g. sectors of a HDD).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does Input/Output refer to?

A

The communication between the processor and external devices.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do I/O controllers do?

A

They serve as intermediaries between the processor and I/O devices. They implement buffering to smooth out speed differences between fast CPU operations and slower device operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How are I/O controllers connected to storage devices?

A

Physically, via wires.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does the CPU interact with I/O controllers?

A

By reading/writing to I/O registers, as if they were memory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can controllers signal the processor?

A

Via interrupts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a DMA transfer?

A

Allows data movement between memory and storage without constant CPU involvement. Without this, the CPU would waste cycles waiting for slow devices.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

L1/L2 vs L3 cache

A

Each core gets its own L1/L2 cache. L3 cache is shared across all cores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What can controllers be thought of as?

A

Little brains with dedicated purpose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What controller connects the processor and memory?

A

Bridge/memory controller

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do controllers follow to enable communication with each other?

A

Protocols.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a bus?

A

A set of wires for communication among devices plus protocols for data transfer operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Name external connection universal bus standards

A

IDE/ATA, SATA, PCI Express

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a cache in the context of a disk controller?

A

A data buffer which temporarily stores information to improve performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What role do drivers play in HDD controllers?

A

They can communicate with different disk interfaces.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What kind of information will a HDD controller cache?

A

Neighbouring sectors from where head is reading.
Recently accessed blocks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the typical tasks of a HDD controller?

A

Read/write operations
Validation and error correction
Communicating with CPU

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Describe the set of steps in a typical interaction between OS and an I/O device.

A
  1. The OS uses the status register to detect when the device is NOT BUSY
  2. The OS writes into the data register and sets the command register.
  3. The controller sets the status to BUSY
  4. The controller reads the command and the data register, and launches the execution of the command.
  5. The OS detects when the command has been executed based on the status register. The controller therefore clears the command, and resets its BUSY status once the command is executed. It sets status to ERROR if needed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How does Memory Mapped I/O work?

A

Specific addresses in the main memory are reserved for I/O devices.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the two methods that enable the OS to know when an I/O device has completed an operation or encountered an error?

A

I/O Interrupt
Polling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How do I/O interrupts work?

A

The device generates an interrupt whenever it needs service.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Pros/Cons of I/O interrupts

A

Pro: Handles unpredictable events well
Con: Interrupt has relatively high overhead (saving/loading contexts costs many instructions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How does Polling work?

A

The IO device puts completion info in a status register. The OS periodically checks the device specific status register (basically saying “are you done yet?”).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Pros/cons of Polling
Pro: Low overhead Con: May waste many cycles on polling if infrequent or unpredictable I/O operations
26
Are devices restricted to using only one of Polling and/or interrupts?
Nope, they can combine both.
27
Why is interrupt-only-driven network data transmission over 10GBit ethernet problematic?
The NIC would need to generate 800k interrupts per second (each interrupt is a packet arrival or transmission complete). The CPU would spend more time handling interrupts than actual processing, creating an "interrupt storm".
28
What are the two methods for transferring data to/from controllers
Programmed I/O and Direct Memory Access
29
How does Programmed I/O work?
The CPU handles every byte of data transfer directly through processor instructions. Specifically, data is moved byte-by-byte using processor input/output or load/store instructions.
30
Pros/Cons of Programmed I/O
Pros: Simple hardware implementation and straightforward programming. Cons: CPU must process each byte, consuming processor cycles proportional to size of data.
31
How does Direct Memory Access work?
Controllers can access main memory without CPU involvement for each byte. - The controller gets direct access to the memory bus - It can transfer entire blocks of data to/from memory without CPU intervention for each byte.
32
Break down the Direct Memory Access process into steps.
1. CPU initiates the process: The device driver is told to transfer data from the disk to a specific memory buffer (address X) 2. The driver tells the disk controller to transfer C bytes from the disk into address X (the specific memory buffer) 3. The disk controller triggers the DMA process, through the DMA controller. 4. The disk controller sends bytes to the DMA controller. 5. DMA controller transfers bytes to address X, increasing memory address and decreasing C (number of bytes) until C=0. 6. When C=0, DMA interrupts CPU to signal transfer completion.
33
Buffer vs Cache
Buffers temporarily hold data while it is being transferred from one place to another, great for managing data flow between operations/devices working at different speeds. Cache is a high speed storage area that keeps frequently accessed data for faster retrieval, avoiding the need to fetch it from slower storage or recalculate it.
34
The DMA controller shares the bus with the CPU. What two behaviours do we observe from the DMA controller?
1. The DMA controller uses the bus when the processor isnt using it 2. The processor is forced to suspend operation temporarily (cycle stealing)
35
Is cycle stealing problematic?
Not really. It slows down the CPU, but not as much as if the CPU were doing the data transfer itself instead of the DMA controller.
36
Compare how occupied the CPU is with DMA vs with programmed I/O
The CPU is only interrupted at the start and end of the transaction (DMA), as opposed to occupying the CPU for the entire duration of the read/write operation (programmed I/O)
37
What are the two common models for processes doing I/O?
- Synchronous, Blocking (wait) - Asynchronous, Non-blocking (tell me later)
38
How does synchronous, blocking I/O work?
The process requests data, and sleeps until the operation is finished.
39
Benefits of synchronous, blocking I/O
Simple to write and manage. Easy to reason about.
40
How does asynchronous, non blocking I/O work?
The process requests an I/O operation, and gives a pointer to a buffer for the kernel to fill/take data. The kernel performs the I/O in the background, and the process continues executing. When done, the kernel notifies the process.
41
What are device drivers?
Device-specific code in the kernel that interacts directly with the device hardware.
42
What are the two parts of a typical device driver?
A top half (system call path) and a bottom half (interrupt routine)
43
Summarize how the top half of device drivers work.
It implements a set of standard, cross-device calls e.g. open(), close(), read(), write(). This is the kernel's interface to the device driver. It will start I/O operations, and may put the process to sleep until finished.
44
Summarize how the bottom half of device drivers work.
It gets input or transfers the next block of output. It may wake sleeping processes if I/O is now complete. Only needed when receiving something from the device.
45
Break down the steps in the life cycle of an IO request.
[USER PROCESS] 1. The user program requests I/O via system call. [KERNEL IO SUBSYSTEM] 2. Check if we can satisfy the system call immediately e.g. using cache. If we can, the IO completes right here, and skip to step 8. 3. Send a request to the device driver, and block the process if appropriate. [DRIVER - TOP HALF] 4. Process the request, and issue commands to the device controller. Block the controller until interrupted (done with IO). [DEVICE HARDWARE] 5. The device controller begins I/O operation. It monitors the device, and generates an interrupt when done. [DRIVER - BOTTOM HALF] 6. Receive input, Stores data into a device-driver-buffer if it was an input, signal to unblock device driver. [DRIVER - TOP HALF] 7. Determine what I/O completed, indicate state change to I/O subsystem. [KERNEL IO SUBSYSTEM] 8. Transfer data to process, return completion or error code. [USER PROGRAM] 9. I/O completed. Input available or output completed.
46
HDD progress over time
Capacity, RPM, Data Rate all went up. Price/GB, Seek Time, Power, Weight all went down.
47
Imagine a HDD with 8 platters. How many RW heads are there?
One for the top side of a platter + one for the bottom side = 2 heads per platter. 2x8 = 16 heads.
48
What is seek time in HDD
How long it takes to change from one track to the next.
49
What is latency in HDDs due to?
A combination of factors. Seek time, rotational latency, controller, time
50
HDD: Track
Ring around the disk surface
51
HDD: Cylinder
A stack of tracks that are the same distance from the spindle. It's the same track across all the platters.
52
HDD: Sector:
Segment of track (cake slice)
53
HDD: Spindle
Rotates the disks.
54
Cylinder Head Sector Tuple
A way to divide the HDD into specific locations to look for data. The combination of what cylinder it is in, which head is involved, and which sector it is. Give each sector an LBA value
55
HDD: Rotational Delay
Time we need to wait for beginning of desired sector to rotate beneath the head
56
What do we get latency from when reading a block from a random place on disk?
Seek time + Rotational delay + Transfer time
57
What do we get latency from when reading a block from a random place in the same cylinder?
Rotational delay + Transfer Time
58
What do we get latency from when reading the next block on the same track? (we've just read one block on this exact track)
Transfer Time
59
What is the key to using disk effectively to minimize latency?
Minimize seek and rotational delays
60
Disk Scheduling: FIFO
First In First Out. We select tracks in the order they arrive. This results in long seek times. The most fair type.
61
Disk Scheduling: SSTF
Shortest Seek Time First. We go to the track that's closest to where we are on disk. This results in shorter seek times. Unbounded waiting time, so starvation may occur (if requests keep arriving that are closer to current track than existing requests).
62
Disk Scheduling: Scan / Elevator Algorithm
Take the closest request in the direction of travel. When we reach the end of the tracks, we reverse the direction and satisfy the remaining requests. Bounded waiting time, so starvation is avoided. It is not fair (biased against the area of the disk that the head has most recently passed)
63
Disk Scheduling: C-Scan
Circular scan. Take the closest request in the direction of travel. When we reach the 'end' of the tracks, we return the head to the opposite end of the disk and the scan begins again. It skips any requests on the way back. Fairer than SCAN.
64
Principle of locality of reference.
The tendency of a system to access the same set of memory locations or nearby locations repeatedly.
65
The scan / elevator algorithm works against locality of reference. What does this mean?
We're likely to get read requests for tracks/sectors near current requests. Files are likely to have some of their blocks adjacent to each other (stored contiguously).
66
How do you make an anticipatory I/O scheduler?
Create a delay after satisfying a read request to see if a new nearby request can be made. We may actually get faster overall I/O from the disk this way.
67
What is SSD
A device that stores data persistently using integrated circuits, without moving mechanical parts.
68
Describe how SSDs store data
They use NAND flash memory to store data. This uses a transistor, which stores one or multiple bits. A trapped electron distinguishes between 1 and 0.
69
What are the two types of NAND flash technology in SSDs?
Single Level Cell Multi Level Cell
70
Single-Level Cell
Stores 1 bit per cell Fastest, most durable, also most expensive.
71
Multi-Level Cell
Stores 2 or more bits per cell. Cheaper, but slower and has lower endurance.
72
How are SSDs organized internally?
Flash chips are organized in banks. Banks are divided in blocks. Blocks are divided into pages.
73
What structure/unit do SSDs offer operations on to the OS?
512-byte "sectors".
74
Do SSDs have rotational and seek delay? Why?
No. No moving parts.
75
How does SSD writing work?
You have to erase an entire block of pages, before writing a page. This is because SSDs can only write to empty pages within a block. Erasing a block sets all bits to 1 (data we want to keep is copied back). Writing a page sets some bits to 0.
76
How does the SSD's controller handle the complexity of writing?
It maintains a pool of empty blocks ready for writing.
77
Rule of thumb for predicting SSD latency
Write latency is roughly 10x read latency Erase latency is roughly 10x write latency
78
How does flash memory wear down in SSDs
Each block can only be programmed and erased a limited number of times. With each erase cycle, electrical charge builds up in the cells. Over time, this makes it too hard to tell between 0 and 1.
79
How does the SSD controller combat flash memory wear?
It performs wear levelling, spreading out write/erase cycles evenly across all blocks. This prevents single blocks from wearing out prematurely.
80
Why do SSDs beat HDDs in random access IO?
The head movement in a HDD would cause significant delays.
81
In sequential access IO, explain the performance difference between HDD and SSD
The differences are smaller, since HDD head seeks are minimised. For reads, SSDs are a few orders of magnitude faster. For writes, it's not clear as it depends on the wear of the SSD and how full it is.
82
Why are SSDs seen as more reliable compared to HDD?
Lack of mechanical parts.
83
How is SSD lifespan measured?
Drive Writes Per Day - how many times the drive capacity can be written per day before the drive fails.
84
Avg failure rate of SSDs.
6 years. Life expectancy 9-11 years.
85
SSD general pros
Lightweight, low power, silent, shock insensitive.
86
How do we go from physical address to logical address using CHS value?
Convert it to an LBA index.
87