File System Flashcards

Question

* Where should we keep the file offset?* * what if two processes want to read from the same file but starting from a different offset?* * What if i want a couple of threads to write to the same file one after the other?*

Answer 1

* we keep the offset within the file struct .* * Two files might refer to the same inode but have different offset, thus no problem for them to read different parts of the data.* * For the mission of writing/reading together **Unix** provided the following solution:* * Parent's file descriptors-table-entries and child's file descriptors-table-entries point to the **same** file struct, thus holding the same offset and flags.* * Unrelated processes point to **different** files thus have different offsets.*

Answer 2

**No.** Each process has its own *file descriptors table* that points to the entries in the **kernel's** *open files description table.* *The kernel's open files description table entries point to the i-node of the file.*

Answer 3

It adds an entry to both the *kernel's open file descriptions* and the *process'* file description table.

Answer 4

*A simple directory(like in MS-DOS):* * *fixed size entries* * *disk addresses & attributes in diretory entry* *Directory entries simply point to i-nodes.*

Answer 5

*In-line* means it is included in the file. That is, there is no pointer referring to some outsource which holds the content, it is "physically" here.

Answer 6

Implemented using struct *dirent* which holds : * *name* * *inode array*

Answer 7

*_Pros:_* * *Typically a multiple number of sectors* * *Because its size requires division* * *In sequential access, less blocks to read/write - less seek/search.* *_Cons:_* * *High internal fragmentatios* * *In Random access larger transfer time, larger memory buffers.*

Answer 8

*_Pros:_* * Smaller internal fragmentatios * Faster random access *_Cons:_* * Slower sequential access(*more seeks*)

Answer 9

* A - Average seek-time * Avg time for head to get above a cylinder. * B - Average time to get to track block. * C - Rotation time * Time for disk to complete full rotation * D - Transfer time * (block size/track size)\*rotation time ## Footnote *Average time to access block is given by:* **A + B + D.**

Answer 10

***As a linked list of blocks*** * *Adresses of the first n free blocks are stored at the _super block._* * *First n-1 blocks are free to be assigned.* * *Last block contains address of a block which contains n more addresses to free blocks..* * Addresses of many free blocks are retrieved with one disk access.* * Unix maintains a single block of free-block addresses in memory.* * Whenever the last free block is reached, the next block of free-blocks is read and used.* ***As a bit map***

Answer 11

* *Both counts 0 - block is missing* * *Add to free list* * *More than once in free list* * *delete all references but one* * *More than once in files - **Trouble*** * *two or more different files write/read from the same place. which to delete? problem.* * *In both file and free list* * *Delete from free list.* ***_Hard links may solve the problem_*** * *count - number of references to each i-node which is found by _descending_ down the file system tree.* * *Compare count with the link-count field in the i-node struct.* * *If different, correct link-count field.*

Answer 12

_*Link-count \>* actual links number_ * This file will not be removed. * Even after removing all actual links - there's a fake one which tell the kernel to avoid removal. * A waste of memory. _*Link-count \<* actual links number_ * The file will be removed even though there's a file pointing to it. It is a much greater problem

Answer 13

**Using *filename* caching**

Answer 14

It is a pool of internal **data buffers** - the buffer cache. The *kernel* maintains that. _Important notes_ * Data *written to disk* is cached, for later use. * Algorithms instruct the buffer to _delay-write._ * If data is not found in cache, it is read from disk to cache. * Algoritms instruct the buffer to _pre-cache._

Answer 15

*Buffers are categorized into:* * *busy- currently being used.* * *Clean - available for use, sync with block content on disk.* * *Free - empty and haven't been used yet.* * *Dirty - needs to be moved to write list.* * Each has a header, that includes the pair .* * The Buffer Cache is implemented using **LRU list:*** * *Buffers are on a doubly-linked list in LRU order.* * *Each hash-queue entry points to a linked list of buffers that have the same hash value.* * *A block may be in only **one** hash-queue.* * *A **free block** is on the free-list(maintained by the kernel, remember?) _in addition_ to being on a hash-queue.* * *When looking for a particular block, the hash-queue for it is searched. When in need of a new block , it is removed from the free list.*

Answer 16

1. *Found in its hash queue and is free* * *_buffer is marked busy_* * *_buffer is removed from free list_* 2. *Found in its has queue and is busy* * *_process sleeps until buffer is freed, then **recheak**._* 3. *Not found in the hash queue and there are free buffers* * *_A free buffer is allocated from the pool_* 4. *Not found in the hash queue and in searching the free list for a free buffer, one or more "delayed-write" buffers are found.* * *_write delayed-write buffer(s) to disk_* * *_move them to the most recently used side of the list and find a free buffer_* 5. *Not found in the hash queue and free list is empty.* * *_Block requesting process, when scheduled, go through hash-queue again._*

Answer 17

Because while waiting some other process **C** might have gotten the buffer, might have loaded it with another block **c.**

Answer 18

* Some blocks should be written as quickly as possible. * Insert critical blocks at the head of the queue ,to be replaced soon and written to disk. * Have a system daemon that calls *sync* every 30 seconds, to help in updating blocks. * Some blocks are likely to be used again. (partly filled blocks being written) * Insert to the end to stay longer in the cache.

Answer 19

***NTFS:*** * A journaling file system - a file system that keeps track of changed not yet commited to the file system's main part by *_recording_* the intentions of such changes. * Each file is represented by a record in a special file called the *master file table(MFT)*

Answer 20

MFT - *Master File Table* * Each file/directory has one or more **1K records** in the *MFT.* * A record contains file attributes and a list of block numbers. * Larger files need more then one MFT record for the list of blocks - records are **extended** by pointing to other records. * Disk blocks are described by sequence of records, each of which is a series of *runs.* * *A run* is a continguous sequence of blocks and represented by a pair: (offset, length) * Case the file is very small - data can be kept *directly.* * The boot sector contains the MFT address * NTFS tries to allocate blocks continguosuly * That is why *run is* a contiguous sequence of blocks. * No upper bound on file size

Answer 21

**Yes.** **Assume files A,B where B is a longer.** A - most of its blocks are not sequential. * A few *runs* structs. * Might have a few MFT records. B - all of his blocks are sequential. * Only one *run* struct is needed. * Only one MFT record is needed.

Answer 22

*With small directories:* * *Standard info* * *A directory entry contains the MFT index for the file, length of the file's name, file's name, and various fields and flags.* *With large directories:* * *Organized as B+ trees* * *Supports transparent file compression* * *Compresses in groups of 16 blocks.* * *Can select to compress whole volume, specific directories or files.*

Answer 23

**Disk throughput is increased** * *because size is smaller.* **CPU works harder** * Because it needs to compress/decompress **Access to RAM slowed down**

Answer 24

*_DFS - Distributed File Systems_* * A collection of interconnected machines that **do not share memory or a clock.** * We can use *file naming scheme.* (**{hostname:path}**) * This method is not **transparent**. * we want the kernel to hide details * This can be solved using _mount_ * __*mounting* is a process by which the OS makes files and directories on a storage device available for users to access via the computer's file system. * A server exports part of its file system * We don't want clients to be exposed to the server's data.. * *stateless protocol -* server remembers nothing about previous requests from a client * *stateful protocol -* more efficient(information kept in server's kernel) * _can be different machines and different OSs_ * _Any machine can be both client and server_ * _Clients access directories by *mounting*_ * *_remote mounts are invisible_* * *_guess it means A request B that request C.._* * *_Servers specify their exported directories upon boot, listed in /etc/exports_* * *_File sharing: accessing a file in a directory mounted by other (sharing) clients._*

Answer 25

NFS(v3) protocols * *send*: {host-name;target-directory *path}, get*: *file handle* * *File handle contains:* * *File system type* * *Disk ID* * *i-node number* * *Protection information* * *File operations - directory and file access:* * *lookup* provides a *file handle* * *reads* and *writes* have all the needed information by using file handles - offsets are absolute. * No *open* or *close* calls. * Server **does not** keep open files tables. * Files cannot be locked

Answer 26

*Remote Procedure Calls (RPC)* * How does it activate code on a remote machine? * using send/receive protocol * Proccess on one machine calls a procedure on another machine. What must be maintained? * synchronous(!) - *blocking* send * Problems? * Different address spaces * Parameters and results have to be passed - by they refer to addresses the client/server don't have * Machines can crash... * general scheme * A client *stub* and a server *stub* * A stub is a piece of code that converts parameters passed between *client* and *sever* * Server *stub* blocked on recieve. * Client *stub* blocks for returning message

Answer 27

*NFS* is a distributed file system protocol allowing a user on a client computer to access files over a _computer network_ much like local storage is accessed.

Answer 28

***_NFS implemenation:_*** * Client & Server have: * Virtual File System layer. * Keeps *v-nodes* for *open-files.* * *v-node* = {*i-node* | *r-node*} * *NFS* module * Used for remote requests * Client creates *r-node* in its internal tables and stores the *file handle.*

Answer 29

*_NFS mounting_* * mount program invoked manually by admin * input: {remote target | local dir pathname} * remote target = {remote server name | dir path name} * *_local dir is where the file will stay in the **client's** VFS_* * server sends *file handle* representing directory * kernel construct a *v-node* for remote directory * Client(NFS) creates r-node for remote directory _Opening a remote file_ * Client kernel parses a path name to reach desired directory * Find a pointer to *r-node* in *v-node* * Asks *NFS* client to open the file * *NFS* clients look at the last name of the path name and gets back a *file-handle.* * *NFS* client creates an *r-node* entry for the open file, stores the *file-handle* in it and the VFS creates a *v-node*, pointing to the *r-node* * Calling process is given a file descriptor in return, pointing to the *v-node.* * Any subsequent calls this file descriptor will be traced by the VFS to the *r-node* and the suitable *read/write* operations will be performed.

Answer 30

_obsolete value:_ *A value which is not in use anymore.* _read-ahead_ *Reading data in advance so that it is immedietly available when requested.*

Answer 31

*_NFS: performance issues_* * Data sent in 8KB chunks * not necesarilly efficient * *Read-ahead* might not be coherent. * Client caching is important for efficiecy * If the policy is **not** *write-through NFS* exposed to problems with coherency and data loss * Cached block discarded * Data block after 3 seconds * Directory block after 30 seconds * Every 30 seconds all dirty cached blocks are written. * Check with the server whenever a cached file is opened * if not in used discard from cache.

File System Flashcards

(56 cards)