C191-Terms-Chapter-10 Flashcards

1
Q

file system (FS)

A

an integral part of every OS, whose function is to implement the concept of files.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

file

A

a named collection of information managed on secondary storage by the FS. The user can view a file as a single abstract unit of data storage and access the file’s contents using high-level operations provided by the FS’s interface, without knowing the specific characteristics of the underlying storage devices. Depending on the FS, a file can be viewed as an unstructured stream of bytes or a series of records.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

record

A

a structure of related data items, possibly of different data types, identified within a file by a record number or a unique key field. Ex: A student ID would uniquely distinguish all student records in a file.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

access method

A

a set of operations provided by the OS as part of the user interface to access files. The most common access method is sequential.

The FS maintains the current position within the file and each read/write operation accesses the next n bytes or the next record of the file. Some OSs (Ex: most IBM systems) support also direct access methods, where a record can be accessed directly by specifying a record number or a key value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Metadata

A

information about the format and organization of a file’s data and is generally stored in a file header

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

file header

A

a portion of the file preceding the actual data and is visible to only the FS itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

magic number

A

a short sequence of characters at the start of the file header, which identifies the file type. The file type, in turn, determines which programs are allowed to access and interpret the file. Ex: The OS will allow only properly compiled and linked binary files to be loaded into memory for execution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

file extension

A

Another way to represent file types is to use file extensions. A file extension is a sequence of one or more characters following the file name.

A file extension, unlike a magic number, is not hidden within the file header and thus can conveniently be examined by the user. The drawback is that a file extension can easily be changed without changing the file’s type.

Thus file extensions support only a weak form of file typing by providing convenient hints about a file’s type but do not rigorously enforce which operations are valid for a given file type.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

file directory (or folder)

A

a special-purpose file that records information about other files and possibly other directories. The directory consists of a set of entries, each containing the name of a file followed by other information necessary to access the file.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

tree-structured directory hierarchy

A

An approach to organizing the individual file directories into a global directory structure.

A tree-structured directory hierarchy is a collection of directories organized such that (1) every directory points to zero or more files or directories at the next lower level, and (2) every file and directory except the root is pointed to by exactly one parent directory at the next higher level.

The root of a tree-structured directory hierarchy is the highest level directory, which does not have a parent directory.

Every file and every directory has a unique ID, assigned by the FS at the time of creation. This ID is used only by the FS for internal management purposes and is not visible to the users.

Instead, users assign arbitrary ASCII names to files and directories, which become part of path names.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

absolute path name

A

An absolute path name of a file, uniquely identified by an internal ID, f, is the concatenation of the directory and file names leading from the root to the file f. The individual names are separated by an agreed-upon delimiter, typically a forward slash or a backslash.

To avoid using long path names, the FS allows the user to designate one directory as the current working directory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

relative path name

A

a concatenation of file names starting with the current directory.

To be able to navigate the hierarchy both up and down, the FS provides a special convention, generally “..”, to refer to a parent directory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Operations on directories

A

The FS provides a set of operations to allow the user to manipulate and use the directory hierarchy. The operations include:

Change: change the current working directory to another directory specified by a path name.

Create: create a new named directory and a new entry in the current working directory, which points to the new directory.

Delete: delete directory d specified by a path name. Delete also all files and directories reachable using any path name starting from d.

Move: move a file or directory from one directory to another.

Rename: change the name of a file or directory specified by a path name.

List: list the file names and optionally other attributes of all files contained in a directory specified by a path name.

Find: find a file or directory specified by a name.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Main drawback of a tree-structured directory hierarchy

A

is that file sharing is asymmetric. Only one directory can be the parent of any file or another directory. Other users must refer to the file by first navigating up to a common directory and then down to the desired location.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

directed acyclic directory hierarchy

A

organizes directories such that any directory at a given level may point to zero or more files or other directories at lower levels but also permits any file or directory to have more than one parent directory.

To permit the creation of multiple parents, the FS provides an additional operation of the form “link path1 path2”, which creates a new link from the directory identified by the path name path1 to the directory or file identified by path2.

Since any file, including any directory, can now be pointed to by more than one directory, the semantics of file deletion must be extended.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

reference count

A

a non-negative integer associated with a file f, which indicates how many directories are pointing to the file. The file is deleted only when the reference count is 1, indicating that f has only a single parent directory.

Otherwise only the pointer to f is deleted from the parent directory and the reference count of f is decremented.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

A general graph structure with symbolic links

A

If a directory is allowed to point to a file or another directory at a higher level, then a cycle can form in the hierarchy.

A cycle, if undetected, can lead to an infinite loop in algorithms that search the hierarchy for a given file. File deletion also becomes more difficult.

A reference count is not sufficient to prevent the creation of unreachable subgraphs in the directory hierarchy, which can only be removed using a garbage collection algorithm.

A compromise solution is to allow only a single parent directory for any file or directory but to provide symbolic links to support file sharing.

A symbolic link (or shortcut) is a directory entry that points to a file or directory just like a regular entry but is treated differently with respect to deletion. A delete operation only removes the link but not the file itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

symbolic link (or shortcut)

A

a directory entry that points to a file or directory just like a regular entry but is treated differently with respect to deletion. A delete operation only removes the link but not the file itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Contents of file directories

A

The typical file attributes include:

Size: The current size in bytes or words.

Type: Information to differentiate directories, regular files, executable files, and other types of files supported by the system.

Location: Information necessary to locate the file’s physical blocks on disk.

Protection: Information about who can access the file and the permitted type of access (Ex: read only or execute only).

Use: The date and time of file creation, last access, or last modification.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Where to keep file attributes?

A

One approach is to keep all attributes together with the file name inside each file directory.

The main drawback is that file directories can become very long, which slows down the search.

Another approach is to keep only the file name inside the directory and provide a pointer to a separate data structure to store the file attributes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

File control block (FCB)

A

a data structure associated with a filename that contains all relevant attributes of the file. FCBs are stored apart from file directories and are pointed to by the corresponding directory entries. In Unix and other OSs an FCB is called an i-node.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Internal structure of file directories

A

The FS must be able to efficiently (1) search directories for a file name, (2) find and allocate an entry when a new file is created, (3) free an entry when a file is deleted, and (4) change the file name, which may shorten or lengthen the name length. Several approaches exist to organizing directory entries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Number 1 approach to organizing directory entries

A

The directory is an array with fixed-length entries. Each entry contains a valid file name or is free.

Finding, freeing, or allocating entries is simple because all have the same size.

Main drawback: All file names have a maximum length.

24
Q

Number 2 approach to organizing directory entries

A

The directory is an array with variable-length entries. Each entry contains a length field and either a valid file name or is free.

Searching for a file name or a hole is accomplished by following the length fields.

Deleting a file requires coalescing the newly free entry with any free neighboring entries to prevent fragmentation.

Changing the file name length requires freeing the current entry and allocating a new entry to reflect the new name length.

25
Q

Number 3 approach to organizing directory entries

A

The directory is an array with variable-length entries. Each entry contains a valid file name, followed by a possibly empty hole. Each entry also contains two pointers, one pointing to the end of the entry and the second to the end of the file name.

Finding, freeing, or allocating entries is analogous to implementation 2.

The main advantage is that, in most cases, changing the file name length only requires adjusting the second pointer.

A new entry must be allocated only when the new length exceeds the end of the hole.

26
Q

create file operation

A

causes the creation of new named file. As part of the create operation, the FS allocates and initializes a new File control block (FCB). The FS also allocates a free directory entry, which contains the file’s name and a pointer to the FCB.

27
Q

destroy file operation

A

removes an existing file from the FS. The operation reverses the process of creation by freeing up the file’s FCB and directory entry.

If the file is not empty, then the location information in the FCB points to blocks on the disk, where the file’s contents are stored. The FS marks the file’s disk blocks as free.

28
Q

Open a file using open file table (OTF)

A

In order to read, write, or otherwise manipulate a file, the corresponding operations need to retrieve information about the file’s length and location on the disk.

To avoid accessing the information repeatedly and to keep track of the file’s current state, the FS maintains all relevant information in an open file table kept in main memory.

The open file table (OFT) is a data structure that keeps track of all files currently in use to facilitate efficient access to and manipulation of the files.

29
Q

open file operation

A

prepares a file for efficient access and manipulation by retrieving relevant file information from the FCB and storing the information in an entry of the OFT. Subsequent accesses to the file then occur via an index into the OFT.

30
Q

The steps performed by the open operation include:

A

Verify that the invoking process has the right to access the file and perform the requested operation.

Find and allocate a free entry in the OFT.

Allocate read/write buffers as necessary for the given type of file access.

Copy relevant information from the FCB to the OFT entry, including the file length and location on the disk.

Initialize other information in the OFT entry, such as the initial position of a sequentially accessed file.

Return the index of the allocated OFT entry to the calling process for subsequent access to the file.

31
Q

read file operation

A

copies data from an open file to a specified area in main memory. The data may be accessed either directly, one record at a time, or sequentially, by specifying the number of bytes to be read next. A generic sequential read instruction has the form:

read i, m, n

where i is an OFT index corresponding to an open file, n specifies the number of bytes to be read, and m is a location in memory (ex: an array) where the data is to be stored.

The contents of a file are kept on disk, which can only be accessed one block at a time. To avoid the overhead of reading an entire disk block for each read operation, the FS maintains a current disk block in a r/w buffer and copies the specified numbers of bytes from the buffer to the memory area m.

Only when the end of the buffer is reached does the FS access the disk to get the next disk block, which replaces the previous disk block in the buffer.

32
Q

write file operation

A

copies data from an area in main memory to a specified open file. A generic sequential write operation has the form:

write i, m, n

where i is an OFT index corresponding to an open file, n specifies the number of bytes to write, and m is the starting location in memory (ex: the name of an array) from which the data is to be copied.

Analogous to the read operation, the write operation also uses the r/w buffer in the OFT. The operation starts copying the specified number of bytes from the memory area to the r/w buffer.

When the desired number of bytes are copied, the operation terminates. If the end of the buffer is reached before all bytes are copied, the operation allocates a new disk block and enters the block number into the FCB.

The operation then stores the current r/w buffer in the new block and continues copying the remaining bytes from memory to the now empty buffer.

33
Q

seek operation

A

moves the current position of an open file to a new specified position. A generic seek operation has the form:

seek i, k

where i is an OFT index corresponding to an open file and k specifies the new position within the file.

The operation sets the current position in the OFT entry to k. In addition, if k is not within the block currently held in the r/w buffer, the operation must save the r/w buffer to disk and load the new target block.

The block containing position k is determined by dividing k by the block size. The offset within the buffer, corresponding to the position k, is determined by taking k modulo the buffer size.

34
Q

close file operation

A

reverses the effects of the open operation by saving the current state of the file in the FCB and freeing the OFT entry.

The steps performed by the close operation include:

If the current content of the r/w buffer is modified, then save the buffer in the corresponding block on the disk.

Update the file length in the FCB.

Free the OFT entry.

Return the status of the close operation to the calling process.

35
Q

disk block

A

a fixed sequence of bytes on the disk, which can only be accessed as a single unit using low-level read-block and write-block operations.

The FS views the disk as a sequence of blocks, numbered consecutively 0 through D - 1, where D is the total number of blocks on the disk.

A file is stored on the disk in units of disk blocks. Thus a file may be viewed as a series of 1 or more blocks, numbered 0 through f - 1, where f is the number of blocks constituting the file.

The way disk blocks are allocated to file blocks has a major impact on the efficiency of most file operations.

36
Q

contiguous block allocation scheme

A

every file is mapped into a contiguous sequence of disk blocks. The FCB points to the first disk block. The file length, also maintained in the FCB, determines how many blocks are occupied by the file.

The main advantages are:

Fast sequential access because neighboring blocks do not require any seek operations.

Fast direct access because a target block number can be computed using the file position and the block length.

The main drawbacks are:
Disk fragmentation, in that over time, the disk consists of variable-length sequences of free and occupied blocks and requires search algorithms to find free areas of appropriate size for a given file.

Inability to expand a file if the block following the last file block is not free. The entire file must be copied into a larger area.

Difficulty in deciding how much space to allocate to a new file to allow for potential expansion.

37
Q

linked block allocation scheme

A

the blocks containing a file may be scattered throughout the disk. The FCB points to the first block and each block points to the logically next block.

The main advantage is:
Easy expandability of a file by linking additional blocks to the last block.

The main drawbacks are:
Slower sequential access than with a contiguous allocation, since each block access requires a seek operation to a different position on the disk.

Inability to perform direct access since the location of the target block cannot be computed. To access any block requires following the pointers and reading all blocks preceding the target block.

Decreased reliability of the disk. If a disk block becomes physically damaged, the rest of the file cannot be found.

Considerable waste of disk space since every disk block must include a pointer to the next block.

38
Q

clustered block allocation scheme

A

links together sequences of contiguous blocks. The last block of any cluster points to the beginning of the logically next cluster.

The last block also records the number of blocks belonging to the next cluster to facilitate direct access within each cluster. The number of blocks in the first cluster is recorded in the FCB along with the pointer.

The approach is a compromise between the contiguous and linked allocations.

39
Q

advantages and disadvantages of a clustered block allocation scheme.

A

The main advantages are:
Easy expandability of a file. If the block following the last file block is free, than the last cluster is extended as with the contiguous allocation. If the block following the last file block is occupied, then a new cluster is started and the last block points to the first block of the new cluster.

A reduced number of pointers since only the last block of a cluster contains a pointer.

Improved sequential access over the purely linked allocation since accessing blocks within a cluster does not require any seek operations.

The main drawbacks are:
Slower sequential access than with contiguous allocation since each cluster requires a seek operation to a different position on the disk.

Inability to perform direct access since the location of the target block cannot be computed. To access any block requires following the pointers until the cluster containing the target block is reached.

40
Q

File allocation table (FAT)

A

an array where each entry corresponds to a disk block. The FAT keeps track of which disk blocks belong to a file by linking the blocks in a chain of indices.

A file’s FCB contains the index of the first file block, which in turn contains the index of the next block, etc.

The FAT provides all the advantages of a linked allocation, including the ability to easily expand a file. Since the FAT resides in a small number of designated disk blocks, the blocks can be cached in main memory.

Consequently:
Sequential access is more efficient since no seek operations are necessary to follow pointers.

Direct access is also possible, because the location of the desired block can be found efficiently by scanning the chain of indices in the FAT.

41
Q

indexed block allocation scheme

A

file blocks may reside anywhere on the disk. An index table is provided for each file, which keeps track of the blocks belonging to the file.

The main advantages are:
Fast sequential as well as direct access.

Ability to efficiently expand a file by simply adding a new block number to the index table.

The main drawback is:
Necessity to decide a priori how big the index table should be. A small table limits the maximum size of any file. A large table wastes disk space, since most file will have no need for the large size.

42
Q

A variation of the simple indexed allocation by UNIX OS

A

is an index table that can expand as necessary. The FCB contains the block numbers of the first n file blocks, which can be accessed directly as with the simple index table.

If more than n blocks are needed, the next entry of the FCB points to a second level index table of size m. Thus a file can have n + m blocks but the additional m blocks incur an access penalty, because the second-level index must be accessed first.

If still more blocks are needed, the expansion can continue with additional multi-level indices, each providing additional blocks but at an increasing cost of access.

43
Q

bitmap

A

Since disk space is allocated in fixed-length blocks, the allocation status of the disk can conveniently be represented by a bitmap.

A bitmap is a data structure where each bit represents one disk block. A 1 indicates that the block is allocated and a 0 indicates that the block is free.

The space used up by the bitmap is minimal. Ex: With a block size of 512 bytes, the bitmap overhead is only 100/(512 * 8) = 0.024%.

If the system architecture does not support individual bits, the bitmap can be implemented as an array of bytes or integers.

Setting, resetting, and searching for an individual bit value is accomplished by bitwise logical operations on the entire byte or word containing the target bit.

44
Q

memory mapping a file

A

allows a part of the virtual address space to be logically associated with the file.

Memory mapping a file is accomplished by mapping a disk block to a page (or pages) in memory. Initial access to the file proceeds through ordinary demand paging, resulting in a page fault.

However, a page-sized portion of the file is read from the file system into a physical page (some systems may opt to read in more than a page-sized chunk of memory at a time). Subsequent reads and writes to the file are handled as routine memory accesses.

Note that writes to the file mapped in memory are not necessarily immediate (synchronous) writes to the file on secondary storage. Generally, systems update the file based on changes to the memory image only when the file is closed.

Under memory pressure, systems will have any intermediate changes to swap space to not lose them when freeing memory for other uses. When the file is closed, all the memory-mapped data are written back to the file on secondary storage and removed from the virtual memory of the process.

45
Q

How different OSs use the memory mapped file mechanism

A

Some operating systems provide memory mapping only through a specific system call and use the standard system calls to perform all other file I/O.

However, some systems choose to memory-map a file regardless of whether the file was specified as memory-mapped. Let’s take Solaris as an example. If a file is specified as memory-mapped (using the mmap() system call), Solaris maps the file into the address space of the process.

If a file is opened and accessed using ordinary system calls, such as open(), read(), and write(), Solaris still memory-maps the file; however, the file is mapped to the kernel address space. Regardless of how the file is opened, then, Solaris treats all file I/O as memory-mapped, allowing file access to take place via the efficient memory subsystem and avoiding system call overhead caused by each traditional read() and write().

46
Q

processes and file mapping

A

Multiple processes may be allowed to map the same file concurrently, to allow sharing of data. Writes by any of the processes modify the data in virtual memory and can be seen by all others that map the same section of the file.

Given our earlier discussions of virtual memory, it should be clear how the sharing of memory-mapped sections of memory is implemented: the virtual memory map of each sharing process points to the same page of physical memory—the page that holds a copy of the disk block.

The memory-mapping system calls can also support copy-on-write functionality, allowing processes to share a file in read-only mode but to have their own copies of any data they modify.

So that access to the shared data is coordinated, the processes involved might use one of the mechanisms for achieving mutual exclusion.

47
Q

shared memory and memory mapping files

A

Quite often, shared memory is in fact implemented by memory mapping files. Under this scenario, processes can communicate using shared memory by having the communicating processes memory-map the same file into their virtual address spaces. The memory-mapped file serves as the region of shared memory between the communicating processes.

We have already seen this in Section 3.8, where a POSIX shared-memory object is created and each communicating process memory-maps the object into its address space. In the following section, we discuss support in the Windows API for shared memory using memory-mapped files.

48
Q

Shared memory in the windows API

A

The general outline for creating a region of shared memory using memory-mapped files in the Windows API involves first creating a file mapping for the file to be mapped and then establishing a view of the mapped file in a process’s virtual address space.

A second process can then open and create a view of the mapped file in its virtual address space. The mapped file represents the shared-memory object that will enable communication to take place between the processes.

49
Q

steps for creating a region of shared memory using memory-mapped files in the Windows API

A

In this example, a producer process first creates a shared-memory object using the memory-mapping features available in the Windows API.

The producer then writes a message to shared memory. After that, a consumer process opens a mapping to the shared-memory object and reads the message written by the consumer.

To establish a memory-mapped file, a process first opens the file to be mapped with the CreateFile() function, which returns a HANDLE to the opened file. The process then creates a mapping of this file HANDLE using the CreateFileMapping() function.

Once the file mapping is done, the process establishes a view of the mapped file in its virtual address space with the MapViewOfFile() function.

The view of the mapped file represents the portion of the file being mapped in the virtual address space of the process—the entire file or only a portion of it may be mapped.

50
Q

CreateFileMapping()

A

The call to CreateFileMapping() creates a named shared-memory object called SharedObject. The consumer process will communicate using this shared-memory segment by creating a mapping to the same named object. The producer then creates a view of the memory-mapped file in its virtual address space.

By passing the last three parameters the value 0, it indicates that the mapped view is the entire file. It could instead have passed values specifying an offset and size, thus creating a view containing only a subsection of the file.

(It is important to note that the entire mapping may not be loaded into memory when the mapping is established. Rather, the mapped file may be demand-paged, thus bringing pages into memory only as they are accessed.)

51
Q

MapViewOfFile() function

A

The MapViewOfFile() function returns a pointer to the shared-memory object; any accesses to this memory location are thus accesses to the memory-mapped file. In this instance, the producer process writes the message “Shared memory message” to shared memory.

Finally, both processes remove the view of the mapped file with a call to UnmapViewOfFile(). We provide a programming exercise at the end of this chapter using shared memory with memory mapping in the Windows API.

52
Q

memory mapping

A

A file-access method in which a file is mapped into the process memory space so that standard memory access instructions read and write the contents of the file; an alternative to the use of read() and write() calls.

53
Q

file mapping

A

In Windows, the first step in memory-mapping a file.

54
Q

view

A

In Windows, an address range mapped in shared memory. Also, the second step in memory-mapping a file, allowing a process to access the file contents.

55
Q

named shared-memory object

A

In Windows API, a section of a memory-mapped file accessible by name from multiple processes.