File Systems Flashcards
Give five different path names for the file /etc/passwd. (Hint: Think about the directory
entries “.” and “..”.)
You can go up and down the tree as often as you want using “..”. Some of the many paths are: /etc/passwd /./etc/passwd /././etc/passwd /./././etc/passwd /etc/../etc/passwd /etc/../etc/../etc/passwd /etc/../etc/../etc/../etc/passwd /etc/../etc/../etc/../etc/../etc/passwd
In Windows, when a user double clicks on a file listed by Windows Explorer, a program is
run and given that file as a parameter. List two different ways the operating system could
know which program to run.
The Windows way is to use the file extension. Each extension corresponds to a file type and to
some program that handles that type. Another way is to remember which program created the
file and run that program. The Macintosh works this way.
In early UNIX systems, executable files (a.out files) began with a very specific magic
number, not one chosen at random. These files began with a header, followed by the text and
data segments. Why do you think a very specific number was chosen for executable files,
whereas other file types had a more-or-less random magic number as the first word?
(TUT10)
These systems loaded the program directly in memory and began executing at word 0, which
was the magic number. To avoid trying to execute the header as code, the magic number was a
BRANCH instruction with a target address just above the header. In this way it was possible to
read the binary file directly into the new process’ address space and run it at 0, without even
knowing how big the header was.
Is the open() system call in UNIX absolutely essential? What would the consequences be of
not having it?
(TUT10)
To start with, if there were no open, on every read it would be necessary to specify the name of
the file to be opened. The system would then have to fetch the i-node for it, although that could
be cached. One issue that quickly arises is when to flush the i-node back to disk. It could time
out, however. It would be a bit clumsy, but it might work.
Systems that support sequential files always have an operation to rewind files. Do systems
that support random-access files need this, too?
No. If you want to read the file again, just randomly access byte 0.
Some operating systems provide a system call rename to give a file a new name. Is there
any difference at all between using this call to rename a file and just copying the file to a new
file with the new name, followed by deleting the old one?
(TUT10)
Yes. The rename call does not change the creation time or the time of last modification, but
creating a new file causes it to get the current time as both the creation time and the time of
last modification. Also, if the disk is nearly full, the copy might fail.
In some systems it is possible to map part of a file into memory. What restrictions must
such systems impose? How is this partial mapping implemented?
The mapped portion of the file must start at a page boundary and be an integral number of
pages in length. Each mapped page uses the file itself as backing store. Unmapped memory uses
a scratch file or partition as backing store.
A simple operating system supports only a single directory but allows it to have arbitrarily
many files with arbitrarily long file names. Can something approximating a hierarchical file
system be simulated? How?
(TUT10)
Use file names such as /usr/ast/file. While it looks like a hierarchical path name, it is really just a
single name containing embedded slashes.
In UNIX and Windows, random access is done by having a special system call that moves
the ‘‘current position’’ pointer associated with a file to a given byte in the file. Propose an
alternative way to do random access without having this system call.
One way is to add an extra parameter to the read system call that tells what address to read
from. In effect, every read then has a potential for doing a seek within the file. The
disadvantages of this scheme are (1) an extra parameter in every read call, and (2) requiring the
user to keep track of where the file pointer is.
Consider the directory tree of Fig. 4-8 (https://imgur.com/a/U8wMPfK). If /usr/jim is the working directory, what is the
absolute path name for the file whose relative path name is ../ast/x?
The dotdot component moves the search to /usr, so ../ast puts it in /usr/ast. Thus ../ast/x is the
same as /usr/ast/x.
Contiguous allocation of files leads to disk fragmentation, as mentioned in the text,
because some space in the last disk block will be wasted in files whose length is not an
integral number of blocks. Is this internal fragmentation or external fragmentation? Make an
analogy with something discussed in the previous chapter.
Since the wasted storage is between the allocation units (files), not inside them, this is external
fragmentation. It is precisely analogous to the external fragmentation of main memory that
occurs with a swapping system or a system using pure segmentation.
Describe the effects of a corrupted data block for a given file for: (a) contiguous, (b) linked,
and (c) indexed (or table based).
(TUT10)
If a data block gets corrupted in a contiguous allocation system, then only this block is affected;
the remainder of the file’s blocks can be read. In the case of linked allocation, the corrupted
block cannot be read; also, location data about all blocks starting from this corrupted block is
lost. In case of indexed allocation, only the corrupted data block is affected.
One way to use contiguous allocation of the disk and not suffer from holes is to compact
the disk every time a file is removed. Since all files are contiguous, copying a file requires a seek and rotational delay to read the file, followed by the transfer at full speed. Writing the
file back requires the same work. Assuming a seek time of 5 msec, a rotational delay of 4
msec, a transfer rate of 80 MB/sec, and an average file size of 8 KB, how long does it take to
read a file into main memory and then write it back to the disk at a new location? Using these
numbers, how long would it take to compact half of a 16-GB disk?
(FFE18)
It takes 9 msec (5 msec + 4 msec) to start the transfer.
The given transfer rate is 80 MB/sec = 10 × 2²³ bytes per second.
The size of file is 8 KB = 2¹³ bytes.
The time taken to read a file = Size of file / Transfer rate = 2¹³ / 2²³ = 0.0977 msec.
Total time needed copying one file = 2 × 9.0977 = 18.1954 msec (to read and write back file)
To compact 8 GB of storage (half of a 16-GB disk), which is 8 GB / 8 KB = 2²⁰ files, it will take 2²⁰
× 18.1954 msec = 19,079.26 sec = 5.3 hours. Clearly, compacting the disk after every file
removal is not a great idea.
Does compacting the disk ever make any
sense?
If done right, yes. While compacting, each file should be organized so that all of its blocks are
consecutive, for fast access. Windows has a program that defragments and reorganizes the disk.
Users are encouraged to run it periodically to improve system performance. But given how long
it takes, running once a month might be a good frequency.
Some digital consumer devices need to store data, for example as files. Name a modern
device that requires file storage and for which contiguous allocation would be a fine idea.
A digital still camera records some number of photographs in sequence on a nonvolatile storage
medium (e.g., flash memory). When the camera is reset, the medium is emptied. Thereafter,
pictures are recorded one at a time in sequence until the medium is full, at which time they are
uploaded to a hard disk. For this application, a contiguous file system inside the camera (e.g., on
the picture storage medium) is ideal.
Consider the i-node shown in Fig. 4-13 (https://imgur.com/a/gxnY2FF). If it contains 10 direct addresses and these were 8
bytes each and all disk blocks were 1024 KB, what would the largest possible file be?
The indirect block can hold 128 disk addresses. Together with the 10 direct disk addresses, the
maximum file has 138 blocks. Since each block is 1 KB, the largest file is 138 KB.
For a given class, the student records are stored in a file. The records are randomly accessed and updated. Assume that each student’s record is of fixed size. Which of the three allocation schemes (contiguous, linked and table/indexed) will be most appropriate?
(TUT10)
For random access, table/indexed and contiguous will be both appropriate, while linked
allocation is not as it typically requires multiple disk reads for a given record.
- Consider a file whose size varies between 4 KB and 4 MB during its lifetime. Which of the
three allocation schemes (contiguous, linked and table/indexed) will be most appropriate?
Since the file size changes a lot, contiguous allocation will be inefficient requiring reallocation of
disk space as the file grows in size and compaction of free blocks as the file shrinks in size. Both
linked and table/indexed allocation will be efficient; between the two, table/indexed allocation
will be more efficient for random-access scenarios.
It has been suggested that efficiency could be improved and disk space saved by storing
the data of a short file within the i-node. For the i-node of Fig. 4-13 (https://imgur.com/a/gxnY2FF), how many bytes of data
could be stored inside the i-node?
(TUT10)
There must be a way to signal that the address-block pointers hold data, rather than pointers. If
there is a bit left over somewhere among the attributes, it can be used. This leaves all nine
pointers for data. If the pointers are k bytes each, the stored file could be up to 9k bytes long. If
no bit is left over among the attributes, the first disk address can hold an invalid address to mark
the following bytes as data rather than pointers. In that case, the maximum file is 8k bytes.
Two computer science students, Carolyn and Elinor, are having a discussion about inodes.
Carolyn maintains that memories have gotten so large and so cheap that when a file is opened, it is simpler and faster just to fetch a new copy of the i-node into the i-node table,
rather than search the entire table to see if it is already there. Elinor disagrees. Who is right?
Elinor is right. Having two copies of the i-node in the table at the same time is a disaster, unless
both are read only. The worst case is when both are being updated simultaneously. When the i-
nodes are written back to the disk, whichever one gets written last will erase the changes made
by the other one, and disk blocks will be lost.
Name one advantage of hard links over symbolic links and one advantage of symbolic links
over hard links.
Hard links do not require any extra disk space, just a counter in the i-node to keep track of how
many there are. Symbolic links need space to store the name of the file pointed to. Symbolic
links can point to files on other machines, even over the Internet. Hard links are restricted to
pointing to files within their own partition.
Explain how hard links and soft links differ with respective to i-node allocations.
A single i-node is pointed to by all directory entries of hard links for a given file. In the case of
soft-links, a new i-node is created for the soft link and this i-node essentially points to the
original file being linked.
Consider a 4-TB disk that uses 4-KB blocks and the free-list method. How many block
addresses can be stored in one block?
Block size = 4 KB = 2¹² bytes
Disk size = 4 TB = 4⁴² bytes
The number of blocks on the disk = 4 TB / 4 KB = 2³⁰.
Thus, each block address can be 32 bits = 4 bytes (the nearest multiple of 8).
Thus, each block can store 2¹² / 4 = 1024 addresses.
Free disk space can be kept track of using a free list or a bitmap. Disk addresses require D
bits. For a disk with B blocks, F of which are free, state the condition under which the free list uses less space than the bitmap. For D having the value 16 bits, express your answer as a
percentage of the disk space that must be free.
The bitmap requires B bits. The free list requires D × F bits. The free list requires fewer bits if
D × F < B. Alternatively, the free list is shorter if F / B < 1 / D, where F / B is the fraction of blocks
free. For 16-bit disk addresses, the free list is shorter if 6.25% or less of the disk is free.