Study Guide - Chapter 4: Searching and Analyzing Text Flashcards
(31 cards)
1- The cat -E MyFile.txt command is entered, and at the end of every line displayed is a $. What does this indicate?
- The text file has been corrupted somehow.
- The text file records end in the ASCII character NUL.
- The text file records end in the ASCII character LF.
- The text file records end in the ASCII character $.
- The text file records contain a $ at their end.
The text file records end in the ASCII character LF.
A text file record is considered to be a single file line that ends in a newline linefeed that is the ASCII character LF. You can see if your text file uses this end‐of‐line character by issuing the cat -E command. Therefore, option C is the correct answer. The text file may have been corrupted, but this command does not indicate it, so option A is an incorrect choice. The text file records end in the ASCII character LF and not NUL or $. Therefore, options B and D are incorrect. The text file records may very well contain a $ at their end, but you cannot tell by the situation description, so option E is a wrong answer.
3- Which of the following utilities change text within a file? (Choose all that apply.)
- cut
- sort
- vim
- nano
- sed
- vim
- nano
Recall that many utilities that process text do not change the text within a file unless redirection is employed to do so. The only utilities in this list that will allow you to modify text are the text editors vim and nano. Therefore, options C and D are the correct answers. The cut, sort, and sed utilities gather the data from a designated text file(s), modify it according to the options used, and display the modified text to standard output. The text in the file is not modified. Therefore, options A, B, and E are incorrect choices.
sed
Stream Editor: A powerful text processing utility that performs operations like search, find and replace, insertion, and deletion on text input; commonly used with the syntax sed ‘s/pattern/replacement/g’ filename for substitutions or sed -n ‘5,10p’ filename to print specific lines.
displays the modified text to standard output
cut
command-line utility used to extract sections from each line of input; works with delimiter-based fields using -d option (e.g., cut -d”:” -f1 /etc/passwd to extract usernames) or character positions using -c option (e.g., cut -c1-5 filename to extract first 5 characters).
displays the modified text to standard output
sort
Used to arrange text lines in specified order; can sort alphabetically (sort filename), numerically (sort -n filename), in reverse (sort -r filename), remove duplicates (sort -u filename), or by specific fields (sort -k2 filename sorts by second field).
displays the modified text to standard output
4- You have a text file, monitor.txt, which contains information concerning the monitors used within the data center. Each record ends with the ASCII LF character and fields are delimitated by a comma (,). A text record has the monitor ID, manufacture, serial number, and location. To display each data center monitor’s monitor ID, serial number, and location, you’d use which cut command?
- cut -d “,” -f 1,3,4 monitor.txt
- cut -z -d “,” -f 1,3,4 monitor.txt
- cut -f “,” -d 1,3,4 monitor.txt
- cut monitor.txt -d “,” -f 1,3,4
- cut monitor.txt -f “,” -d 1,3,4
cut -d “,” -f 1,3,4 monitor.txt
The cut command gathers data from the text file, listed as its last argument, and displays it according to the options used. To define field delimiters as a comma and display each data center monitor’s monitor ID, serial number, and location, the options to use are -d “,” -f 1,3,4. Also, since the text file’s records end with an ASCII LF character, no special options, such as the -z option, are needed to process these records. Therefore, option A is the correct choice. Option B uses the unneeded -z option and is therefore a wrong answer. Option C is an incorrect choice because it reverses the -f and -d options. Options D and E are wrong answers because they put the filename before the command switches.
6- You are a system administrator on a Red Hat Linux server. You need to view records in the /var/log/messages file that start with the date May 30 and end with the IPv4 address 192.168.10.42. Which of the following is the best grep command to use?
- grep “May 30?192.168.10.42” /var/log/messages
- grep “May 30.*192.168.10.42” /var/log/messages
- grep -i “May 30.*192.168.10.42” /var/log/messages
- grep -i “May 30?192.168.10.42” /var/log/messages
- grep -v “May 30.*192.168.10.42” /var/log/messages
grep “May 30.*192.168.10.42” /var/log/messages
Option B is the best command because this grep command employs the correct syntax. It uses the quotation marks around the pattern to avoid unexpected results and uses the .* regular expression characters to indicate that anything can be between May 30 and the IPv4 address. No additional switches are necessary. Option A is not the best grep command because it uses the wrong regular expression of ?, which only allows one character to exist between May 30 and the IPv4 address. Options C and D are not the best grep commands because they employ the -i switch to ignore case, which is not needed in this case. The grep command in option E is an incorrect choice, because it uses the -v switch, which will display text records that do not match the PATTERN.
7- Which of the following is a BRE pattern that could be used with the grep command? (Choose all that apply.)
- Sp?ce
- “Space, the .*frontier”
- ^Space
- (lasting | final)
- frontier$
- Sp?ce
- “Space, the .*frontier”
- ^Space
- frontier$
A BRE is a basic regular expression that describes certain patterns you can use with the grep command. An ERE is an extended regular expression and it requires the use of grep -e or the egrep command. Options A, B, C, and E are all BRE patterns that can be used with the grep command, so they are correct choices. The only ERE is in option D, and therefore, it is an incorrect choice.
8- You need to search through a large text file and find any record that contains either Luke or Laura at the record’s beginning. Also, the phrase Father is must be located somewhere in the record’s middle. Which of the following is an ERE pattern that could be used with the egrep command to find this record?
- “Luke$|Laura$.*Father is”
- “^Luke|^Laura.Father is”
- ”(^Luke|^Laura).Father is”
- “(Luke$|Laura$).* Father is$”
- ”(^Luke|^Laura).*Father is.*”
”(^Luke|^Laura).*Father is.*“
To meet the search requirements, option E is the ERE to use with the egrep command. Therefore, option E is the correct answer. Option A will return either a record that ends with Luke or a record that ends with Laura. Thus, option A is the wrong answer. Option B is an incorrect choice because it will return either a record that begins with Luke or a record that begins with Laura and has one character between Laura and the Father is phrase. Option C has the Luke and Laura portion of the ERE correct, but it only allows one character between the names and the Father is phrase, which will not meet the search requirements. Thus, option C is a wrong choice. Option D will try to return either a record that ends with Luke or a record that ends with Laura and contains the Father is phrase, so the egrep command will display nothing. Thus, option D is an incorrect choice.
9- A file data.txt needs to be sorted numerically and its output saved to a new file newdata.txt. Which of the following commands can accomplish this task? (Choose all that apply.)
- sort -n -o newdata.txt data.txt
- sort -n data.txt ˃ newdata.txt
- sort -n -o data.txt newdata.txt
- sort -o newdata.txt data.txt
- sort data.txt ˃ newdata.txt
- sort -n -o newdata.txt data.txt
- sort -n data.txt ˃ newdata.txt
To sort the data.txt file numerically and save its output to the new file, newdata.txt, you can either use the -o switch to save the file or employ standard output redirection with the ˃ symbol. In both cases, however, you need to use the -n switch to properly enact a numerical sort. Therefore, both options A and B are correct. Option C is a wrong answer because the command has the newdata.txt and data.txt flipped in the command’s syntax. Options D and E do not employ the -n switch, so they are incorrect answers as well.
10- Which of the following commands can display the data.txt and datatoo.txt files’ content one after the other to STDOUT? (Choose all that apply.)
- ls data.txt datatoo.txt
- sort -n data.txt ˃ datatoo.txt
- cat -n data.txt datatoo.txt
- ls -l data.txt datatoo.txt
- sort data.txt datatoo.txt
- cat -n data.txt datatoo.txt
- sort data.txt datatoo.txt
The commands in both options C and E will display the data.txt and datatoo.txt files’ content one after the other to STDOUT. The cat -n command will also append line numbers to it, but it will still concatenate the files’ content to standard output. Therefore, options C and E are correct. Option A will just display the files’ names to STDOUT, so it is a wrong answer. Option B will numerically sort the data.txt, wipe out the datatoo.txt file’s contents, and replace it with the numerically sorted contents from the data.txt file. Therefore, option B is an incorrect answer. Option D will show the two files’ metadata to STDOUT instead of their contents, so it also is a wrong choice.
11- A text file, StarGateAttacks.txt, needs to be specially formatted for review. Which of the following commands is the best command to accomplish this task quickly?
- printf
- wc
- pr
- paste
- nano
pr
The pr command’s primary purpose in life is to specially format a text file for printing, and it can accomplish the required task fairly quickly. Therefore, option C is the best choice. While the pr utility can handle formatting entire text files, the printf command is geared toward formatting the output of a single text line. While you could write a shell script to read and format each text file’s line via the printf command, it would not be the quickest method to employ. Therefore, option A is a wrong answer. Option B’s wc command will perform counts on a text file and does not format text file contents, so it is also an incorrect answer. The paste command will “sloppily” put together two or more text files side by side. Thus, option D is a wrong answer. Option E is an incorrect choice because the nano text editor would force you to manually format the text file, which is not the desired action.
printf
Formatting command that displays text with precise control; uses format specifiers like %s (strings) and %d (numbers) and supports escape sequences (e.g., printf “Name: %s\nID: %d\n” “john” 1001).
wc
Counts lines (-l), words (-w), and characters (-c) in files; useful for quickly analyzing file size and content structure (e.g., wc -lwc config.txt for complete file statistics).
pr
Formats text files for printing with headers, footers, and pagination; supports multi-column output and file merging (e.g., pr -h “Report” -2 data.txt).
paste
Merges lines from multiple files horizontally using tabs or custom delimiters (-d option); essential for combining related data from separate files (e.g., paste -d”:” users.txt ids.txt).
12- You need to format the string 42.777 into the correct two‐digit floating number. Which of the following printf command FORMAT settings is the correct one to use?
- “%s\n”
- ”%.2s\n”
- “%d\n”
- ”%.2c\n”
- ”%.2f\n”
”%.2f\n”
The printf FORMAT “%.2f\n” will produce the desired result of 42.78, and therefore option E is the correct answer. The FORMAT in option A will simply output 42.777, so it is an incorrect choice. The FORMAT in option B will output 42 and therefore is a wrong answer. The printf FORMAT setting in option C will produce an error, and therefore, it is an incorrect choice. Option D’s printf FORMAT “%.2c\n” will display 42 and thus is also an incorrect answer.
13- A Unicode‐encoded text file, MyUCode.txt, needs to be perused. Before you decide what utility to use in order view the file’s contents, you employ the wc command on it. This utility displays 2020 6786 11328 to STDOUT. Which of the following is true? (Choose all that apply.)
- The file has 2,020 lines in it.
- The file has 2,020 characters in it.
- The file has 6,786 words in it.
- The file has 11,328 characters in it.
- The file has 11,328 lines in it.
- The file has 2,020 lines in it.
- The file has 6,786 words in it.
The first item output by the wc utility is the number of lines within a designated text file. Therefore, option A is correct. Option C is also correct, because the second item output by the wc utility is the number of words within a designated text file. Option B is a wrong answer because the file contains 2,020 lines and not characters. Option D is an incorrect choice because you do not know whether or not the Unicode subset of ASCII is used for the text file’s encoding. You should always assume the last number is the number of bytes within the file. Use the -m or –chars switch on the wc command to get a character count. Therefore, the file could have 11,328 bytes in it instead of characters. Option E is also a wrong choice because the file has 2,020 lines in it.
14- Which of the following best defines a file descriptor?
- A letter that represents the file’s type
- A number that represents a process’s open files
- Another term for the file’s name
- A six‐character name that represents standard output
- A symbol that indicates the file’s classification
A number that represents a process’s open files
A file descriptor is a number that represents a process’s open files. Therefore, option B is the correct answer. A file type code is a letter that represents the file’s type, displayed as the first item in the ls -l output line. Therefore, option A is a wrong answer. Option C is also wrong, because it is a made‐up answer. Option D is incorrect because it describes only STDOUT, which has a file descriptor number of 1 and is only one of several file descriptors. A file indicator code is a symbol that indicates the file’s classification, and it is generated by the ls -F command. Therefore, option E is also a wrong choice.**
File Descriptor
A small non-negative integer that the kernel uses to identify an open file within a process; the first three descriptors are standard input (0), standard output (1), and standard error (2).
STDOUT
Standard Output:
* default output stream (file descriptor 1) where a program writes its normal output
* typically directed to the terminal but can be redirected to files or other programs using the > or | operators.
STDIN
Standard Input
* The default input stream (file descriptor 0) from which a program reads input
* typically connected to the keyboard but can receive input from files or other programs using the < operator or pipes (e.g., cat < input.txt or echo “text” | grep “t”).
STDERR
Standard Error
* The default error output stream (file descriptor 2) where programs write error messages and diagnostics
* separated from stdout to distinguish normal output from errors and can be redirected independently (e.g., command 2> errors.log).
16- Which of the following commands will display the file SpaceOpera.txt to output as well as save a copy of it to the file SciFi.txt?
cat SpaceOpera.txt | tee SciFi.txt
cat SpaceOpera.txt ˃ SciFi.txt
cat SpaceOpera.txt 2˃ SciFi.txt
cp SpaceOpera.txt SciFi.txt
cat SpaceOpera.txt &˃ SciFi.txt
cat SpaceOpera.txt | tee SciFi.txt
The command in option A will display the SpaceOpera.txt file to output as well as save a copy of it to the SciFi.txt file. Therefore, option A is the correct answer. Option B is a wrong answer because it will only put a copy of SpaceOpera.txt into the SciFi.txt file. Option C is an incorrect choice because this will display the SpaceOpera.txt file to output and put any error messages into the SciFi.txt file. The cp command will only copy one text file to another. It will not display the original file to output, so option D is a wrong answer. Option E is a wrong choice because it will put a copy of SpaceOpera.txt into the SciFi.txt file and include any error messages that are generated.