Session 8 - Advanced Programming Techniques Flashcards

1
Q

We know how to read data from a single CSV file.

But often we want read data from

A

many files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

We know how to read data from a single CSV file.

But often we want read data from

For example, we might run the same experiment on 100 participants

Each experiment generated a data file and now you want to process them all in one script.

How do we know which files to read from?

A

One good way is to put all the files into a single directory like /home/alex/subject_data and then find all the files in that directory that match a certain pattern (e.g. ending in .csv).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does this ‘real directory’ show? - (3)

A

Here, then directory stores lots of different files from a single experiment.

The ones we care about are .csv files but there are also some other ones in there as well (like the .log files).

In the analysis we want to find and load in all the .csv files and ignore the other ones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Two ways of listening contents that is in a directory - (2)

A
  1. glob
  2. listdir command (part of os)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

One way to find out what is in a directory is with the

A

listdir command (part of os).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

One way of listing contents of a directory is using

A

os.listdir

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

To download data from YNiC to practice on is using the command ‘git’

Git is a

A

a free protocol that allows you to manage files that are synchronized across the internet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The very common use for ‘Git’ is for

A

distributing software source code and data files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

One website that runs ‘git’ is caled ‘github.com’ which - (2)

A

a favourite place for people to store their software projects and has become almost synonymous with ‘git’ itself.

Currently, Github says that they host over 100 million developers and over 420 million software projects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

YNiC runs its own git server and use to download

A

some useful data files like this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Example of downloading some useful data files from YNiC git server

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does this code do? - (6)

A
  • This code is used to download a smaller version of a repository called ‘pin-material’ from a specific URL.
  • The ‘!cd /content’ command changes the directory to ‘/content’.
  • ‘!git clone –branch small –depth 1 https://vcs.ynic.york.ac.uk/cn/pin-material.git’ is the main command.
  • It clones the ‘small’ branch of the repository with a depth of 1, meaning it only gets the latest version of the files, not the entire history.
  • The comment explains that this smaller version doesn’t include neuroimaging data, making it much smaller than the full repository.
  • The ‘!ls -lat’ command lists the contents of the current directory in detail, showing the latest changes first.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Git is a protocol of getting source files, text files from a server in a

A

particular order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

os.listdir() is a function from the …. module

A

os

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

We can use the command listdir to see what is in a

A

particular directory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

We can check current working directory using module os by using

A

os.getcwd()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

We can lists contents of current working directory using function part of os called

A

os.listdir()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

We can list the contents of a different directory by passing its path to `os.listdir() e.g.,

A

For instance, os.listdir('/content/pin-material') lists the contents of the ‘/content/pin-material’ directory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Explain this code (using YNiC’s git server to download useful files like pin-material-git) - (8)

A
  • This code uses the os module
  • os.getcwd() prints the current working directory.
  • os.listdir('.') lists the contents of the current directory.and stores in variable called ‘contents’
  • '.' represents the current directory.
  • type(contents) prints the type of the variable contents.
  • print(contents) prints the contents of the current directory.
  • os.listdir('/content/pin-material') lists the contents of the ‘/content/pin-material’ directory - different directory and stores into variable ‘newcontents’
  • contents of ‘newcontents’ variable is printed out
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Output of this code

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Both os.listdir('.') and os.listdir() refer to the same thing,, listening

A

listing the contents of the current directory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Remember .. means ‘

A

‘go up one directory’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

’.’ means

A

‘this directory’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

This is what pin-material directory looks like in file format:

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

os.listdir() includes hidden files, which start with a…

Hidden files may….

You may need to fitler… - (3)

A

with a dot (e.g., .DS_Store)

  • Hidden files may not be useful and can clutter the list.
  • You may need to filter out hidden files from the list returned by os.listdir()
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Example of os.listdir() including hidden files (e.g., .DS_Store)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

A more useful function than os.listdir is

A

glob function from glob module

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What does ‘glob’ stand for?

A

It is short for ‘global pattern match’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q
  • The glob function from the glob module is used to
A

find files and directories matching a specific pattern.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

The ‘glob’ function from glob module allows you to use special characters such as ‘*’ and ‘?’ to

A

search for strings that match certain patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Example of using glob on YNiC pin material

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Example of using YNiC pin material directory

Explain the code - (5)

A
  • Importing the glob function is achieved with from glob import glob.
  • filelist = glob('/content/pin-material/*.jpg') finds all .jpg files in the ‘pin-material’ directory.
  • print(filelist) displays the list of .jpg files found.
  • pyFiles= glob('/content/pin-material/*.py') finds all Python script files.
  • print(sorted(pyFiles)) prints the Python script files as a sorted list - in ascending order
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Output of this code:

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

We see in this code that glob returns whatever path we used in the arguement

Therefore if we use the full path (as we did above) we now have a set of full paths

In other words:

A
  • When provided with the full path as an argument, glob returns a list of full paths.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

We could then use this list in loop to open multiple files and load the data from

A

each one in turn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Can use sorted function to find these hidden files first when using os.listdir

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What are wildcard characters in the context of glob?

A

Wildcard characters are special symbols used in glob patterns to match filenames or paths.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

List all the wildcard characters using in glob function - (4)

A
    • (an asterix)
  1. ? (a question mark)
  2. [1234] a list of characters -
  3. [1-9] a range of characters -
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Explain the wildcard ‘*’ in glob - (2)

A
  • It matches any set of characters, including no characters at all.
  • For example, ‘file*.txt’ matches ‘file.txt’, ‘file123.txt’, and ‘file.txt.backup’.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What does the ‘?’ wildcard match in glob? - (2)

A
  • It matches any single character.
  • For example, ‘file?.txt’ matches ‘file1.txt’, ‘fileA.txt’, but not ‘file12.txt’.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

How does the wildcard ‘[1234]’ work in glob? - (2)

A
  • ‘[1234]’ is a wildcard character in glob that matches any single character from the list [1234].
  • For example, ‘file[1234].txt’ matches ‘file1.txt’, ‘file2.txt’, but not ‘file5.txt’.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Explain the ‘[1-9]’ wildcard in glob - (2)

A
  • ‘[1-9]’ is a wildcard character in glob that matches any single character in the range from 1 to 9.
  • For example, ‘file[1-9].txt’ matches ‘file1.txt’, ‘file2.txt’, but not ‘file10.txt’.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Here are the files in the directory:

[pop2_tidy_script2.py’
, ‘s3’,
‘README.md’,
‘app_headshape.xlsx’,
‘fft_colour.jpg’,
‘pop2_tidy_script1.py’,
‘fft_bw.jpg’,
‘.DS_Store’,
‘s4’,
‘app_headshape.bin’,
‘pop2_debug_script2.py’,
‘pop2_debug_script1.py’,
‘.git’,
‘pop3_test_script.py’]

What would glob(‘/content/pin-material/fft*’) print? - (4)

A

The glob pattern ‘/content/pin-material/fft*’ matches all files in the ‘/content/pin-material’ directory that start with ‘fft’.

  • From the given list of files:
    • ‘fft_colour.jpg’ and ‘fft_bw.jpg’ match the pattern.
    • Therefore, glob(‘/content/pin-material/fft*’) would print [‘fft_colour.jpg’, ‘fft_bw.jpg’].
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Here are the files in the directory:

[pop2_tidy_script2.py’
, ‘s3’,
‘README.md’,
‘app_headshape.xlsx’,
‘fft_colour.jpg’,
‘pop2_tidy_script1.py’,
‘fft_bw.jpg’,
‘.DS_Store’,
‘s4’,
‘app_headshape.bin’,
‘pop2_debug_script2.py’,
‘pop2_debug_script1.py’,
‘.git’,
‘pop3_test_script.py’]

What would glob(‘/content/pin-material/*md’) print? - (4)

A
  • The glob pattern ‘/content/pin-material/*md’ matches all files in the ‘/content/pin-material’ directory that end with ‘md’.
  • Based on the * wildcard, which matches any set of characters, it will find files ending with ‘md’.
  • From the given list of files, ‘README.md’ matches the pattern.
  • Therefore, glob(‘/content/pin-material/*md’) would print [‘README.md’].
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Here are the files in the directory:

[pop2_tidy_script2.py’
, ‘s3’,
‘README.md’,
‘app_headshape.xlsx’,
‘fft_colour.jpg’,
‘pop2_tidy_script1.py’,
‘fft_bw.jpg’,
‘.DS_Store’,
‘s4’,
‘app_headshape.bin’,
‘pop2_debug_script2.py’,
‘pop2_debug_script1.py’,
‘.git’,
‘pop3_test_script.py’]

What would glob(‘/content/pin-material/pop?_*’) print? - (7)

A

The glob pattern ‘/content/pin-material/pop?_’ utilizes two wildcard characters: ‘?’ and ‘’.

  • ’?’ matches any single character, allowing for flexibility in matching filenames.
  • ‘*’ matches any set of characters, including no characters at all.
  • Therefore, the pattern matches files in the ‘/content/pin-material’ directory that start with ‘pop’, followed by any single character, and then an underscore, and then any set of characters.
  • Based on this pattern:
    • Files like ‘pop2_tidy_script2.py’, ‘pop2_tidy_script1.py’, ‘pop2_debug_script2.py’, and ‘pop2_debug_script1.py’ would match.
  • Therefore, glob(‘/content/pin-material/pop?_*’) would print [‘pop2_tidy_script2.py’, ‘pop2_tidy_script1.py’, ‘pop2_debug_script2.py’, ‘pop2_debug_script1.py’].
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Here are the files in the directory:

[pop2_tidy_script2.py’
, ‘s3’,
‘README.md’,
‘app_headshape.xlsx’,
‘fft_colour.jpg’,
‘pop2_tidy_script1.py’,
‘fft_bw.jpg’,
‘.DS_Store’,
‘s4’,
‘app_headshape.bin’,
‘pop2_debug_script2.py’,
‘pop2_debug_script1.py’,
‘.git’,
‘pop3_test_script.py’]

What would glob(‘/content/pin-material/pop*’) print? - (4)

A
  • The glob pattern ‘/content/pin-material/pop*’ matches all files in the ‘/content/pin-material’ directory that start with ‘pop’.
  • Based on the ‘*’ wildcard, which matches any set of characters, it will find files that start with ‘pop’.
  • From the given list of files, ‘pop2_tidy_script2.py’, ‘pop2_tidy_script1.py’, ‘pop2_debug_script2.py’, ‘pop2_debug_script1.py’, and ‘pop3_test_script.py’ match the pattern.
  • Therefore, glob(‘/content/pin-material/pop*’) would print [‘pop2_tidy_script2.py’, ‘pop2_tidy_script1.py’, ‘pop2_debug_script2.py’, ‘pop2_debug_script1.py’, ‘pop3_test_script.py’].
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Here are the files in the directory:

[pop2_tidy_script2.py’
, ‘s3’,
‘README.md’,
‘app_headshape.xlsx’,
‘fft_colour.jpg’,
‘pop2_tidy_script1.py’,
‘fft_bw.jpg’,
‘.DS_Store’,
‘s4’,
‘app_headshape.bin’,
‘pop2_debug_script2.py’,
‘pop2_debug_script1.py’,
‘.git’,
‘pop3_test_script.py’]

What would glob(‘/content/pin-material/pop?_tidy_script[1-2]*’) print? - (6)

A
  • The glob pattern ‘/content/pin-material/pop?_tidy_script[1-2]*’ matches files in the ‘/content/pin-material’ directory that start with ‘pop’, followed by any single character, then ‘_tidy_script’, then either ‘1’ or ‘2’, and then any set of characters.
  • ’?’ matches any single character, allowing flexibility in matching filenames.
  • ‘[1-2]’ matches either ‘1’ or ‘2’.
  • ‘*’ matches any set of characters, including no characters at all.
  • From the given list of files, ‘pop2_tidy_script2.py’ and ‘pop2_tidy_script1.py’ match the pattern.
  • Therefore, glob(‘/content/pin-material/pop?_tidy_script[1-2]*’) would print [‘pop2_tidy_script2.py’, ‘pop2_tidy_script1.py’].
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Here are the files in the directory:

[pop2_tidy_script2.py’
, ‘s3’,
‘README.md’,
‘app_headshape.xlsx’,
‘fft_colour.jpg’,
‘pop2_tidy_script1.py’,
‘fft_bw.jpg’,
‘.DS_Store’,
‘s4’,
‘app_headshape.bin’,
‘pop2_debug_script2.py’,
‘pop2_debug_script1.py’,
‘.git’,
‘pop3_test_script.py’]

What would glob(‘/content/pin-material/fft*.jpg’’) print? - (4)

A
  • The glob pattern ‘/content/pin-material/fft*.jpg’ matches files in the ‘/content/pin-material’ directory that start with ‘fft’, followed by any set of characters, and end with ‘.jpg’.
  • ‘*’ matches any set of characters, including no characters at all.
  • From the given list of files, ‘fft_colour.jpg’ and ‘fft_bw.jpg’ match the pattern.
  • Therefore, glob(‘/content/pin-material/fft*.jpg’) would print [‘fft_colour.jpg’, ‘fft_bw.jpg’].
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

There are cases where you might have full paths (e.g. from glob above) and need to split them up into directory and filename. You may also want to split out the extension of a file from the main part of it (i.e

A

turn myfile.txt into myfile and txt).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

here are cases where you might have full paths (e.g. from glob above) and need to split them up into directory and filename. You may also want to split out the extension of a file from the main part of it (i.e. turn myfile.txt into myfile and txt).

You are already thinking of the split() function right? Well that can work but in addition, there are three os functions that can help you with that - (3)

A

1) basename
2) dirname
3) splitext

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

How to import three os functions that help you with spilting full file paths using os module?

basename, dirname, split text?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

Functions like basename, dirname, and splitext from the os.path module can help split full paths into directory, filename, and file extension.

  • These functions provide a convenient way to
A

extract different parts of a file path.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

Explain the basename function from the os.path module - (3)

A
  • The basename function, from the os.path module, extracts the filename from a full path.
  • It returns the last component of the path, excluding the directory.
  • For example, basename(‘/content/pin-contents/s4/s4_rt_data_part01.hdf5’) would return ‘s4_rt_data_part01.hdf5’.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

Explain the dirname function from the os.path module.

A
  • The dirname function, from the os.path module, extracts the directory name from a full path.
  • It returns the directory component of the path, excluding the filename.
  • For example, dirname(‘/content/pin-contents/s4/s4_rt_data_part01.hdf5’) would return ‘/content/pin-contents/s4’.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Explain the splitext function from the os.path module - (3)

A
  • The splitext function, from the os.path module, splits a filename into its base name and extension.
  • It returns a tuple containing the base name and the extension separately.
  • For example, splitext(‘/content/pin-contents/s4/s4_rt_data_part01.hdf5’) would return (‘/content/pin-contents/s4/s4_rt_data_part01’, ‘.hdf5’)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

The splitext function returns a tuple (you can treat it as a list) of two items. - (2)

A

The first element is everything except the extension of the file and the second element is the extension (including the leading .).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Can use basename, dirname and splittext on variables - (3)

e.g., my_path = ‘/content/pin-contents/s4/s4_rt_data_part01.hdf5’

A

dname = dirname(my_path)

fname = basename(my_path)

print(splitext(my_path))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

What does the splitext function do when applied to the full path? - (3)

A
  • The splitext function splits the full path into its base name and extension.
  • When applied to the full path, it returns a tuple containing the base name and the extension separately.
  • For example, splitext(‘/content/pin-contents/s4/s4_rt_data_part01.hdf5’) would return (‘/content/pin-contents/s4/s4_rt_data_part01’, ‘.hdf5’).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

What does the splitext function do when applied to just the filename? - (3)

A
  • When applied to just the filename, the splitext function splits the filename into its base name and extension.
  • It returns a tuple containing the base name and the extension separately.
  • For example, if fname is ‘s4_rt_data_part01.hdf5’, splitext(fname) would return (‘s4_rt_data_part01’, ‘.hdf5’).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

Produce a code that Using glob, find all of the files in /content/pin-contents/s4 that end in .hdf5. Sort this list, loop over it and print out just the filename without the extension. Your output should look like:

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

Explain this code - (8)

A

The code first imports necessary modules from glob module and basename, split text and dirname functions from os module

Use glob to find files:
The glob function searches for all files ending with ‘.hdf5’ in the ‘/content/pin-material/s4/’ directory.
The resulting list of file paths is stored in the fileList variable.

A for loop iterates over each file path in fileList.

In each iteration of element in fileList,

fNameOnly stores the filename extracted from the full path thisFileName (e.g., if thisFileName is ‘/content/pin-material/s4/s4_rt_data_part04.hdf5’, then fNameOnly will store ‘s4_rt_data_part04.hdf5’.

parts variable = splitext (fNameOnly) so splitext splits the filename stored in fNameOnly into its base name and extension. The base name is stored in parts[0].
For example, if fNameOnly is ‘s4_rt_data_part04.hdf5’, then after splitting:
parts[0] will store ‘s4_rt_data_part04’ (the base name).

print(parts[0]):Only the base name stored in parts[0] is printed. For example, if parts[0] is ‘s4_rt_data_part04’, then this base name will be printed.

For loop continues until each element of list in fileList is covered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

Output of this code

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

There are two additional things we can do with lists which can make our code more concise and easier to read and write

A

These are list comprehensions and list enumerating .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

We know how to make a list both by hand and by the range function

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Explain this code - (3)

A
  • list1=[0,1,2,3,4,5]: Defines a list named list1 containing integers 0 through 5, entered manually.
  • list2=list(range(6)): Creates a list named list2 using the range() function to generate integers from 0 to 5.
  • Prints both list1 and list2
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

We often need to manipulate the contents of data in lists and have learned to do this by using

A

for loops

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

Example of manipulate the contents of data in lists and have learned to do this by using loops:

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

Explain this code - (8)

A
  • input_list = range(10): Creates a range object containing integers from 0 to 9 (not including 10), assigned to input_list.
  • output_list = []: Initializes an empty list named output_list.
  • for value in input_list:: Iterates over each value in input_list.
    • Inside the loop:
      • value takes on each value from input_list in sequence.
      • output_list.append(value * 2): Multiplies each value by 2 and appends the result to output_list.
  • print(list(input_list)): Prints the contents of input_list, displaying integers from 0 to 9.
  • print(output_list): Prints the contents of output_list, displaying each element multiplied by 2.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

What would be its output?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

For cases where we need to implement a simple transformation like this (such as multiplying by a number or calling a function on each member of a list), like in this example,

A

Python gives us an alternative: the list comprehension.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

What is list comprehension mean in python?

A

A list comprehension is simply a statement inside of square brackets which tells Python how to contruct the list.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

How to write this list ‘outputlist’ into list comprehension?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

Explain this code - (2)

A

The example above therefore reads as (x * 2) for each value (x) in range(10). i.e., for each value in the list produced by range(10), put it in the variable x, then put the value x*2 into the list.

Note that the variable x is just a placeholder and could be called anything.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

What would be output of this code?

A

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

The trick with list comprehensions is to read them out

A

loud to yourself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

List comprehension works with

A

any sort of list and any sort of data, e.g.,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

Explain this code - (6)

A
  • original_data = ['Alex', 'Bob', 'Catherine', 'Dina']: Defines a list named original_data containing four strings.
  • new_list = ['Hello ' + x for x in original_data]: Utilizes list comprehension to create a new list named new_list.
    • For each element x in original_data, the expression 'Hello ' + x concatenates ‘Hello ‘ with the value of x, which represents each name in original_data.
    • The resulting strings are added to new_list.
  • print(original_data): Prints the contents of original_data, displaying the original list of names.
  • print(new_list): Prints the contents of new_list, displaying each name prefixed with ‘Hello ‘.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

What would be output of this code?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

We can also call functions in

A

list comprehension

e.g.,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

Explain this code - (6)

A
  • original_data = ['This', 'is', 'a', 'test']: Defines a list named original_data containing four strings.
  • new_list = [len(x) for x in original_data]: Utilizes list comprehension to create a new list named new_list.
    • For each element x in original_data, the expression len(x) calculates the length of the string x.
    • The resulting lengths are added to new_list.
  • print(new_list): Prints the contents of new_list, displaying the length of each string in original_data.
  • For example, ‘This’ has 4 characters, ‘is’ has 2 characters, ‘a’ has 1 character, and ‘test’ has 4 characters.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

What would be output of this code?

A
82
Q

Use list comprehension to make the code fragment shorter:

A
83
Q

What would be output of this code?

A
84
Q

Use list comprehension to make the code fragment shorter:

A
85
Q

What would be output of this code?

A
86
Q

What is the purpose of the pass statement in Python? - (2)

A

The pass statement in Python serves as a placeholder and does nothing when executed.

It is often used to create empty loops, functions, or classes.

87
Q

Example of using pass in an empty loop - (5)

A
  • data1 = [10, 20, 30] creates a list of numbers from 10 to 30.
    • The for loop iterates over each item in data1.
    • Inside the loop, a comment explains that the loop is pointless but serves as a placeholder for future code.
    • The pass statement is used to indicate that no action needs to be taken inside the loop.
  • Essentially, pass allows the loop to exist without any executable code inside it, avoiding syntax errors in situations where a loop is required but no action is necessary.
88
Q

What would be output of this code? - (3)

A

The pass statement itself does nothing when executed and serves as a placeholder.

  • As a result, there are no print statements or other operations that would produce output

NO OUTPUT

89
Q

Explain the use of break in this code snippet - (5)

A
  • It imports the randint function from the random module to generate random integer numbers.
  • Inside a while loop that runs indefinitely (while True), random numbers between 0 and 5 (inclusive) are generated and printed.
  • If a randomly generated number is equal to 0 printed, the break statement is executed.
  • The break statement immediately terminates the while loop, exiting the loop and ending the program execution.
  • This allows the program to stop generating random numbers once a 0 is encountered.
90
Q

Python produces error if a for loop has

A

no code to execute inside it

91
Q

We now looking at how to load different data formats

A

(other than .csv)

92
Q

Files come in different ‘

A

formats

93
Q

The format of a file means how the data

A

are stored inside it.

94
Q

In some files the data are stored as ‘plain text’. You can open them in a t

A

a text editor (like Spyder or Notepad or you can look at them using cat on the command line) and read them although they might not make a lot of sense.

95
Q

In some files the data are stored as ‘plain text’. You can open them in a text editor (like Spyder or Notepad or you can look at them using cat on the command line) and read them although they might not make a lot of sense

A

iles like this include .csv files, files ending in .txt and most types of programming language ‘source code’ like python files (.py) and web pages (.html).

96
Q

The other ‘family’ of files store their data in formats where you cannot easily read them into a

A

text editor.

97
Q

The other ‘family’ of files store their data in formats where you cannot easily read them into a text editor. These files usually contain - (2)

A

‘numbers’ of some sort stored in a way that computers like to read.

We call them ‘binary’ files.

98
Q

Example of binary files

A

. Files like this include the image and video formats that you might be familiar with (.jpg, .gif, .mp4).

99
Q

In neuroimaging, the brain data files we use tend to be in binary format (e.g.

A

nii, .mat)

100
Q

In neuroimaging, the brain data files we use tend to be in binary format (e.g. .nii, .mat) while the files describing other stuff like experiment structure and subject responses are in

A

plain text’ (.csv, .txt).

101
Q

Python can read most file formats – you just need to hunt down the

A

right modules.

102
Q

Previously, we started to use matplotlib for plotting our data.

We imported the pyplot submodule like this:

A
103
Q

pyplot contains most of the functions which we will learn in this section.

Because we imported it ‘as’ plt

We will access the functions using - (2)

A

plt.FUNCTIONNAME,

e.g. plt.plot; the same way as, when using numpy, we use np.array.

104
Q

Plain text files can be formatted in a number of ways, but a common way for numerical data is:

A

i.e., numbers separated - either by spaces, tabs or commas with one row per line of text.

105
Q

Up until now we have used Pandas to load in .csv files. But numpy also knows how to load in data and sometimes we just want to have the data

A

appear directly in a numpy array.

106
Q

MEG systems arrange their sensors in a

A

‘bowl’ over the subject’s head. Like in this picture.

107
Q

Each MEG sensor measures the

A

magnetic field activity in a particular location.

108
Q

Together, MEG sensors tell you

A

what is happening all across the subject’s head at any moment.

109
Q

We are going to load some MEG data from the file s4_meg_sensor_data.txt in the

A

pin-materials s4 sub-directory using this code:

110
Q

When you have a text-based data file (e.g., MEG data: s4_meg_sensor_data.txt ) always start… - (2)

A

always start by having a look at it to understand the format.

Mostly often we just want to see a few lines of the file to get an idea of what is in it.

111
Q

We can get an idea of what is inside a file using shell command

A

head which gives first 10 lines of a file

112
Q

Using ‘!’ command, we can

A

list the contents of the “s4” subdirectory, we use the ls command:

113
Q

Explain what this code does - (6)

!ls -lh pin-material/s4

A

The ls command lists the contents of a directory.

  • -lh is a combination of options:
    • -l lists detailed information about each file, including permissions, owner, size, and modification time.
    • -h displays file sizes in a human-readable format (e.g., kilobytes, megabytes).
  • pin-material/s4 specifies the directory whose contents will be listed.
  • Therefore, this command lists detailed information about the contents of the “s4” subdirectory within the “pin-material” directory.
114
Q

Explain what this code does - (3)

!head pin-material/s4/s4_meg_sensor_data.txt

A

The head command displays the beginning of a file.

  • pin-material/s4/s4_meg_sensor_data.txt specifies the file whose beginning will be displayed.
  • This command shows the first few lines of the “s4_meg_sensor_data.txt” file located within the “s4” subdirectory of the “pin-material” directory.
115
Q

By executing this code:

A

It gives us this output

116
Q

This output shows - (8)

A

We can see that the first line of the file contains some column headings.

We make a note of these as we will need them later on:

Column 0: Time
Column 1: Left Mean
Column 2: Left Lower CI
Column 3: Left Upper CI
Column 4: Right Mean
Column 5: Right Lower CI
Column 6: Right Upper CI

117
Q

Executing this code line gives us this output:

A
118
Q

We can use the tail command on MEG data to check 10 lines of the file are really numbers

A
119
Q

The tail command is used to

A

It displays the last 10 lines of a file by default.

120
Q

What does the following code snippet do, and why might it encounter an issue?

A

The code snippet uses NumPy’s “loadtxt” function to load numerical data from a text file.

  • It attempts to load data from the file “s4_meg_sensor_data.txt” located within the “s4” subdirectory of the “pin-material” directory.
  • However, it may encounter an issue if the first line of the file contains column names or non-numeric data instead of numerical values.
  • By default, “loadtxt” expects numeric data and will raise an ValueError error if it encounters non-numeric content in the first row.
121
Q

We can correct this by:

A

telling nump to skip the first line (skip first row) which contains t is the header with column names which is string - words

122
Q

What would be the shape of the data and type?

A

(400, 7) - 400 rows and 7 columns
float64

123
Q

After loading plain text files using np.loadtxt (e.g., ‘pin-material/s4/s4_meg_sensor_data.txt’, skiprows=1), we can save them out using

A

np.savetxt.

124
Q

Explain the code - (9)

A

This code snippet uses NumPy’s “loadtxt” function to load numerical data from a text file.

  • It loads data from the file “s4_meg_sensor_data.txt” located within the “s4” subdirectory of the “pin-material” directory, skipping the first row (which likely contains column names or non-numeric information).
  • The loaded data is stored in the variable “importedData”.
  • This line extracts a subset of the loaded data (“importedData”).
  • It selects columns 1, 2, and 3 (exclusive indexing) from the loaded data.
  • The extracted data, representing the timecourse and confidence intervals, is stored in the variable “ourdata”.
  • The “savetxt” function from NumPy is used to save the extracted data (“ourdata”) to a new text file named “my_new_meg_data.txt”.
  • This file is saved in the “/content/pin-material” directory.
  • After saving the data, the “!ls -lth” command is executed to list the files in the directory, providing information about file sizes and modification times.
125
Q

explain what this line np.savetxt mean in this code - (6)

A

np.savetxt: This function from the NumPy library is used to save data to a text file.

‘my_new_meg_data.txt’: Specifies the name of the text file where the data will be saved.

ourdata: Represents the data to be saved. This variable holds the extracted subset of the loaded data, which includes columns 1, 2, and 3.

header=’Mean UppcrCI LowerCI’: Defines a header string that will be written at the beginning of the file.

This header typically contains column names or other descriptive information. In this case, the header string specifies the column names as “Mean”, “UppcrCI”, and “LowerCI”.

fmt=’%1.4e’: Specifies the format string for writing the data. The %1.4e format specifier formats floating-point numbers with scientific notation (exponential format) and exactly 4 digits after the decimal point. This ensures that the data is written with a precision of 4 decimal places.

126
Q

The MEG data are

A

average timecourses measured from some sensors on the left and right side of the head after a ‘beep’.

127
Q

Usually in MEG we present the same stimulus many times and combine the recordings from each presentation to get an

A

‘average’ response.

128
Q

Previously we loaded or made small arrays and plotted them with matplotlib. We can pass data to matplotlib either as a

A

list or as a numpy array.

129
Q

Plot data as a list - example code

Explain whats happening in plt.plot([0,1,2,3]) - (4)

A

Since only one set of values is passed, these values will be used as the y-values, and the x-values will be automatically generated as the indices of the data points (0, 1, 2, 3).

In this example, we did not supply an ‘X’ axis. matplotlib assumed that we just wanted 0,1,2,3 and so on.

So we really plotted (0, 2), (1, 3), (2, 4) and (3, 5) and joined them up with a straight line.

When matplotlib draws a line in this way, it joins the points with straight lines by default.

130
Q

We might, however, want to look at the individual data points. To do this, we can add a ‘format’ string - (2)

A

This is a string which describes how to format the data in the plot.

Here we do this by just adding an ‘o’ to the function call.

131
Q

What does this line of code do?

plt.plot([2, 3, 4, 5], ‘o’)

A

The first line plots the data points as circles (‘o’) without joining lines, creating a scatter plot.

132
Q

What does this line of code do?

plt.plot([2, 3, 4, 5], ‘og’)

A

The second line plots the same data points as green circles (‘og’) without joining lines.

133
Q

What does this line of code do?

plt.plot([2, 3, 4, 5], ‘o–’)

A

The third line plots the data points as circles with a dashed line (‘o–’) joining them.

134
Q

What are some commonly used markers in Matplotlib? - (8)

A

In Matplotlib, markers are symbols used to denote individual data points in plots.

Here are some commonly used markers:

  • ’.’: Point marker
  • ‘o’: Circle marker
  • ‘v’: Downward triangle marker
  • ’^’: Upward triangle marker
  • ’+’: Plus marker
  • ‘*’: Star marker
135
Q

As well with plotting markers, there is also you can also change the line type here in matplotlib

A

For instance, – means used a dashed line whilst -. means use a dash/dotted line

136
Q

In matplot we can also

A

adjust the colour of our lines or markers using the format string.

137
Q

y default, matplotlib will use its own colour cycling scheme to choose colours for us but

A

but we can override this

138
Q

We can combine markers and colours in same format string:

A
139
Q

Explain what this code does and output - (5)

A
  1. import matplotlib.pyplot as plt: Imports the Matplotlib library and aliases it as plt for convenience.
  2. plt.cla(): Clears the current axes to ensure a fresh plot.
  3. plt.plot([2, 3, 4, 5], 'r+'): Plots the data points [2, 3, 4, 5] with red plus markers (‘r+’).

The ‘r’ indicates the color red, and the ‘+’ specifies the marker style as a plus sign.

The datapoints are not joined by a line

140
Q

How can we specify basic colors in Matplotlib? - (8)

A
  • ‘b’: blue
  • ‘g’: green
  • ‘r’: red
  • ‘c’: cyan
  • ‘m’: magenta
  • ‘y’: yellow
  • ‘k’: black
  • ‘w’: white
141
Q

What does this code line do?

A

This code plots the points [2, 3, 4, 5] using blue stars at each data point with a dash-dotted line in between.

142
Q

We can pass two lists/arrays of numbers first will be x value and second y values like this:

A
143
Q

What does this code do: plt.plot([-2, 1.5, 2, 4], [2, 3, 4, 5], ‘g*’)?

A

This code plots a graph with green star markers at the coordinates (-2, 2), (1.5, 3), (2, 4), and (4, 5),

144
Q

Why does this code not join the points together: plt.plot([-2, 1.5, 2, 4], [2, 3, 4, 5], ‘g*’)?

A

The absence of a line in the plot is due to the format string ‘g’, where ‘g’ denotes green color and ‘’ denotes star markers, but no line style is specified.

145
Q

The default behaviour of matplotlib is to add plots of new data to the

A

existing figure

146
Q

The default behaviour of matplotlib is to add plots of new data to the existing figure.

This allows us to

A

create complex plots from multiple components

147
Q

E.g.,

The default behaviour of matplotlib is to add plots of new data to the existing figure. This allows us to create complex plots from multiple components.

A

A simple example would be a line plot with many lines of data, or a scatter plot where we show different types of data with different symbols

148
Q

What does plt.ylim do?

A

set limits of y axis go from -3 to +4

149
Q

What does plt.grid() do?

A

even put a beautiful grid on it

150
Q

What does plt.savefig(‘my_figure.png’, dpi=300)

A

This code saves the current figure as an image file named “my_figure.png” with a resolution of 300 dots per inch (dpi).

151
Q

You might want to save out your figures to include them in your papers or dissertation.

You can do this using the

A

savefig function

152
Q

. There are two ways to use this savefig function

A

You can either select the figure and then use plt.savefig() or you can use the .savefig() method directly on the figure object.

153
Q

Using can either select the figure and then use plt.savefig()

A
154
Q

can use the .savefig() method directly on the figure object.

A

figureHandle = plt.figure()#Generate a new figure. Hold its ‘ID’ or ‘Handle’ in a variable

155
Q

What does plt.close(‘all’) do?

A

Close all existing figures

156
Q

what does legend part do in this code: import matplotlib.pyplot as plt? - (2)

A

The plt.legend() function adds a legend to the plot, which provides labels for the plotted lines.

In this code, it labels the first line as “A straight line” and the second line as “A wiggly line”.

157
Q

Legends tell you what each line on a

A

plot means.

158
Q

The ‘newline’ character (‘\n’) forces a new line inside a

A

string.

159
Q

We can add new line character inside x-axis label and y-axis level via:

A
160
Q

Why is one line blue and one line green in this code and plot? - (3)

A

In the provided code, the line plt.plot([2, 3, 4, 5]) creates a plot with only y-values, so it auto-generates x-values as sequential integers starting from 0.

This line is plotted in the default color, which is blue.

The subsequent line plt.plot([-2, 1.5, 2, 4], [2, 3, 4, 5], ‘g’) plots the provided x and y values in green color as specified by the ‘g’ argument.

161
Q

We can also add text in your plot by

A

use plt.title to add a title to our plot and plt.text allows us to plot text at arbitrary positions in our figure.

162
Q

What does plt.text do in this code? - (2)

A

The plt.text() function in this code adds text to the plot at a specified location.

In this case, it adds the text “Hello World” at the position (2, 2) on the plot, with the color set to red.

163
Q

Explain this code plotting MEG data - (6)

A

np.loadtxt() loads data from a text file, skipping the first row.

Time data is stored in the first column (t).

Sensor readings from columns 1 and 4 are extracted (plot_dat).

plot_dat.shape prints the size of the extracted data - (400, 2) rows column

plt.plot(t, plot_dat) plots sensor data over time.

Legends and gridlines are added for clarity.

164
Q

Wait in this code

There are two lines but only one plt.plot call! What has happened? - (3)

A

Until this point, each of our plt.plot() calls has only plotted a single line. If there are multiple columns in the data passed to plot_dat, matplotlib will automatically plot multiple lines - magic!

We have passed values for the x axis in (t: the time variable),

Notice that we have time before 0s. Time 0 is the time we present the stimulus (in this case a ‘beep’). We see that we get a large deviation in the signals shortly after the presentation of the stimulus.

165
Q

The MEG signals we have plotted so far are the

A

are mean response across multiple presentations of the same ‘beep’

166
Q

Someone has also computed a 95% Confidence Interval. We would like to visualise this. To do this, we will use the

A

plt.fill_between()

167
Q

The plt.fill_between() allows us to add ‘error bars’

A

(perhaps an ‘error envelope’ is better desccription) to our plots.

168
Q

The plt.fill_between is a function in matplotlib for

A

filling the area between two curves

169
Q

What does plt.fill_between show? - (4)

A

Plots the error enevelope - shaded area between let_lower_ci and left_upper ci on the plot along time axis

t parameter is added to define x-axis values

left_lower_ci = data[:, 2] - extracts data from column 2 lower CI

left_upper_ci = data[:, 3] - extracts data from column 3 of upper CI

170
Q

We would also like to plot the mean line (as well as error envelop) by

A

first plotting the mean line in black using plot.plot() (color=’k’), then plotting the error envelope over the top (plt.fill_between()).

When we plot over the top we have to set the color to be a bit transparent otherwise you will not see the line below.

Computers often refer to the transparency or ‘solidness’ of a color as its ‘alpha’ value. So if we set ‘alpha’ to 0.5, it will become 50% see-through. 20% is even more see-through.

171
Q

We can also provide a colour argument to - (3)

A

plt.plot and plt.fill_betweeen() where we use ‘shorthand’ for colors where ‘green’ is ‘g’, ‘blue’ is ‘b’, ‘black’ is ‘k’ and so on.

e.g., plt.plot(data[:,0],color=’r’)

e.g. plt.fill_between(t, left_lower_ci, left_upper_ci,alpha=0.5, color=’g’)

172
Q

We can specficy colours to plt.plot in different ways such as - (3)

A

specificy single letter ‘r’ = ‘red’

Red , green, blue format

Using colour names like aquamarine, mediumseagreen

173
Q

To visualise a distribution of

A

numbers are hirstograms (frequency plots) and boxplots.

174
Q

What does plt.plot(data[:,0], colour = ‘r’) mean? - (4)

A

data produces 2-D array with 10 rows and 4 columns filled with randomly generated numbers between interval 0 and 1 - numbers between 0 and 1 (exclusive)

data[:,0] selects all the rows of array of the first column (0) and plots them on y axis

Since no x axis values are explicitly mentioned, indices on x axis are generated –> x indices would be 0 to 9 since there are 10 rows in data array

Plots the values from data on y axis and indices of that on x axis as red line

175
Q

What does data1 = np.random.randn(10000) mean? - (2)

A

This produces an array containing random numbers 10,000 random numbers drawn from standard normal distribution (mean = 0, SD = 1) - unlike rand that produces numbers from flat distribution

Majority of numbers would fall around mean (0)

176
Q

What does data2 = np.random.randn(10000)?

A

Produce an array containing 10,000 random numbers drawn from standard normal distribution but majority of numbers fall around 1.8 than 0

177
Q

Wha does plt.figure() and plt.hist(data1, bins = 50) show?

A

plt.figure() - produce a figure for the histogram

plt.hist(data1, bins = 50)

x axis = range of values in dataset data1 divided by 50 bins

Y axis is frequency or count of occurence of data points falling within each bin on x axis

specifices bins - how lumpy i want histogram to be

178
Q

How to produce transparent histograms? - (2)

A

Specificying alpha parameter in plt.hist()

Alpha parameter controls the transparency of the histograms

179
Q

What does plt.hist with alpha value 0.3 mean?

A

Alpha value of 0.3 means bars in histogram are somewhat transparent

180
Q

The higher the alpha parameter is in plt.hist() the

A

less transparent the histogram will be

181
Q

What does

binEdges=np.linspace(-10,10,100)

and what does it mean when specificed in histograms?

A

binEdges = np.linspace (-10,10,1000) produces array of bin edges ranging from -10 to 10 inclusive with 1000 evenly spaced interval - each interval defines boundaries of a bin of histogram

In plt.hist(data1, bins = binEdges) - bin parameter used to specificy bin edges to use for histogram

182
Q

Box and whisker plots are often used to illustrate data which may have a

A

skewed distribution.

183
Q

A box and whisker plot makes it easy to see the interquartile range

A

(25-75% range) as well as the median (50% value). Outlier points (as defined by a multiple of the interquartile range) are plotted as individual points outside the whiskers of the plot.

184
Q

Explain this code which produces a boxplot - (6)

A
  • data = np.random.rand(1000,3) - produces 1000x3 Numpy array filled with random numbers between 0 and 1 with uniform distribution

data[:,1] = data[:,1]*2 + 1 - this multiples the second column of data array (all rows) by 2 and adds 1 to value - scales and shifts values in second column

data[:,2] = data[:,2]*1.5 +- 1 - this multiplties the values in the third column of data array by 1.5 and subtracts 1 from each value - shifts and scales values from third column

plt.figure() - produces figure for plot

plt.boxplot(data)- produces boxplot using data in data array - each column has a different dataet

plt.show() - displays plot with boxplot

185
Q

What is output of the boxplot?

A
186
Q

What does plt.xticks ([1,2,3], [‘Set1’, ‘Set2’,’Set3’])

A

Adds label for the 3 boxplots by setting first boxplot Set 1, second boxplot Set 2 and third boxplot is Set 3

187
Q

xticks does not use a

A

zero-referenced system.

188
Q

what does it show here:

plt.xticks([1, 2, 3],[‘Mouse’,’Elephant’,’Badger’])

A

The first argument is a list of numbers indicating the different categories.

The second argument is a list of strings saying what to call them.

189
Q

The ‘style’ of your plots is the default way everything looks. Stuff like

A

he color of the background, the line thickness, the font.

190
Q

Matplotlib has a default plotting style. It also has the ability to change this style: either by means of

A

individual tweaks to plotting layouts, colours etc, or by changing all of its settings in one go.

191
Q

ou can set the plotting style using the

A

plt.style.use() function

192
Q

We can change the plotting style using

plt.style.use() funtion

using

plt.style.use(‘ggplot’)

which shows:

A
193
Q

We can change the plotting style using

plt.style.use() funtion

using

plt.style.use(‘fivethirtyeight’)

which shows:

A
194
Q

Question 1
Consider the following code snippet:

What is wrong with this code? How should it be corrected?

A

Technically this might execute but the file_path is not an absolute path as expected. Almost certainly it is missing the initial ‘/’

195
Q

Identify the problem in this code and suggest a fix. - (2)

A

The plt.show() command fixes the image so the last two lines do not do anything.

Place them before the plt.show() command.

196
Q

This code makes an assumption - what is it? - (2)

A

It assumes there is at least one line of data in the file as well as the header.

If this it not true, it will throw an exception.

197
Q

This code has a similar bug to that in Q2. What is it? - (2)

A

Again - the plt.show() command stops the next line from working.

Change their order.

198
Q

How might this code run into problems depending on the platform it runs on?

A

The ‘directory’ variable hard-codes the ‘/’ separators. This might fail on Windows where the separator should be ‘'

199
Q

This code is designed to plot two random time series. What mistake does it make? - (3)

A

Data is defined as a 2 (down) by 10 (across) array.

So it will plot 10 random time series with two points each.

Change to rand(10,2) to make it work.

200
Q

Again, this code might work or it might not depending on the operating system, even if you are sure that the directory ‘plots’ exists. What line makes it so ‘fragile’ and how could you fix it? - (3)

A

Line of error - plt.savefig(‘/plots/parabola.png’)

Windows uses \ instead of /. - fix it by plt.savefig(‘\plots\parabola.png’)

Use os.path.join to glue together all the bits in a platform-independent manner.

201
Q

Which one of these boxplots will have the highest median value (as indicated by the bar across the middle)? - (2)

A

A
The second one (‘Group 2’). The offset is defined by the +2. The spread is defined by the *.5. So this one will have a median at +2 which is bigger than any of the others.

This is because offset (+2) directly adds a constant value to each data point, it has a more significant impact on the median compared to the spread (*.5).

202
Q

Q10
A directory contains the following files - (2)

A

a: apple.txt, allFruit.csv, allFruit.xls, allFruit.tsv, apple.jpg

b: apple.jpg, banana.jpg