Session 8 - Advanced Programming Techniques Flashcards
We know how to read data from a single CSV file.
But often we want read data from
many files
We know how to read data from a single CSV file.
But often we want read data from
For example, we might run the same experiment on 100 participants
Each experiment generated a data file and now you want to process them all in one script.
How do we know which files to read from?
One good way is to put all the files into a single directory like /home/alex/subject_data and then find all the files in that directory that match a certain pattern (e.g. ending in .csv).
What does this ‘real directory’ show? - (3)
Here, then directory stores lots of different files from a single experiment.
The ones we care about are .csv files but there are also some other ones in there as well (like the .log files).
In the analysis we want to find and load in all the .csv files and ignore the other ones.
Two ways of listening contents that is in a directory - (2)
- glob
- listdir command (part of os)
One way to find out what is in a directory is with the
listdir command (part of os).
One way of listing contents of a directory is using
os.listdir
To download data from YNiC to practice on is using the command ‘git’
Git is a
a free protocol that allows you to manage files that are synchronized across the internet.
The very common use for ‘Git’ is for
distributing software source code and data files
One website that runs ‘git’ is caled ‘github.com’ which - (2)
a favourite place for people to store their software projects and has become almost synonymous with ‘git’ itself.
Currently, Github says that they host over 100 million developers and over 420 million software projects
YNiC runs its own git server and use to download
some useful data files like this
Example of downloading some useful data files from YNiC git server
What does this code do? - (6)
- This code is used to download a smaller version of a repository called ‘pin-material’ from a specific URL.
- The ‘!cd /content’ command changes the directory to ‘/content’.
- ‘!git clone –branch small –depth 1 https://vcs.ynic.york.ac.uk/cn/pin-material.git’ is the main command.
- It clones the ‘small’ branch of the repository with a depth of 1, meaning it only gets the latest version of the files, not the entire history.
- The comment explains that this smaller version doesn’t include neuroimaging data, making it much smaller than the full repository.
- The ‘!ls -lat’ command lists the contents of the current directory in detail, showing the latest changes first.
Git is a protocol of getting source files, text files from a server in a
particular order
os.listdir() is a function from the …. module
os
We can use the command listdir to see what is in a
particular directory
We can check current working directory using module os by using
os.getcwd()
We can lists contents of current working directory using function part of os called
os.listdir()
We can list the contents of a different directory by passing its path to `os.listdir() e.g.,
For instance, os.listdir('/content/pin-material')
lists the contents of the ‘/content/pin-material’ directory.
Explain this code (using YNiC’s git server to download useful files like pin-material-git) - (8)
- This code uses the
os
module -
os.getcwd()
prints the current working directory. -
os.listdir('.')
lists the contents of the current directory.and stores in variable called ‘contents’ -
'.'
represents the current directory. -
type(contents)
prints the type of the variablecontents
. -
print(contents)
prints the contents of the current directory. -
os.listdir('/content/pin-material')
lists the contents of the ‘/content/pin-material’ directory - different directory and stores into variable ‘newcontents’ - contents of ‘newcontents’ variable is printed out
Output of this code
Both os.listdir('.')
and os.listdir()
refer to the same thing,, listening
listing the contents of the current directory.
Remember .. means ‘
‘go up one directory’
’.’ means
‘this directory’
This is what pin-material directory looks like in file format: