Week 3: Reproducible research, sustainable software Flashcards

1
Q

Best Practices for Scientific Computing

A

Scientists spend an increasing amount of time building and using software. However, most scientists are never taught how to do this efficiently. As a result, many are unaware of tools and practices that would allow them to write more reliable and maintainable code with less effort. We describe a set of best practices for scientific software development that have solid foundations in research and experience, and that improve scientists’ productivity and the reliability of their software.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  1. Write programs for people, not computers
A

you might publish your code, someone may benefit from using it, you might leave it and come back to it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  1. Automate repetitive tasks
A

if you have to do something many times write a function in a loop so that you only have to go through a thought process to get it to be done for you

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. Use the computer to record history
A

log the changes that you make, keep track of all the versions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  1. Make incremental changes
A

small improvements where you can ensure things are working appropriately and every level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  1. Use version control
A

dropbox saves every version of every file for 30 days, you can restore old versions, this is very handy when you break or delete things by mistake. when working with several people you tend to not really know what the most up to date file is – systems to automatically merge files together (version control systems)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. Do not repeat yourself (or others).
A

reuse code, don’t reinvent the wheel, copy code

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. Plan for mistakes
A

we know we are going to make mistakes, how do we detect mistakes, one way, test code (check as you code that everything is working), the best practice is that every code you have automatic tests run once you do the code. Extreme situations; give errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
  1. Optimize software only after it works correctly
A

biologists not so important we just need it to work and work correctly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  1. Documenting the purpose of code
A

Very tempting to write a small comment as you write code but rapidly you will be able to read the code the way it has been written – what is harder is to identify the general aim

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  1. Conduct code reviews
A

really helpful to show what you have done to somebody else, fresh eyes can see what you might not have.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Use a style guide

A

Google has very good style guides that are respected by many.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Software Sustainability Institute

A

aim is to improve software made for science – they have done a lot of advocacy on how best to do things – “Better software, better research”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Eliminate redundancy

A

DRY; Don’t Repeat Yourself

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Track versions of everything

GitHub; Facebook for code

A

Random people use your stuff, and find problems – fix and improve it!
Great impact – easily updated – easily collaborated – identify trends – build online reputation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Programming language

Excel

A

Good: quick & dirty

Bad: easy to make mistakes, doesn’t scale

17
Q

Programming language

R

A

Good: numbers, stats, genomics

Bad: programming not intuitive

18
Q

Programming language

Unix commmand-line ==shell == bash

A

Good: Can’t escape it, quick & dirty

Bad: programming, complicated things

19
Q

Programming language

Java

A

Good: 1990s user interfaces

Bad: overcomplicated

20
Q

Programming language

Perl

A

Good: 1980s user interfaces

Bad: Everything

21
Q

Programming language

Python

A

Good: scripting, text

22
Q

Programming language

Ruby

A

Good: scripting, text

23
Q

Programming language

Javascript

A

Good: scripting & flexibility (web & client)

Bad: only little bio-stuff