{ "@context": "https://schema.org", "@type": "Organization", "name": "Brainscape", "url": "https://www.brainscape.com/", "logo": "https://www.brainscape.com/pks/images/cms/public-views/shared/Brainscape-logo-c4e172b280b4616f7fda.svg", "sameAs": [ "https://www.facebook.com/Brainscape", "https://x.com/brainscape", "https://www.linkedin.com/company/brainscape", "https://www.instagram.com/brainscape/", "https://www.tiktok.com/@brainscapeu", "https://www.pinterest.com/brainscape/", "https://www.youtube.com/@BrainscapeNY" ], "contactPoint": { "@type": "ContactPoint", "telephone": "(929) 334-4005", "contactType": "customer service", "availableLanguage": ["English"] }, "founder": { "@type": "Person", "name": "Andrew Cohen" }, "description": "Brainscape’s spaced repetition system is proven to DOUBLE learning results! Find, make, and study flashcards online or in our mobile app. Serious learners only.", "address": { "@type": "PostalAddress", "streetAddress": "159 W 25th St, Ste 517", "addressLocality": "New York", "addressRegion": "NY", "postalCode": "10001", "addressCountry": "USA" } }

Data Science Flashcards

(59 cards)

1
Q

quantitative data

A

used to measure the amount of something (eg: mass)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

categorical data

A

used to classify instead of measure (eg: species of an animal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Definitions area

A

where you write the code in Pyret

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Interactions area

A

where the output is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Pyret decimals

A

must start with 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

bar chart

A

count
Visual representation of value’s frequency
Column for every category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Pie chart

A

Percentage
Visual representation of RELATIVE frequency
slice for every column
Max 7 slices, generally 5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

stacked bar chart

A

shows more detail about another column (eg: count of species and sex)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

data cycle

A

ask questions, consider data, analyze data, interpret data (mnemonic: QCAI )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

lookup questions

A

answered by looking up a single value in a table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

arithmetic questions

A

computing an answer within a single column Can be finding the average, max, min in a column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

statistical questions

A

asks a question about the relationship between two columns?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

null hypothesis

A

a type of statistical hypothesis that proposes no statistical significance exists in a set of given observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

random samples

A

a subset of a population in which each member has an equal chance of being chosen. Larger the random sample, the more accurate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

grouped samples

A

a subset of the population in which each member of the subset was chosen for a specific reason

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

file extension purpose

A

tell your computer which application created or can open open the file and which icon to use for the file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what does CSV stand for?

A

comma-separated values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

histograms

A

shows the number of rows that fall within certain intervals (or “bins”) along the horizontal axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

⬛️⬛️
⬛️⬛️⬛️⬛️

A

?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

type of histogram
📉
(imagine this as a histogram)
⬛️
⬛️⬛️
⬛️⬛️⬛️
⬛️⬛️⬛️⬛️⬛️⬛️⬛️

A

skew right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

type of histogram
📈
(imagine this as a histogram)
⬛️
⬛️⬛️
⬛️⬛️⬛️
⬛️⬛️⬛️⬛️⬛️⬛️

A

Skew left

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what does it mean to be an outlier

A

Compare it to the other data. But it is important to think about all extreme data points, not just outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

mean

A

average
symmetric medium-large dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

median

A

Half the values are smaller and half are larger. The middle number or average of two middle #s
If data is asymmetric, use median

24
mode
or #s that occur the most often in a dataset in small dataset, mode will likely be most accurate measure of center
25
how many quartiles are in a box plot?
3
26
histogram and box plot shape
whisker direction is the same direction as the skew
27
standard deviation
the most useful way to summarize spread of quantitative columns
28
how to calculate standard deviation
average spread from mean
29
standard deviation equation
sqrt([number of squares of distances] / [# - 1] )
30
explanatory variable
a type of independent variable (x) scatterplot
31
response variable
a type of dependent variable (y) scatterplot
32
r
correlation statistic between -1 and +1 -1 = strongest negative correlation +1 = strongest positive correlation 0 = no correlation
33
what is the regression line also known as?
Line of best fit, least quares line, predictor, trendline
34
definition of a row
cat-row = row-n(animals-table, #)
35
look up identify
cat-row[species"] (have cat-row predefined)
36
how to make a function
fun gt(name): fun(parameters) end
37
what is an example for functions?
shows what the function does
38
example example
fun f(x): x / 2 end examples f(2) is 2 / 2 f(10) is 10 / 2 end
39
what functions need a helper function
image-scatter-plot, build-column
40
what function to make a specific table
sort or build-column filter(build-column(animals-table, "kilos", kilogram), is-heavy)
41
syntax errors
typos and easy to spot. code will not run
42
runtime error
the app runs for a bit and crashes at specific point in the code
43
logic error
the app runs completely but simply produces the wrong input
44
four categories of dirty data
missing data, inconsistent types, inconsistent units/invalid range, inconsistent naming
45
missing data
some cells have data. Some do not
46
inconsistent types
a column where the values have different data types. (eg: 2, two)
47
inconsistent unit/ invalid range
where the data types are the same but represent different units
48
inconsistent naming
inconsistent spelling and capitalization in entries
49
selection bias
if the participants selected are representative of the group study
50
bias in the study design
if the study was not designed specifically and ended up not measuring what was asked very specifically
51
poor choice of summary
using the wrong data analysis technique: mean/median
52
confounding variables
correlation does not imply causation. an outside influence other than the one being studies
53
intentionally using the wrong chart
misleads the audience, can remove holes in data, making it inaccurate
54
changing the scale of a chart
makes the data look a certain way
55
56
57
58