unit 9 Flashcards

1
Q

what is a fact when looking at visualizations?

A

what the data shows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is the opinion when looking at visualizations?

A

why the fact might be the case

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what assumptions should we be careful to make?

A

correlation does not equal causation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is metadata?

A

data about data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what happens to the primary data when metadata is changed?

A

can be changed without impacting the primary data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what can the metadata be used for?

A

finding, organizing, and managing information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what does metadata increase?

A

increases effective use of data by providing extra information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what does metadata allow the data to be?

A

structured and organized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how to create a bar chart

A

count how many times each value in the column appears and make a bar at that height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

information we can get out of bar charts

A

what values are the most common in this column
what values are the least common in this column
what is the unique list of values in this column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what happens when all the values of a chart is unique?

A

it is not useful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how to create a histogram

A

similar to a bar chart, but all numbers in a bucket are grouped together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

when can histograms be created?

A

only with numeric data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

when is a histogram useful?

A

when a normal bar chart may be difficult to read

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

information we can get out of histograms?

A

what range of values are the most common in this column
what range of values are the least common in this column
what range of values do or do not appear in this column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

when does data need to be cleaned?

A

data in incomplete
data is invalid
multiple tables are combined into one

17
Q

what leads to messy data

A

users enter in different types of data - “two” or 2
users use diff abbreviations for some info - “February” or “Feb”
data may have diff spellings - “colour” or “color”
data has inconsistent capitalization - “spring” or Spring”

18
Q

what does filtering data allow for?

A

allows the user to look at a subset of the data

19
Q

when are bar charts and histograms useful?

A

when looking at one column of data

20
Q

ways to visualize data that look at two columns of data at the same time

A

crosstab chart
scatterplot

21
Q

crosstab chart

A

counts how many times combinations of values appear

22
Q

scatterplot

A

useful for seeing patterns and trends between two values and numeric data with lots of different values
not useful for lots of repeated values

23
Q

what can we takeaway from manipulating and visualizing data?

A

develop insights and knowledge about our world by finding patterns

24
Q

what can we see when investigating two columns of data?

A

we can observe patterns different values move together (how they are correlated) but cannot know the cause of correlation

25
what is open data?
sharing data with others so that they can analyze it publicly available data shared by government, organizations, and others
26
how is making data open useful?
helps spread useful knowledge or creates opportunities for others to use it to solve problems
27
what is citizen science and crowdsourcing
collecting data from others so you can analyze it examples of how human capabilities can be enhanced by collaboration via computing
28
crowdsourcing
practice of obtaining input or information from large numbers of people via the internet
29
what does crowdsourcing offer?
new models for collaborations, such as connecting businesses or social causes with funding
30
citizen science
research where some of the data collection is done by members of the public using own computing devices with leads to solving scientific problems
31
what is big data?
collect huge amounts of data so we can learn even more from it
32
what does the size of datasets analyzed impact? as a result what happens?
how much information can be extracted people are working with increasingly big data sets in many contexts like business and science
33
cloud computing
parallel systems when data gets too big and can no longer be processed on one computer so this is used to help process all that info
34
what is important to consider when working with big data?
the scalability as you want your systems to be able to work even as you're using more and more data
35
what are bar charts and histograms good for knowing?
what is in your data set
36
what are crosstab charts and scatterplots good for knowing?
finding relationships and patterns across diff columns