Lecture 3 Flashcards

(11 cards)

1
Q

Observations/Cases

A

the objects we collect
data on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Unit of observation

A

the level of our observations in our dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Variables

A
  • Variables are “the pieces of information we collect on our units of observation” (Haan
    &Godley 2017, p.12).
  • Any property that varies can potentially be a variable (Takes on two or more values)
  • Variables should be both exhaustive and mutually exclusive
    Exhaustive: there should be enough categories composing the variables to classify every
    observation; every observation or case has to have a place to go
    Mutually exclusive: there is only one category suitable for each observation; one observation
    cannot fit into multiple categories; every observation must fit into only one category
  • columns in data set
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

2 Purposes/Types of Variables

A
  1. Identification variables: variables that uniquely identify each observational unit
  2. Characteristic or measurement variables: variables that describe properties of our observations

Best practice: identification variables should be the first column(s) in our dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

4 Levels of measurement

A
  1. Nominal
  2. Ordinal
  3. Interval
  4. Ratio
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Nominal variables

A
  • Categorical
  • There is no quantifiable difference between categories
  • It is not possible to rank the categories
  • Numeric values used to represent categories are not meaningful
    (do not imply anything about the magnitude or differences
    between categories)
  • Numbers or symbols are assigned to the values of the variable for
    the purpose of classifying, naming, or labelling
  • Often referred to as “qualitative”
  • Example: sex, religion, political party, ethnicity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Ordinal variables

A
  • You can rank the categories from low
    to high, but not calculate the
    difference between them
  • Many attitudes we measure are
    ordinal-level variables (level of
    agreement or satisfaction)
  • Example: social class, education,
    but also often times we measure age
    and income on ordinal scales
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Interval/ratio

A
  • Compare values not only in terms of which is larger or smaller
    but also in terms of how much larger or smaller one is compared
    with another
  • Sometimes distinguish between variables with a natural zero
    point (zero means the absence of the property) (ratio) and those
    where zero is arbitrary, meaning there’s no true zero (interval)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Dataset Codebook

A
  • A codebook describes the contents, structure, and layout of a data
    collection.
  • Should include a description of the study, variable names and
    descriptions
  • may also include question wording (survey data), information about
    weights, summary statistics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Tidy Data

A

Tidy data refers to data that is stored in the following format:
* Each observation is a row
* Each variable is a column
* Each type of observational unit is a table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Five common problems that make for untidy data

A
  • Column headers are values, not variable names.
  • Multiple variables are stored in one column.
  • Variables are stored in both rows and columns.
  • Multiple types of observational units are stored in the same table.
  • Data values about a single type of observational unit are spread out over
    multiple datasets.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly