DVT: Data, Variables and Tables Flashcards

(17 cards)

1
Q

What 2 ways can data variables be classified?

A

Numerical:

  • Quantitative
  • Individuals measured or count

Categorical:

  • Qualitative
  • Individuals classified into groups
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are examples of numerical variables?

A
  • Weight
  • BP
  • Prothrombin time
  • Age
  • No. long distance flights in last month
  • No. cigarettes per day
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are examples of categorical variables?

A
  • Smoker/non-smoker
  • On anticoagulation medicine?
  • History of cancer
  • Alive after 6 months?
  • Blood group type
  • Causes of death
  • Pain assessment
  • Stage of cancer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How are numerical values measured?

A

On interval scales (interval or distance b/w points on scale has precise numerical meaning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is binary data?

A

Subtype categorical

Can only take 2 values (often yes/no)
Also known as dichotomous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is nominal data?

A

Subtype categorical

More than 2 categories, but no natural order (A, B, AB, O)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is ordinal data?

A

More than 2 categories, with a natural order e.g. Stage I, II, III, IV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How can data be summarised?

A
  • Numerical - Measures of central tendency (mean, median), measures of spread (standard deviation, range)
  • Categorical - Frequencies, proportions, percentages. Use tables and charts to do this
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why should data be summarised?

A
  • Data monitoring - Ensure what’s being collected is valid to spot errors that can be corrected
  • Data checking/cleaning - Ensure collected data correct, identify any outliers
  • Summary of results - Basic description, potential precursor to more complex analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How can central tendency be measured?

A

Mean - Average of all values, good measure of centre at a symmetrical distribution. Much more useful in practice but over influenced by extreme values
Median - Value at which 50% data points lie, better for skewed distributions because only slightly affected by extreme values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe symmetrical bell shape

A

Mean = Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe negatively skewed bell shape

A

Mean < median, long tail to left

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe positively skewed bell curve

A

Mean > Median, long tail to right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Can range be a measure of spread?

A

Dependent on outliers (i.e. extreme values)

Range doesn’t indicate whether these values are distinct from main body of data (larger sample, wider range)

  • Useful if data not normal (symmetrical)
    Splits data so there are equal frequencies in each group
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define reference range and how it can be estimated?

A

A set of values within which a specific test result is considered to be within the normal or healthy range for a particular population

Can be estimated by a large sample of individuals from the defined population is recruited, and their results for the specific test or measurement are collected.

The collected data is analyzed to calculate:
Mean: The average value of the test or measurement across the sample.
Standard Deviation (SD): A measure of how much the individual results deviate from the mean.

The reference range is usually defined as the mean plus or minus a certain number of standard deviations. Commonly, the 95% reference range is calculated as mean ± 2 SD. This means that approximately 95% of individuals in the defined population would be expected to have values within this range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How can numerical data be further classified?

A

As continuous or discrete:

Continuous - All possible values within range, Continuous numerical data refers to numerical data that can take on any value within a given range, including decimals and fractions. Continuous data is measured rather than counted

Discrete - Takes certain values in given range, Discrete numerical data refers to data that can only take on certain, separate values, typically whole numbers, and are usually counted rather than measured

17
Q

How do we calculate confidence intervals?

A

Mean +/- 2 Standard error