Intro Flashcards

1
Q

What infographic do you never use

A

Pie charts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Statistics

A

Statistics is the branch of mathematics that examines ways to process and analyse data. Statistics provides procedures to collect and transform data in ways that are useful to business decision makers. To understand anything about statistics, you first need to understand the meaning of a variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

4 fundamental terms of statistics

A

Population
Sample
Parameter
Statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Population

A

A population consists of all the members of a group about which you want to
draw a conclusion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Sample

A

A sample is the portion of the population selected for analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Parameter

A

A parameter is a numerical measure that describes a characteristic of a
population (measures used to describe a population) GREEK LETTERS REFER
TO A PARAMETER

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Statistic

A

A statistic is a numerical measure that describes a characteristic of a sample
(measures calculated from sample data) ROMAN LETTERS REFER TO
STATISTICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

2 types of statistics

A

Descriptive statistics

Inferential statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Descriptive statistics

A

Collecting, summarising and presenting data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Inferential statistics

A

Drawing conclusions about a population based on sample

data/results (i.e. estimating a parameter based on a statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

3 steps of descriptive statistics

A

Collect data
Present data
Characterise data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Collect data example

A

Survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Present data example

A

Tables and graphs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Characterise data example

A

Sample mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Steps of inferential statistics

A

Estimation

Hypothesis Testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Estimation example

A

Estimate the population mean weight (parameter) using the

sample mean weight (statistic)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Hypothesis testing example

A

Test the claim that the population mean weight is 100 kilos

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

4 important sources when collecting data

A

Data distributed by organisation or individual

Designed experiment

Survey

Observational study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

2 classifications of data sources

A

Primary

Secondary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

2 types of data

A

Categorical (defined categories)

Numerical (quantitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

2 types of numerical variables

A

Discrete (counted items)

Continuous (measured characteristics)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Categorical data

A

Simply classifies data into categories (e.g. marital status, hair
colour, gender)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Numerical discrete data e.g.

A

Counted items – finite number of items (e.g. number of

children, number of people who have type-O blood

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Numerical continuous data e.g.

A

Measured characteristics – infinite number of items

e.g. weight, height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

4 levels of Measurement and Measurement Scales from highest to lowest

A

Ratio data
Interval data
Ordinal data
Nominal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Ratio data

A

Differences between measurements are meaningful and a true zero
exists

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Interval data

A

Differences between measurements are meaningful but no true zero
exists

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Ordinal data

A

Ordered categories (rankings, order or scaling)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Nominal data

A

Categories (no ordering or direction)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Ratio data eg

A

Height, weight, age, weekly food spending

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Interval data eg

A

Temperature in degrees Celsius, standardised exam score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Ordinal data eg

A

Rankings in a tennis tournament, student letter grades, Likert
scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Nominal data eg

A

Marital status, type of car owned, gender, hair colour

34
Q

What data is charted and how is this done

A

Categorical data through the use of summary tables

35
Q

What data is graphed and how is this done

A

Numerical data through the use of bar charts and pie charts

36
Q

Ordered array

A

A sequence of data in rank order. Shows range, min to max. Provides some signals about variability within the range and may help identify outliers. If the data set is large or if the data is highly variable the ordered array is less useful.

37
Q

Frequency distribution

A
A frequency distribution is a summary table in which data
are arranged into numerically ordered classes or intervals. The number of observations in each ordered class or interval becomes the corresponding
frequency of that class or interval.
38
Q

Why use a frequency distribution

A

It is a way to summarise numerical data. It condenses the raw data into a more useful form. It allows for a quick visual
interpretation of the data and first inspection of the shape of the data.

39
Q

Frequency distribution rules

A

Class boundaries must be mutually exclusive and classes must be collectively exhaustive. Essentially no class overlaps. Each data value belongs to only one class. Each class grouping has the same width. Usually at least 5 but no more than 15 groupings. Round up the interval width to get desirable endpoints

40
Q

How is width of interval determined in a frequency distribution

A

range/number of desired class groupings

41
Q

Histogram

A
A graph of the data in a frequency distribution is called a histogram. The class boundaries (or class midpoints) are shown on the horizontal axis. The vertical axis is either frequency, relative frequency, or percentage. Bars of the
appropriate heights are used to represent the frequencies (number of observations) within each class or the relative frequencies (percentage) of that class.
42
Q

Important note about histograms

A

No gaps between bars even though excel does

43
Q

What allows you to compare two or more variables

A

Frequency polygon and ogives

44
Q

Scatter diagrams

A

Scatter diagrams are used to examine possible relationships between two numerical variables In a scatter diagram: one variable is measured on the vertical axis (Y) and the other variable is measured on the horizontal axis (X).

45
Q

Time series plot

A

A time-series plot is used to study patterns in the values of a
variable over time. In a time-series plot: one variable is measured on the vertical
axis and the time period is measured on the horizontal axis.

46
Q

Stem and leaf display

A

A quick and simple way to see distribution details in a data set
Method: Separate the sorted data series into groups (the stem) and the values within each group (the leaves)

47
Q

Tables and charts for numerical data

A

Photo 1

48
Q

Stem and leaf display example

A

Photos 2-5

49
Q

Frequency distribution example

A

Photos 6-10

50
Q

Histogram example

A

Photo 11

51
Q

Frequency polygon example

A

Photo 12

52
Q

The ogive example

A

Photo 13

53
Q

Scatter diagrams example

A

Photo14

54
Q

Time series plot example

A

Photo 15

55
Q

Variables

A

Variables are characteristics of items or individuals.

56
Q

Data

A

Data are the observed values of variables.

57
Q

Operational definition

A

Defines how a variable is to be measured.

58
Q

Big Data

A

Large data sets characterised by their volume, velocity and variety.

59
Q

Statistical packages

A

Computer programs designed to perform statistical analysis.

60
Q

Primary sources

A

Provide information collected by the data analyser.

61
Q

Secondary sources

A

Provide data collected by another person or organisation.

62
Q

Focus group

A

An observational study. A group of people who are asked about attitudes and opinions for qualitative research.

63
Q

Discrete variables

A

Can only take a finite or countable number of values.

64
Q

Continuous variables

A

Can take any value between specified limits.

65
Q
Problems for section 1.4
Chapter 1 review problems
Problems for Section 2.1 Problems for Section 2.2
Problems for Section 2.3
Problems for Section 2.4
Problems for Section 2.5
Problems for Section 2.6
Chapter 2 review problems
A

Work through problems in textbook

66
Q

Summary table

A

Summarises categorical or numerical data; gives the frequency, proportion or percentage of data values in each category or class.

67
Q

Summary table examples

A

Photos 16-17

68
Q

Bar chart

A

Graphical representation of a summary table for categorical data; the length of each bar represents the proportion, frequency or percentage of data values in a category.

69
Q

Pie Chart

A

Graphical representation of a summary table for categorical data, each category represented by a slice of a circle of which the area represents the proportion or percentage share of the category relative to the total of all categories.

70
Q

Class width (frequency distribution)

A

Distance between upper and lower boundaries of a class.

71
Q

Range

A

Distance measure of variation; difference between maximum and minimum data values

72
Q

Class boundaries (frequency distribution)

A

Upper and lower values used to define classes for numerical data.

73
Q

Class midpoint

A

Centre of a class; representative value of class.

74
Q

Relative Frequency Distributions and Percentage Distributions

A

A relative frequency distribution is obtained by dividing the frequency in each class by the total number of values. From this a percentage distribution can be obtained by multiplying each relative frequency by 100%.

75
Q

Relative frequency distribution

A

Summary table for numerical data which gives the relative frequency of data values in each class.

76
Q

Percentage distribution

A

Summary table for numerical data which gives the percentage of data values in each class.

77
Q

Cumulative percentage distribution

A

Summary table for numerical data; gives the cumulative frequency of each successive class. A cumulative percentage distribution gives the percentage of values that are less than a certain value.

78
Q

Percentage polygon

A

Graphical representation of a percentage distribution.

79
Q

cumulative percentage polygon (ogive)

A

Graphical representation of a cumulative frequency distribution.

80
Q

Chartjunk

A

Unnecessary information and detail that reduces the clarity of a graph.