# Intro Flashcards

1
Q

What infographic do you never use

A

Pie charts

2
Q

Statistics

A

Statistics is the branch of mathematics that examines ways to process and analyse data. Statistics provides procedures to collect and transform data in ways that are useful to business decision makers. To understand anything about statistics, you first need to understand the meaning of a variable.

3
Q

4 fundamental terms of statistics

A

Population
Sample
Parameter
Statistic

4
Q

Population

A

A population consists of all the members of a group about which you want to
draw a conclusion.

5
Q

Sample

A

A sample is the portion of the population selected for analysis

6
Q

Parameter

A

A parameter is a numerical measure that describes a characteristic of a
population (measures used to describe a population) GREEK LETTERS REFER
TO A PARAMETER

7
Q

Statistic

A

A statistic is a numerical measure that describes a characteristic of a sample
(measures calculated from sample data) ROMAN LETTERS REFER TO
STATISTICS

8
Q

2 types of statistics

A

Descriptive statistics

Inferential statistics

9
Q

Descriptive statistics

A

Collecting, summarising and presenting data

10
Q

Inferential statistics

A

Drawing conclusions about a population based on sample

data/results (i.e. estimating a parameter based on a statistic

11
Q

3 steps of descriptive statistics

A

Collect data
Present data
Characterise data

12
Q

Collect data example

A

Survey

13
Q

Present data example

A

Tables and graphs

14
Q

Characterise data example

A

Sample mean

15
Q

Steps of inferential statistics

A

Estimation

Hypothesis Testing

16
Q

Estimation example

A

Estimate the population mean weight (parameter) using the

sample mean weight (statistic)

17
Q

Hypothesis testing example

A

Test the claim that the population mean weight is 100 kilos

18
Q

4 important sources when collecting data

A

Data distributed by organisation or individual

Designed experiment

Survey

Observational study

19
Q

2 classifications of data sources

A

Primary

Secondary

20
Q

2 types of data

A

Categorical (defined categories)

Numerical (quantitative)

21
Q

2 types of numerical variables

A

Discrete (counted items)

Continuous (measured characteristics)

22
Q

Categorical data

A

Simply classifies data into categories (e.g. marital status, hair
colour, gender)

23
Q

Numerical discrete data e.g.

A

Counted items – finite number of items (e.g. number of

children, number of people who have type-O blood

24
Q

Numerical continuous data e.g.

A

Measured characteristics – infinite number of items

e.g. weight, height

25
Q

4 levels of Measurement and Measurement Scales from highest to lowest

A

Ratio data
Interval data
Ordinal data
Nominal data

26
Q

Ratio data

A

Differences between measurements are meaningful and a true zero
exists

27
Q

Interval data

A

Differences between measurements are meaningful but no true zero
exists

28
Q

Ordinal data

A

Ordered categories (rankings, order or scaling)

29
Q

Nominal data

A

Categories (no ordering or direction)

30
Q

Ratio data eg

A

Height, weight, age, weekly food spending

31
Q

Interval data eg

A

Temperature in degrees Celsius, standardised exam score

32
Q

Ordinal data eg

A

Rankings in a tennis tournament, student letter grades, Likert
scales

33
Q

Nominal data eg

A

Marital status, type of car owned, gender, hair colour

34
Q

What data is charted and how is this done

A

Categorical data through the use of summary tables

35
Q

What data is graphed and how is this done

A

Numerical data through the use of bar charts and pie charts

36
Q

Ordered array

A

A sequence of data in rank order. Shows range, min to max. Provides some signals about variability within the range and may help identify outliers. If the data set is large or if the data is highly variable the ordered array is less useful.

37
Q

Frequency distribution

A
```A frequency distribution is a summary table in which data
are arranged into numerically ordered classes or intervals. The number of observations in each ordered class or interval becomes the corresponding
frequency of that class or interval.```
38
Q

Why use a frequency distribution

A

It is a way to summarise numerical data. It condenses the raw data into a more useful form. It allows for a quick visual
interpretation of the data and first inspection of the shape of the data.

39
Q

Frequency distribution rules

A

Class boundaries must be mutually exclusive and classes must be collectively exhaustive. Essentially no class overlaps. Each data value belongs to only one class. Each class grouping has the same width. Usually at least 5 but no more than 15 groupings. Round up the interval width to get desirable endpoints

40
Q

How is width of interval determined in a frequency distribution

A

range/number of desired class groupings

41
Q

Histogram

A
```A graph of the data in a frequency distribution is called a histogram. The class boundaries (or class midpoints) are shown on the horizontal axis. The vertical axis is either frequency, relative frequency, or percentage. Bars of the
appropriate heights are used to represent the frequencies (number of observations) within each class or the relative frequencies (percentage) of that class.```
42
Q

A

No gaps between bars even though excel does

43
Q

What allows you to compare two or more variables

A

Frequency polygon and ogives

44
Q

Scatter diagrams

A

Scatter diagrams are used to examine possible relationships between two numerical variables In a scatter diagram: one variable is measured on the vertical axis (Y) and the other variable is measured on the horizontal axis (X).

45
Q

Time series plot

A

A time-series plot is used to study patterns in the values of a
variable over time. In a time-series plot: one variable is measured on the vertical
axis and the time period is measured on the horizontal axis.

46
Q

Stem and leaf display

A

A quick and simple way to see distribution details in a data set
Method: Separate the sorted data series into groups (the stem) and the values within each group (the leaves)

47
Q

Tables and charts for numerical data

A

Photo 1

48
Q

Stem and leaf display example

A

Photos 2-5

49
Q

Frequency distribution example

A

Photos 6-10

50
Q

Histogram example

A

Photo 11

51
Q

Frequency polygon example

A

Photo 12

52
Q

The ogive example

A

Photo 13

53
Q

Scatter diagrams example

A

Photo14

54
Q

Time series plot example

A

Photo 15

55
Q

Variables

A

Variables are characteristics of items or individuals.

56
Q

Data

A

Data are the observed values of variables.

57
Q

Operational definition

A

Defines how a variable is to be measured.

58
Q

Big Data

A

Large data sets characterised by their volume, velocity and variety.

59
Q

Statistical packages

A

Computer programs designed to perform statistical analysis.

60
Q

Primary sources

A

Provide information collected by the data analyser.

61
Q

Secondary sources

A

Provide data collected by another person or organisation.

62
Q

Focus group

A

An observational study. A group of people who are asked about attitudes and opinions for qualitative research.

63
Q

Discrete variables

A

Can only take a finite or countable number of values.

64
Q

Continuous variables

A

Can take any value between specified limits.

65
Q
```Problems for section 1.4
Chapter 1 review problems
Problems for Section 2.1 Problems for Section 2.2
Problems for Section 2.3
Problems for Section 2.4
Problems for Section 2.5
Problems for Section 2.6
Chapter 2 review problems```
A

Work through problems in textbook

66
Q

Summary table

A

Summarises categorical or numerical data; gives the frequency, proportion or percentage of data values in each category or class.

67
Q

Summary table examples

A

Photos 16-17

68
Q

Bar chart

A

Graphical representation of a summary table for categorical data; the length of each bar represents the proportion, frequency or percentage of data values in a category.

69
Q

Pie Chart

A

Graphical representation of a summary table for categorical data, each category represented by a slice of a circle of which the area represents the proportion or percentage share of the category relative to the total of all categories.

70
Q

Class width (frequency distribution)

A

Distance between upper and lower boundaries of a class.

71
Q

Range

A

Distance measure of variation; difference between maximum and minimum data values

72
Q

Class boundaries (frequency distribution)

A

Upper and lower values used to define classes for numerical data.

73
Q

Class midpoint

A

Centre of a class; representative value of class.

74
Q

Relative Frequency Distributions and Percentage Distributions

A

A relative frequency distribution is obtained by dividing the frequency in each class by the total number of values. From this a percentage distribution can be obtained by multiplying each relative frequency by 100%.

75
Q

Relative frequency distribution

A

Summary table for numerical data which gives the relative frequency of data values in each class.

76
Q

Percentage distribution

A

Summary table for numerical data which gives the percentage of data values in each class.

77
Q

Cumulative percentage distribution

A

Summary table for numerical data; gives the cumulative frequency of each successive class. A cumulative percentage distribution gives the percentage of values that are less than a certain value.

78
Q

Percentage polygon

A

Graphical representation of a percentage distribution.

79
Q

cumulative percentage polygon (ogive)

A

Graphical representation of a cumulative frequency distribution.

80
Q

Chartjunk

A

Unnecessary information and detail that reduces the clarity of a graph.