statistics definitions Flashcards Preview

Statistics > statistics definitions > Flashcards

Flashcards in statistics definitions Deck (18)
Loading flashcards...

raw data

data as it is first collected in a statistical investigation before it has been sorted or ordered


what is quantitative data

numerical data such as measures of height and weight


what is qualitative data

non-numerical data such as type of car or colour of hair


what is categorical data

variables that can be sorted into categories


what is continuous data

numerical data that can take values between two numbers like temperature


what is discrete data

numerical data that can only take one value like show size


what is ordinal data

position in a race or in a class test
written in order of numerical value


what is bivariate data

pairs of related data values such as exam results and time spent on study


what is multivariate data

involves sets of three or more related data values like age, height and weight


what is primary data

data that you collect yourself


what is secondary data

data collected by a published source


what are the advantages and disadvantages of primary data

- collection method known
- accuracy known
- questionnaire or survey can be designed properly to find answers to specific questions

- collection of data can be expensive and time-consuming


what are the advantages and disadvantages of secondary data

- easy and cheap to obtain
- data from known organisations is usually reliable like the UK office for National Statistics

- data source may not be reliable
- data might contain errors
- data might not be suitable to find answers to specific questions
- collection method unknown
- data might be out of date


what needs to be controlled when collecting data

- extraneous variables which are any variables that the researcher is not interested in but could affect the results of the experiment
- explanatory data which is like the control variable in science
- response variable which is like the dependent variable in science


what are field experiments

carried out in an everyday (uncontrolled) environment but the researcher sets up the situation and variables are controlled


what is a natural experiment

carried out in an everyday (uncontrolled) environment but the researcher has no control over any variables


how must you clean data

- identify and correct or remove inaccurate data values or extreme values
- check units are consistent
- record values without units or with other symbols
- decide what to do about missing data


why must you check and clean data

- to ensure it is consistent and accurate before you process it otherwise you results may be invalid
- collected data may contain outliers or anomalous values that do not fit the pattern of the rest of the data and may skew your results
- outliers can be ignored if they are due to measuring or recording errors
- you need to check that your collection plan hasn't affected the reliability of your results