Chapter 13 - Data Analysis Flashcards
What is Data?
Consists of numbers, letters, symbols, raw facts, events and transactions which have been recorded but not yet processed into a form which is suitable for use by management
What is information?
Data which have been processed in such a way that is meaningful to the person who receives it
Why is information useful to management?
- Helps planning
- Helps making decisions
- Helps controlling day-to-day operations, for example by comparing actual results with those planned
What are the four types of data?
- Quantitative data = numerical data that provides measurements or quantities. Expressed as numbers for e.g. number of KG needed to make a unit of product
- Qualitative data = Cannot be expressed as numbers or values and it is much harder to analyse
- Discrete Data = Non-continuous data can take on any value (within a range) for e.g. time or distance
What are internal sources of data
- Accounting records
- HR/payroll records
- Machine logs/computer systems
- Procurement data systems
- Timesheets
- Communication to/from staff
What are the sources of internal information?
Formally gathered
- Market research e.g. new trends, customer tastes, competitor products
- Research and development
- Tax and accounting specialists
- Legal specialists
Informally gathered
- Any information gathered on an ongoing basis e.g. newspapers, internet, meetings with external business colleagues
What is the internet of things IoT
internet connected devices continually collect and exchange data
Using the mnemonic ACCURATE - What are qualities of good information?
A - Accurate e.g. no typos, roundings, categorised, assumptions
C - Complete e.g. all information provided with purpose
C - Cost-beneficial e.g. benefit > cost of producing information
U - User-targeted e.g. understandable and useful to recipient
R - Relevant for purpose intended
A - Authoritative e.g. genuine, highest quality for purpose, source should be known and reliable
T - Timely e.g. produced in advanced of when needed
E - Easy to use e.g. clear, concise, constructive, communicated appropriately
What is Data analysis?
- Identify the information needs
- Collect the data
- Analyse the data
- Present the information
- Use the information
What are ways in which data can be analysed?
- Inferential statistics e.g. draw conclusions about a set of data taken from a population to describe and make inferences about the population
- Exploratory data e.g. when pattern is identified in types of data. This type of analysis may use regression and correlation analysis.
- Confirmatory data analysis - confirms (or not) a hypothesis using statistical methods. For example a price increase of 3% will reduce demand by 5%
- Sample e.g. a group of items drawn from a population. The population may consist of items such as metal bars, invoices, packets of tea
What is sampling?
Collecting a sample by selecting a unit e.g. people, organisations) then using the information to generalise to the wider population
What are the three main reasons why sampling is necessary
- Whole population may not be known
- Even if the population is known the process of testing every item can be extremely costly in time and money e.g. gaining information about the popularity of TV programmes by interviewing every viewer
- Items being tested may be completely destroyed in the process, for e.g. in order to check the lifetime of an electric light bulb it is necessary to leave the bulb burning until it breaks and is of no further useW
What are the rules involved with sampling?
Sample must be chosen in such a way that is representative of the population
Sample must be of certain type. In general large the sample, the more reliable the results will be
What are the four types of sampling?
- Random
- Systematic
- Surveys
- Stratified
What are spreadsheets?
- Computer package used to manipulate data
What is the SUM function used for?
Totals the values in the list
What are the AVERAGE function used for?
Average of the values in the list
What are the MAX function used for?
Highest values in the list
What are the MIN values used for
Lowest values in the list
What are the disadvantages of using spreadsheets?
- Can be time consuming
- Not able to identify data input errors or prevent accidental deletion so training of staff is important
- Sharing violations among users wishing to view or change data at the same time
- Difficult to identify an error in the design of the spreadsheet as some formula are very complicated.
- Spreadsheets are open to cyber-attack through viruses, hackers and general system failure
- Spreadsheets are restricted to a finite number of records and they may not be a true reflection of the ‘real’ world.
What are the two problems with data?
- Comparability: is it possible to compare data from different sources?
- Data bias: When a sample is chosen does it truly represent the population
What are the 7 types of bias
- Selection bias
- Self-selection bias
- Observer bias
- Omitted variable bias
- Cognitive bias
- Confirmation bias
- Survivorship bias
What is selection bias?
When selecting a sample all items in a population should have the same chance of being picked - true random sampling
If data is not random then selection bias can occur and sample may not be representative
What is self-selection bias?
When an individual selects whether or not to include themselves as part of a sample