Paper 2 Unit 6 Flashcards
(42 cards)
What is data?
Data refers to raw facts, observations or measurements that have little meaning on their own
What is information?
Processed and organised data that has meaning and context. It is derived from data through interpretation, analysis and contextualisation.
What is knowledge?
Knowledge goes beyond information and represents the understanding, insights and expertise gained from information and experience.
What is human readable data?
Unstructured data like a block of text that can only be interpreted by humans
What is machine readable data?
Structures data like a set of instructions that can be processed by computer programs
What is big data?
Large, complex, and layered groups of data that can be analysed to spot patterns and trends
Define a data type.
The way data is stored (string, integer, float etc)
What is data wrangling?
The process of transforming a raw data form into a desired format suitable for purpose
Name the stages of data wrangling
Discover, structure, clean, enrich, validate, share
Why do organisations need data?
To analyse market trends to identify patterns and inform decisions, system performance analysis, user monitoring, targeted marketing, inform decision making, assess threats and opportunities
How is data generated?
Human input, AI, sensors, Internet of Things, Transactional data
Name different data formats
ASCII, CSV, fixed width text file, XML, JSON
What are the benefits and drawbacks of ASCII?
Benefits: Standard format for all computer systems, communicate using standard English alphabet
Drawbacks: Limited number of characters, replaced by Unicode which contains other alphabets and symbols so can be more widely used
What are the benefits and drawbacks of CSV?
Benefits: Common format understood by most applications
Drawbacks: Format is delimited and it is possible to use other delimiters other than the comma, tab is common making TSV widely used
What are the benefits and drawbacks of Fixed Width data formats?
Benefits: For very large data files it’s easy to calculate the location of data to retrieve it since the length is fixed
Drawbacks: Fixed sizes for fields, padding character and alignment need to be known before data can be retrieved accurately, needs to be carefully planned before setting up and saving data
What are the benefits and drawbacks of XML?
Benefits: platform dependent so can be used on any system, supports Unicode so can cope with data, displayed in a GUI using HTML
Drawbacks: Requires a series of complex tags to store the data making files large
What are the benefits and drawbacks of JSON?
Benefits: Compact format based on JavaScript that can be used on most systems, good and reliable for website-to-website data transfer, many programming languages support JSON
Drawbacks: No error handling for JSON calls, security issues with data transfers if it is being hosted on a vulnerable website so it is open to hacking
What are the differences between file-based directory structure and dictionary-based data structures?
File-based: defines the structure and types of data stored, each application defines and manages own data, data should be consistently structured to maintain access to different processes, different file formats can make data incompatible, data can be duplicated, analysing data can become complex since formats need to be rationalised.
Dictionary-based: typically hierarchal, easier to locate data, set of keys to a value, set of keys are ordered making structure more logical and searching more efficient.
What are the stages of data wrangling?
Discovery- becoming familiar with the data and understanding it so patterns can be identified
Structure- data restructured into single format after coming from different sources with additional items of no value being removed
Clean- errors identified and fixed, outliers, null values and duplicates removed, format standardised, typos fixed, measurements standardised and data validated
Enrich- existing data from internal or third-party added to fill gaps and enhance set
Validate- quality of data validated for quality, consistency, accuracy, security and authenticity
Output- data ready to be published and used
What are the core functions of a data system?
Input, Search, Save, Integrate, Organise, Output, Feedback loop
Why is maintaining data important?
Making sure data entry is accurate is not enough, regular check need to be carried out to update important data e.g by contacting customers regularly to check their data is correct
How can data be visualised?
Graphs and charts to help clarify data but the scales and choice of graph can be confusing and misleading, data tables to provide rapid and easy access to enable stakeholders to compare information but they need to be labeled accurately, reports show data in an accessible format to assess the performance of a business but a consistent format is needed so comparisons can be made, infographics represent data graphically with minimal text for an easy-to-see overview however they can sometimes lack detail
What is the purpose of business intelligence software?
To retrieve and analyse data to inform decision making. The applications can provide information that a business can use to inform decisions about long-term strategic decisions
What is the purpose of financial planning and analysis?
To support the financial aspects of the business including financial planning, setting budgets and forecasting future performance including profits