Week 5 - data and databases Flashcards
Big data
- still an evolving area of research and application concerning data sets of scale
- different approaches are required for efficient analysis
What does data have to be before analysis
- cleaned (removal of any errors or inconsistencies)
- prepared (transformed into a format that can be analysed)
Techniques to extract information for data analyses
- identifying patterns
- statistics
- building predictive models
Types of knowledge
Explicit - easy to communicate and shared, often written or other forms
implicity - embedded in peoples skills, experiences and intutition
How knowledge can be contextually applies
- information and decision making
- problem solving
- product development
- service delivery
Database definition
An organised collection of data, can be physical or electronic
Flat tile databases
- typically individual files
- some formats are unstructured and volatile
- some formats may be more technical but offer greater strucutee
Relational databases
- intended to address issues around data redundancy, data siilos and data complexity
- strong theoretical foundation, maturity and wide support
How is data stored in relational databases
- columns - individual items of data
- rows - a record containing all relevant attributes
Entity relationship diagram
- a model produced following the systematic analysis of business processes
- helps to define and describe any created or required data
- identify and describe any relationship between objects and things within the real world
Domain definition
The real world context in which entities exist
Entity definition
An object or something which exists independently
Relationship definition
How one entity is related to another within the domain
Cardinality definition
A numerical representation of the nature of the relationship between entities, can be visual
Database management systems
Software for managing and storing data in a structured and efficient way
Structured query language
A domain specific programming language, used to manage and work with data stores