Unit 4 Flashcards
Come on!!! (10 cards)
Data analytics data types
Structured: adheres to pre-defined model and straightforward to analyze, excel and SQL
Semi: contains tags to separate semantic elements…JSON XML
Unstructured: ambiguous , audio, video, NOSQL
Metadate
data about data like having info about when and where a pic was taken
Data value chain?
End to end process of creating value from data (lifecycle of a data
Steps of data value chain:
Data acquisition: gathering and filtering data before it is stored(data warehouse usually) ,flexible
Data Analysis: raw data ~ susceptible to decision making, makes the data relevant, extracting info
Data Curation: manages data accuracy
Data Storage: persistence and management of data in a scalable way, used the ACID and uses NOSQL
Data Usage: how businesses use data to make decisions. using data, analysis and software
Which infrastructure in the data value chain is the biggest challenge and has a predictable latency ?
Data acquisition
Big data
blanket term for how large datasets that can’t be stored in a single computer are gathered…bla bla bla
Characteristics of big data
Volume
Velocity
Variety
Veracity: accurate and credible
Value: Getting useful value
Latency
Time it takes for a packet of data to travel from source to a
destination
Data warehouse
Data flows into a data warehouse from transactional systems,
relational databases, and other sources, typically on a regular
frequency.
Data lake
It can store data in its native format and process any variety of
it, ignoring size limits.