Exam 1 Flashcards
To cover Data Fundamentals (23 cards)
What are the steps to make data useful?
Data Acquisition, Data Modeling, Extraction
Name and Describe Data Provisioning
Data Provisioning is the process of providing users and systems with access to data. This includes the security authorizations to limit access to only those data which the user or system is officially permitted to view
Replication
Data is copied from the source and transferred to the analysis system. This is done to keep the data intact. It is done in time or in batches
Structured Data
Structured data is computer readable and usable. EX(databases, spreadsheets, flat files) Specific data types. Metadata is data about the data (meaning, context, purpose)
Create, Read, Update and Delete (CRUD)
Tables: Columns and Rows
Unstructured Data
Unstructured data does not conform to a data model and or has associated metadata such as pictures, audio, video, tweet, reviews
Relational Databases
Relationships between tables
Each row has a unique id called a primary key.
Connect liked tables with the primary key in another table called a foreign key.
CRUD Anomalies
Read - Does not create an anomaly
Create Anomalies by repeating data already stored, combining data, possibly creating unstructured data.
Update anomalies- storing the same data in many different places
Delete anomalies- delete a row if data which affects another tables data.
Describe Normalization
Normalization is the process of decomposing a database into more tables until the database is not longer susceptible to anomalies
Most common forms of anomalies
-First normal form(1NF)
-Second Normal form (2NF)
-Third Normal form(3NF) (Industry standard)
1NF
Each table cell should contain a singe value
Each record needs to be unique
2NF
Rule 1- Be in 1NF
Rule 2- Single Column primary key
3NF
Rule 1- Be in 2NF
Rule 2- Has no transitive functional dependencies
What is a transitive functional dependencies
A transitive functional dependency is when changing a non-key column, might cause any of the other non-key columns to change
What are some examples of tagged data?
XML and HTML and JSON are examples of tagged data
What is AI?
The Turing test is a test of a machines ability to exhibit intelligent behavior equivalent to, or extinguishable from that of a human
Define Natural Language processing
NLP translates human voice and language into computer readable text using programming languages.
Examples:
- Speech recognition
- Sentiment Analysis
Describe Transactional systems (TLP)
TLP store and process business data required for each of the business transaction cycles (OLTP)
- Sales
- Inventory
Designed to process transactions quickly, reliable and accurately
OLTP
Three tiered architecture
- User interface
- Business Logic
- Data Services
Informatational Systems
Informational Systems provide a place for data to be stored and prepared for analytical purposes.
- Data driven decisions
- Read only
- Data extracted from OLTP systems
- Referred to as Online Analytical Processing(OLAP
Compare OLTP and OLAP
OLTP Level of Detail – Very Detailed Updatable - Yes Speed – Quick on writing Current - Must Requirements - Must be know
OLAP Level of Detail – summarized data Updatable – Read Only Speed – Quick on reading Current – Depends on extract Requirements - Ambiguous
Compare OLTP and OLAP Continued
OLTP
Historical Data – not normally kept
Data – sometimes compartmentalized
Availability – Needs to be 100%
OLAP
Historical Data – kept for analysis
Information – shared normally across company
Availability – Needs to be 100%
What is an inner join?
The INNER JOIN selects all rows from both participating tables as long as there is a match between the columns
What is an outer join?
The SQL OUTER JOIN returns all rows from both the participating tables which satisfy the join condition along with rows which do not satisfy the join condition.
Pivot Tables and Terms related
Dashboards Aggregate Conditional Formatting Sorting -ABC -Numerical -Chronological
Filtering
- Label
- Value
Ranking
- Top N or Bottom N
- Top % or Bottom %
- Combined
Calculations