lesson 7 Flashcards

1
Q

how to Verify Data Quality? 3

A

DOCUMENTATION

– CONTENT:
- Entity (what is about?)
- Property
- Measurement (units) type of variable
- Time

– TECHNICAL
- Abbreviations, codes
- Program code for data set creation, conversion

TRUTHFULNESS
- verify from other sources
-plausibility

COMPLETENESS

– TALL:
- missing observations?
– WIDE:
- encoding for missing documentation (NaN, 0..)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how to preserve data quality?

A

CONVERSION:

– DATA TRANSFORMATION
- convert unit
-aggregation
MERGING:
- ‘key’ is key

– DATA CLEANING
- limit the scope of the analysis (focus on the scope!)
- check the realistic/possible range of value
- check the origin of the data
- eliminate outliners
- eliminate border observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how can we treat the missing data?

A

ELIMINATE: horizontally or vertically but cannot be a statement

IMPUTE: make a statement based on other variables. (estimation)

INTERPOLATION: male a statement based on the same variable. OK CROSS-SECTION, NOT OK Interpolation in time

IMPROVE ECONOMETRICS: other methods - different form interpolation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is meant by Audit trail?

A

document the entire process from original data to the final result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

6 dimension of data quality

A
  1. Accuracy -> Data is accurate when it reflects reality. es. height of a person is recorded as 15 cm
  2. Completeness->Data is complete when all required data for a particular use is present.
  3. Uniqueness->Data is unique if each entry appears only once within a dataset, without duplicates
  4. Consistency-> Consistency is achieved when data values do not conflict within a record or across
  5. Timeliness-> Data is timely if it is available when expected and needed.
  6. Validity->Data is valid if it conforms to the expected format, type, and range. es. @ nell’email.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly