Lecture 6 – Modelling Data Flashcards

1
Q

In what scenarios do we encounter or need to deal with temporal data?

A
• Data indexed with time or dates
• Data about change, transformation and occurrences
• Time series data

Temporal phrases
Era: AD 2020, 20 Jan 2020 CE
* Calendar: Lunar, Hebrew, Chinese, etc.
* Time zone: 1pm AEST, 1pm UTC+10:00
* Submultiples: 13:00.001
* Years do not have the same number of days
* Months have different numbers of days
* It can be difficult to identify the day of the week, day of the month and week in the year
* Years and months start on different days
* Even specific time phrases can be very complicated to parse!

Counting time:
* Time is not decimal
* Months and years have different numbers of days
* Be careful how you compare time elements
* Socially, not all time periods are the same
➡ weekends
➡ holidays
➡ pay periods

2
Q

Statistical Modelling

A
• Models represent aspects of a scenario to help us understand it.
• Statistical models represent the relationships between variables
➡ Independent variable(s)
➡ Dependent variable
• A model can be used to predict about the dependent variable, given information about the independent variable(s)
• Rather than trying to use all data about the scenario, the model just reduces the data set to a low dimensional summary.
3
Q

Causation & Correlation

A

Causation:
Causation indicates that one event is the result of the occurrence of the other event –> cause and effect (e.g. it rains –> the street is wet)

Correlation:
Correlation is a statistical measure that describes the size and direction of a relationship between two or more variables
–> does not automatically mean that the change in one variable is the cause of the change in the values of the other variable.

!!!
Causation implies correlation (normally); BUT, Correlation does not imply causation
!!!