Data Analysis Flashcards

(68 cards)

1
Q

Where data is complied from

A

Data used in transport modelling is compiled from samples of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sampling Methods

A
  • Simple Random Sampling

- Stratified Random Sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Simple Random Sampling

A

Involved associating an identifier (number) to each unit in population, then selection numbers at random to obtain the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Stratified Random Sampling

A

Population subdivided into homogeneous strata and then random samples taken from each of these groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Problem with simple random sampling

A

Far too large sample would be required to ensure sufficient data collected on minority groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Types of errors that can be introduced in sampling

A
  • Sampling Error

- Sampling Bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sampling Error

A

Error generated due to fact that sample is only proportion of population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sampling Bias

A

Caused by mistakes made either

  1. when defining population of interest
  2. when selecting sample method
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Equations

A

In lecture slide 6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Type of errors

A
  • Errors in modelling and forecasting
  • measurement errors
  • sampling errors
  • specification errors
  • transfer errors
  • aggregation errors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Errors in modelling and forecasting

A

ideal req is to find combo of model complexity and data accuracy which best fits required forecasting precision + study budget

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

measurement errors

A

survey questions badly interpreted, answered badly, coding errors, etc, can cause these

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

sampling errors

A

due to representation of population by finite data sets

equation in lecture 6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

specification errors

A

arise where phenomenon being modeled is not well understood, eg. irrelevant variable included in model or relevant variable is omitted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

transfer errors

A

arise if model is removed from one area to another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

aggregation errors

A

typically in models, forecasting done for groups of individuals but data is compiled on basis of responses of individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

type of info required by surveys

A
  • infrastructure eg. road network, public transport network
  • land use inventory eg. residential zones
  • O-D travel surveys eg. traffic counts
  • Socio-economic info eg. income, car ownership
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

questionnaire design

A
  • keep qs simple + direct

- divide into several sections

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

roadside interviews

A

-better method of estimating trip matrices than home interviews as larger samples available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

cordon surveys

A

provide useful info about external-external and external-internal trips

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

screen-line surveys

A

divide area into large natural zones eg. at both sides of river of motorway

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

travel diary surveys

A
  • require similar but more detail to that of an O-D survey

- diaries distributed to members in a HH and each asked to complete diaries for all travel during day

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

stated preference surveys

A

where travelers evaluate and rank set of hypothetical options

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

longitudinal/time series collection metods

A
  • repeated cross sectional survey
  • similar measurements conducted on samples at diff times
  • individuals may be included in more than one survey
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
panel survey
similar measurements made on same sample at diff times
26
cohort survey
some individuals included for only proportion of survey
27
problems
- panel surveys become unrepresentative as individuals age - may omit phenomena eg. children leaving home - typically higher rate of non-response
28
Accuracy
Overall estimate of errors present in measurements, including systematic effects. Set of observations considered accurate if mean of observations close to that of true value
29
Precision
represents repeatability of a measurement + is concerned only with random errors. Good precision is obtained from a set of observations closely grouped together with small deviations from mean of observations. A set of observations spread out widely have poor precision
30
Mean
Sum of all data points divided by number of data points
31
Standard deviation
measure of spread or dispersion of set of measurements. If small, measurements have good precision.
32
Standard deviation equation
lecture slides 4&5
33
Standard error of mean (SEM)
Standard deviation of mean. | Estimates variability between samples whereas standard deviation measures variability within a single sample
34
Differences between standard deviation and standard error of mean
- SD quantifies scatter: how much values vary from one another - SEM quantifies how precisely you know true mean of population - SEM, by definition, always smaller than SD - SEM gets smaller as samples get larger, as mean of large sample is likely to be closer to true population mean than mean of small sample
35
Range
difference between lowest and highest values in dataset
36
Quartiles
where dataset is segmented into four equal segments
37
outliers
data point in data set that is much larger or smaller than all of the other data points in data set
38
what outliers can do
- skew mean, standard deviation, standard error - can provide incorrect result - can indicate incorrect data and point to a problem in data collection process
39
methods for checking for outliers
- plot data | - descriptive analysis (average, range, standard error, quartiles)
40
importance of transport planning
- crucial in planning sustainable developments + ensuring accessibility for all individuals - design phase of all major public amenities require significant transport planning - at planning stage of following amenities it is important: sporting venues (stadiums), retail parks, shopping centers, residential areas, industrial parks/commercial centers.
41
Transport Planning
- justify funding - obtain planning permission - environmental considerations
42
justify funding
detailed plan of how road/service will impact population needs to be conducted in justifying expenditure on new road/public transport service
43
obtain planning permissions
traffic impact assessment and transportation plan for new site important when large development being planned. These plans included in application for planning permission
44
environmental considerations
environmental considerations should be taken into account
45
Sustainable development
a socio-ecological process characterized by fulfillment of human needs while maintaining quality of natural environment indefinitely
46
key element in sustainable transport planning
-minimize distance individuals have to travel, and if longer distance travel necessary that good public transport links provided
47
CO2 emissions statistics
- Road transport accounts 21% of Irish CO2 emissions - Road traffic rising 2% per year - Global aviation growing at 5% per year
48
methods of transport planning
- transport impact assessment (TIA) | - traffic forecasting
49
transport impact analysis/assessment
study which assesses effects a particular development's traffic will have on transportation network in community
50
traffic impact studies help communities to
- forecast additional traffic associated w/ new development - determine improvements necessary to accommodate new dev - assist in land use decision making - assist allocating scarce resources to areas which need improvements - identify potential problems w/ proposed development which may influence developer's decision to pursue it - allow community to assess impacts proposed development may have
51
why traffic forecasting is important
- plan future transport needs - plan for congestion - measure maintenance needed on road network - plan for new large developments
52
what is traffic forecasting estimated on?
- population + job forecasts - car ownership forecasts - travel demand forecasts - good vehicles forecasts
53
capacity of a road
max flow of vehicles, per hour or per day, for a road
54
types of data
- large scale data | - In-dept behaviour data
55
large scale data
lots of observations, but little info for each eg. census travel to work/education, Irish Rail Census data
56
in-depth behaviour data
fewer observations, more detail for each eg. trips Trinity students make during a college week
57
transport survey constraints
- can't collect all the data in all the detail you want - travel behavior tends to be complex - data costs money, more in-depth data costs more money - privacy and data protection issues
58
3 types of transport surveys
travel diaries detection apps survey
59
travel diary considerations
- sample considerations, who should take part - can't get everyone in college to take part, unrealistic, need diff students to get a good reflection of all students - should try to be as representative as possible of overall population - need a lot of info over prolonged timescale - need easy way to capture, store, analyse data
60
travel diary
diary where people record what trips they took, how they traveled, how long it took, why they traveled, what mode they took, etc
61
travel diary advantages
- tend to be simple + easy to interpret | - dont require large amounts of digital literacy
62
-travel diary disadvantages
- participants may forget to input info or put it in later - estimates may not be accurate (travel time, distance, etc) - not able to gain more complex data (routes taken, modes available, etc)
63
detection apps
smartphone applications that automatically record trips
64
gps apps advantages
- huge data collection potential - automatic detection - graphic + route specific outputs (maps etc)
65
gps apps disadvantages
- not everyone has smartphone (65+ etc) | - issues such as battery use + canyon effects
66
transport surveys
widely used to gain info about how people act/will act
67
transport surveys advantages
- can get large no. of responses - can present hypothetical scenarios - relatively cheap to do - can ask large no. of questions + get large no. of info
68
transport surveys disadvantages
- non-representative samples can bias results - have to assume respondents are reading all questions + answering honestly - have to make sure they understand what you are asking them