Module 10 Flashcards Preview

Y3 T2 Geography 322 > Module 10 > Flashcards

Flashcards in Module 10 Deck (44)
Loading flashcards...
1
Q

interpolation

A

interpolation is filling in data points between the data
you already have
• eg, regression analysis and trendlines only apply
to the data set (from xmin to xmax); temperature is
measured only at weather stations, can we
estimate temperature between the weather
stations

2
Q

extrapolation

A

extrapolation is filling in data points beyond the data
that you have
• eg, using regression analysis to predict values
beyond the scale of the observations; estimating
temperature beyond the network of weather
stations
• extrapolation methods assume that the world
outside the data behaves the same or similar to
the world inside the data

3
Q

IDW

A

• inverse distance weighting estimates a value of each
location by taking the distance
-weighted average of
the values of known points in its neighbourhood
• the closer a known point is to the location being
estimated, the more influence or weight it has in the
averaging process (ie, each known point has a local
influence that diminished with distance)

4
Q

Tobler’s First Law of Geography

A

The First Law of Geography, according to Waldo Tobler, is “everything is related to everything else, but near things are more related than distant things.”

5
Q

IDW importance of the Power

A

• the power parameter determines the significance of the known
points on the interpolated value
• a higher power (eg, > 2) puts more emphasis on the nearby points
and produces a more varying and less smooth surface
• a lower power (eg, < 2) gives more influence to the distant points,
resulting in a smoother surface

6
Q

• neighbourhood size can be defined by the radius of a circle, or by the number of known points – in general, the ______ the neighbourhood the smoother the interpolated surface since the averaging procedure incorporates more of the actual data

A

larger

7
Q

Primary Features of an IDW result

A

.the surface passes through the sample points
• the interpolated values are always within the range of the measured values of
known points and will never be beyond the maximum and minimum values of the
known points

8
Q

Natural Neighbor

A

the natural neighbour interpolation method estimates the value of an unknown location
by finding the closest subset of known points to the location being estimated, then
applying weights to them based on proportionate areas
• each polygon contains 1 known point, and any unknown point within a given polygon is closer to the polygons known point than to any other known point contained in other polygons
• this technique originated as a method to generate rainfall estimates, and has since spread throughout spatial science
• a new polygon is created around the given unknown point, which also adjusts the surrounding polygons but maintains the basic proximity rules
• only the known points belonging to polygons that have been adjusted will be included in the subset of points for interpolation, and the weight applied to each known point is proportional to the amount of overlap between the new polygon and the original polygons

9
Q

Trend Surface Interpolation: 3 types

A
  • a trend surface interpolation fits a smooth surface defined by a polynomial function to a set of known points, then uses the polynomial function to estimate the values of unknown locations
  • the trend surface is analogous to a least-squares regression equation – use a subset of points to define the relationship, then predict the z value of each point in the sample area
  • like regression analysis, there is a prediction error (the residual) at each point – for the trend interpolation,

1st Order Polynomial:Planar Surface(flat)
2nd Order Polynomial:Quadratic(some degree of curve)
3rd Order Polynomial: Cubic Surface(very curvy)

trend surfaces are also an effective tool for smoothing the data – much like a filter, the trend surface removes high and low values and reveals the underlying spatial trend of the dataset
• orders 1 – 4 are most commonly used (ArcGIS allows up to 12th -order) - it is difficult to justify that some natural phenomenon behaves as an 8th-order polynomial, so it is best to avoid these cases
• trend surface interpolation is highly susceptible to extreme outliers (just like regression analysis), so examining the dataset beforehand and objectively removing the outliers is important

10
Q

______ order polynomial equations need many data points to produce the surface, so a
bigger dataset is needed-for trend surface interpolation

A

higher

11
Q

Spline

A

Estimates values at unknown locations using a mathematical function that minimizes overall surface curvature
-while there are several different types of spline functions, the most commonly used in GIS are thin-plate splines, which produce a surface that passes exactly through the known points while ensuring the surface is as smooth as possible

  • both regularized splines and splines with tension create smooth, gradually changing surfaces with estimated values that may lie outside the range of the maximum and minimum values for the known points
  • regularized splines run into significant problems by estimating steep gradients in data-poor regions – these are known as overshoots; in general, when t > 0.5 there are a greater number of overshoots
  • splines with tension allow the user to control the tension to be applied at the edges of the surface as a method of reducing overshoots
12
Q

while there are several different types of spline functions, the most commonly used in GIS are _____ splines, which produce a surface that passes exactly through the known points while ensuring the surface is as smooth as possible

A

thin-plate

13
Q

Kriging

A

• kriging is a geostatistical method for spatial interpolation that is similar to IDW in that it
estimates the value of a variable at a location by computing a weighted average of the
known z values in its neighbourhood;however, the weights in kriging are dependent on the spatial variability in the values of the known points

• kriging assumes that in most cases spatial variations observed in environmental
phenomena (eg, variations in soil qualities, changes in the grade of ores) are random
but spatially correlated, and the data values characterizing such phenomena conform
to Tobler’s first law of geography – ie, spatial autocorrelation
• the exact nature of spatial autocorrelation varies from dataset to dataset, and each
set of data has its own unique function of variability and distance between known
points, which can ultimately be represented by the semivariogram

14
Q

Semivariogram

A

a semivariogram is a graph of the semi variance on the y-axis and the distance between known points (the lag) on the x-axis

in order to estimate the semivariance at any given
distance, the data points are fitted with a continuous curve called a semivariogram modelnthere are several different models, each designed to fit different types of phenomena and having different effects on the estimation of the unknown values, especially for nearby
points

15
Q

Kriging: range

A

the range represents the maximum
distance between points where spatial autocorrelation occurs -small ranges indicate that data values change more rapidly over space
-the range is used in kriging for defining the size of the
neighbourhood so that spatially correlated known points are selected for interpolation

16
Q

Kriging: the sill

A

the sill represents the semivariance at the range value, and is typically the same as the variance of the whole
dataset theoretically, at lag = 0, semivariance =
0, but most natural phenomena exhibit a nugget effect, where semivariance > 0 at lag = 0
the nugget value represents a degree of randomness attributed to measurement error and/or spatial variations that occur at scales smaller than the
sampling scale

17
Q

2 main forms of kriging used by ARCGIS

A

• Ordinary Kriging (for random data): assumes that there is no trend in the data and that the mean of the dataset is unknown – the weights are derived by solving a system of linear equations
which minimize the expected variance of the data values

• Universal Kriging (for trending data): assumes that there is an overriding trend in the data in addition to spatial autocorrelation among the known points, and this trend can be modeled by a
polynomial function

18
Q

ordinary kriging is for _______(trending/random) data

A

random

19
Q
  • in the use of kriging, more known points will produce a more accurate ________ model, and a more accurate interpolated surface
  • kriging also produces as additional output a map of , _____ ______which can be interpreted as showing where the interpolated surface is most, or least, accurate
A

semivariogram

standard errors

  • anything white or light gray is a poor extrapolation
  • dark colour gray/black means good extrapolation
20
Q

TOR F

kriging produces a surface which
passes through the known points and
the interpolated values are bound by the maximum and minimum of the known data

A

F the interpolated values are NOT bound by the max and min

21
Q

every spatial interpolation method involves errors –

A

-An interpolated surface is a mathematical approximation of a continuous surface

22
Q

• the accuracy of an interpolated surface is often evaluated through cross-validation,
which evaluates the performance of the surface in 2 steps:

A
  1. it removes each known point one at a time and estimates its value based on the
    remaining known points using the chosen interpolation method
  2. then it compares the observed and estimated values to calculate estimation errors
    (eg, standard error, or standardized RMSE)
23
Q

in addition to errors inherent in the interpolation method, there are other common
sources of error in spatial interpolation(2)

A
  1. data uncertainty in sample data mainly results from too few known points, limited or clustered distributions of known points, and uncertainty about locations and/or values of known points
    • in general, more known points=accurate interpolation , but clustered points yield less information than evenly spread out points
  2. edge effects refer to distortions of the interpolated values near the boundary of the study area due to the lack of sample data outside the area
    • in fact, near the edges of method is extrapolating, not interpolating
    • edge effects can be minimized by collecting data from outside the study site, include them in the interpolation, then clip them out afterwards
24
Q

exploratory spatial data analysis

A

is the process of applying spatial statistical methods and tools to investigate spatial data in order to detect and quantify patterns in the data and to establish spatial associations between a given set of environmental events or phenomena
▪ in most circumstances, objectively random sampling methods are preferred because they are best for reducing bias and are most likely to lead to sample
representativeness of the population

25
Q

Spatial Interpolation Analysis(5)

A
  1. IDW Inverse Distance Weighting
  2. Natural Neighbor
  3. Trend Surface
  4. Spline
  5. Kriging
26
Q

Spatial Sampling types(5)

A

1.Simple Random Sampling

  1. Stratified point sampling
    - divides the study area into a number of mutually exclusive and collectively exhaustive strata, then takes a random sample within each stratum

3.Systematic random sampling
▪ systematic point sampling takes a sample according
to some regular pattern, usually a regularly spaced
grid
▪ the sampling interval is the chosen distance
between sample points; the first point is chosen
randomly, then every other point is chosen based on
the sampling interval
▪ this method overcomes the problem of spatial
unevenness in simple random point sampling and is
often used when dealing with continuously
distributed environmental phenomena, but may fail
to detect the true extent of heterogeneity in the
spatial pattern of the phenomenon being
investigated

4.Clustered random sampling
.clustered point sampling first selects a number of
sites randomly, then takes a random sample in the
nearby area surrounding each site
▪ it excludes substantial parts of the study area and it
is hard to tell whether the sample is representative

5.Random transects
. transect sampling involves taking samples at fixed
intervals, usually along lines
▪ this method involves doing transects, where a
sampling line is set up across areas with clear
environmental gradients
▪ the position of the transect line depends on the
direction of the environmental gradient being
studied

27
Q

▪ as a rule of thumb, __ sample points are required for detecting significant spatial autocorrelation, and __ or more sample points are needed for obtaining reliable measurements of spatial pattern or structure

A

30

100

28
Q

Weighted Mean Center

A

▪ in some cases, a weight can be assigned to each of the data points, such that points
with a greater weight influence the mean centre more – this is the weighted mean
centre

29
Q

Mean Centre

A

The mean center is the average x and y coordinate of all the features in the study area. It’s useful for tracking changes in the distribution or for comparing the distributions of different types of features.

30
Q

Median Centre

A

▪ the median centre finds the location that
minimizes the sum of all distances to all
features
-The Median Center tool is a measure of central tendency that is robust to outliers. It identifies the location that minimizes travel from it to all other features in the dataset.
-▪ the median centre is often used for finding
the most accessible place or locating
services and facilities in terms of
accessibility

31
Q

Linear Directional Mean

A

Identifies the mean direction, length, and geographic center for a set of lines
-The input must be a line feature class

32
Q

Standard distance

A

▪ standard distance measures dispersion around the mean centre, and is expressed in the unit in which distance is measured in the dataset
▪ a larger standard distance means the data points are
more spread out

33
Q

standard deviation ellipse

A

▪ the standard deviational ellipse measures the directional trend of a spatial distribution by computing an ellipse centred on the mean centre of the distribution
-since the method calculates the standard deviation of the x coordinates and y coordinates from the mean center to define the axes of the ellipse. The ellipse allows you to see if the distribution of features is elongated and hence has a particular orientation

▪ the major axis of the ellipse shows the direction of
maximum dispersion of the features
▪ the minor axis shows the direction of minimum
dispersion

34
Q

spatial pattern analysis aims to

A

measure the degree to which geographical features
or their attribute values are clustered, dispersed, or randomly distributed across a
region

35
Q

Nearest neighbor analysis

A

involves measuring the distance of each feature to its
nearest neighbour, then comparing the observed distances with that expected from a random spatial pattern of features
▪ R ranges between 0 (all points occupy the same position) to 2.149 (regularly dispersed distribution), and R = 1 represents a perfectly random distribution

36
Q

Problems with Nearest Neighbor Anaysis(3)

A
  1. the R statistic is highly dependent on the size and shape of the study area – a long and narrow area may result in a relatively low R value since the points tends
    to be close to each other
  2. the same R value could be obtained from very different point patterns since R only relates to the distance between points and does not account for other
    characteristics of the spatial arrangements of the points (eg, angular configuration)
  3. R only describes a spatial pattern in terms of the number of individual features in a given area and the distribution function of distances between them – it ignores spatial variations in feature attributes and spatial autocorrelation
37
Q

Spatial Autocorrelation

A

spatial autocorrelation occurs whenever the values of a variable or feature attribute at one location depend on the values of the same variables or features at nearby
locations
-When data are spatially autocorrelated, it is possible to predict the value at one location based on the value sampled from a nearby location when data using interpolation methods. The absence of autocorrelation implies data are independent.

38
Q

Moran’s I statistic is a commonly used measure for

A

characterizing a spatial pattern in terms of spatial autocorrelation
▪ Moran’s I ranges from -1 to +1, where 0 represents a random pattern, +1 a perfectly clustered pattern; negative values are rare and represent negative spatial
autocorrelation

39
Q

Local Moran’s I identifies

A

▪spatial clusters of
features with high or low values and spatial
outliers (a high surrounded by low, or vice
versa)
-detects spatial clusters

40
Q

Getis-Ord Gi

* statistic tests

A

tests whether feature i and its neighbouring features
have higher or lower than average values of a particular attribute or variable

▪ a positive and significant Gi* value indicates that the feature is part of a spatial cluster of particularly high values (hot spot ), while a negative and significant Gi*
value suggests that the feature is located in a spatial cluster of particularly low values (cold spot )
▪ Gi* can be used to identify the clusters of features with values of a particular variable higher or lower in magnitude than might be expected by random chance

41
Q

▪ spatial patterns result from _______ (induced) or ______ (inherent) processes occurring in space and time

A

exogenous

endogenous

42
Q

spatial regression

A

▪ spatial regression follows the same logic as ordinary regression, but applies it to
spatial datasets to make predictions of some phenomenon based on some
influencing phenomena

43
Q

ordinary regression analysis

A

▪ ordinary regression analysis creates a single regression model to represent the
relationship between the dependent and independent variable(s) based on the
assumption that the relationship is static and consistent across the whole study area

44
Q

geographically weighted regression (GWR)

A

is a “local” regression procedure that
produces a unique regression model for each data point in the study area
▪ the emphasis in GWR is on the empirical coefficients produced during the
regression process – for each data point these coefficients change to reflect their
importance in determining the behaviour of the effect variable