Functional summary statistics Flashcards
(29 cards)
What is the first order moment?
It is the intensity measure μ(A) = 𝔼[N(A)].
Campbell’s formula
Suppose the intensity measure μ has an intensity function ρ. For h : R^d → [0, ∞)
,𝔼[ ∑ _{u ∈ X} h(u) ] = ∫ _{R^d} h(u) ρ(u) du.
Explain non-parametric estimators
They are estimators requiring no model assumptions. They are different for the homogeneous and inhomogeneous case. We assume they are over a bounded observation window W .
Homogeneous:
The natural (and unbiased) estimator for ρ, which is actually also the MLE, isρ' = n(x ∩ W ) / |W |.
Inhomogeneous:
For k a pdf on R^d, called the kernel, and a edge correction factor c_w(v) = ∫_ {W} k(u - v) du
, the estimator is∑_{v ∈ X ∩ W} k(u - v) / c_w(v).
List and define the different kernels
Uniform kernel:k(u) = 1(|u| < 1 / 2)
Gauss kernel:k(u) = 1 / (√2π)^d exp(−∥u∥^2 / 2).
Epinechnikov (2D) kernel:k((u_1, u_2)) = e(u_1) e(u_2),
where e(u) = 3 / 4 (1 − |u|)1{|u| ≤ 1}.
What is bandwidth?
It can be considered as the spread parameter, a larger bandwidth spreads the influence of event points over a greater distance, but is also more likely to experience edge effects close to the study region boundaries.
In practice, the bandwidth b is more important than the shape of the kernel k.
What is the second order factorial moment measure?
The second order factorial moment measure α^{(2)} is defined for C ⊆ R^d × R^d
, byα^{(2)} (C ) = 𝔼[∑_{u,v ∈X}^≠ 1{ (u, v) ∈ C }], .
It has a second order product density ρ^{(2)} ifα^{(2)} (C ) =∫_C ρ^{(2)} (u, v ) du dv.
State some properties of the second order moment
Cov(N(A), N(B)) = α^{(2)} (A × B) + μ(A ∩ B) − μ(A)μ(B).
If A and B are disjoint,
Cov(N(A), N(B)) = α^{(2)} (A × B) − μ(A)μ(B).
Var(N(A)) = α^{(2)} (A × A) + μ(A) − μ(A)2.
What is the second order Campbell formula?
For h : R^d × R^d → [0, ∞),
𝔼[ ∑_{u,v ∈X}^≠ h(u, v )] =∫∫ h(u, v )ρ^{(2)} (u, v) du dv.
What is the second order product density for Poisson?
ρ^{(2)}(u, v ) = ρ(u) ρ(v).
What is the pair correlation function?
The pair correlation function is defined as
g (u, v) = ρ^{(2)} (u, v) / ρ(u)ρ(v),
where a / 0 = 0 for all a ≥ 0.
Poisson process: g = 1.
g > 1: More clustered than Poisson.
g < 1: More regular than Poisson.
If X is both stationary and isotropic, then g (u, v ) = g (u − v) = g (∥u − v ∥), thus depends only on the distance between u and v.
What is Ripley’s K function?
Assume X is stationary with constant intensity ρ > 0. For r > 0,
Ripley’s K -function is given by
K(r) = 1 / ρ^2|A| 𝔼[ ∑ _{u∈X∩A} ∑ _{v ∈X{u}} 1{∥u − v ∥ ≤ r } ].
Interpretation of K function in terms of clustering/regular.
For a Poisson process,K(r) = |b(0, r)| = r^d ω_d
,
where ω_d = π^{d/2} / Γ(1 + d/2) is the volume of the unit ball in R^d.
If g > 1 (clustering): K(r) > r^d ω_d
If g < 1 (repulsion): K(r) < r^d ω_d
Interpretation of the K function.
K(r) can be interpreted as the expected number of points in a ball around 0, conditionally on 0 being a point of the point process.
What is Besag’s L function?
Besag’s L-function is given by
L(r) = (K(r) / ω_d )^{1 / d}.
Intepretation of the L function in terms of clustering/regular
Poisson process: L(r) = r.
If g > 1 (clustering): L(r) > r.
If g < 1 (repulsion): L(r) < r.
What is the non-parametric estimation of the K function?
An unbiased estimator:ˆK(r) = 1 / ρ^2 ∑ _{u∈x_W} ∑ _{v ∈x_W \{u}} 1{∥v − u∥ ≤ r } / |W ∩ (W − v + u)|
What is the non-parametric estimation of the L function?
ˆL(r ) = (ˆK(r) / ω_d)^{1 / d}.
What is second-order intensity reweighted stationary?
A point process is said to be s.o.i.r.s. , if for any region A with 0 < |A| < ∞ the measureK(B) = 1 / |A| 𝔼[ ∑_{u∈X} ∑_{v ∈X\{u}} 1{u ∈ A, v − u ∈ B} / ρ(u)ρ(v) ]
,
for B ∈ B, does not depend on A. In this case K(B) defines a measure called the second order reduced moment measure.
What is the empty space function and its use?
Assume X is stationary with constant intensity ρ > 0. Let r > 0. The empty space function is given by
F (r ) = P(d(0, X) ≤ r ) = P(X ∩ b(0, r ) ≠ ∅),
where d(0, X) denotes the distance from the origin to the nearest point in X.
What is the nearest neighbor distribution function and its use?
Assume X is stationary with constant intensity ρ > 0. Let A ⊆ Rd with 0 < |A| < ∞. The nearest neighbor distribution function is defined byG(r) = 𝔼[ ∑_{u∈X∩A} 1{d(u, X \ {u}) ≤ r }] / ρ|A|.
G (r) can be interpreted as the probability that a ”typical point” in X has another point of X within distance r.
What is the J function?
Assume X is stationary with constant intensity ρ > 0. The J-function is given (for F (r ) < 1) byJ(r) = 1 − G(r) / 1 − F(r)
.
Using the interpretation of the F - and G -function, we getJ(r) = P((X \ {0}) ∩ b(0, r) = ∅|0 ∈ X) / P(X ∩ b(0, r) = ∅)
.
Edge correction factors/methods
Edge correction factor, |1 / |W ∩ W_{η−ξ}|
A simpler alternative, minus sampling, for r > 0, letW_r = {ξ ∈ W : b(ξ, r) ⊆ W }
where these points are then used for estimating the K function. Minus sampling discards data from the outer region of the observation window.
Minus sampling doesn’t use all points, whereas edge correction does. However, given sufficient data, one may prefer the reduced-sample estimate if very large weights 1/|W ∩Wη−ξ | are given to pairs of points.
How is model checking performed using functional summary statistics?
Suppose x is an observed point pattern.
Suppose we have fitted a point process model to the data.
We compute a non-parametric functional summary T_0(r) from our data (we omit the ”hat” over T ).
We compare this to the theoretical value T(r) under the fitted model.
Then we use envelopes to check if the model is good.
What is a pointwise envelope?
Simulate n i.i.d. point patterns from the fitted model and calculate the functional summary statistics T_1(r ), . . . , T_n(r ). Fix r > 0. If the fitted model is correct, (T_0(r ), . . . , T_n(r )) is exchangeable (the joint distribution is preserved under permutation). Thus, (if the probability of ties is zero), P(T_0(r) ≤ min_{i=1,...,n} T_i(r)
or T_0(r) ≥ max_{i=1,...,n} T_i(r) ) = 2 / n + 1
If n = 39, then with probability 95% we expect T_0(r ) to fall between the smallest and largest value of T_1(r ), … , T_n(r).
The interval from min_{i=1,...,39} T_i(r) to max_{i=1,...,39} T_i(r)
is called a 95% envelope. Since r was fixed, it is a pointwise envelope.